NOT IN subquery optimization

Started by Jim Finnertyalmost 7 years ago65 messages
#1Jim Finnerty
jfinnert@amazon.com

The semantics of NOT IN (SELECT ...) are subtly different from the semantics
of NOT EXISTS (SELECT ...). These differences center on how NULLs are
treated, and in general can result in statements that are harder to optimize
and slower to execute than the apparently similar NOT EXISTS statement.

A little over a year ago, Christian Antognini authored the blog "/How Well a
Query Optimizer Handles Subqueries?/" summarizing his findings about the
performance of PostgreSQL, MySQL, and Oracle on various subqueries:

https://antognini.ch/2017/12/how-well-a-query-optimizer-handles-subqueries/

His position was that you can classify the optimizations as correct or
incorrect, and based on that he provided the following comparison summary
(see below). In short, PostgreSQL was the worst of the three systems:

"Summary

The number of queries that the query optimizers handle correctly are
the following:

Oracle Database 12.2: 72 out of 80
MySQL 8.0.3: 67 out of 80
PostgreSQL 10.0: 60 out of 80

Since not all queries are handled correctly, for best performance it is
sometimes necessary to rewrite them."

The subqueries that were found to be optimized "incorrectly" were almost
entirely due to poor or absent NOT IN subquery optimization.

The PostgreSQL community has been aware of the deficiencies in NOT IN
optimization for quite some time. Based on an analysis of
psgsql-performance posts between 2013 and 2015, Robert Haas identified NOT
IN optimization as one of the common root causes of performance problems.

We have been working on improved optimization of NOT IN, and we would like
to share this optimizaton with the community. With respect to the test
cases mentioned in the blog post mentioned above, it will elevate PostgreSQL
from "worst" to "first". Generally the performance gains are large when the
optimization applies, though we have found one test case where performance
is worse. We are investigating this now to see if we can disable the
optimization in that case.

We would like to include a patch for this change in the current commitfest.
This thread can be used to track comments about this optimization.

-----
Jim Finnerty, AWS, Amazon Aurora PostgreSQL
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Finnerty (#1)
Re: NOT IN subquery optimization

Jim Finnerty <jfinnert@amazon.com> writes:

We have been working on improved optimization of NOT IN, and we would like
to share this optimizaton with the community.

The idea that's been kicked around in the past is to detect whether the
subselect's output column(s) can be proved NOT NULL, and if so, convert
to an antijoin just like NOT EXISTS. Is that what you're doing, or
something different?

We would like to include a patch for this change in the current commitfest.

Generally, people just send patches, they don't ask for permission first
;-)

Having said that, we have a general policy that we don't like complex
patches that first show up for the last commitfest of a dev cycle.
So unless this is a pretty small patch, it's probably going to get
delayed to v13. Still, we'd like to have it in the queue, so submit
away ...

regards, tom lane

#3Jim Finnerty
jfinnert@amazon.com
In reply to: Tom Lane (#2)
Re: NOT IN subquery optimization

re: The idea that's been kicked around in the past is to detect whether the
subselect's output column(s) can be proved NOT NULL, and if so, convert
to an antijoin just like NOT EXISTS

basically, yes. this will handle nullability of both the outer and inner
correlated expression(s), multiple expressions, presence or absence of
predicates in the WHERE clause, and whether the correlated expressions are
on the null-padded side of an outer join. If it is judged to be more
efficient, then it transforms the NOT IN sublink into an anti-join.

some complications enter into the decision to transform NOT IN to anti-join
based on whether a bitmap plan will/not be used, or whether it will/not be
eligible for PQ.

-----
Jim Finnerty, AWS, Amazon Aurora PostgreSQL
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Finnerty (#3)
Re: NOT IN subquery optimization

Jim Finnerty <jfinnert@amazon.com> writes:

re: The idea that's been kicked around in the past is to detect whether the
subselect's output column(s) can be proved NOT NULL, and if so, convert
to an antijoin just like NOT EXISTS

basically, yes. this will handle nullability of both the outer and inner
correlated expression(s), multiple expressions, presence or absence of
predicates in the WHERE clause, and whether the correlated expressions are
on the null-padded side of an outer join. If it is judged to be more
efficient, then it transforms the NOT IN sublink into an anti-join.

Hmm, that seems overcomplicated ...

some complications enter into the decision to transform NOT IN to anti-join
based on whether a bitmap plan will/not be used, or whether it will/not be
eligible for PQ.

... and that even more so, considering that this decision really needs
to be taken long before cost estimates would be available.

As far as I can see, there should be no situation where we'd not want
to transform to antijoin if we can prove it's semantically valid to
do so. If there are cases where that comes out as a worse plan,
that indicates a costing error that would be something to address
separately (because it'd also be a problem for other antijoin cases).
Also, as long as it nearly always wins, I'm not going to cry too hard
if there are corner cases where it makes the wrong choice. That's not
something that's possible to avoid completely.

regards, tom lane

#5Jim Finnerty
jfinnert@amazon.com
In reply to: Tom Lane (#4)
Re: NOT IN subquery optimization

We can always correctly transform a NOT IN to a correlated NOT EXISTS. In
almost all cases it is more efficient to do so. In the one case that we've
found that is slower it does come down to a more general costing issue, so
that's probably the right way to think about it.

-----
Jim Finnerty, AWS, Amazon Aurora PostgreSQL
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

#6David Rowley
david.rowley@2ndquadrant.com
In reply to: Jim Finnerty (#5)
1 attachment(s)
Re: NOT IN subquery optimization

On Thu, 21 Feb 2019 at 16:27, Jim Finnerty <jfinnert@amazon.com> wrote:

We can always correctly transform a NOT IN to a correlated NOT EXISTS. In
almost all cases it is more efficient to do so. In the one case that we've
found that is slower it does come down to a more general costing issue, so
that's probably the right way to think about it.

I worked on this over 4 years ago [1]/messages/by-id/CAApHDvqRB-iFBy68=dCgqS46aRep7AuN2pou4KTwL8kX9YOcTQ@mail.gmail.com. I think the patch there is not
completely broken and seems just to need a few things fixed. I rebased
it on top of current master and looked at it. I think the main
remaining issue is fixing the code that ensures the outer side join
quals can't be NULL. The code that's there looks broken still since
it attempts to use quals from any inner joined rel for proofs that
NULLs will be removed. That might not work so well in a case like:
SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a AND t2.b NOT IN(select b
from t3), however, I'd need to think harder about that since if there
was such a qual then the planner should convert the left join into an
inner join. But anyway, the function expressions_are_not_nullable()
was more intended to work with targetlists to ensure exprs there can't
be NULL. I just had done a poor job of trying to modify that into
allowing it to take exprs from any random place, likely that should be
a new function and expressions_are_not_nullable() should be put back
to what Tom ended up with.

I've attached the rebased and still broken version.

[1]: /messages/by-id/CAApHDvqRB-iFBy68=dCgqS46aRep7AuN2pou4KTwL8kX9YOcTQ@mail.gmail.com

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

not_in_anti_join_v1.0.patchapplication/octet-stream; name=not_in_anti_join_v1.0.patchDownload
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index c721054787..0942cba00b 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -87,6 +87,8 @@ static bool contain_dml(Node *node);
 static bool contain_dml_walker(Node *node, void *context);
 static void inline_cte(PlannerInfo *root, CommonTableExpr *cte);
 static bool inline_cte_walker(Node *node, inline_cte_walker_context *context);
+static bool is_NOTANY_compatible_with_antijoin(Query *outerquery,
+											   SubLink * sublink);
 static bool simplify_EXISTS_query(PlannerInfo *root, Query *query);
 static Query *convert_EXISTS_to_ANY(PlannerInfo *root, Query *subselect,
 					  Node **testexpr, List **paramIds);
@@ -1108,6 +1110,118 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
 }
 
 
+/*
+ * is_NOTANY_compatible_with_antijoin
+ *		True if the NOT IN sublink can be safely converted into an ANTI JOIN.
+ *		Per SQL spec, NOT IN is not ordinarily equivalent to an anti-join,
+ *		however, if we can prove that all of the expressions on both sides of
+ *		the would be join condition are all certainly not NULL, then it's safe
+ *		to convert NOT IN to an anti-join.
+ *
+ * Note:
+ *	This function is quite locked into the NOT IN syntax. Certain assumptions
+ *	are made about the structure of the join conditions:
+ *
+ * 1.	We assume that when more than 1 join condition exists that these are
+ *		AND type conditions, i.e not OR conditions.
+ *
+ * 2.	We assume that each join qual has 2 arguments and the first arguments
+ *		in each join qual is the one that belongs to the outer side of the
+ *		NOT IN clause.
+ */
+static bool
+is_NOTANY_compatible_with_antijoin(Query *outerquery, SubLink *sublink)
+{
+	ListCell *lc;
+	List	 *outerexpr;
+	List	 *innerexpr;
+	Node	 *testexpr = sublink->testexpr;
+	Query	 *subselect = (Query *) sublink->subselect;
+
+	/*
+	 * We must build 2 lists of expressions, one for the outer query side and
+	 * one for the subquery/inner side of the NOT IN clause. It is these lists
+	 * that we pass to to expressions_are_not_nullable to allow it to determine
+	 * if each expression cannot contain NULL values or not. We'll try and be
+	 * as lazy about this as possible and we won't bother generating the 2nd
+	 * list if the first list can't be proved to be free from null-able
+	 * expressions. The order that we checks these lists does not really matter
+	 * as neither one seems to be more likely to allow us to exit from this
+	 * function more quickly.
+	 */
+
+	/* if it's a single expression */
+	if (IsA(testexpr, OpExpr))
+	{
+		OpExpr *opexpr = (OpExpr *) testexpr;
+
+		Assert(list_length(opexpr->args) == 2);
+
+		outerexpr = list_make1(linitial(opexpr->args));
+	}
+
+	/* multiple expressions ANDed together */
+	else if (IsA(testexpr, BoolExpr))
+	{
+		List	 *list = ((BoolExpr *) testexpr)->args;
+
+		outerexpr = NIL;
+
+		 /* loop through each expression appending to the list each iteration. */
+		foreach(lc, list)
+		{
+			OpExpr *opexpr = (OpExpr *) lfirst(lc);
+
+			Assert(list_length(opexpr->args) == 2);
+
+			outerexpr = lappend(outerexpr, linitial(opexpr->args));
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+					(int) nodeTag(testexpr));
+
+	if (outerexpr == NIL)
+		elog(ERROR, "unexpected empty clause");
+
+	/*
+	 * Check the expressions being used in the outer query side of the NOT IN
+	 * clause to ensure NULLs are not possible.
+	 */
+	if (!expressions_are_not_nullable(outerquery, outerexpr))
+		return false;
+
+	/*
+	 * Now we check to ensure each TargetEntry in the subquery can be proved to
+	 * never be NULL.
+	 */
+
+	innerexpr = NIL;
+
+	foreach(lc, subselect->targetList)
+	{
+		TargetEntry *tle = (TargetEntry *) lfirst(lc);
+
+		/* Resjunk columns can be ignored: they don't produce output values */
+		if (tle->resjunk)
+			continue;
+
+		innerexpr = lappend(innerexpr, tle->expr);
+	}
+
+	if (innerexpr == NIL)
+		elog(ERROR, "unexpected empty clause");
+
+	/*
+	 * Check if all the subquery's targetlist columns can be proved to be not
+	 * null.
+	 */
+	if (!expressions_are_not_nullable(subselect, innerexpr))
+		return false;
+
+	return true; /* supports ANTI JOIN */
+}
+
 /*
  * convert_ANY_sublink_to_join: try to convert an ANY SubLink to a join
  *
@@ -1117,11 +1231,14 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
  * If so, form a JoinExpr and return it.  Return NULL if the SubLink cannot
  * be converted to a join.
  *
- * The only non-obvious input parameter is available_rels: this is the set
- * of query rels that can safely be referenced in the sublink expression.
- * (We must restrict this to avoid changing the semantics when a sublink
- * is present in an outer join's ON qual.)  The conversion must fail if
- * the converted qual would reference any but these parent-query relids.
+ * If under_not is true, the caller actually found NOT (ANY SubLink),
+ * so that what we must try to build is an ANTI not SEMI join.
+ *
+ * available_rels is the set of query rels that can safely be referenced
+ * in the sublink expression.  (We must restrict this to avoid changing the
+ * semantics when a sublink is present in an outer join's ON qual.)
+ * The conversion must fail if the converted qual would reference any but
+ * these parent-query relids.
  *
  * On success, the returned JoinExpr has larg = NULL and rarg = the jointree
  * item representing the pulled-up subquery.  The caller must set larg to
@@ -1144,7 +1261,7 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
  */
 JoinExpr *
 convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
-							Relids available_rels)
+							bool under_not, Relids available_rels)
 {
 	JoinExpr   *result;
 	Query	   *parse = root->parse;
@@ -1159,6 +1276,16 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 
 	Assert(sublink->subLinkType == ANY_SUBLINK);
 
+	/*
+	 * Per SQL spec, NOT IN is not ordinarily equivalent to an anti-join, so
+	 * that by default we have to fail when under_not.  However, if we can
+	 * prove that all of the expressions on both sides of the, would be, join
+	 * condition are all certainly not NULL, then it's safe to convert NOT IN
+	 * to an anti-join.
+	 */
+	if (under_not && !is_NOTANY_compatible_with_antijoin(parse, sublink))
+		return NULL;
+
 	/*
 	 * The sub-select must not refer to any Vars of the parent query. (Vars of
 	 * higher levels should be okay, though.)
@@ -1228,7 +1355,7 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 	 * And finally, build the JoinExpr node.
 	 */
 	result = makeNode(JoinExpr);
-	result->jointype = JOIN_SEMI;
+	result->jointype = under_not ? JOIN_ANTI : JOIN_SEMI;
 	result->isNatural = false;
 	result->larg = NULL;		/* caller must fill this in */
 	result->rarg = (Node *) rtr;
@@ -1243,9 +1370,7 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 /*
  * convert_EXISTS_sublink_to_join: try to convert an EXISTS SubLink to a join
  *
- * The API of this function is identical to convert_ANY_sublink_to_join's,
- * except that we also support the case where the caller has found NOT EXISTS,
- * so we need an additional input parameter "under_not".
+ * The API of this function is identical to convert_ANY_sublink_to_join's.
  */
 JoinExpr *
 convert_EXISTS_sublink_to_join(PlannerInfo *root, SubLink *sublink,
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index aebe162713..bab2b37fa7 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -388,7 +388,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 		/* Is it a convertible ANY or EXISTS clause? */
 		if (sublink->subLinkType == ANY_SUBLINK)
 		{
-			if ((j = convert_ANY_sublink_to_join(root, sublink,
+			if ((j = convert_ANY_sublink_to_join(root, sublink, false,
 												 available_rels1)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
@@ -414,7 +414,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 				return NULL;
 			}
 			if (available_rels2 != NULL &&
-				(j = convert_ANY_sublink_to_join(root, sublink,
+				(j = convert_ANY_sublink_to_join(root, sublink, false,
 												 available_rels2)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
@@ -499,14 +499,68 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 	}
 	if (is_notclause(node))
 	{
-		/* If the immediate argument of NOT is EXISTS, try to convert */
+		/* If the immediate argument of NOT is ANY or EXISTS, try to convert */
 		SubLink    *sublink = (SubLink *) get_notclausearg((Expr *) node);
 		JoinExpr   *j;
 		Relids		child_rels;
 
 		if (sublink && IsA(sublink, SubLink))
 		{
-			if (sublink->subLinkType == EXISTS_SUBLINK)
+			if (sublink->subLinkType == ANY_SUBLINK)
+			{
+				if ((j = convert_ANY_sublink_to_join(root, sublink, true,
+												   available_rels1)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink1;
+					*jtlink1 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Because
+					 * we are underneath a NOT, we can't pull up sublinks that
+					 * reference the left-hand stuff, but it's still okay to
+					 * pull up sublinks referencing j->rarg.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->rarg,
+															 child_rels,
+															 NULL, NULL);
+					/* Return NULL representing constant TRUE */
+					return NULL;
+				}
+				if (available_rels2 != NULL &&
+					(j = convert_ANY_sublink_to_join(root, sublink, true,
+												   available_rels2)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink2;
+					*jtlink2 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Because
+					 * we are underneath a NOT, we can't pull up sublinks that
+					 * reference the left-hand stuff, but it's still okay to
+					 * pull up sublinks referencing j->rarg.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->rarg,
+															 child_rels,
+															 NULL, NULL);
+					/* Return NULL representing constant TRUE */
+					return NULL;
+				}
+			}
+			else if (sublink->subLinkType == EXISTS_SUBLINK)
 			{
 				if ((j = convert_EXISTS_sublink_to_join(root, sublink, true,
 														available_rels1)) != NULL)
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 3737166e8f..239ff93038 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -42,6 +42,7 @@
 #include "parser/parse_agg.h"
 #include "parser/parse_coerce.h"
 #include "parser/parse_func.h"
+#include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "tcop/tcopprot.h"
 #include "utils/acl.h"
@@ -113,6 +114,8 @@ static bool contain_context_dependent_node_walker(Node *node, int *flags);
 static bool contain_leaked_vars_walker(Node *node, void *context);
 static Relids find_nonnullable_rels_walker(Node *node, bool top_level);
 static List *find_nonnullable_vars_walker(Node *node, bool top_level);
+static void find_innerjoined_rels(Node *jtnode,
+					  Relids *innerjoined_rels, List **usable_quals);
 static bool is_strict_saop(ScalarArrayOpExpr *expr, bool falseOK);
 static Node *eval_const_expressions_mutator(Node *node,
 							   eval_const_expressions_context *context);
@@ -1458,6 +1461,10 @@ contain_leaked_vars_walker(Node *node, void *context)
 								  context);
 }
 
+/*****************************************************************************
+ *		  Nullability analysis
+ *****************************************************************************/
+
 /*
  * find_nonnullable_rels
  *		Determine which base rels are forced nonnullable by given clause.
@@ -1701,7 +1708,7 @@ find_nonnullable_rels_walker(Node *node, bool top_level)
  * but here we assume that the input is a Boolean expression, and wish to
  * see if NULL inputs will provably cause a FALSE-or-NULL result.  We expect
  * the expression to have been AND/OR flattened and converted to implicit-AND
- * format.
+ * format (but the results are still good if it wasn't AND/OR flattened).
  *
  * The result is a palloc'd List, but we have not copied the member Var nodes.
  * Also, we don't bother trying to eliminate duplicate entries.
@@ -2002,6 +2009,234 @@ find_forced_null_var(Node *node)
 	return NULL;
 }
 
+/*
+ * expressions_are_not_nullable
+ *		Returns TRUE if each Expr in the expression list is certainly not NULL.
+ *		All Exprs must return non-NULL to answer TRUE.
+ *
+ * The reason this takes a Query, and not just an individual tlist expression,
+ * is so that we can make use of the query's WHERE/ON clauses to prove it does
+ * not return nulls.
+ *
+ * In current usage, the passed sub-Query hasn't yet been through any planner
+ * processing.  This means that applying find_nonnullable_vars() to its WHERE
+ * clauses isn't really ideal: for lack of const-simplification, we might be
+ * unable to prove not-nullness in some cases where we could have proved it
+ * afterwards.  However, we should not get any false positive results.
+ *
+ * Like the other forms of nullability analysis above, we can err on the
+ * side of conservatism: if we're not sure, it's okay to return FALSE.
+ */
+bool
+expressions_are_not_nullable(Query *query, List *exprlist)
+{
+	Relids		innerjoined_rels = NULL;
+	bool		computed_innerjoined_rels = false;
+	List	   *usable_quals = NIL;
+	List	   *nonnullable_vars = NIL;
+	bool		computed_nonnullable_vars = false;
+	ListCell   *lc;
+
+	/*
+	 * If the query contains set operations, punt.  The set ops themselves
+	 * couldn't introduce nulls that weren't in their inputs, but the tlist
+	 * present in the top-level query is just dummy and won't give us useful
+	 * info.  We could get an answer by recursing to examine each leaf query,
+	 * but for the moment it doesn't seem worth the extra complication.
+	 *
+	 * Note that we needn't consider other top-level operators such as
+	 * DISTINCT, GROUP BY, etc, as those will not introduce nulls either.
+	 */
+	if (query->setOperations)
+		return false;
+
+	/*
+	 * Examine each Expr to prove that it can't produce NULL.
+	 */
+	foreach(lc, exprlist)
+	{
+		Expr *expr = (Expr *) lfirst(lc);
+
+		/*
+		 * For the most part we don't try to deal with anything more complex
+		 * than Consts and Vars; but it seems worthwhile to look through
+		 * binary relabelings, since we know those don't introduce nulls.
+		 */
+		while (expr && IsA(expr, RelabelType))
+			expr = ((RelabelType *) expr)->arg;
+
+		if (expr == NULL)		/* paranoia */
+			return false;
+
+		if (IsA(expr, Const))
+		{
+			/* Consts are easy: they're either null or not. */
+			if (((Const *) expr)->constisnull)
+				return false;
+		}
+		else if (IsA(expr, Var))
+		{
+			Var		   *var = (Var *) expr;
+
+			/* Currently, we punt for any nonlocal Vars */
+			if (var->varlevelsup != 0)
+				return false;
+
+			/*
+			 * Since the subquery hasn't yet been through expression
+			 * preprocessing, we must apply flatten_join_alias_vars to the
+			 * given Var, and to any Vars found by find_nonnullable_vars, to
+			 * avoid being fooled by join aliases.  If we get something other
+			 * than a plain Var out of the substitution, punt.
+			 */
+			var = (Var *) flatten_join_alias_vars(query, (Node *) var);
+
+			if (!IsA(var, Var))
+				return false;
+			Assert(var->varlevelsup == 0);
+
+			/*
+			 * We don't bother to compute innerjoined_rels and usable_quals
+			 * until we've found a Var we must analyze.
+			 */
+			if (!computed_innerjoined_rels)
+			{
+				find_innerjoined_rels((Node *) query->jointree,
+									  &innerjoined_rels, &usable_quals);
+				computed_innerjoined_rels = true;
+			}
+
+			/*
+			 * If Var is from a plain relation, and that relation is not on
+			 * the nullable side of any outer join, and its column is marked
+			 * NOT NULL according to the catalogs, it can't produce NULL.
+			 */
+			if (bms_is_member(var->varno, innerjoined_rels))
+			{
+				RangeTblEntry *rte = rt_fetch(var->varno, query->rtable);
+
+				if (rte->rtekind == RTE_RELATION &&
+					get_attnotnull(rte->relid, var->varattno))
+					continue;	/* cannot produce NULL */
+			}
+
+			/*
+			 * Even if that didn't work, we can conclude that the Var is not
+			 * nullable if find_nonnullable_vars can find a "var IS NOT NULL"
+			 * or similarly strict condition among the usable_quals.  Compute
+			 * the list of Vars having such quals if we didn't already.
+			 */
+			if (!computed_nonnullable_vars)
+			{
+				nonnullable_vars = find_nonnullable_vars((Node *) usable_quals);
+				nonnullable_vars = (List *)
+					flatten_join_alias_vars(query,
+											(Node *) nonnullable_vars);
+				/* We don't bother removing any non-Vars from the result */
+				computed_nonnullable_vars = true;
+			}
+
+			if (!list_member(nonnullable_vars, var))
+				return false;	/* we failed to prove the Var non-null */
+		}
+		else
+		{
+			/* Not a Const or Var; punt */
+			return false;
+		}
+	}
+
+	return true;				/* query cannot emit NULLs */
+}
+
+/*
+ * find_innerjoined_rels
+ *		Traverse jointree to locate non-outerjoined-rels and quals above them
+ *
+ * We fill innerjoined_rels with the relids of all rels that are not below
+ * the nullable side of any outer join (which would cause their Vars to be
+ * possibly NULL regardless of what's in the catalogs).  In the same scan,
+ * we locate all WHERE and JOIN/ON quals that constrain these rels, and add
+ * them to the usable_quals list (forming a list with implicit-AND semantics).
+ *
+ * Top-level caller must initialize innerjoined_rels/usable_quals to NULL/NIL.
+ */
+static void
+find_innerjoined_rels(Node *jtnode,
+					  Relids *innerjoined_rels, List **usable_quals)
+{
+	if (jtnode == NULL)
+		return;
+	if (IsA(jtnode, RangeTblRef))
+	{
+		int			varno = ((RangeTblRef *) jtnode)->rtindex;
+
+		*innerjoined_rels = bms_add_member(*innerjoined_rels, varno);
+	}
+	else if (IsA(jtnode, FromExpr))
+	{
+		FromExpr   *f = (FromExpr *) jtnode;
+		ListCell   *lc;
+
+		/* All elements of the FROM list are allowable */
+		foreach(lc, f->fromlist)
+			find_innerjoined_rels((Node *) lfirst(lc),
+								  innerjoined_rels, usable_quals);
+		/* ... and its WHERE quals are too */
+		if (f->quals)
+			*usable_quals = lappend(*usable_quals, f->quals);
+	}
+	else if (IsA(jtnode, JoinExpr))
+	{
+		JoinExpr   *j = (JoinExpr *) jtnode;
+
+		switch (j->jointype)
+		{
+			case JOIN_INNER:
+				/* visit both children */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				/* and grab the ON quals too */
+				if (j->quals)
+					*usable_quals = lappend(*usable_quals, j->quals);
+				break;
+
+			case JOIN_LEFT:
+			case JOIN_SEMI:
+			case JOIN_ANTI:
+
+				/*
+				 * Only the left input is possibly non-nullable; furthermore,
+				 * the quals of this join don't constrain the left input.
+				 * Note: we probably can't see SEMI or ANTI joins at this
+				 * point, but if we do, we can treat them like LEFT joins.
+				 */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_RIGHT:
+				/* Reverse of the above case */
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_FULL:
+				/* Neither side is non-nullable, so stop descending */
+				break;
+
+			default:
+				elog(ERROR, "unrecognized join type: %d",
+					 (int) j->jointype);
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+			 (int) nodeTag(jtnode));
+}
+
 /*
  * Can we treat a ScalarArrayOpExpr as strict?
  *
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index e88c45d268..1201fdeb5a 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -878,6 +878,34 @@ get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 	ReleaseSysCache(tp);
 }
 
+/*
+ * get_attnotnull
+ *
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	tp;
+
+	/* XXX worth special casing system columns? */
+	tp = SearchSysCache2(ATTNUM,
+						 ObjectIdGetDatum(relid),
+						 Int16GetDatum(attnum));
+	if (HeapTupleIsValid(tp))
+	{
+		Form_pg_attribute att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+		bool		result;
+
+		result = att_tup->attnotnull;
+		ReleaseSysCache(tp);
+		return result;
+	}
+	else
+		return false;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/include/optimizer/clauses.h b/src/include/optimizer/clauses.h
index 5e10fb1d50..dcfed985b5 100644
--- a/src/include/optimizer/clauses.h
+++ b/src/include/optimizer/clauses.h
@@ -44,6 +44,7 @@ extern Relids find_nonnullable_rels(Node *clause);
 extern List *find_nonnullable_vars(Node *clause);
 extern List *find_forced_null_vars(Node *clause);
 extern Var *find_forced_null_var(Node *clause);
+extern bool expressions_are_not_nullable(Query *query, List *exprlist);
 
 extern bool is_pseudo_constant_clause(Node *clause);
 extern bool is_pseudo_constant_clause_relids(Node *clause, Relids relids);
diff --git a/src/include/optimizer/subselect.h b/src/include/optimizer/subselect.h
index 2d2c3bcbc0..efc5bf75cc 100644
--- a/src/include/optimizer/subselect.h
+++ b/src/include/optimizer/subselect.h
@@ -19,6 +19,7 @@
 extern void SS_process_ctes(PlannerInfo *root);
 extern JoinExpr *convert_ANY_sublink_to_join(PlannerInfo *root,
 							SubLink *sublink,
+							bool under_not,
 							Relids available_rels);
 extern JoinExpr *convert_EXISTS_sublink_to_join(PlannerInfo *root,
 							   SubLink *sublink,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 16b0b1d2dc..71370d8877 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -89,6 +89,7 @@ extern AttrNumber get_attnum(Oid relid, const char *attname);
 extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 					  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern char *get_collation_name(Oid colloid);
 extern char *get_constraint_name(Oid conoid);
 extern char *get_language_name(Oid langoid, bool missing_ok);
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index cc3f5f3737..f2d5d6679f 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -1310,3 +1310,367 @@ select * from x for update;
    Output: subselect_tbl.f1, subselect_tbl.f2, subselect_tbl.f3
 (2 rows)
 
+--
+-- Check NOT IN performs an ANTI JOIN when NULLs are not possible
+-- on either side of the, would be, join condition.
+--
+BEGIN;
+CREATE TEMP TABLE a (id INT PRIMARY KEY);
+CREATE TEMP TABLE b (x INT NOT NULL, y INT);
+CREATE TEMP TABLE c (z INT NOT NULL);
+INSERT INTO b VALUES(1,1),(2,2),(3,NULL);
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+LEFT OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT z FROM c);
+             QUERY PLAN             
+------------------------------------
+ Hash Right Join
+   Hash Cond: (b.x = a.id)
+   Filter: (NOT (hashed SubPlan 1))
+   ->  Seq Scan on b
+   ->  Hash
+         ->  Seq Scan on a
+   SubPlan 1
+     ->  Seq Scan on c
+(8 rows)
+
+-- ANTI JOIN, b.x is from an outer join but b.x > 100
+-- forces the join not to produce NULL on the righthand
+-- side.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+LEFT OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT z FROM c)
+  AND b.x > 100;
+           QUERY PLAN            
+---------------------------------
+ Hash Join
+   Hash Cond: (b.x = a.id)
+   ->  Hash Anti Join
+         Hash Cond: (b.x = c.z)
+         ->  Seq Scan on b
+               Filter: (x > 100)
+         ->  Hash
+               ->  Seq Scan on c
+   ->  Hash
+         ->  Seq Scan on a
+(10 rows)
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+FULL OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT y FROM c);
+         QUERY PLAN          
+-----------------------------
+ Hash Full Join
+   Hash Cond: (b.x = a.id)
+   Filter: (NOT (SubPlan 1))
+   ->  Seq Scan on b
+   ->  Hash
+         ->  Seq Scan on a
+   SubPlan 1
+     ->  Seq Scan on c
+(8 rows)
+
+-- No ANTI JOIN. y can have NULLs
+EXPLAIN (COSTS OFF)
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on b
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on c
+(4 rows)
+
+-- c is an empty relation so should cause no filtering on b
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+ x | y 
+---+---
+ 1 | 1
+ 2 | 2
+ 3 |  
+(3 rows)
+
+INSERT INTO c VALUES(1);
+-- Records where y is NULL should be filtered out.
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+ x | y 
+---+---
+ 2 | 2
+(1 row)
+
+-- No ANTI JOIN, y can be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+(4 rows)
+
+-- No ANTI JOIN, x is NOT NULL, but we don't know if + 1 will change that.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT x+1 FROM b);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+(4 rows)
+
+-- ANTI JOIN 1 is a Const that is not null.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT 1 FROM b);
+        QUERY PLAN         
+---------------------------
+ Nested Loop Anti Join
+   Join Filter: (a.id = 1)
+   ->  Seq Scan on a
+   ->  Materialize
+         ->  Seq Scan on b
+(5 rows)
+
+-- No ANTI JOIN, results contain a NULL Const
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT NULL::int FROM b);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+(4 rows)
+
+-- ANTI JOIN y = 1 means y can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE y = 1);
+          QUERY PLAN           
+-------------------------------
+ Hash Anti Join
+   Hash Cond: (a.id = b.y)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: (y = 1)
+(6 rows)
+
+-- No ANTI JOIN, OR condition does not ensure y = 1
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE y = 1 OR x = 1);
+               QUERY PLAN               
+----------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+           Filter: ((y = 1) OR (x = 1))
+(5 rows)
+
+-- No ANTI JOIN, OR condition does not ensure y = 1 or y = 2
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR x = 1) AND (y = 2 OR x = 2));
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+           Filter: (((y = 1) OR (x = 1)) AND ((y = 2) OR (x = 2)))
+(5 rows)
+
+-- ANTI JOIN y must be 2, so can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR x = 1) AND y = 2);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (a.id = b.y)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: ((y = 2) AND ((y = 1) OR (x = 1)))
+(6 rows)
+
+-- ANTI JOIN y can be 1 or 2, but can't be null.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR y = 2));
+                 QUERY PLAN                 
+--------------------------------------------
+ Hash Anti Join
+   Hash Cond: (a.id = b.y)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: ((y = 1) OR (y = 2))
+(6 rows)
+
+-- No ANTI JOIN c.z is from a left outer join so it can have nulls.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Merge Left Join
+           Merge Cond: (b.x = c.z)
+           ->  Sort
+                 Sort Key: b.x
+                 ->  Seq Scan on b
+           ->  Sort
+                 Sort Key: c.z
+                 ->  Seq Scan on c
+(11 rows)
+
+-- ANTI JOIN, c.z is not from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b RIGHT JOIN c ON b.x = c.z);
+               QUERY PLAN                
+-----------------------------------------
+ Merge Anti Join
+   Merge Cond: (a.id = c.z)
+   ->  Index Only Scan using a_pkey on a
+   ->  Merge Left Join
+         Merge Cond: (c.z = b.x)
+         ->  Sort
+               Sort Key: c.z
+               ->  Seq Scan on c
+         ->  Sort
+               Sort Key: b.x
+               ->  Seq Scan on b
+(11 rows)
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.x FROM b RIGHT JOIN c ON b.x = c.z);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Merge Right Join
+           Merge Cond: (b.x = c.z)
+           ->  Sort
+                 Sort Key: b.x
+                 ->  Seq Scan on b
+           ->  Sort
+                 Sort Key: c.z
+                 ->  Seq Scan on c
+(11 rows)
+
+-- No ANTI JOIN, c.z is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b FULL JOIN c ON b.x = c.z);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Merge Full Join
+           Merge Cond: (b.x = c.z)
+           ->  Sort
+                 Sort Key: b.x
+                 ->  Seq Scan on b
+           ->  Sort
+                 Sort Key: c.z
+                 ->  Seq Scan on c
+(11 rows)
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.x FROM b FULL JOIN c ON b.x = c.z);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Merge Full Join
+           Merge Cond: (b.x = c.z)
+           ->  Sort
+                 Sort Key: b.x
+                 ->  Seq Scan on b
+           ->  Sort
+                 Sort Key: c.z
+                 ->  Seq Scan on c
+(11 rows)
+
+-- ANTI JOIN, c.z is from an inner join and has a NOT NULL constraint.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b INNER JOIN c ON b.x = c.z);
+               QUERY PLAN                
+-----------------------------------------
+ Merge Anti Join
+   Merge Cond: (a.id = c.z)
+   ->  Index Only Scan using a_pkey on a
+   ->  Merge Join
+         Merge Cond: (b.x = c.z)
+         ->  Sort
+               Sort Key: b.x
+               ->  Seq Scan on b
+         ->  Sort
+               Sort Key: c.z
+               ->  Seq Scan on c
+(11 rows)
+
+-- ANTI JOIN, c.z must be 1
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z WHERE c.z = 1);
+                QUERY PLAN                 
+-------------------------------------------
+ Hash Anti Join
+   Hash Cond: (a.id = c.z)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Nested Loop
+               ->  Seq Scan on c
+                     Filter: (z = 1)
+               ->  Materialize
+                     ->  Seq Scan on b
+                           Filter: (x = 1)
+(10 rows)
+
+-- ANTI JOIN, c.z can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z WHERE c.z IS NOT NULL);
+                 QUERY PLAN                  
+---------------------------------------------
+ Merge Anti Join
+   Merge Cond: (a.id = c.z)
+   ->  Index Only Scan using a_pkey on a
+   ->  Merge Join
+         Merge Cond: (b.x = c.z)
+         ->  Sort
+               Sort Key: b.x
+               ->  Seq Scan on b
+         ->  Sort
+               Sort Key: c.z
+               ->  Seq Scan on c
+                     Filter: (z IS NOT NULL)
+(12 rows)
+
+-- ANTI JOIN, b.y cannot be NULL due to the join condition b.y = c.z
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.y FROM b INNER JOIN c ON b.y = c.z);
+               QUERY PLAN                
+-----------------------------------------
+ Merge Anti Join
+   Merge Cond: (a.id = b.y)
+   ->  Index Only Scan using a_pkey on a
+   ->  Merge Join
+         Merge Cond: (b.y = c.z)
+         ->  Sort
+               Sort Key: b.y
+               ->  Seq Scan on b
+         ->  Sort
+               Sort Key: c.z
+               ->  Seq Scan on c
+(11 rows)
+
+ROLLBACK;
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index 8bca1f5d55..d6fff1f2ae 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -683,3 +683,123 @@ select * from (with x as (select 2 as y) select * from x) ss;
 explain (verbose, costs off)
 with x as (select * from subselect_tbl)
 select * from x for update;
+
+--
+-- Check NOT IN performs an ANTI JOIN when NULLs are not possible
+-- on either side of the, would be, join condition.
+--
+
+BEGIN;
+
+CREATE TEMP TABLE a (id INT PRIMARY KEY);
+CREATE TEMP TABLE b (x INT NOT NULL, y INT);
+CREATE TEMP TABLE c (z INT NOT NULL);
+
+INSERT INTO b VALUES(1,1),(2,2),(3,NULL);
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+LEFT OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT z FROM c);
+
+-- ANTI JOIN, b.x is from an outer join but b.x > 100
+-- forces the join not to produce NULL on the righthand
+-- side.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+LEFT OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT z FROM c)
+  AND b.x > 100;
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+FULL OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT y FROM c);
+
+-- No ANTI JOIN. y can have NULLs
+EXPLAIN (COSTS OFF)
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+
+-- c is an empty relation so should cause no filtering on b
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+
+INSERT INTO c VALUES(1);
+
+-- Records where y is NULL should be filtered out.
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+
+-- No ANTI JOIN, y can be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b);
+
+-- No ANTI JOIN, x is NOT NULL, but we don't know if + 1 will change that.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT x+1 FROM b);
+
+-- ANTI JOIN 1 is a Const that is not null.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT 1 FROM b);
+
+-- No ANTI JOIN, results contain a NULL Const
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT NULL::int FROM b);
+
+-- ANTI JOIN y = 1 means y can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE y = 1);
+
+-- No ANTI JOIN, OR condition does not ensure y = 1
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE y = 1 OR x = 1);
+
+-- No ANTI JOIN, OR condition does not ensure y = 1 or y = 2
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR x = 1) AND (y = 2 OR x = 2));
+
+-- ANTI JOIN y must be 2, so can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR x = 1) AND y = 2);
+
+-- ANTI JOIN y can be 1 or 2, but can't be null.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR y = 2));
+
+-- No ANTI JOIN c.z is from a left outer join so it can have nulls.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z);
+
+-- ANTI JOIN, c.z is not from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b RIGHT JOIN c ON b.x = c.z);
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.x FROM b RIGHT JOIN c ON b.x = c.z);
+
+-- No ANTI JOIN, c.z is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b FULL JOIN c ON b.x = c.z);
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.x FROM b FULL JOIN c ON b.x = c.z);
+
+-- ANTI JOIN, c.z is from an inner join and has a NOT NULL constraint.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b INNER JOIN c ON b.x = c.z);
+
+-- ANTI JOIN, c.z must be 1
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z WHERE c.z = 1);
+
+-- ANTI JOIN, c.z can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z WHERE c.z IS NOT NULL);
+
+-- ANTI JOIN, b.y cannot be NULL due to the join condition b.y = c.z
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.y FROM b INNER JOIN c ON b.y = c.z);
+
+ROLLBACK;
#7David Rowley
david.rowley@2ndquadrant.com
In reply to: David Rowley (#6)
1 attachment(s)
Re: NOT IN subquery optimization

On Fri, 22 Feb 2019 at 09:44, David Rowley <david.rowley@2ndquadrant.com> wrote:

I've attached the rebased and still broken version.

I set about trying to make a less broken version of this.

A quick reminder of the semantics of NOT IN:

1. WHERE <nullable_column> NOT IN(SELECT <not null column> FROM table);

If table is non-empty:
will filter out rows where <nullable_column> is NULL
and only show values that are not in <not null column>

If table is empty:
Filters nothing.

2. WHERE <nonnullable_column> NOT IN(SELECT <null column> FROM table);

If table contains NULLs in the <null column> no records will match.

The previous patch handled #2 correctly but neglected to do anything
about #1. For #1 the only way we can implement this as a planner only
change is to insist that the outer side expressions also are not null.
If we could have somehow tested if "table" was non-empty then we could
have added a IS NOT NULL clause to the outer query and converted to an
anti-join, but ... can't know that during planning and can't add the
IS NOT NULL regardless as, if "table" is empty we will filter NULLs
when we shouldn't.

In the attached, I set about fixing #1 by determining if the outer
expressions could be NULL by checking

1. If expression is a Var from an inner joined relation it can't be
NULL if there's a NOT NULL constraint on the column; or
2. If expression is a Var from an inner joined relation and there is a
strict WHERE/ON clause, the expression can't be NULL; or
3. If expression is a Var from an outer joined relation check for
quals that were specified in the same syntactical level as the NOT IN
for proofs that NULL will be filtered.

An example of #3 is:

SELECT * FROM t1 LEFT JOIN t2 on t1.a = t2.a WHERE t2.a IS NOT NULL
AND t2.a NOT IN(SELECT a FROM t3); -- t2 becomes INNER JOINed later in
planning, but...
or;
SELECT * FROM t1 LEFT JOIN t2 on t1.a = t2.a AND t2.a NOT IN(SELECT a FROM t3);

In the latter of the two, the t1.a = t2.a join conditions ensures that
NULLs can't exist where the NOT IN is evaluated.

I implemented #3 by passing the quals down to
pull_up_sublinks_qual_recurse(). At the top level call 'node' and
'notnull_proofs' are the same, but that changes in recursive calls
like the one we make inside the is_andclause() condition.

Comments welcome.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

not_in_anti_join_v1.1.patchapplication/octet-stream; name=not_in_anti_join_v1.1.patchDownload
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index c721054787..deab0fb4fd 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -87,6 +87,9 @@ static bool contain_dml(Node *node);
 static bool contain_dml_walker(Node *node, void *context);
 static void inline_cte(PlannerInfo *root, CommonTableExpr *cte);
 static bool inline_cte_walker(Node *node, inline_cte_walker_context *context);
+static bool is_NOTANY_compatible_with_antijoin(Query *outerquery,
+											   SubLink * sublink,
+											   Node *notnull_proofs);
 static bool simplify_EXISTS_query(PlannerInfo *root, Query *query);
 static Query *convert_EXISTS_to_ANY(PlannerInfo *root, Query *subselect,
 					  Node **testexpr, List **paramIds);
@@ -1108,6 +1111,84 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
 }
 
 
+/*
+ * is_NOTANY_compatible_with_antijoin
+ *		True if the NOT IN sublink can be safely converted into an ANTI JOIN.
+ *		Per SQL spec, NOT IN is not ordinarily equivalent to an anti-join,
+ *		however, if we can prove that all of the expressions on both sides of
+ *		the, would be, join condition are all certainly not NULL, then it's
+ *		safe to convert NOT IN to an anti-join.
+ *
+ * 'outerquery' is the parse of the query that the NOT IN is present in.
+ * 'notnull_proofs' are quals from the same syntactical level as the NOT IN.
+ * These will be used to assist in proving the outer query's would be join
+ * expressions cannot be NULL.
+ *
+ * Note:
+ *	This function is quite locked into the NOT IN syntax. Certain assumptions
+ *	are made about the structure of the join conditions:
+ *
+ * 1.	We assume that when more than 1 join condition exists that these are
+ *		AND type conditions, i.e not OR conditions.
+ *
+ * 2.	We assume that each join qual has 2 arguments and the first arguments
+ *		in each join qual is the one that belongs to the outer side of the
+ *		NOT IN clause.
+ */
+static bool
+is_NOTANY_compatible_with_antijoin(Query *outerquery, SubLink *sublink,
+								   Node *notnull_proofs)
+{
+	List	 *outerexpr;
+	Node	 *testexpr = sublink->testexpr;
+
+	/* Extract the outer rel's join condition expressions. */
+
+	/* if it's a single expression */
+	if (IsA(testexpr, OpExpr))
+	{
+		OpExpr *opexpr = (OpExpr *) testexpr;
+
+		Assert(list_length(opexpr->args) == 2);
+
+		outerexpr = list_make1(linitial(opexpr->args));
+	}
+
+	/* Extract exprs from multiple expressions ANDed together */
+	else if (IsA(testexpr, BoolExpr))
+	{
+		List	 *list = ((BoolExpr *) testexpr)->args;
+		ListCell *lc;
+
+		outerexpr = NIL;
+
+		/* Build a list containing the lefthand expr from each OpExpr */
+		foreach(lc, list)
+		{
+			OpExpr *opexpr = (OpExpr *) lfirst(lc);
+
+			Assert(list_length(opexpr->args) == 2);
+
+			outerexpr = lappend(outerexpr, linitial(opexpr->args));
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+					(int) nodeTag(testexpr));
+
+	Assert(outerexpr != NIL);
+
+	/* Check if any outer expressions can be NULL. */
+	if (!expressions_are_not_nullable(outerquery, outerexpr, notnull_proofs))
+		return false;
+
+	/* Now validate the subquery targetlist to ensure no NULL are possible. */
+	if (!query_outputs_are_not_nullable((Query *) sublink->subselect))
+		return false;
+
+	return true; /* supports ANTI JOIN */
+}
+
 /*
  * convert_ANY_sublink_to_join: try to convert an ANY SubLink to a join
  *
@@ -1117,11 +1198,22 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
  * If so, form a JoinExpr and return it.  Return NULL if the SubLink cannot
  * be converted to a join.
  *
- * The only non-obvious input parameter is available_rels: this is the set
- * of query rels that can safely be referenced in the sublink expression.
- * (We must restrict this to avoid changing the semantics when a sublink
- * is present in an outer join's ON qual.)  The conversion must fail if
- * the converted qual would reference any but these parent-query relids.
+ * If under_not is true, the caller actually found NOT (ANY SubLink),
+ * so that what we must try to build is an ANTI not SEMI join.  Per SQL spec,
+ * NOT IN is not ordinarily equivalent to an anti-join, so that by default we
+ * have to fail when under_not.  However, if we can prove that all of the
+ * expressions on both sides of the, would be, join condition are all
+ * certainly not NULL, then it's safe to convert NOT IN to an anti-join.
+ * To assist with nullability proofs for the join conditions on the outside
+ * of the join, we make use of 'notnull_proofs'. It is the caller's
+ * responsibility to ensure these proofs come from the same syntactical level
+ * as the NOT IN sublink.
+ *
+ * available_rels is the set of query rels that can safely be referenced
+ * in the sublink expression.  (We must restrict this to avoid changing the
+ * semantics when a sublink is present in an outer join's ON qual.)
+ * The conversion must fail if the converted qual would reference any but
+ * these parent-query relids.
  *
  * On success, the returned JoinExpr has larg = NULL and rarg = the jointree
  * item representing the pulled-up subquery.  The caller must set larg to
@@ -1144,7 +1236,8 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
  */
 JoinExpr *
 convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
-							Relids available_rels)
+							bool under_not, Relids available_rels,
+							Node *notnull_proofs)
 {
 	JoinExpr   *result;
 	Query	   *parse = root->parse;
@@ -1159,6 +1252,14 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 
 	Assert(sublink->subLinkType == ANY_SUBLINK);
 
+	/*
+	 * For NOT INs, only convert to an anti-join when join conditions exprs
+	 * certainly are not null.
+	 */
+	if (under_not &&
+		!is_NOTANY_compatible_with_antijoin(parse, sublink, notnull_proofs))
+		return NULL;
+
 	/*
 	 * The sub-select must not refer to any Vars of the parent query. (Vars of
 	 * higher levels should be okay, though.)
@@ -1228,7 +1329,7 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 	 * And finally, build the JoinExpr node.
 	 */
 	result = makeNode(JoinExpr);
-	result->jointype = JOIN_SEMI;
+	result->jointype = under_not ? JOIN_ANTI : JOIN_SEMI;
 	result->isNatural = false;
 	result->larg = NULL;		/* caller must fill this in */
 	result->rarg = (Node *) rtr;
@@ -1243,9 +1344,7 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 /*
  * convert_EXISTS_sublink_to_join: try to convert an EXISTS SubLink to a join
  *
- * The API of this function is identical to convert_ANY_sublink_to_join's,
- * except that we also support the case where the caller has found NOT EXISTS,
- * so we need an additional input parameter "under_not".
+ * The API of this function is identical to convert_ANY_sublink_to_join's.
  */
 JoinExpr *
 convert_EXISTS_sublink_to_join(PlannerInfo *root, SubLink *sublink,
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index aebe162713..27a7fafcf4 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -64,7 +64,8 @@ static Node *pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 								  Relids *relids);
 static Node *pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 							  Node **jtlink1, Relids available_rels1,
-							  Node **jtlink2, Relids available_rels2);
+							  Node **jtlink2, Relids available_rels2,
+							  Node *notnull_proofs);
 static Node *pull_up_subqueries_recurse(PlannerInfo *root, Node *jtnode,
 						   JoinExpr *lowest_outer_join,
 						   JoinExpr *lowest_nulling_outer_join,
@@ -261,7 +262,7 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 		/* Now process qual --- all children are available for use */
 		newf->quals = pull_up_sublinks_qual_recurse(root, f->quals,
 													&jtlink, frelids,
-													NULL, NULL);
+													NULL, NULL, f->quals);
 
 		/*
 		 * Note that the result will be either newf, or a stack of JoinExprs
@@ -315,13 +316,13 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 														 &jtlink,
 														 bms_union(leftrelids,
 																   rightrelids),
-														 NULL, NULL);
+														 NULL, NULL, j->quals);
 				break;
 			case JOIN_LEFT:
 				j->quals = pull_up_sublinks_qual_recurse(root, j->quals,
 														 &j->rarg,
 														 rightrelids,
-														 NULL, NULL);
+														 NULL, NULL, j->quals);
 				break;
 			case JOIN_FULL:
 				/* can't do anything with full-join quals */
@@ -330,7 +331,7 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 				j->quals = pull_up_sublinks_qual_recurse(root, j->quals,
 														 &j->larg,
 														 leftrelids,
-														 NULL, NULL);
+														 NULL, NULL, j->quals);
 				break;
 			default:
 				elog(ERROR, "unrecognized join type: %d",
@@ -370,12 +371,17 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
  * and/or jtlink2 in the order we encounter them.  We rely on subsequent
  * optimization to rearrange the stack if appropriate.
  *
+ * notnull_proofs, if not passed as NULL, is used to help prove that the
+ * would be outer rel's join condition exprs cannot be NULL when we attempt
+ * to convert NOT IN to use an antijoin.
+ *
  * Returns the replacement qual node, or NULL if the qual should be removed.
  */
 static Node *
 pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 							  Node **jtlink1, Relids available_rels1,
-							  Node **jtlink2, Relids available_rels2)
+							  Node **jtlink2, Relids available_rels2,
+							  Node *notnull_proofs)
 {
 	if (node == NULL)
 		return NULL;
@@ -388,8 +394,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 		/* Is it a convertible ANY or EXISTS clause? */
 		if (sublink->subLinkType == ANY_SUBLINK)
 		{
-			if ((j = convert_ANY_sublink_to_join(root, sublink,
-												 available_rels1)) != NULL)
+			if ((j = convert_ANY_sublink_to_join(root, sublink, false,
+												 available_rels1, NULL)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
 				j->larg = *jtlink1;
@@ -409,13 +415,14 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 														 &j->larg,
 														 available_rels1,
 														 &j->rarg,
-														 child_rels);
+														 child_rels,
+														 notnull_proofs);
 				/* Return NULL representing constant TRUE */
 				return NULL;
 			}
 			if (available_rels2 != NULL &&
-				(j = convert_ANY_sublink_to_join(root, sublink,
-												 available_rels2)) != NULL)
+				(j = convert_ANY_sublink_to_join(root, sublink, false,
+												 available_rels2, NULL)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
 				j->larg = *jtlink2;
@@ -435,7 +442,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 														 &j->larg,
 														 available_rels2,
 														 &j->rarg,
-														 child_rels);
+														 child_rels,
+														 notnull_proofs);
 				/* Return NULL representing constant TRUE */
 				return NULL;
 			}
@@ -463,7 +471,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 														 &j->larg,
 														 available_rels1,
 														 &j->rarg,
-														 child_rels);
+														 child_rels,
+														 notnull_proofs);
 				/* Return NULL representing constant TRUE */
 				return NULL;
 			}
@@ -489,7 +498,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 														 &j->larg,
 														 available_rels2,
 														 &j->rarg,
-														 child_rels);
+														 child_rels,
+														 notnull_proofs);
 				/* Return NULL representing constant TRUE */
 				return NULL;
 			}
@@ -499,14 +509,70 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 	}
 	if (is_notclause(node))
 	{
-		/* If the immediate argument of NOT is EXISTS, try to convert */
+		/* If the immediate argument of NOT is ANY or EXISTS, try to convert */
 		SubLink    *sublink = (SubLink *) get_notclausearg((Expr *) node);
 		JoinExpr   *j;
 		Relids		child_rels;
 
 		if (sublink && IsA(sublink, SubLink))
 		{
-			if (sublink->subLinkType == EXISTS_SUBLINK)
+			if (sublink->subLinkType == ANY_SUBLINK)
+			{
+				if ((j = convert_ANY_sublink_to_join(root, sublink, true,
+												   available_rels1, notnull_proofs)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink1;
+					*jtlink1 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Because
+					 * we are underneath a NOT, we can't pull up sublinks that
+					 * reference the left-hand stuff, but it's still okay to
+					 * pull up sublinks referencing j->rarg.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->rarg,
+															 child_rels,
+															 NULL, NULL,
+															 notnull_proofs);
+					/* Return NULL representing constant TRUE */
+					return NULL;
+				}
+				if (available_rels2 != NULL &&
+					(j = convert_ANY_sublink_to_join(root, sublink, true,
+												   available_rels2, notnull_proofs)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink2;
+					*jtlink2 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Because
+					 * we are underneath a NOT, we can't pull up sublinks that
+					 * reference the left-hand stuff, but it's still okay to
+					 * pull up sublinks referencing j->rarg.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->rarg,
+															 child_rels,
+															 NULL, NULL,
+															 notnull_proofs);
+					/* Return NULL representing constant TRUE */
+					return NULL;
+				}
+			}
+			else if (sublink->subLinkType == EXISTS_SUBLINK)
 			{
 				if ((j = convert_EXISTS_sublink_to_join(root, sublink, true,
 														available_rels1)) != NULL)
@@ -529,7 +595,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 															 j->quals,
 															 &j->rarg,
 															 child_rels,
-															 NULL, NULL);
+															 NULL, NULL,
+															 notnull_proofs);
 					/* Return NULL representing constant TRUE */
 					return NULL;
 				}
@@ -555,7 +622,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 															 j->quals,
 															 &j->rarg,
 															 child_rels,
-															 NULL, NULL);
+															 NULL, NULL,
+															 notnull_proofs);
 					/* Return NULL representing constant TRUE */
 					return NULL;
 				}
@@ -580,7 +648,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 													  jtlink1,
 													  available_rels1,
 													  jtlink2,
-													  available_rels2);
+													  available_rels2,
+													  notnull_proofs);
 			if (newclause)
 				newclauses = lappend(newclauses, newclause);
 		}
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 501b0e9e2d..28044b8588 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -42,6 +42,7 @@
 #include "parser/parse_agg.h"
 #include "parser/parse_coerce.h"
 #include "parser/parse_func.h"
+#include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "tcop/tcopprot.h"
 #include "utils/acl.h"
@@ -113,6 +114,8 @@ static bool contain_context_dependent_node_walker(Node *node, int *flags);
 static bool contain_leaked_vars_walker(Node *node, void *context);
 static Relids find_nonnullable_rels_walker(Node *node, bool top_level);
 static List *find_nonnullable_vars_walker(Node *node, bool top_level);
+static void find_innerjoined_rels(Node *jtnode,
+					  Relids *innerjoined_rels, List **usable_quals);
 static bool is_strict_saop(ScalarArrayOpExpr *expr, bool falseOK);
 static Node *eval_const_expressions_mutator(Node *node,
 							   eval_const_expressions_context *context);
@@ -1467,6 +1470,10 @@ contain_leaked_vars_walker(Node *node, void *context)
 								  context);
 }
 
+/*****************************************************************************
+ *		  Nullability analysis
+ *****************************************************************************/
+
 /*
  * find_nonnullable_rels
  *		Determine which base rels are forced nonnullable by given clause.
@@ -1710,7 +1717,7 @@ find_nonnullable_rels_walker(Node *node, bool top_level)
  * but here we assume that the input is a Boolean expression, and wish to
  * see if NULL inputs will provably cause a FALSE-or-NULL result.  We expect
  * the expression to have been AND/OR flattened and converted to implicit-AND
- * format.
+ * format (but the results are still good if it wasn't AND/OR flattened).
  *
  * The result is a palloc'd List, but we have not copied the member Var nodes.
  * Also, we don't bother trying to eliminate duplicate entries.
@@ -2011,6 +2018,393 @@ find_forced_null_var(Node *node)
 	return NULL;
 }
 
+/*
+ * expressions_are_not_nullable
+ *		Return TRUE if all 'exprs' are certainly not NULL.
+ *
+ * The reason this takes a Query, and not just a list of expressions, is so
+ * that we can determine which relations are INNER JOINed and make use of the
+ * query's WHERE/ON clauses to help prove that inner joined rel's Vars cannot
+ * be NULL, in the absense of a NOT NULL constraint.
+ *
+ * 'query' is the query that the 'exprs' belong to.
+ * 'notnull_proofs' can be passed to assist in proving the non-nullability of
+ * 'exprs'.  These may be used for outer joined Vars or for Vars that allow
+ * NULLs to help determine if each 'exprs' cannot be NULL.  It is the callers
+ * responsibility to ensure 'notnull_proofs' come from the same syntactical
+ * level as 'exprs'.
+ *
+ * In current usage, the passed exprs haven't yet been through any planner
+ * processing.  This means that applying find_nonnullable_vars() on them isn't
+ * really ideal: for lack of const-simplification, we might be unable to prove
+ * not-nullness in some cases where we could have proved it afterwards.
+ * However, we should not get any false positive results.
+ *
+ * Here we can err on the side of conservatism: if we're not sure then it's
+ * okay to return FALSE.
+ */
+bool
+expressions_are_not_nullable(Query *query, List *exprs, Node *notnull_proofs)
+{
+	Relids		innerjoined_rels = NULL;
+	List	   *innerjoined_useful_quals = NIL;
+	bool		computed_innerjoined_rels = false;
+	List	   *nonnullable_vars = NIL;
+	bool		computed_nonnullable_vars = false;
+	List	   *nonnullable_inner_vars = NIL;
+	bool		computed_nonnullable_inner_vars = false;
+	ListCell   *lc;
+
+	foreach(lc, exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+
+		/*
+		 * For the most part we don't try to deal with anything more complex
+		 * than Consts and Vars; but it seems worthwhile to look through
+		 * binary relabelings, since we know those don't introduce nulls.
+		 */
+		while (expr && IsA(expr, RelabelType))
+			expr = ((RelabelType *) expr)->arg;
+
+		if (expr == NULL)		/* paranoia */
+			return false;
+
+		if (IsA(expr, Const))
+		{
+			/* Consts are easy: they're either null or not. */
+			if (((Const *) expr)->constisnull)
+				return false;
+		}
+		else if (IsA(expr, Var))
+		{
+			Var		   *var = (Var *) expr;
+
+			/* Currently, we punt for any nonlocal Vars */
+			if (var->varlevelsup != 0)
+				return false;
+
+			/*
+			 * Since the subquery hasn't yet been through expression
+			 * preprocessing, we must apply flatten_join_alias_vars to the
+			 * given Var, and to any Vars found by find_nonnullable_vars, to
+			 * avoid being fooled by join aliases.  If we get something other
+			 * than a plain Var out of the substitution, punt.
+			 */
+			var = (Var *) flatten_join_alias_vars(query, (Node *) var);
+
+			if (!IsA(var, Var))
+				return false;
+			Assert(var->varlevelsup == 0);
+
+			/*
+			 * We don't bother to compute innerjoined_rels until we've found a
+			 * Var we must analyze.
+			 */
+			if (!computed_innerjoined_rels)
+			{
+				find_innerjoined_rels((Node *) query->jointree,
+									  &innerjoined_rels,
+									  &innerjoined_useful_quals);
+				computed_innerjoined_rels = true;
+			}
+
+			/* Check if the Var is from an INNER JOINed rel */
+			if (bms_is_member(var->varno, innerjoined_rels))
+			{
+				RangeTblEntry *rte = rt_fetch(var->varno, query->rtable);
+
+				/*
+				 * If Var is from a plain relation and its column is marked
+				 * NOT NULL according to the catalogs, it can't produce NULL.
+				 */
+				if (rte->rtekind == RTE_RELATION &&
+					get_attnotnull(rte->relid, var->varattno))
+					continue;	/* cannot produce NULL */
+
+				/*
+				 * Otherwise check for the existance of quals which filter
+				 * out NULL values for this Var.  We may need to compute the
+				 * nonnullable_inner_vars, if not done already.
+				 */
+				if (!computed_nonnullable_inner_vars)
+				{
+					nonnullable_inner_vars =
+						find_nonnullable_vars((Node *) innerjoined_useful_quals);
+					nonnullable_inner_vars = (List *)
+						flatten_join_alias_vars(query,
+											(Node *) nonnullable_inner_vars);
+					/* We don't bother removing any non-Vars from the result */
+					computed_nonnullable_inner_vars = true;
+				}
+
+				if (list_member(nonnullable_inner_vars, var))
+					continue;	/* cannot produce NULL */
+			}
+
+			/*
+			 * Even if that didn't work, we can conclude that the Var is not
+			 * nullable if find_nonnullable_vars can find a "var IS NOT NULL"
+			 * or similarly strict condition among the notnull_proofs.
+			 * Compute the list of Vars having such quals if we didn't
+			 * already.
+			 */
+			if (!computed_nonnullable_vars)
+			{
+				nonnullable_vars = find_nonnullable_vars(notnull_proofs);
+				nonnullable_vars = (List *)
+					flatten_join_alias_vars(query,
+											(Node *) nonnullable_vars);
+				/* We don't bother removing any non-Vars from the result */
+				computed_nonnullable_vars = true;
+			}
+
+			if (!list_member(nonnullable_vars, var))
+				return false;	/* we failed to prove the Var non-null */
+		}
+		else
+		{
+			/* Not a Const or Var; punt */
+			return false;
+		}
+	}
+
+	return true;				/* exprs cannot emit NULLs */
+}
+
+/*
+ * query_outputs_are_not_nullable
+ *		Returns TRUE if the output values of the Query are certainly not NULL.
+ *		All output columns must return non-NULL to answer TRUE.
+ *
+ * The reason this takes a Query, and not just an individual tlist expression,
+ * is so that we can make use of the query's WHERE/ON clauses to prove it does
+ * not return nulls.
+ *
+ * In current usage, the passed sub-Query hasn't yet been through any planner
+ * processing.  This means that applying find_nonnullable_vars() to its WHERE
+ * clauses isn't really ideal: for lack of const-simplification, we might be
+ * unable to prove not-nullness in some cases where we could have proved it
+ * afterwards.  However, we should not get any false positive results.
+ *
+ * Like the other forms of nullability analysis above, we can err on the
+ * side of conservatism: if we're not sure, it's okay to return FALSE.
+ */
+bool
+query_outputs_are_not_nullable(Query *query)
+{
+	Relids		innerjoined_rels = NULL;
+	bool		computed_innerjoined_rels = false;
+	List	   *usable_quals = NIL;
+	List	   *nonnullable_vars = NIL;
+	bool		computed_nonnullable_vars = false;
+	ListCell   *tl;
+
+	/*
+	 * If the query contains set operations, punt.  The set ops themselves
+	 * couldn't introduce nulls that weren't in their inputs, but the tlist
+	 * present in the top-level query is just dummy and won't give us useful
+	 * info.  We could get an answer by recursing to examine each leaf query,
+	 * but for the moment it doesn't seem worth the extra complication.
+	 *
+	 * Note that we needn't consider other top-level operators such as
+	 * DISTINCT, GROUP BY, etc, as those will not introduce nulls either.
+	 */
+	if (query->setOperations)
+		return false;
+
+	/*
+	 * Examine each targetlist entry to prove that it can't produce NULL.
+	 */
+	foreach(tl, query->targetList)
+	{
+		TargetEntry *tle = (TargetEntry *) lfirst(tl);
+		Expr	   *expr = tle->expr;
+
+		/* Resjunk columns can be ignored: they don't produce output values */
+		if (tle->resjunk)
+			continue;
+
+		/*
+		 * For the most part we don't try to deal with anything more complex
+		 * than Consts and Vars; but it seems worthwhile to look through
+		 * binary relabelings, since we know those don't introduce nulls.
+		 */
+		while (expr && IsA(expr, RelabelType))
+			expr = ((RelabelType *) expr)->arg;
+
+		if (expr == NULL)		/* paranoia */
+			return false;
+
+		if (IsA(expr, Const))
+		{
+			/* Consts are easy: they're either null or not. */
+			if (((Const *) expr)->constisnull)
+				return false;
+		}
+		else if (IsA(expr, Var))
+		{
+			Var		   *var = (Var *) expr;
+
+			/* Currently, we punt for any nonlocal Vars */
+			if (var->varlevelsup != 0)
+				return false;
+
+			/*
+			 * Since the subquery hasn't yet been through expression
+			 * preprocessing, we must apply flatten_join_alias_vars to the
+			 * given Var, and to any Vars found by find_nonnullable_vars, to
+			 * avoid being fooled by join aliases.  If we get something other
+			 * than a plain Var out of the substitution, punt.
+			 */
+			var = (Var *) flatten_join_alias_vars(query, (Node *) var);
+
+			if (!IsA(var, Var))
+				return false;
+			Assert(var->varlevelsup == 0);
+
+			/*
+			 * We don't bother to compute innerjoined_rels and usable_quals
+			 * until we've found a Var we must analyze.
+			 */
+			if (!computed_innerjoined_rels)
+			{
+				find_innerjoined_rels((Node *) query->jointree,
+									  &innerjoined_rels, &usable_quals);
+				computed_innerjoined_rels = true;
+			}
+
+			/*
+			 * If Var is from a plain relation, and that relation is not on
+			 * the nullable side of any outer join, and its column is marked
+			 * NOT NULL according to the catalogs, it can't produce NULL.
+			 */
+			if (bms_is_member(var->varno, innerjoined_rels))
+			{
+				RangeTblEntry *rte = rt_fetch(var->varno, query->rtable);
+
+				if (rte->rtekind == RTE_RELATION &&
+					get_attnotnull(rte->relid, var->varattno))
+					continue;	/* cannot produce NULL */
+			}
+
+			/*
+			 * Even if that didn't work, we can conclude that the Var is not
+			 * nullable if find_nonnullable_vars can find a "var IS NOT NULL"
+			 * or similarly strict condition among the usable_quals.  Compute
+			 * the list of Vars having such quals if we didn't already.
+			 */
+			if (!computed_nonnullable_vars)
+			{
+				nonnullable_vars = find_nonnullable_vars((Node *) usable_quals);
+				nonnullable_vars = (List *)
+					flatten_join_alias_vars(query,
+											(Node *) nonnullable_vars);
+				/* We don't bother removing any non-Vars from the result */
+				computed_nonnullable_vars = true;
+			}
+
+			if (!list_member(nonnullable_vars, var))
+				return false;	/* we failed to prove the Var non-null */
+		}
+		else
+		{
+			/* Not a Const or Var; punt */
+			return false;
+		}
+	}
+
+	return true;				/* query cannot emit NULLs */
+}
+
+/*
+ * find_innerjoined_rels
+ *		Traverse jointree to locate non-outerjoined-rels and quals above them
+ *
+ * We fill innerjoined_rels with the relids of all rels that are not below
+ * the nullable side of any outer join (which would cause their Vars to be
+ * possibly NULL regardless of what's in the catalogs).  In the same scan,
+ * we locate all WHERE and JOIN/ON quals that constrain these rels add them to
+ * the usable_quals list (forming a list with implicit-AND semantics).
+ *
+ * Top-level caller must initialize innerjoined_rels/usable_quals to NULL/NIL.
+ */
+static void
+find_innerjoined_rels(Node *jtnode,
+					  Relids *innerjoined_rels, List **usable_quals)
+{
+	if (jtnode == NULL)
+		return;
+	if (IsA(jtnode, RangeTblRef))
+	{
+		int			varno = ((RangeTblRef *) jtnode)->rtindex;
+
+		*innerjoined_rels = bms_add_member(*innerjoined_rels, varno);
+	}
+	else if (IsA(jtnode, FromExpr))
+	{
+		FromExpr   *f = (FromExpr *) jtnode;
+		ListCell   *lc;
+
+		/* All elements of the FROM list are allowable */
+		foreach(lc, f->fromlist)
+			find_innerjoined_rels((Node *) lfirst(lc),
+								  innerjoined_rels, usable_quals);
+		/* ... and its WHERE quals are too */
+		if (f->quals)
+			*usable_quals = lappend(*usable_quals, f->quals);
+	}
+	else if (IsA(jtnode, JoinExpr))
+	{
+		JoinExpr   *j = (JoinExpr *) jtnode;
+
+		switch (j->jointype)
+		{
+			case JOIN_INNER:
+				/* visit both children */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				/* and grab the ON quals too */
+				if (j->quals)
+					*usable_quals = lappend(*usable_quals, j->quals);
+				break;
+
+			case JOIN_LEFT:
+			case JOIN_SEMI:
+			case JOIN_ANTI:
+
+				/*
+				 * Only the left input is possibly non-nullable; furthermore,
+				 * the quals of this join don't constrain the left input.
+				 * Note: we probably can't see SEMI or ANTI joins at this
+				 * point, but if we do, we can treat them like LEFT joins.
+				 */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_RIGHT:
+				/* Reverse of the above case */
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_FULL:
+				/* Neither side is non-nullable, so stop descending */
+				break;
+
+			default:
+				elog(ERROR, "unrecognized join type: %d",
+					 (int) j->jointype);
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+			 (int) nodeTag(jtnode));
+}
+
 /*
  * Can we treat a ScalarArrayOpExpr as strict?
  *
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index e88c45d268..e0e4d1ed63 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -878,6 +878,33 @@ get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 	ReleaseSysCache(tp);
 }
 
+/*
+ * get_attnotnull
+ *
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	tp;
+
+	tp = SearchSysCache2(ATTNUM,
+						 ObjectIdGetDatum(relid),
+						 Int16GetDatum(attnum));
+	if (HeapTupleIsValid(tp))
+	{
+		Form_pg_attribute att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+		bool		result;
+
+		result = att_tup->attnotnull;
+		ReleaseSysCache(tp);
+		return result;
+	}
+	else
+		return false;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/include/optimizer/clauses.h b/src/include/optimizer/clauses.h
index 5e10fb1d50..b4d189eabc 100644
--- a/src/include/optimizer/clauses.h
+++ b/src/include/optimizer/clauses.h
@@ -44,6 +44,9 @@ extern Relids find_nonnullable_rels(Node *clause);
 extern List *find_nonnullable_vars(Node *clause);
 extern List *find_forced_null_vars(Node *clause);
 extern Var *find_forced_null_var(Node *clause);
+extern bool expressions_are_not_nullable(Query *query, List *exprs,
+							 Node *notnull_proofs);
+extern bool query_outputs_are_not_nullable(Query *query);
 
 extern bool is_pseudo_constant_clause(Node *clause);
 extern bool is_pseudo_constant_clause_relids(Node *clause, Relids relids);
diff --git a/src/include/optimizer/subselect.h b/src/include/optimizer/subselect.h
index 2d2c3bcbc0..48216a6277 100644
--- a/src/include/optimizer/subselect.h
+++ b/src/include/optimizer/subselect.h
@@ -19,7 +19,9 @@
 extern void SS_process_ctes(PlannerInfo *root);
 extern JoinExpr *convert_ANY_sublink_to_join(PlannerInfo *root,
 							SubLink *sublink,
-							Relids available_rels);
+							bool under_not,
+							Relids available_rels,
+							Node *notnull_proofs);
 extern JoinExpr *convert_EXISTS_sublink_to_join(PlannerInfo *root,
 							   SubLink *sublink,
 							   bool under_not,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 16b0b1d2dc..71370d8877 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -89,6 +89,7 @@ extern AttrNumber get_attnum(Oid relid, const char *attname);
 extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 					  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern char *get_collation_name(Oid colloid);
 extern char *get_constraint_name(Oid conoid);
 extern char *get_language_name(Oid langoid, bool missing_ok);
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index cc3f5f3737..f2d5d6679f 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -1310,3 +1310,367 @@ select * from x for update;
    Output: subselect_tbl.f1, subselect_tbl.f2, subselect_tbl.f3
 (2 rows)
 
+--
+-- Check NOT IN performs an ANTI JOIN when NULLs are not possible
+-- on either side of the, would be, join condition.
+--
+BEGIN;
+CREATE TEMP TABLE a (id INT PRIMARY KEY);
+CREATE TEMP TABLE b (x INT NOT NULL, y INT);
+CREATE TEMP TABLE c (z INT NOT NULL);
+INSERT INTO b VALUES(1,1),(2,2),(3,NULL);
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+LEFT OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT z FROM c);
+             QUERY PLAN             
+------------------------------------
+ Hash Right Join
+   Hash Cond: (b.x = a.id)
+   Filter: (NOT (hashed SubPlan 1))
+   ->  Seq Scan on b
+   ->  Hash
+         ->  Seq Scan on a
+   SubPlan 1
+     ->  Seq Scan on c
+(8 rows)
+
+-- ANTI JOIN, b.x is from an outer join but b.x > 100
+-- forces the join not to produce NULL on the righthand
+-- side.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+LEFT OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT z FROM c)
+  AND b.x > 100;
+           QUERY PLAN            
+---------------------------------
+ Hash Join
+   Hash Cond: (b.x = a.id)
+   ->  Hash Anti Join
+         Hash Cond: (b.x = c.z)
+         ->  Seq Scan on b
+               Filter: (x > 100)
+         ->  Hash
+               ->  Seq Scan on c
+   ->  Hash
+         ->  Seq Scan on a
+(10 rows)
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+FULL OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT y FROM c);
+         QUERY PLAN          
+-----------------------------
+ Hash Full Join
+   Hash Cond: (b.x = a.id)
+   Filter: (NOT (SubPlan 1))
+   ->  Seq Scan on b
+   ->  Hash
+         ->  Seq Scan on a
+   SubPlan 1
+     ->  Seq Scan on c
+(8 rows)
+
+-- No ANTI JOIN. y can have NULLs
+EXPLAIN (COSTS OFF)
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on b
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on c
+(4 rows)
+
+-- c is an empty relation so should cause no filtering on b
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+ x | y 
+---+---
+ 1 | 1
+ 2 | 2
+ 3 |  
+(3 rows)
+
+INSERT INTO c VALUES(1);
+-- Records where y is NULL should be filtered out.
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+ x | y 
+---+---
+ 2 | 2
+(1 row)
+
+-- No ANTI JOIN, y can be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+(4 rows)
+
+-- No ANTI JOIN, x is NOT NULL, but we don't know if + 1 will change that.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT x+1 FROM b);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+(4 rows)
+
+-- ANTI JOIN 1 is a Const that is not null.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT 1 FROM b);
+        QUERY PLAN         
+---------------------------
+ Nested Loop Anti Join
+   Join Filter: (a.id = 1)
+   ->  Seq Scan on a
+   ->  Materialize
+         ->  Seq Scan on b
+(5 rows)
+
+-- No ANTI JOIN, results contain a NULL Const
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT NULL::int FROM b);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+(4 rows)
+
+-- ANTI JOIN y = 1 means y can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE y = 1);
+          QUERY PLAN           
+-------------------------------
+ Hash Anti Join
+   Hash Cond: (a.id = b.y)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: (y = 1)
+(6 rows)
+
+-- No ANTI JOIN, OR condition does not ensure y = 1
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE y = 1 OR x = 1);
+               QUERY PLAN               
+----------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+           Filter: ((y = 1) OR (x = 1))
+(5 rows)
+
+-- No ANTI JOIN, OR condition does not ensure y = 1 or y = 2
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR x = 1) AND (y = 2 OR x = 2));
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+           Filter: (((y = 1) OR (x = 1)) AND ((y = 2) OR (x = 2)))
+(5 rows)
+
+-- ANTI JOIN y must be 2, so can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR x = 1) AND y = 2);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (a.id = b.y)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: ((y = 2) AND ((y = 1) OR (x = 1)))
+(6 rows)
+
+-- ANTI JOIN y can be 1 or 2, but can't be null.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR y = 2));
+                 QUERY PLAN                 
+--------------------------------------------
+ Hash Anti Join
+   Hash Cond: (a.id = b.y)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: ((y = 1) OR (y = 2))
+(6 rows)
+
+-- No ANTI JOIN c.z is from a left outer join so it can have nulls.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Merge Left Join
+           Merge Cond: (b.x = c.z)
+           ->  Sort
+                 Sort Key: b.x
+                 ->  Seq Scan on b
+           ->  Sort
+                 Sort Key: c.z
+                 ->  Seq Scan on c
+(11 rows)
+
+-- ANTI JOIN, c.z is not from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b RIGHT JOIN c ON b.x = c.z);
+               QUERY PLAN                
+-----------------------------------------
+ Merge Anti Join
+   Merge Cond: (a.id = c.z)
+   ->  Index Only Scan using a_pkey on a
+   ->  Merge Left Join
+         Merge Cond: (c.z = b.x)
+         ->  Sort
+               Sort Key: c.z
+               ->  Seq Scan on c
+         ->  Sort
+               Sort Key: b.x
+               ->  Seq Scan on b
+(11 rows)
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.x FROM b RIGHT JOIN c ON b.x = c.z);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Merge Right Join
+           Merge Cond: (b.x = c.z)
+           ->  Sort
+                 Sort Key: b.x
+                 ->  Seq Scan on b
+           ->  Sort
+                 Sort Key: c.z
+                 ->  Seq Scan on c
+(11 rows)
+
+-- No ANTI JOIN, c.z is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b FULL JOIN c ON b.x = c.z);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Merge Full Join
+           Merge Cond: (b.x = c.z)
+           ->  Sort
+                 Sort Key: b.x
+                 ->  Seq Scan on b
+           ->  Sort
+                 Sort Key: c.z
+                 ->  Seq Scan on c
+(11 rows)
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.x FROM b FULL JOIN c ON b.x = c.z);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Merge Full Join
+           Merge Cond: (b.x = c.z)
+           ->  Sort
+                 Sort Key: b.x
+                 ->  Seq Scan on b
+           ->  Sort
+                 Sort Key: c.z
+                 ->  Seq Scan on c
+(11 rows)
+
+-- ANTI JOIN, c.z is from an inner join and has a NOT NULL constraint.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b INNER JOIN c ON b.x = c.z);
+               QUERY PLAN                
+-----------------------------------------
+ Merge Anti Join
+   Merge Cond: (a.id = c.z)
+   ->  Index Only Scan using a_pkey on a
+   ->  Merge Join
+         Merge Cond: (b.x = c.z)
+         ->  Sort
+               Sort Key: b.x
+               ->  Seq Scan on b
+         ->  Sort
+               Sort Key: c.z
+               ->  Seq Scan on c
+(11 rows)
+
+-- ANTI JOIN, c.z must be 1
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z WHERE c.z = 1);
+                QUERY PLAN                 
+-------------------------------------------
+ Hash Anti Join
+   Hash Cond: (a.id = c.z)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Nested Loop
+               ->  Seq Scan on c
+                     Filter: (z = 1)
+               ->  Materialize
+                     ->  Seq Scan on b
+                           Filter: (x = 1)
+(10 rows)
+
+-- ANTI JOIN, c.z can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z WHERE c.z IS NOT NULL);
+                 QUERY PLAN                  
+---------------------------------------------
+ Merge Anti Join
+   Merge Cond: (a.id = c.z)
+   ->  Index Only Scan using a_pkey on a
+   ->  Merge Join
+         Merge Cond: (b.x = c.z)
+         ->  Sort
+               Sort Key: b.x
+               ->  Seq Scan on b
+         ->  Sort
+               Sort Key: c.z
+               ->  Seq Scan on c
+                     Filter: (z IS NOT NULL)
+(12 rows)
+
+-- ANTI JOIN, b.y cannot be NULL due to the join condition b.y = c.z
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.y FROM b INNER JOIN c ON b.y = c.z);
+               QUERY PLAN                
+-----------------------------------------
+ Merge Anti Join
+   Merge Cond: (a.id = b.y)
+   ->  Index Only Scan using a_pkey on a
+   ->  Merge Join
+         Merge Cond: (b.y = c.z)
+         ->  Sort
+               Sort Key: b.y
+               ->  Seq Scan on b
+         ->  Sort
+               Sort Key: c.z
+               ->  Seq Scan on c
+(11 rows)
+
+ROLLBACK;
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index 8bca1f5d55..d6fff1f2ae 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -683,3 +683,123 @@ select * from (with x as (select 2 as y) select * from x) ss;
 explain (verbose, costs off)
 with x as (select * from subselect_tbl)
 select * from x for update;
+
+--
+-- Check NOT IN performs an ANTI JOIN when NULLs are not possible
+-- on either side of the, would be, join condition.
+--
+
+BEGIN;
+
+CREATE TEMP TABLE a (id INT PRIMARY KEY);
+CREATE TEMP TABLE b (x INT NOT NULL, y INT);
+CREATE TEMP TABLE c (z INT NOT NULL);
+
+INSERT INTO b VALUES(1,1),(2,2),(3,NULL);
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+LEFT OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT z FROM c);
+
+-- ANTI JOIN, b.x is from an outer join but b.x > 100
+-- forces the join not to produce NULL on the righthand
+-- side.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+LEFT OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT z FROM c)
+  AND b.x > 100;
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+FULL OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT y FROM c);
+
+-- No ANTI JOIN. y can have NULLs
+EXPLAIN (COSTS OFF)
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+
+-- c is an empty relation so should cause no filtering on b
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+
+INSERT INTO c VALUES(1);
+
+-- Records where y is NULL should be filtered out.
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+
+-- No ANTI JOIN, y can be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b);
+
+-- No ANTI JOIN, x is NOT NULL, but we don't know if + 1 will change that.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT x+1 FROM b);
+
+-- ANTI JOIN 1 is a Const that is not null.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT 1 FROM b);
+
+-- No ANTI JOIN, results contain a NULL Const
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT NULL::int FROM b);
+
+-- ANTI JOIN y = 1 means y can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE y = 1);
+
+-- No ANTI JOIN, OR condition does not ensure y = 1
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE y = 1 OR x = 1);
+
+-- No ANTI JOIN, OR condition does not ensure y = 1 or y = 2
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR x = 1) AND (y = 2 OR x = 2));
+
+-- ANTI JOIN y must be 2, so can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR x = 1) AND y = 2);
+
+-- ANTI JOIN y can be 1 or 2, but can't be null.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR y = 2));
+
+-- No ANTI JOIN c.z is from a left outer join so it can have nulls.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z);
+
+-- ANTI JOIN, c.z is not from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b RIGHT JOIN c ON b.x = c.z);
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.x FROM b RIGHT JOIN c ON b.x = c.z);
+
+-- No ANTI JOIN, c.z is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b FULL JOIN c ON b.x = c.z);
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.x FROM b FULL JOIN c ON b.x = c.z);
+
+-- ANTI JOIN, c.z is from an inner join and has a NOT NULL constraint.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b INNER JOIN c ON b.x = c.z);
+
+-- ANTI JOIN, c.z must be 1
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z WHERE c.z = 1);
+
+-- ANTI JOIN, c.z can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z WHERE c.z IS NOT NULL);
+
+-- ANTI JOIN, b.y cannot be NULL due to the join condition b.y = c.z
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.y FROM b INNER JOIN c ON b.y = c.z);
+
+ROLLBACK;
#8Li, Zheng
zhelli@amazon.com
In reply to: David Rowley (#7)
1 attachment(s)
Re: NOT IN subquery optimization

I'm attaching a working patch following the discussion.

You can also find the following patch description is the commit message:
NOT IN to ANTI JOIN transformation

The semantics of ANTI JOIN were created to match the semantics of NOT
EXISTS, which enables NOT EXISTS subqueries to be efficiently executed
as a kind of join. NOT IN subqueries have different NULL semantics than
NOT EXISTS, but since there is no special join operator for NOT IN it is
generally executed as a nested sub-plan. It is possible, however, to
transform NOT IN to a correlated NOT EXISTS so that it can be executed
it as an ANTI JOIN with additional correlated predicates.

A general transformation from NOT IN to NOT EXISTS for the
single-expression case (the multi-expression case is just ANDs of the
single-expressions) is:
t1.x NOT IN (SELECT t2.y from t2 where p) <=> NOT EXISTS (select 1
from t2 where (y=x or y is NULL or x is NULL) and p).

If x or y is non-nullable, we can safely remove the predicate "x
is NULL" or "y is NULL", and if there is no predicate p,
then "x is NULL" may be factored out of the subquery.
Experiments show that if we can remove one or the other ORed
predicates, or if we can factor out the "x is NULL", then
execution is typically much faster.

Basically, for the single expression case (we also handle the multi
expression case), we try to do the following transformation:
When p does not exist:
t1.x not in (t2.y) => ANTI JOIN t1(Filter: x is not null), t2 on
join condition: t1.x=t2.y or t2.y is null.
When x is non-nullable:
t1.x not in (t2.y where p) => ANTI JOIN t1, t2 on join condition:
(t1.x=t2.y or t2.y is null) and p.

We implemented a nullability test routine is_node_nonnullable().
Currently it handles Var, TargetEntry, CoalesceExpr and Const. Outer
joins are taken into consideration in the nullability test.

We adjust and apply reduce_outer_joins() before the transformation so
that the outer joins have an opportunity to be converted to inner joins
prior to the transformation.

Using this transformation, we measured performance improvements of
two to three orders of magnitude on most queries in a development
environment. In our performance experiments, table s (small) has 11
rows, table l (large) has 1 million rows. s.n and l.n have NULL
values. s.nn and l.nn are NOT NULL.

s.n not in (l.n) 1150 ms -> 0.49 ms
s.nn not in (l.nn) 1120 ms -> 0.45 ms
l.n not in (l.n) over 20 min -> 1700 ms
l.nn not in (l.nn) over 20 min -> 220 ms
l.n not in (s.n) 63 ms -> 750 ms
l.nn not in (s.nn) 58 ms -> 46 ms

For the only case that performance drops - l.n not in (s.n). It is
likely to be resolved by ending the nested loop anti join early as
soon as we find NULL inner tuple entry/entries that satisfies the
join condition during execution. This is still under investigation.

Comments are welcome.

With Regards,
---
Zheng Li, AWS, Amazon Aurora PostgreSQL

On 2/25/19, 7:32 AM, "David Rowley" <david.rowley@2ndquadrant.com> wrote:

On Fri, 22 Feb 2019 at 09:44, David Rowley <david.rowley@2ndquadrant.com> wrote:

I've attached the rebased and still broken version.

I set about trying to make a less broken version of this.

A quick reminder of the semantics of NOT IN:

1. WHERE <nullable_column> NOT IN(SELECT <not null column> FROM table);

If table is non-empty:
will filter out rows where <nullable_column> is NULL
and only show values that are not in <not null column>

If table is empty:
Filters nothing.

2. WHERE <nonnullable_column> NOT IN(SELECT <null column> FROM table);

If table contains NULLs in the <null column> no records will match.

The previous patch handled #2 correctly but neglected to do anything
about #1. For #1 the only way we can implement this as a planner only
change is to insist that the outer side expressions also are not null.
If we could have somehow tested if "table" was non-empty then we could
have added a IS NOT NULL clause to the outer query and converted to an
anti-join, but ... can't know that during planning and can't add the
IS NOT NULL regardless as, if "table" is empty we will filter NULLs
when we shouldn't.

In the attached, I set about fixing #1 by determining if the outer
expressions could be NULL by checking

1. If expression is a Var from an inner joined relation it can't be
NULL if there's a NOT NULL constraint on the column; or
2. If expression is a Var from an inner joined relation and there is a
strict WHERE/ON clause, the expression can't be NULL; or
3. If expression is a Var from an outer joined relation check for
quals that were specified in the same syntactical level as the NOT IN
for proofs that NULL will be filtered.

An example of #3 is:

SELECT * FROM t1 LEFT JOIN t2 on t1.a = t2.a WHERE t2.a IS NOT NULL
AND t2.a NOT IN(SELECT a FROM t3); -- t2 becomes INNER JOINed later in
planning, but...
or;
SELECT * FROM t1 LEFT JOIN t2 on t1.a = t2.a AND t2.a NOT IN(SELECT a FROM t3);

In the latter of the two, the t1.a = t2.a join conditions ensures that
NULLs can't exist where the NOT IN is evaluated.

I implemented #3 by passing the quals down to
pull_up_sublinks_qual_recurse(). At the top level call 'node' and
'notnull_proofs' are the same, but that changes in recursive calls
like the one we make inside the is_andclause() condition.

Comments welcome.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

not_in_anti_join_transformation.patchapplication/octet-stream; name=not_in_anti_join_transformation.patchDownload
commit 4acf23d0308da002a62f0a58db8d11f5d0a71c29
Author: Zheng Li <zhelli@amazon.com>
Date:   Fri Feb 22 21:00:46 2019 +0000

    NOT IN to ANTI JOIN transformation
    
    The semantics of ANTI JOIN were created to match the semantics of NOT
    EXISTS, which enables NOT EXISTS subqueries to be efficiently executed
    as a kind of join.  NOT IN subqueries have different NULL semantics than
    NOT EXISTS, but since there is no special join operator for NOT IN it is
    generally executed as a nested sub-plan.  It is possible, however, to
    transform NOT IN to a correlated NOT EXISTS so that it can be executed
    it as an ANTI JOIN with additional correlated predicates.
    
    A general transformation from NOT IN to NOT EXISTS for the
    single-expression case (the multi-expression case is just ANDs of the
    single-expressions) is:
        t1.x NOT IN (SELECT t2.y from t2 where p) <=> NOT EXISTS (select 1
                from t2 where (y=x or y is NULL or x is NULL) and p).
    
    If x or y is non-nullable, we can safely remove the predicate "x
    is NULL" or "y is NULL", and if there is no predicate p,
    then "x is NULL" may be factored out of the subquery.
    Experiments show that if we can remove one or the other ORed
    predicates, or if we can factor out the "x is NULL", then
    execution is typically much faster.
    
    Basically, for the single expression case (we also handle the multi
    expression case), we try to do the following transformation:
    When p does not exist:
        t1.x not in (t2.y) => ANTI JOIN t1(Filter: x is not null), t2 on
        join condition: t1.x=t2.y or t2.y is null.
    When x is non-nullable:
        t1.x not in (t2.y where p) => ANTI JOIN t1, t2 on join condition:
        (t1.x=t2.y or t2.y is null) and p.
    
    We implemented a nullability test routine is_node_nonnullable().
    Currently it handles Var, TargetEntry, CoalesceExpr and Const. Outer
    joins are taken into consideration in the nullability test.
    
    We apply reduce_outer_joins() before the transformation so that the
    outer joins have an opportunity to be converted to inner joins prior
    to the transformation.
    
    Using this transformation, we measured performance improvements of
    two to three orders of magnitude on most queries in a development
    environment. In our performance experiments, table s (small) has 11
    rows, table l (large) has 1 million rows. s.n and l.n have NULL
    values. s.nn and l.nn are NOT NULL.
    
        s.n not in (l.n)       1150 ms -> 0.49 ms
        s.nn not in (l.nn)     1120 ms -> 0.45 ms
        l.n not in (l.n)   over 20 min -> 1700 ms
        l.nn not in (l.nn) over 20 min -> 220 ms
        l.n not in (s.n)         63 ms -> 750 ms
        l.nn not in (s.nn)       58 ms -> 46 ms
    
    For the only case that performance drops - l.n not in (s.n). It is
    likely to be resolved by ending the nested loop anti join early as
    soon as we find NULL inner tuple entry/entries that satisfies the
    join condition during execution. This is still under investigation.
    
    Authors: Zheng Li, Jim Finnerty
    Discussion:
    https://www.postgresql.org/message-id/1550706289606-0.post%40n3.nabble.com

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index bc81535905..e5091d2595 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -976,7 +976,7 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
 	 * preprocessing.
 	 */
 	if (hasOuterJoins)
-		reduce_outer_joins(root);
+		reduce_outer_joins(parse);
 
 	/*
 	 * If we have any RTE_RESULT relations, see if they can be deleted from
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index c721054787..d076d89227 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -33,6 +33,7 @@
 #include "optimizer/prep.h"
 #include "optimizer/subselect.h"
 #include "parser/parse_relation.h"
+#include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/builtins.h"
 #include "utils/lsyscache.h"
@@ -1107,6 +1108,439 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
 	return expression_tree_walker(node, inline_cte_walker, context);
 }
 
+/*
+ * check if relid appears on the NULL-padded side of an outer join.
+ */
+static bool
+on_null_sideof_outerjoin(Index relid, Node *jtnode)
+{
+	if(!jtnode)
+	{
+		return false;
+	}
+	if(IsA(jtnode, FromExpr))
+	{
+		ListCell		*lc;
+		FromExpr	*fexpr = (FromExpr *)jtnode;
+		foreach(lc, fexpr->fromlist)
+		{
+			if(on_null_sideof_outerjoin(relid, lfirst(lc)))
+			{
+				return true;
+			}
+		}
+	}
+	else if(IsA(jtnode, JoinExpr))
+	{
+		JoinExpr *expr = (JoinExpr *)jtnode;
+		Relids relids = NULL;
+		Relids relids2 = NULL;
+		switch(expr->jointype)
+		{
+		case JOIN_LEFT:
+			relids = get_relids_in_jointree(expr->rarg, true);
+
+			return bms_is_member(relid, relids) ||
+						on_null_sideof_outerjoin(relid, expr->larg);
+		case JOIN_FULL:
+			relids = get_relids_in_jointree(expr->larg, true);
+			relids2 = get_relids_in_jointree(expr->rarg, true);
+
+			return bms_is_member(relid, relids) ||
+						bms_is_member(relid, relids2);
+		case JOIN_RIGHT:
+			relids = get_relids_in_jointree(expr->larg, true);
+
+			return bms_is_member(relid, relids) ||
+						on_null_sideof_outerjoin(relid, expr->rarg);
+		case JOIN_INNER:
+			return on_null_sideof_outerjoin(relid, expr->larg) ||
+						on_null_sideof_outerjoin(relid, expr->rarg);
+
+		case JOIN_SEMI:
+		case JOIN_ANTI:
+		case JOIN_UNIQUE_OUTER:
+		case JOIN_UNIQUE_INNER:
+			/*
+			 * JOIN_SEMI and JOIN_ANTI should not appear yet since they
+			 * are introduced only by pull_up_sublinks and this routine is called
+			 * before we replace root->parse->jointree in pull_up_sublinks().
+			 * I don't think we will see JOIN_UNIQUE_OUTER or
+			 * JOIN_UNIQUE_INNER since they are only used internally in
+			 * the planner in a much later phase (in standard_join_search).
+			 */
+			break;
+
+		default:
+			elog(ERROR, "unrecognized join type: %d",
+							 (int) expr->jointype);
+			break;
+		}
+	}
+	/* should be RangeTblRef from the following three cases:
+	 * plain / under JOIN_INNER / on inner side of an outer join
+	 */
+	return false;
+}
+
+/* Returns a List of Nodes from the testexpr of an Any SubLink */
+static List *
+getTestExpr(SubLink *sublink)
+{
+	Node * testexpr = sublink->testexpr;
+	Assert(testexpr);
+
+	/* single expression */
+	if(IsA(testexpr, OpExpr))
+	{
+		OpExpr	*opexpr = (OpExpr *) testexpr;
+		Node	*testnode = linitial(opexpr->args);
+
+		return list_make1(testnode);
+	}
+	/* multi-expression */
+	else if(IsA(testexpr, BoolExpr))
+	{
+		BoolExpr	*bexpr = (BoolExpr *) testexpr;
+		ListCell		*lc;
+		Node		*node;
+		List			*result = NULL;
+
+		foreach(lc, bexpr->args)
+		{
+			node = lfirst(lc);
+			if(IsA(node, OpExpr))
+			{
+				OpExpr *expr = (OpExpr *) node;
+				result = lappend(result, linitial(expr->args));
+			}
+			else
+			{
+				elog(ERROR, "unrecognized node type for testexpr: %d",
+						(int) nodeTag(node));
+			}
+		}
+		return result;
+	}
+	else
+	{
+		elog(ERROR, "unrecognized node type for testexpr: %d",
+				(int) nodeTag(testexpr));
+	}
+}
+
+/*
+ * Returns true if the Node passed in is nonnullable. Currently handles Var,
+ * TargetEntry, CoaleseExpr and Const.
+ * TODO: Add more supporting cases.
+ * A Var is nonnullable if:
+ * it does not appear on  the NULL-padded side of an outer join and it has NOT NULL constraint.
+ * A CoalesceExpr is nonnullable if it has a non-null argument
+ */
+static bool
+is_node_nonnullable(Node * node, Query *parse)
+{
+	AttrNumber	attno = InvalidAttrNumber;
+	Oid				reloid;
+	Index			relid = 0;
+
+	switch (nodeTag(node))
+	{
+		case T_Var:
+		{
+			Var 					*var;
+			RangeTblEntry	*rte;
+
+			var = (Var *) node;
+			relid = var->varno;
+			attno = var->varattno;
+			rte = rt_fetch(relid, parse->rtable);
+			reloid = rte->relid;
+			break;
+		}
+		case T_TargetEntry:
+		{
+			TargetEntry	*te = (TargetEntry *)node;
+			switch(nodeTag(te->expr))
+			{
+				case T_Var:
+				{
+					attno = te->resorigcol;
+					reloid = te->resorigtbl;
+					relid = get_relid_oid(reloid, parse);
+					break;
+				}
+				/* recurse into is_node_nonnullable for other types of Node */
+				default:
+					return is_node_nonnullable((Node *)te->expr, parse);
+			}
+			break;
+		}
+		case T_CoalesceExpr:
+		{
+			ListCell   			*arg;
+			CoalesceExpr	*cexpr = (CoalesceExpr *)node;
+
+			/* handle COALESCE Function by looking for non-null argument */
+			foreach(arg, cexpr->args)
+			{
+				Node	*e = lfirst(arg);
+
+				/* recurse into is_node_nonnullable */
+				if (is_node_nonnullable(e, parse))
+				{
+					return true;
+				}
+			}
+			break;
+		}
+		case T_Const:
+		{
+			if(!((Const *) node)->constisnull)
+			{
+				return true;
+			}
+			break;
+		}
+		/*
+		 * TODO: handle more cases to make the nullability test more accurate
+		 * Assume unhandled cases are nullable.
+		 */
+		default:
+			return false;
+	}
+
+	/* check if the Var appears on the NULL-padded side of an outer join */
+	if(relid &&
+		on_null_sideof_outerjoin(relid, (Node *)parse->jointree))
+	{
+		return false;
+	}
+	/* check for NOT NULL Constraint */
+	if(attno != InvalidAttrNumber &&
+			get_attnotnull(reloid, attno))
+	{
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Returns true if at least one Node in the input list is non-nullable
+ */
+static bool
+list_hasnonnullable(List * list, Query *parse)
+{
+	ListCell	*lc;
+	Node	*node;
+
+	foreach(lc, list)
+	{
+		node = lfirst(lc);
+		if(is_node_nonnullable(node, parse))
+		{
+			return true;
+		}
+	}
+	return false;
+}
+
+/* Try to reduce outer joins if there is one in the Query */
+static void
+reduce_outer_joins_NOT_IN(Query *parse)
+{
+	ListCell				*lc;
+	RangeTblEntry	*rte;
+
+	foreach(lc, parse->rtable)
+	{
+		rte = (RangeTblEntry *) lfirst(lc);
+		/* try to reduce outer joins if there is one */
+		if (rte->rtekind == RTE_JOIN &&
+				IS_OUTER_JOIN(rte->jointype))
+		{
+			reduce_outer_joins(parse);
+			return;
+		}
+	}
+}
+
+/*
+* Convert x NOT IN (y) to ANTI JOIN. See detailed comments below.
+*/
+static bool
+convert_NOT_IN_to_join(PlannerInfo *root, Node **quals,
+									SubLink *sublink, List *subquery_vars,
+									Node **pullout)
+{
+	Query			*parse = root->parse;
+	NullTest			*nt;
+	TargetEntry	*te;
+	List				*testnodes = getTestExpr(sublink);
+	bool				 isxnonnullable;
+	bool				 isynonnullable;
+	Query			*subselect = (Query *) sublink->subselect;
+	bool				 hassubquals = subselect->jointree->quals ? true: false;
+
+	/*
+	 * Try reduce_outer_joins since outer join affects the nullability test that's coming up next.
+	 * We have to call reduce_outer_joins for outer and inner query separately because we
+	 * don't have a global range table yet.
+	 */
+	reduce_outer_joins_NOT_IN(parse);
+
+	reduce_outer_joins_NOT_IN(subselect);
+
+	Assert(testnodes);
+	isxnonnullable =
+					list_hasnonnullable(testnodes, parse);
+	isynonnullable =
+					list_hasnonnullable(subselect->targetList, subselect);
+	/*
+	 * We consider the conversion: t1.x not in (t2.y where p) => ANTI JOIN
+	 * t1, t2 on join condition (t1.x=t2.y or t2.y is null or t1.x is null) and p.
+	 * This is logically equivalent to NOT EXISTS (select 1 from t2 where
+	 * (t1.x=t2.y or t2.y is null or t1.x is null) and p).
+	 * In general, the above conversion does not result in a faster plan due to the
+	 * multiple ORs in the join condition. However, if we can remove t1.x is null (in
+	 * the case when x is nonnullable) or pull "OR x is null" out as "AND x is not null"
+	 * (in the case where p does not exist and x is not a multi-expr (x1,x2...xn))
+	 * we will have a much faster plan after the conversion because it's eligible to use
+	 * bitmap index scan and bitmap ORs or index scan(if y is nonnullable too) by
+	 * downstream optimizations.
+	 * Basically we aim to do the following conversions:
+	 * when p does not exist:
+	 * t1.x not in (t2.y) => ANTI JOIN t1(Filter: x is not null), t2 on join
+	 * condition: t1.x=t2.y or t2.y is null.
+	 * when x is nonnullable:
+	 * t1.x not in (t2.y where p) => ANTI JOIN t1, t2 on join
+	 * condition: (t1.x=t2.y or t2.y is null) and p.
+	 * So do NOT convert if (p exists and the test expr x is nullable) or
+	 * if x is a multi-expr (x1,x2,...xn) and has no nonnullable
+	 * var. In other words, we only do the conversion when
+	 * x is nonnull or (p does not exist and x is a single test expression
+	 * (i.e. quals is of type OpExpr. If x is a multiple expression,
+	 * qulas will be of type BoolExpr))
+	 */
+	if(!(isxnonnullable ||
+		(!hassubquals && IsA(*quals, OpExpr))))
+		return false;
+
+	/* If there is only one node in the testexpr, do the following conversion:
+	 * when p does not exist:
+	 * t1.x not in (t2.y) => ANTI JOIN t1(Filter: x is not null), t2 on
+	 * join condition: t1.x=t2.y or t2.y is null.
+	 * when x is nonnullable:
+	 * t1.x not in (t2.y where p) => ANTI JOIN t1, t2 on join condition:
+	 * (t1.x=t2.y or t2.y is null) and p.
+	 * Notice the case where p exits and x is nullable has been ruled out earlier.
+	 */
+	if(IsA(*quals, OpExpr))
+	{
+		/* add "OR y IS NULL" if y is nullable */
+		if(!isynonnullable)
+		{
+			/* make expr y IS NULL */
+			nt = makeNode(NullTest);
+			nt->arg = (Expr *)linitial(subquery_vars);
+			nt->nulltesttype = IS_NULL;
+			nt->argisrow = false;
+			nt->location = -1;
+
+			/* make orclause (x = y OR y IS NULL) */
+			*quals = (Node *)make_orclause(list_make2(*quals,
+								(Node *)nt));
+		}
+		/* pullout "AND x IS NOT NULL" if x is nullable */
+		if(!isxnonnullable)
+		{
+			/* make expr x IS NOT NULL */
+			nt = makeNode(NullTest);
+			nt->arg = (Expr *)linitial(testnodes);
+			nt->argisrow = false;
+			nt->nulltesttype = IS_NOT_NULL;
+			nt->location = -1;
+			*pullout = (Node *)nt;
+		}
+	}
+	/*
+	 * When there are multiple nodes in the testexpr:
+	 * i.e. (x1, x2, ... xn) not in (y1, y2, ... yn ) convert it to
+	 * ANTI JOIN t1, t2 on join condition: (t1.x1 = t2.y1 or
+	 * t2.y1 is NULL or t1.x1 is null) and ...(t1.xi = t2.yi or t2.yi is NULL) ...
+	 * and (t1.xn = t2.yn or t2.yn is NULL or t1.xn is NULL). Notice there will
+	 * be at least one xi that is nonnullable due to the nullability test earlier.
+	 * Downstream optimization will pick the shorter clause
+	 * (t1.xi = t2.yi or t2.yi is NULL) for bitmap index scan on t2 and bitmap OR
+	 * (or index scan on t2 if t2.yi is also nonnullable).
+	 */
+	else if(IsA(*quals, BoolExpr))
+	{
+		ListCell    *qualc;
+		ListCell    *xc = list_head(testnodes);
+		ListCell    *yc = list_head(subquery_vars);
+		ListCell    *ytlc = list_head(subselect->targetList);
+		List        *quallist = ((BoolExpr *)*quals)->args;
+		Node		*tempnode = NULL;
+
+		/* Reconstruct quals in the loop */
+		*quals = NULL;
+		foreach(qualc, quallist)
+		{
+			te = (TargetEntry *)lfirst(ytlc);
+			isynonnullable = is_node_nonnullable((Node*)te, subselect);
+			/* x_i = y_i */
+			tempnode = lfirst(qualc);
+			/* append y_i IS NULL if y_i nonnullable */
+			if(!isynonnullable)
+			{
+				/* make expr y_i IS NULL */
+				nt = makeNode(NullTest);
+				nt->arg = (Expr *)lfirst(yc);
+				nt->nulltesttype = IS_NULL;
+				nt->argisrow = false;
+				nt->location = -1;
+
+				/* make orclause (x_i = y_i OR y_i IS NULL) */
+				tempnode = (Node *)make_orclause(list_make2(lfirst(qualc),
+										(Node *)nt));
+			}
+			isxnonnullable = is_node_nonnullable(lfirst(xc), parse);
+			/*
+			 * append x_i IS NULL if x_i nonnullable, can't pull it out as
+			 * x_i is NOT NULL due to multiple AND clauses.
+			 */
+			if(!isxnonnullable)
+			{
+				/* make expr x_i IS NULL */
+				nt = makeNode(NullTest);
+				nt->arg = (Expr *)lfirst(xc);
+				nt->argisrow = false;
+				nt->nulltesttype = IS_NULL;
+				nt->location = -1;
+				/* make orclause (x_i = y_i OR y_i IS NULL OR x_i IS NULL) */
+				tempnode = (Node *)make_orclause(list_make2(tempnode,
+															(Node *)nt));
+			}
+			/*
+			 * Now append tempnode to quals as one andclause.
+			 * (x_i = y_i OR y_i IS NULL OR x_i IS NULL) AND
+			 * (x_j = y_j OR y_j IS NULL OR x_j IS NULL)...
+			 */
+			*quals = (Node *)make_andclause(list_make2(*quals, tempnode));
+			xc = lnext(xc);
+			yc = lnext(yc);
+			ytlc = lnext(ytlc);
+		}
+	}
+	/* quals should be either OpExpr or BoolExpr, otherwise don't convert */
+	else
+	{
+		return false;
+	}
+	return true;
+}
 
 /*
  * convert_ANY_sublink_to_join: try to convert an ANY SubLink to a join
@@ -1144,6 +1578,7 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
  */
 JoinExpr *
 convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
+							bool under_not, Node **pullout,
 							Relids available_rels)
 {
 	JoinExpr   *result;
@@ -1224,11 +1659,22 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 	 */
 	quals = convert_testexpr(root, sublink->testexpr, subquery_vars);
 
+	/*
+	 * Try converting x NOT IN (y) to ANTI JOIN.
+	 */
+	if(under_not &&
+			!convert_NOT_IN_to_join(root, &quals,
+					sublink, subquery_vars, pullout))
+	{
+		return NULL;
+	}
+
 	/*
 	 * And finally, build the JoinExpr node.
 	 */
 	result = makeNode(JoinExpr);
-	result->jointype = JOIN_SEMI;
+	/* NOT IN will be converted to ANTI JOIN */
+	result->jointype = under_not ? JOIN_ANTI : JOIN_SEMI;
 	result->isNatural = false;
 	result->larg = NULL;		/* caller must fill this in */
 	result->rarg = (Node *) rtr;
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index aebe162713..184d71cd7c 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -104,7 +104,7 @@ static Query *pullup_replace_vars_subquery(Query *query,
 static reduce_outer_joins_state *reduce_outer_joins_pass1(Node *jtnode);
 static void reduce_outer_joins_pass2(Node *jtnode,
 						 reduce_outer_joins_state *state,
-						 PlannerInfo *root,
+						 Query *parse,
 						 Relids nonnullable_rels,
 						 List *nonnullable_vars,
 						 List *forced_null_vars);
@@ -256,6 +256,13 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 		}
 		/* Build the replacement FromExpr; no quals yet */
 		newf = makeFromExpr(newfromlist, NULL);
+		/*
+		 * Replace parse->jointree with newf now because we might modify join types
+		 * during reduce_outer_joins() in convert_NOT_IN_to_join() which is called
+		 * in pull_up_sublinks_qual_recurse() that's coming up next.
+		 */
+		newf->quals = f->quals;
+		root->parse->jointree = newf;
 		/* Set up a link representing the rebuilt jointree */
 		jtlink = (Node *) newf;
 		/* Now process qual --- all children are available for use */
@@ -388,7 +395,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 		/* Is it a convertible ANY or EXISTS clause? */
 		if (sublink->subLinkType == ANY_SUBLINK)
 		{
-			if ((j = convert_ANY_sublink_to_join(root, sublink,
+			if ((j = convert_ANY_sublink_to_join(root, sublink, false, NULL,
 												 available_rels1)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
@@ -414,7 +421,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 				return NULL;
 			}
 			if (available_rels2 != NULL &&
-				(j = convert_ANY_sublink_to_join(root, sublink,
+				(j = convert_ANY_sublink_to_join(root, sublink, false, NULL,
 												 available_rels2)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
@@ -560,6 +567,69 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 					return NULL;
 				}
 			}
+			else if (sublink->subLinkType == ANY_SUBLINK)
+			{
+				Node *pullout = NULL;
+
+				if ((j = convert_ANY_sublink_to_join(root, sublink, true, &pullout,
+													 available_rels1)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink1;
+					*jtlink1 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Any inserted
+					 * joins can get stacked onto either j->larg or j->rarg,
+					 * depending on which rels they reference.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->larg,
+															 available_rels1,
+															 &j->rarg,
+															 child_rels);
+					/*
+					 * Return pullout predicate (x is NOT NULL) if it's not null,
+					 * otherwise return NULL representing constant TRUE.
+					 */
+					return pullout? pullout : NULL;
+				}
+				if (available_rels2 != NULL &&
+					(j = convert_ANY_sublink_to_join(root, sublink, true, &pullout,
+													 available_rels2)) != NULL)
+				{
+					Node *pullout = NULL;
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink2;
+					*jtlink2 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Any inserted
+					 * joins can get stacked onto either j->larg or j->rarg,
+					 * depending on which rels they reference.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->larg,
+															 available_rels2,
+															 &j->rarg,
+															 child_rels);
+					/*
+					 * Return pullout predicate (x is NOT NULL) if it's not null,
+					 * otherwise return NULL representing constant TRUE.
+					 */
+					return pullout? pullout : NULL;
+				}
+			}
 		}
 		/* Else return it unmodified */
 		return node;
@@ -2377,7 +2447,7 @@ flatten_simple_union_all(PlannerInfo *root)
  * alias-var expansion).
  */
 void
-reduce_outer_joins(PlannerInfo *root)
+reduce_outer_joins(Query *parse)
 {
 	reduce_outer_joins_state *state;
 
@@ -2390,14 +2460,14 @@ reduce_outer_joins(PlannerInfo *root)
 	 * join(s) below each side of each join clause. The second pass examines
 	 * qual clauses and changes join types as it descends the tree.
 	 */
-	state = reduce_outer_joins_pass1((Node *) root->parse->jointree);
+	state = reduce_outer_joins_pass1((Node *) parse->jointree);
 
 	/* planner.c shouldn't have called me if no outer joins */
 	if (state == NULL || !state->contains_outer)
 		elog(ERROR, "so where are the outer joins?");
 
-	reduce_outer_joins_pass2((Node *) root->parse->jointree,
-							 state, root, NULL, NIL, NIL);
+	reduce_outer_joins_pass2((Node *) parse->jointree,
+							 state, parse, NULL, NIL, NIL);
 }
 
 /*
@@ -2480,7 +2550,7 @@ reduce_outer_joins_pass1(Node *jtnode)
 static void
 reduce_outer_joins_pass2(Node *jtnode,
 						 reduce_outer_joins_state *state,
-						 PlannerInfo *root,
+						 Query *parse,
 						 Relids nonnullable_rels,
 						 List *nonnullable_vars,
 						 List *forced_null_vars)
@@ -2520,7 +2590,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 			reduce_outer_joins_state *sub_state = lfirst(s);
 
 			if (sub_state->contains_outer)
-				reduce_outer_joins_pass2(lfirst(l), sub_state, root,
+				reduce_outer_joins_pass2(lfirst(l), sub_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
@@ -2632,7 +2702,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 		/* Apply the jointype change, if any, to both jointree node and RTE */
 		if (rtindex && jointype != j->jointype)
 		{
-			RangeTblEntry *rte = rt_fetch(rtindex, root->parse->rtable);
+			RangeTblEntry *rte = rt_fetch(rtindex, parse->rtable);
 
 			Assert(rte->rtekind == RTE_JOIN);
 			Assert(rte->jointype == j->jointype);
@@ -2717,7 +2787,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 					pass_nonnullable_vars = NIL;
 					pass_forced_null_vars = NIL;
 				}
-				reduce_outer_joins_pass2(j->larg, left_state, root,
+				reduce_outer_joins_pass2(j->larg, left_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
@@ -2739,7 +2809,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 					pass_nonnullable_vars = NIL;
 					pass_forced_null_vars = NIL;
 				}
-				reduce_outer_joins_pass2(j->rarg, right_state, root,
+				reduce_outer_joins_pass2(j->rarg, right_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 4130514952..96563bb180 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -28,6 +28,7 @@
 #include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
 #include "partitioning/partbounds.h"
+#include "parser/parsetree.h"
 #include "utils/hsearch.h"
 
 
@@ -65,6 +66,23 @@ static void build_child_join_reltarget(PlannerInfo *root,
 						   AppendRelInfo **appinfos);
 
 
+/* get RangeTable index from Oid of a relation and the Query Tree */
+Index get_relid_oid(Oid reloid, Query *parse)
+{
+	Index				 rti;
+	RangeTblEntry	*rte;
+
+	for(rti = 1; rti <  list_length(parse->rtable) + 1; rti++)
+	{
+		rte = rt_fetch(rti, parse->rtable);
+		if(rte->relid == reloid)
+		{
+			return rti;
+		}
+	}
+	return 0;
+}
+
 /*
  * setup_simple_rel_arrays
  *	  Prepare the arrays we use for quickly accessing base relations.
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index e88c45d268..0b9436628a 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -878,6 +878,35 @@ get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 	ReleaseSysCache(tp);
 }
 
+/*
+ * get_attnotnull
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	  tp;
+	Form_pg_attribute att_tup;
+	bool              result;
+
+	tp = SearchSysCache2(ATTNUM,
+			ObjectIdGetDatum(relid),
+			Int16GetDatum(attnum));
+
+	if (!HeapTupleIsValid(tp))
+	{
+		elog(ERROR, "cache lookup failed for attribute %d of relation %u",
+			 attnum, relid);
+	}
+
+	att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+
+	result = att_tup->attnotnull;
+	ReleaseSysCache(tp);
+	return result;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 574bb85b50..5e85be2eca 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -275,6 +275,7 @@ extern Path *reparameterize_path_by_child(PlannerInfo *root, Path *path,
 /*
  * prototypes for relnode.c
  */
+extern Index get_relid_oid(Oid reloid, Query *parse);
 extern void setup_simple_rel_arrays(PlannerInfo *root);
 extern void setup_append_rel_array(PlannerInfo *root);
 extern RelOptInfo *build_simple_rel(PlannerInfo *root, int relid,
diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h
index a9b2c9026c..ec8af5abf4 100644
--- a/src/include/optimizer/prep.h
+++ b/src/include/optimizer/prep.h
@@ -26,7 +26,7 @@ extern void pull_up_sublinks(PlannerInfo *root);
 extern void inline_set_returning_functions(PlannerInfo *root);
 extern void pull_up_subqueries(PlannerInfo *root);
 extern void flatten_simple_union_all(PlannerInfo *root);
-extern void reduce_outer_joins(PlannerInfo *root);
+extern void reduce_outer_joins(Query *parse);
 extern void remove_useless_result_rtes(PlannerInfo *root);
 extern Relids get_relids_in_jointree(Node *jtnode, bool include_joins);
 extern Relids get_relids_for_join(Query *query, int joinrelid);
diff --git a/src/include/optimizer/subselect.h b/src/include/optimizer/subselect.h
index 2d2c3bcbc0..ea2abaa131 100644
--- a/src/include/optimizer/subselect.h
+++ b/src/include/optimizer/subselect.h
@@ -19,6 +19,8 @@
 extern void SS_process_ctes(PlannerInfo *root);
 extern JoinExpr *convert_ANY_sublink_to_join(PlannerInfo *root,
 							SubLink *sublink,
+							bool under_not,
+							Node ** pullout,
 							Relids available_rels);
 extern JoinExpr *convert_EXISTS_sublink_to_join(PlannerInfo *root,
 							   SubLink *sublink,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 16b0b1d2dc..71370d8877 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -89,6 +89,7 @@ extern AttrNumber get_attnum(Oid relid, const char *attname);
 extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 					  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern char *get_collation_name(Oid colloid);
 extern char *get_constraint_name(Oid conoid);
 extern char *get_language_name(Oid langoid, bool missing_ok);
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index cc3f5f3737..1a36d6c150 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -1310,3 +1310,1232 @@ select * from x for update;
    Output: subselect_tbl.f1, subselect_tbl.f2, subselect_tbl.f3
 (2 rows)
 
+-- test NON IN to ANTI JOIN conversion
+CREATE TABLE s (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s values(1000002, 1000002, 1000002, 'foofoo');
+UPDATE s set n = NULL WHERE n = 3;
+CREATE UNIQUE INDEX s_u ON s (u);
+CREATE INDEX s_n ON s (n);
+CREATE INDEX s_nn ON s (nn);
+CREATE TABLE l (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into l (u, n, nn, p)
+	select
+    generate_series(1,10000 ) as u,
+	generate_series(1,10000 ) as n,
+	generate_series(1,10000 ) as nn,
+	'bar' as p;
+UPDATE l set n = NULL WHERE n = 7;
+CREATE UNIQUE INDEX l_u ON l (u);
+CREATE INDEX l_n ON l (n);
+CREATE INDEX l_nn ON l (nn);
+CREATE TABLE s1 (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s1 (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s1 values(1000003, 1000003, 1000003, 'foofoo');
+insert into s1 values(1003, 1003, 1003, 'foofoo');
+UPDATE s1 set n = NULL WHERE n = 3;
+-- test single expression
+explain (costs false) select * from s where n not in (select n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n IS NOT NULL)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select u from l);
+              QUERY PLAN              
+--------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_u on l
+         Index Cond: (u = s.u)
+(4 rows)
+
+select * from s where u not in (select u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where 3*n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: ((3 * n) IS NOT NULL)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: (((3 * s.n) = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = (3 * s.n))
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where 3*n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select 3*n from l);
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = (3 * l.n)) OR ((3 * l.n) IS NULL))
+   ->  Seq Scan on s
+         Filter: (n IS NOT NULL)
+   ->  Materialize
+         ->  Seq Scan on l
+(6 rows)
+
+select * from s where n not in (select 3*n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- test multi expression
+explain (costs false) select * from s where (n,u) not in (select n,u from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_u on l
+         Index Cond: (u = s.u)
+         Filter: ((s.n = n) OR (n IS NULL) OR (s.n IS NULL))
+(5 rows)
+
+select * from s where (n,u) not in (select n,u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (u, nn) not in (select u, nn from l);
+           QUERY PLAN            
+---------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_u on l
+         Index Cond: (u = s.u)
+         Filter: (s.nn = nn)
+(5 rows)
+
+select * from s where (u, nn) not in (select u, nn from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u) not in (select u,n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.u = n) OR (n IS NULL))
+         Filter: ((s.n = u) OR (s.n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.u)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where (n,u) not in (select u,n from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l);
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.nn = nn) AND ((s.u = n) OR (n IS NULL)))
+         Filter: ((s.n = u) OR (s.n IS NULL))
+         ->  BitmapAnd
+               ->  Bitmap Index Scan on l_nn
+                     Index Cond: (nn = s.nn)
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.u)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+(13 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.nn = nn) AND ((s.u = n) OR (n IS NULL)))
+         Filter: ((u > 1000) AND ((s.n = u) OR (s.n IS NULL)))
+         ->  BitmapAnd
+               ->  Bitmap Index Scan on l_nn
+                     Index Cond: (nn = s.nn)
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.u)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+(13 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+-- test multi-table
+explain (costs false) select count(*) from s, l where s.n not in (select n from l);
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Seq Scan on l
+         ->  Materialize
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                           Filter: (n IS NOT NULL)
+                     ->  Bitmap Heap Scan on l l_1
+                           Recheck Cond: ((s.n = n) OR (n IS NULL))
+                           ->  BitmapOr
+                                 ->  Bitmap Index Scan on l_n
+                                       Index Cond: (n = s.n)
+                                 ->  Bitmap Index Scan on l_n
+                                       Index Cond: (n IS NULL)
+(14 rows)
+
+select count(*) from s, l where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s, l where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Seq Scan on l
+         ->  Materialize
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+(8 rows)
+
+select count(*) from s, l where s.nn not in (select nn from l);
+ count 
+-------
+ 10000
+(1 row)
+
+-- test null padded results from outer join
+explain (costs false) select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = s_1.nn) OR (s_1.nn IS NULL))
+   ->  Seq Scan on s
+         Filter: (n IS NOT NULL)
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s_1.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s s_1
+(9 rows)
+
+select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = s_1.nn) OR (s_1.nn IS NULL))
+   ->  Seq Scan on s
+         Filter: (n IS NOT NULL)
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s_1.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s s_1
+(9 rows)
+
+select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+                QUERY PLAN                 
+-------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         ->  Hash Anti Join
+               Hash Cond: (l.nn = s_1.nn)
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s s_1
+         ->  Hash
+               ->  Seq Scan on s
+(10 rows)
+
+select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+ count 
+-------
+  9997
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+                  QUERY PLAN                  
+----------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Hash Anti Join
+               Hash Cond: (s.nn = s_1.nn)
+               ->  Seq Scan on s
+                     Filter: (nn IS NOT NULL)
+               ->  Hash
+                     ->  Seq Scan on s s_1
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: (nn = s.nn)
+               ->  Bitmap Index Scan on l_nn
+                     Index Cond: (nn = s.nn)
+(12 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         ->  Hash Anti Join
+               Hash Cond: (l.nn = l_1.nn)
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Left Join
+                           Hash Cond: (l_1.nn = s_1.nn)
+                           ->  Seq Scan on l l_1
+                           ->  Hash
+                                 ->  Seq Scan on s s_1
+         ->  Hash
+               ->  Seq Scan on s
+(14 rows)
+
+select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               Join Filter: ((s.nn = s_1.nn) OR (s_1.nn IS NULL))
+               ->  Seq Scan on s
+                     Filter: (nn IS NOT NULL)
+               ->  Hash Left Join
+                     Hash Cond: (l_1.nn = s_1.nn)
+                     ->  Seq Scan on l l_1
+                     ->  Hash
+                           ->  Seq Scan on s s_1
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: (nn = s.nn)
+               ->  Bitmap Index Scan on l_nn
+                     Index Cond: (nn = s.nn)
+(15 rows)
+
+select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Aggregate
+   ->  Hash Right Join
+         Hash Cond: (s1.u = s.u)
+         ->  Seq Scan on s1
+         ->  Hash
+               ->  Nested Loop
+                     ->  Nested Loop Anti Join
+                           ->  Seq Scan on s
+                           ->  Index Only Scan using l_nn on l l_1
+                                 Index Cond: (nn = s.nn)
+                     ->  Index Only Scan using l_u on l
+                           Index Cond: (u = s.u)
+(12 rows)
+
+select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Aggregate
+   ->  Hash Right Join
+         Hash Cond: (s1.u = s.u)
+         ->  Seq Scan on s1
+         ->  Hash
+               ->  Nested Loop
+                     ->  Nested Loop Anti Join
+                           ->  Seq Scan on s
+                                 Filter: (nn IS NOT NULL)
+                           ->  Index Only Scan using l_nn on l l_1
+                                 Index Cond: (nn = s.nn)
+                     ->  Index Only Scan using l_u on l
+                           Index Cond: (u = s.u)
+(13 rows)
+
+select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Aggregate
+   ->  Hash Right Join
+         Hash Cond: (s1.u = s.u)
+         ->  Seq Scan on s1
+         ->  Hash
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l
+                           Index Cond: (nn = s.nn)
+(9 rows)
+
+select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = l.u)
+         ->  Hash Join
+               Hash Cond: (s1.u = s.u)
+               ->  Seq Scan on s1
+               ->  Hash
+                     ->  Nested Loop Anti Join
+                           ->  Seq Scan on s
+                           ->  Index Only Scan using l_nn on l l_1
+                                 Index Cond: (nn = s.nn)
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s1.u)
+(13 rows)
+
+select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = l.u)
+         ->  Hash Join
+               Hash Cond: (s1.u = s.u)
+               ->  Seq Scan on s1
+               ->  Hash
+                     ->  Nested Loop Anti Join
+                           ->  Seq Scan on s
+                                 Filter: (nn IS NOT NULL)
+                           ->  Index Only Scan using l_nn on l l_1
+                                 Index Cond: (nn = s.nn)
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s1.u)
+(14 rows)
+
+select * from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+ u | n | nn | p | u | n | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Aggregate
+   ->  Hash Right Join
+         Hash Cond: (s1.u = s.u)
+         ->  Seq Scan on s1
+         ->  Hash
+               ->  Nested Loop
+                     ->  Nested Loop Anti Join
+                           ->  Seq Scan on s
+                           ->  Index Only Scan using l_nn on l l_1
+                                 Index Cond: (nn = s.nn)
+                     ->  Index Only Scan using l_u on l
+                           Index Cond: (u = s.u)
+(12 rows)
+
+select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Aggregate
+   ->  Hash Join
+         Hash Cond: (s1.u = s.u)
+         ->  Seq Scan on s1
+         ->  Hash
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                           Filter: (nn IS NOT NULL)
+                     ->  Index Only Scan using l_nn on l
+                           Index Cond: (nn = s.nn)
+(10 rows)
+
+select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Hash Right Join
+         Hash Cond: (s1.nn = l.nn)
+         ->  Seq Scan on s1
+         ->  Hash
+               ->  Hash Left Join
+                     Hash Cond: (l.nn = s_1.nn)
+                     ->  Bitmap Heap Scan on l
+                           Recheck Cond: (s.nn = nn)
+                           ->  Bitmap Index Scan on l_nn
+                                 Index Cond: (nn = s.nn)
+                     ->  Hash
+                           ->  Seq Scan on s s_1
+(14 rows)
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.nn = l.nn) OR (l.nn IS NULL))
+   ->  Seq Scan on s
+   ->  Materialize
+         ->  Merge Right Join
+               Merge Cond: (l.nn = s1.nn)
+               ->  Merge Left Join
+                     Merge Cond: (l.nn = s_1.nn)
+                     ->  Index Only Scan using l_nn on l
+                     ->  Sort
+                           Sort Key: s_1.nn
+                           ->  Seq Scan on s s_1
+               ->  Sort
+                     Sort Key: s1.nn
+                     ->  Seq Scan on s1
+(15 rows)
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+                                     QUERY PLAN                                      
+-------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Nested Loop Left Join
+         ->  Index Scan using l_u on l
+               Index Cond: (u = s.u)
+               Filter: (((s.n = n) OR (n IS NULL) OR (s.n IS NULL)) AND (s.nn = nn))
+         ->  Index Only Scan using s_nn on s s_1
+               Index Cond: (nn = l.nn)
+(8 rows)
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l right join s on l.nn = s.nn);
+                                                                QUERY PLAN                                                                 
+-------------------------------------------------------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s.n = l.n) OR (l.n IS NULL) OR (s.n IS NULL)) AND ((s.u = l.u) OR (l.u IS NULL)) AND ((s.nn = l.nn) OR (l.nn IS NULL)))
+   ->  Seq Scan on s
+   ->  Materialize
+         ->  Nested Loop Left Join
+               ->  Seq Scan on s s_1
+               ->  Bitmap Heap Scan on l
+                     Recheck Cond: (nn = s_1.nn)
+                     ->  Bitmap Index Scan on l_nn
+                           Index Cond: (nn = s_1.nn)
+(10 rows)
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+--test reduce outer joins from outer query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+                     Filter: (nn IS NOT NULL)
+               ->  Index Only Scan using l_nn on l l_1
+                     Index Cond: (nn = s.nn)
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: (nn = s.nn)
+               ->  Bitmap Index Scan on l_nn
+                     Index Cond: (nn = s.nn)
+(11 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+                     Filter: (u > 0)
+               ->  Index Only Scan using l_nn on l l_1
+                     Index Cond: (nn = s.nn)
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: (nn = s.nn)
+               ->  Bitmap Index Scan on l_nn
+                     Index Cond: (nn = s.nn)
+(11 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Hash Join
+         Hash Cond: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Bitmap Heap Scan on l
+                     Recheck Cond: (nn = s.nn)
+                     ->  Bitmap Index Scan on l_nn
+                           Index Cond: (nn = s.nn)
+         ->  Hash
+               ->  Seq Scan on s1
+(14 rows)
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Hash Join
+         Hash Cond: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                           Filter: (nn IS NOT NULL)
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Bitmap Heap Scan on l
+                     Recheck Cond: (nn = s.nn)
+                     ->  Bitmap Index Scan on l_nn
+                           Index Cond: (nn = s.nn)
+         ->  Hash
+               ->  Seq Scan on s1
+(15 rows)
+
+select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                           Filter: (nn IS NOT NULL)
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Bitmap Heap Scan on l
+                     Recheck Cond: (nn = s.nn)
+                     ->  Bitmap Index Scan on l_nn
+                           Index Cond: (nn = s.nn)
+         ->  Hash
+               ->  Seq Scan on s1
+(15 rows)
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+--test reduce outer joins from subquery
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.nn = l.nn) OR (l.nn IS NULL))
+   ->  Seq Scan on s
+   ->  Materialize
+         ->  Nested Loop Left Join
+               ->  Seq Scan on s s_1
+               ->  Bitmap Heap Scan on l
+                     Recheck Cond: (nn = s_1.nn)
+                     ->  Bitmap Index Scan on l_nn
+                           Index Cond: (nn = s_1.nn)
+(10 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Nested Loop
+         ->  Seq Scan on s s_1
+         ->  Index Scan using l_nn on l
+               Index Cond: ((nn = s.nn) AND (nn = s_1.nn))
+               Filter: (u > 9)
+(7 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.nn = l.nn) OR (l.nn IS NULL))
+   ->  Seq Scan on s
+   ->  Materialize
+         ->  Nested Loop Left Join
+               ->  Seq Scan on s s_1
+                     Filter: (u > 9)
+               ->  Bitmap Heap Scan on l
+                     Recheck Cond: (nn = s_1.nn)
+                     ->  Bitmap Index Scan on l_nn
+                           Index Cond: (nn = s_1.nn)
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+                              QUERY PLAN                               
+-----------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Hash Join
+         Hash Cond: (s1.n = l.n)
+         ->  Seq Scan on s1
+         ->  Hash
+               ->  Nested Loop
+                     ->  Seq Scan on s s_1
+                     ->  Index Scan using l_nn on l
+                           Index Cond: ((nn = s.nn) AND (nn = s_1.nn))
+(10 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.nn = l.nn) OR (l.nn IS NULL))
+   ->  Seq Scan on s
+   ->  Materialize
+         ->  Hash Left Join
+               Hash Cond: (s1.n = l.n)
+               ->  Seq Scan on s1
+               ->  Hash
+                     ->  Nested Loop
+                           ->  Seq Scan on s s_1
+                           ->  Bitmap Heap Scan on l
+                                 Recheck Cond: (nn = s_1.nn)
+                                 ->  Bitmap Index Scan on l_nn
+                                       Index Cond: (nn = s_1.nn)
+(14 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.nn = l.nn) OR (l.nn IS NULL))
+   ->  Seq Scan on s
+   ->  Materialize
+         ->  Nested Loop Left Join
+               ->  Seq Scan on s s_1
+               ->  Hash Right Join
+                     Hash Cond: (s1.n = l.n)
+                     ->  Seq Scan on s1
+                     ->  Hash
+                           ->  Bitmap Heap Scan on l
+                                 Recheck Cond: (nn = s_1.nn)
+                                 ->  Bitmap Index Scan on l_nn
+                                       Index Cond: (nn = s_1.nn)
+(14 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+--test reduce outer join on outer and sub-query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
+ Aggregate
+   ->  Hash Join
+         Hash Cond: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Hash Join
+                           Hash Cond: (s1_1.n = l_1.n)
+                           ->  Seq Scan on s1 s1_1
+                           ->  Hash
+                                 ->  Nested Loop
+                                       ->  Seq Scan on s s_1
+                                       ->  Index Scan using l_nn on l l_1
+                                             Index Cond: ((nn = s.nn) AND (nn = s_1.nn))
+               ->  Bitmap Heap Scan on l
+                     Recheck Cond: (nn = s.nn)
+                     ->  Bitmap Index Scan on l_nn
+                           Index Cond: (nn = s.nn)
+         ->  Hash
+               ->  Seq Scan on s1
+(20 rows)
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+                                    QUERY PLAN                                     
+-----------------------------------------------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     Join Filter: ((s.nn = l_1.nn) OR (l_1.nn IS NULL))
+                     ->  Seq Scan on s
+                           Filter: (nn IS NOT NULL)
+                     ->  Materialize
+                           ->  Nested Loop Left Join
+                                 ->  Seq Scan on s s_1
+                                 ->  Hash Right Join
+                                       Hash Cond: (s1_1.n = l_1.n)
+                                       ->  Seq Scan on s1 s1_1
+                                       ->  Hash
+                                             ->  Bitmap Heap Scan on l l_1
+                                                   Recheck Cond: (nn = s_1.nn)
+                                                   ->  Bitmap Index Scan on l_nn
+                                                         Index Cond: (nn = s_1.nn)
+               ->  Bitmap Heap Scan on l
+                     Recheck Cond: (nn = s.nn)
+                     ->  Bitmap Index Scan on l_nn
+                           Index Cond: (nn = s.nn)
+         ->  Hash
+               ->  Seq Scan on s1
+(25 rows)
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+ count 
+-------
+     0
+(1 row)
+
+-- test union all
+explain (costs false) select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s t
+         Filter: (n IS NOT NULL)
+   ->  Append
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: ((t.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = t.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+         ->  Seq Scan on s
+               Filter: ((t.n = u) OR (u IS NULL))
+         ->  Seq Scan on s s_1
+               Filter: ((t.n = nn) OR (nn IS NULL))
+(15 rows)
+
+select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n IS NOT NULL)
+   ->  Append
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+         ->  Seq Scan on s s_1
+               Filter: ((s.n = u) OR (u IS NULL))
+         ->  Seq Scan on s s_2
+               Filter: ((s.n = nn) OR (nn IS NULL))
+(15 rows)
+
+ 
+select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- test multi-levels of NOT IN
+explain (costs false) select * from s where n not in (select n from s where n not in (select n from l));
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Anti Join
+           ->  Seq Scan on s s_1
+                 Filter: (n IS NOT NULL)
+           ->  Bitmap Heap Scan on l
+                 Recheck Cond: ((s_1.n = n) OR (n IS NULL))
+                 ->  BitmapOr
+                       ->  Bitmap Index Scan on l_n
+                             Index Cond: (n = s_1.n)
+                       ->  Bitmap Index Scan on l_n
+                             Index Cond: (n IS NULL)
+(13 rows)
+
+select * from s where n not in (select n from s where n not in (select n from l));
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where n not in (select n from s where n not in (select u from l));
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Anti Join
+           ->  Seq Scan on s s_1
+                 Filter: (n IS NOT NULL)
+           ->  Index Only Scan using l_u on l
+                 Index Cond: (u = s_1.n)
+(8 rows)
+
+select * from s where n not in (select n from s where n not in (select u from l));
+ u | n | nn |  p  
+---+---+----+-----
+ 1 | 1 |  1 | foo
+ 2 | 2 |  2 | foo
+(2 rows)
+
+--test COALESCE
+explain (costs false) select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, '-1'::integer) = COALESCE(l.n, '-1'::integer))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, '-1'::integer) = COALESCE(l.n, '-1'::integer))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+                                 QUERY PLAN                                  
+-----------------------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((COALESCE(s.n) = COALESCE(l.n)) OR (COALESCE(l.n) IS NULL))
+   ->  Seq Scan on s
+         Filter: (COALESCE(n) IS NOT NULL)
+   ->  Materialize
+         ->  Seq Scan on l
+(6 rows)
+
+select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, s.nn) = COALESCE(l.n, l.nn))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+                   QUERY PLAN                   
+------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.nn) = COALESCE(l.nn))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+                                                       QUERY PLAN                                                        
+-------------------------------------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: (s.nn = nn)
+         Filter: ((COALESCE(s.n, '-1'::integer) = COALESCE(n, '-1'::integer)) AND (COALESCE(s.n, s.u) = COALESCE(n, u)))
+         ->  Bitmap Index Scan on l_nn
+               Index Cond: (nn = s.nn)
+(7 rows)
+
+select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+-- non conversion cases
+explain (costs false) select * from s where n not in (select n from l where u > 500);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on l
+           Filter: (u > 500)
+(5 rows)
+
+select * from s where n not in (select n from l where u > 500);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(3 rows)
+
+explain (costs false) select * from s where (n,n) not in (select n,n from l);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on l
+(4 rows)
+
+select * from s where (n,n) not in (select n,n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+             QUERY PLAN             
+------------------------------------
+ Hash Left Join
+   Hash Cond: (l.nn = s.nn)
+   Filter: (NOT (hashed SubPlan 1))
+   ->  Seq Scan on l
+   ->  Hash
+         ->  Seq Scan on s
+   SubPlan 1
+     ->  Seq Scan on l l_1
+(8 rows)
+
+select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+ u | n | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+---
+(0 rows)
+
+-- clean up
+drop table s;
+drop table s1;
+drop table l;
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index 8bca1f5d55..1368d3af47 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -683,3 +683,287 @@ select * from (with x as (select 2 as y) select * from x) ss;
 explain (verbose, costs off)
 with x as (select * from subselect_tbl)
 select * from x for update;
+
+-- test NON IN to ANTI JOIN conversion
+CREATE TABLE s (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s values(1000002, 1000002, 1000002, 'foofoo');
+UPDATE s set n = NULL WHERE n = 3;
+
+CREATE UNIQUE INDEX s_u ON s (u);
+CREATE INDEX s_n ON s (n);
+CREATE INDEX s_nn ON s (nn);
+
+CREATE TABLE l (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into l (u, n, nn, p)
+	select
+    generate_series(1,10000 ) as u,
+	generate_series(1,10000 ) as n,
+	generate_series(1,10000 ) as nn,
+	'bar' as p;
+UPDATE l set n = NULL WHERE n = 7;
+
+CREATE UNIQUE INDEX l_u ON l (u);
+CREATE INDEX l_n ON l (n);
+CREATE INDEX l_nn ON l (nn);
+
+CREATE TABLE s1 (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s1 (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s1 values(1000003, 1000003, 1000003, 'foofoo');
+insert into s1 values(1003, 1003, 1003, 'foofoo');
+UPDATE s1 set n = NULL WHERE n = 3;
+
+-- test single expression
+explain (costs false) select * from s where n not in (select n from l);
+
+select * from s where n not in (select n from l);
+
+explain (costs false) select * from s where u not in (select u from l);
+
+select * from s where u not in (select u from l);
+
+explain (costs false) select * from s where 3*n not in (select n from l);
+
+select * from s where 3*n not in (select n from l);
+
+explain (costs false) select * from s where n not in (select 3*n from l);
+
+select * from s where n not in (select 3*n from l);
+
+-- test multi expression
+explain (costs false) select * from s where (n,u) not in (select n,u from l);
+
+select * from s where (n,u) not in (select n,u from l);
+
+explain (costs false) select * from s where (u, nn) not in (select u, nn from l);
+
+select * from s where (u, nn) not in (select u, nn from l);
+
+explain (costs false) select * from s where (n,u) not in (select u,n from l);
+
+select * from s where (n,u) not in (select u,n from l);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+
+-- test multi-table
+explain (costs false) select count(*) from s, l where s.n not in (select n from l);
+
+select count(*) from s, l where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s, l where s.nn not in (select nn from l);
+
+select count(*) from s, l where s.nn not in (select nn from l);
+
+-- test null padded results from outer join
+explain (costs false) select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+
+select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+
+select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+
+select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+
+select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+
+select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+select * from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l right join s on l.nn = s.nn);
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+--test reduce outer joins from outer query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+--test reduce outer joins from subquery
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+--test reduce outer join on outer and sub-query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+-- test union all
+explain (costs false) select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+
+select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+
+explain (costs false) select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+ 
+select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+-- test multi-levels of NOT IN
+explain (costs false) select * from s where n not in (select n from s where n not in (select n from l));
+
+select * from s where n not in (select n from s where n not in (select n from l));
+
+explain (costs false) select * from s where n not in (select n from s where n not in (select u from l));
+
+select * from s where n not in (select n from s where n not in (select u from l));
+
+--test COALESCE
+explain (costs false) select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+
+select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+
+explain (costs false) select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+
+select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+
+explain (costs false) select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+
+select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+
+explain (costs false) select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+
+select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+
+explain (costs false) select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+
+select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+
+explain (costs false) select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+
+select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+
+-- non conversion cases
+explain (costs false) select * from s where n not in (select n from l where u > 500);
+
+select * from s where n not in (select n from l where u > 500);
+
+explain (costs false) select * from s where (n,n) not in (select n,n from l);
+
+select * from s where (n,n) not in (select n,n from l);
+
+explain (costs false) select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+
+select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+
+-- clean up
+drop table s;
+drop table s1;
+drop table l;
+
#9Li, Zheng
zhelli@amazon.com
In reply to: Li, Zheng (#8)
1 attachment(s)
Re: NOT IN subquery optimization

Resend the patch with a whitespace removed so that "git apply patch" works directly.

---
Zheng Li, AWS, Amazon Aurora PostgreSQL

On 2/25/19, 12:39 PM, "Li, Zheng" <zhelli@amazon.com> wrote:

I'm attaching a working patch following the discussion.

You can also find the following patch description is the commit message:
NOT IN to ANTI JOIN transformation

The semantics of ANTI JOIN were created to match the semantics of NOT
EXISTS, which enables NOT EXISTS subqueries to be efficiently executed
as a kind of join. NOT IN subqueries have different NULL semantics than
NOT EXISTS, but since there is no special join operator for NOT IN it is
generally executed as a nested sub-plan. It is possible, however, to
transform NOT IN to a correlated NOT EXISTS so that it can be executed
it as an ANTI JOIN with additional correlated predicates.

A general transformation from NOT IN to NOT EXISTS for the
single-expression case (the multi-expression case is just ANDs of the
single-expressions) is:
t1.x NOT IN (SELECT t2.y from t2 where p) <=> NOT EXISTS (select 1
from t2 where (y=x or y is NULL or x is NULL) and p).

If x or y is non-nullable, we can safely remove the predicate "x
is NULL" or "y is NULL", and if there is no predicate p,
then "x is NULL" may be factored out of the subquery.
Experiments show that if we can remove one or the other ORed
predicates, or if we can factor out the "x is NULL", then
execution is typically much faster.

Basically, for the single expression case (we also handle the multi
expression case), we try to do the following transformation:
When p does not exist:
t1.x not in (t2.y) => ANTI JOIN t1(Filter: x is not null), t2 on
join condition: t1.x=t2.y or t2.y is null.
When x is non-nullable:
t1.x not in (t2.y where p) => ANTI JOIN t1, t2 on join condition:
(t1.x=t2.y or t2.y is null) and p.

We implemented a nullability test routine is_node_nonnullable().
Currently it handles Var, TargetEntry, CoalesceExpr and Const. Outer
joins are taken into consideration in the nullability test.

We adjust and apply reduce_outer_joins() before the transformation so
that the outer joins have an opportunity to be converted to inner joins
prior to the transformation.

Using this transformation, we measured performance improvements of
two to three orders of magnitude on most queries in a development
environment. In our performance experiments, table s (small) has 11
rows, table l (large) has 1 million rows. s.n and l.n have NULL
values. s.nn and l.nn are NOT NULL.

s.n not in (l.n) 1150 ms -> 0.49 ms
s.nn not in (l.nn) 1120 ms -> 0.45 ms
l.n not in (l.n) over 20 min -> 1700 ms
l.nn not in (l.nn) over 20 min -> 220 ms
l.n not in (s.n) 63 ms -> 750 ms
l.nn not in (s.nn) 58 ms -> 46 ms

For the only case that performance drops - l.n not in (s.n). It is
likely to be resolved by ending the nested loop anti join early as
soon as we find NULL inner tuple entry/entries that satisfies the
join condition during execution. This is still under investigation.

Comments are welcome.

With Regards,
---
Zheng Li, AWS, Amazon Aurora PostgreSQL

On 2/25/19, 7:32 AM, "David Rowley" <david.rowley@2ndquadrant.com> wrote:

On Fri, 22 Feb 2019 at 09:44, David Rowley <david.rowley@2ndquadrant.com> wrote:

I've attached the rebased and still broken version.

I set about trying to make a less broken version of this.

A quick reminder of the semantics of NOT IN:

1. WHERE <nullable_column> NOT IN(SELECT <not null column> FROM table);

If table is non-empty:
will filter out rows where <nullable_column> is NULL
and only show values that are not in <not null column>

If table is empty:
Filters nothing.

2. WHERE <nonnullable_column> NOT IN(SELECT <null column> FROM table);

If table contains NULLs in the <null column> no records will match.

The previous patch handled #2 correctly but neglected to do anything
about #1. For #1 the only way we can implement this as a planner only
change is to insist that the outer side expressions also are not null.
If we could have somehow tested if "table" was non-empty then we could
have added a IS NOT NULL clause to the outer query and converted to an
anti-join, but ... can't know that during planning and can't add the
IS NOT NULL regardless as, if "table" is empty we will filter NULLs
when we shouldn't.

In the attached, I set about fixing #1 by determining if the outer
expressions could be NULL by checking

1. If expression is a Var from an inner joined relation it can't be
NULL if there's a NOT NULL constraint on the column; or
2. If expression is a Var from an inner joined relation and there is a
strict WHERE/ON clause, the expression can't be NULL; or
3. If expression is a Var from an outer joined relation check for
quals that were specified in the same syntactical level as the NOT IN
for proofs that NULL will be filtered.

An example of #3 is:

SELECT * FROM t1 LEFT JOIN t2 on t1.a = t2.a WHERE t2.a IS NOT NULL
AND t2.a NOT IN(SELECT a FROM t3); -- t2 becomes INNER JOINed later in
planning, but...
or;
SELECT * FROM t1 LEFT JOIN t2 on t1.a = t2.a AND t2.a NOT IN(SELECT a FROM t3);

In the latter of the two, the t1.a = t2.a join conditions ensures that
NULLs can't exist where the NOT IN is evaluated.

I implemented #3 by passing the quals down to
pull_up_sublinks_qual_recurse(). At the top level call 'node' and
'notnull_proofs' are the same, but that changes in recursive calls
like the one we make inside the is_andclause() condition.

Comments welcome.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

not_in_anti_join_transformation_v1.patchapplication/octet-stream; name=not_in_anti_join_transformation_v1.patchDownload
commit 47a18cafc5e693fcbafd1902817a934006a96e84
Author: Zheng Li <zhelli@amazon.com>
Date:   Fri Feb 22 21:00:46 2019 +0000

    NOT IN to ANTI JOIN transformation
    
    The semantics of ANTI JOIN were created to match the semantics of NOT
    EXISTS, which enables NOT EXISTS subqueries to be efficiently executed
    as a kind of join.  NOT IN subqueries have different NULL semantics than
    NOT EXISTS, but since there is no special join operator for NOT IN it is
    generally executed as a nested sub-plan.  It is possible, however, to
    transform NOT IN to a correlated NOT EXISTS so that it can be executed
    it as an ANTI JOIN with additional correlated predicates.
    
    A general transformation from NOT IN to NOT EXISTS for the
    single-expression case (the multi-expression case is just ANDs of the
    single-expressions) is:
        t1.x NOT IN (SELECT t2.y from t2 where p) <=> NOT EXISTS (select 1
                from t2 where (y=x or y is NULL or x is NULL) and p).
    
    If x or y is non-nullable, we can safely remove the predicate "x
    is NULL" or "y is NULL", and if there is no predicate p,
    then "x is NULL" may be factored out of the subquery.
    Experiments show that if we can remove one or the other ORed
    predicates, or if we can factor out the "x is NULL", then
    execution is typically much faster.
    
    Basically, for the single expression case (we also handle the multi
    expression case), we try to do the following transformation:
    When p does not exist:
        t1.x not in (t2.y) => ANTI JOIN t1(Filter: x is not null), t2 on
        join condition: t1.x=t2.y or t2.y is null.
    When x is non-nullable:
        t1.x not in (t2.y where p) => ANTI JOIN t1, t2 on join condition:
        (t1.x=t2.y or t2.y is null) and p.
    
    We implemented a nullability test routine is_node_nonnullable().
    Currently it handles Var, TargetEntry, CoalesceExpr and Const. Outer
    joins are taken into consideration in the nullability test.
    
    We apply reduce_outer_joins() before the transformation so that the
    outer joins have an opportunity to be converted to inner joins prior
    to the transformation.
    
    Using this transformation, we measured performance improvements of
    two to three orders of magnitude on most queries in a development
    environment. In our performance experiments, table s (small) has 11
    rows, table l (large) has 1 million rows. s.n and l.n have NULL
    values. s.nn and l.nn are NOT NULL.
    
        s.n not in (l.n)       1150 ms -> 0.49 ms
        s.nn not in (l.nn)     1120 ms -> 0.45 ms
        l.n not in (l.n)   over 20 min -> 1700 ms
        l.nn not in (l.nn) over 20 min -> 220 ms
        l.n not in (s.n)         63 ms -> 750 ms
        l.nn not in (s.nn)       58 ms -> 46 ms
    
    For the only case that performance drops - l.n not in (s.n). It is
    likely to be resolved by ending the nested loop anti join early as
    soon as we find NULL inner tuple entry/entries that satisfies the
    join condition during execution. This is still under investigation.
    
    Authors: Zheng Li, Jim Finnerty
    Discussion:
    https://www.postgresql.org/message-id/1550706289606-0.post%40n3.nabble.com

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index bc81535905..e5091d2595 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -976,7 +976,7 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
 	 * preprocessing.
 	 */
 	if (hasOuterJoins)
-		reduce_outer_joins(root);
+		reduce_outer_joins(parse);
 
 	/*
 	 * If we have any RTE_RESULT relations, see if they can be deleted from
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index c721054787..d076d89227 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -33,6 +33,7 @@
 #include "optimizer/prep.h"
 #include "optimizer/subselect.h"
 #include "parser/parse_relation.h"
+#include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/builtins.h"
 #include "utils/lsyscache.h"
@@ -1107,6 +1108,439 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
 	return expression_tree_walker(node, inline_cte_walker, context);
 }
 
+/*
+ * check if relid appears on the NULL-padded side of an outer join.
+ */
+static bool
+on_null_sideof_outerjoin(Index relid, Node *jtnode)
+{
+	if(!jtnode)
+	{
+		return false;
+	}
+	if(IsA(jtnode, FromExpr))
+	{
+		ListCell		*lc;
+		FromExpr	*fexpr = (FromExpr *)jtnode;
+		foreach(lc, fexpr->fromlist)
+		{
+			if(on_null_sideof_outerjoin(relid, lfirst(lc)))
+			{
+				return true;
+			}
+		}
+	}
+	else if(IsA(jtnode, JoinExpr))
+	{
+		JoinExpr *expr = (JoinExpr *)jtnode;
+		Relids relids = NULL;
+		Relids relids2 = NULL;
+		switch(expr->jointype)
+		{
+		case JOIN_LEFT:
+			relids = get_relids_in_jointree(expr->rarg, true);
+
+			return bms_is_member(relid, relids) ||
+						on_null_sideof_outerjoin(relid, expr->larg);
+		case JOIN_FULL:
+			relids = get_relids_in_jointree(expr->larg, true);
+			relids2 = get_relids_in_jointree(expr->rarg, true);
+
+			return bms_is_member(relid, relids) ||
+						bms_is_member(relid, relids2);
+		case JOIN_RIGHT:
+			relids = get_relids_in_jointree(expr->larg, true);
+
+			return bms_is_member(relid, relids) ||
+						on_null_sideof_outerjoin(relid, expr->rarg);
+		case JOIN_INNER:
+			return on_null_sideof_outerjoin(relid, expr->larg) ||
+						on_null_sideof_outerjoin(relid, expr->rarg);
+
+		case JOIN_SEMI:
+		case JOIN_ANTI:
+		case JOIN_UNIQUE_OUTER:
+		case JOIN_UNIQUE_INNER:
+			/*
+			 * JOIN_SEMI and JOIN_ANTI should not appear yet since they
+			 * are introduced only by pull_up_sublinks and this routine is called
+			 * before we replace root->parse->jointree in pull_up_sublinks().
+			 * I don't think we will see JOIN_UNIQUE_OUTER or
+			 * JOIN_UNIQUE_INNER since they are only used internally in
+			 * the planner in a much later phase (in standard_join_search).
+			 */
+			break;
+
+		default:
+			elog(ERROR, "unrecognized join type: %d",
+							 (int) expr->jointype);
+			break;
+		}
+	}
+	/* should be RangeTblRef from the following three cases:
+	 * plain / under JOIN_INNER / on inner side of an outer join
+	 */
+	return false;
+}
+
+/* Returns a List of Nodes from the testexpr of an Any SubLink */
+static List *
+getTestExpr(SubLink *sublink)
+{
+	Node * testexpr = sublink->testexpr;
+	Assert(testexpr);
+
+	/* single expression */
+	if(IsA(testexpr, OpExpr))
+	{
+		OpExpr	*opexpr = (OpExpr *) testexpr;
+		Node	*testnode = linitial(opexpr->args);
+
+		return list_make1(testnode);
+	}
+	/* multi-expression */
+	else if(IsA(testexpr, BoolExpr))
+	{
+		BoolExpr	*bexpr = (BoolExpr *) testexpr;
+		ListCell		*lc;
+		Node		*node;
+		List			*result = NULL;
+
+		foreach(lc, bexpr->args)
+		{
+			node = lfirst(lc);
+			if(IsA(node, OpExpr))
+			{
+				OpExpr *expr = (OpExpr *) node;
+				result = lappend(result, linitial(expr->args));
+			}
+			else
+			{
+				elog(ERROR, "unrecognized node type for testexpr: %d",
+						(int) nodeTag(node));
+			}
+		}
+		return result;
+	}
+	else
+	{
+		elog(ERROR, "unrecognized node type for testexpr: %d",
+				(int) nodeTag(testexpr));
+	}
+}
+
+/*
+ * Returns true if the Node passed in is nonnullable. Currently handles Var,
+ * TargetEntry, CoaleseExpr and Const.
+ * TODO: Add more supporting cases.
+ * A Var is nonnullable if:
+ * it does not appear on  the NULL-padded side of an outer join and it has NOT NULL constraint.
+ * A CoalesceExpr is nonnullable if it has a non-null argument
+ */
+static bool
+is_node_nonnullable(Node * node, Query *parse)
+{
+	AttrNumber	attno = InvalidAttrNumber;
+	Oid				reloid;
+	Index			relid = 0;
+
+	switch (nodeTag(node))
+	{
+		case T_Var:
+		{
+			Var 					*var;
+			RangeTblEntry	*rte;
+
+			var = (Var *) node;
+			relid = var->varno;
+			attno = var->varattno;
+			rte = rt_fetch(relid, parse->rtable);
+			reloid = rte->relid;
+			break;
+		}
+		case T_TargetEntry:
+		{
+			TargetEntry	*te = (TargetEntry *)node;
+			switch(nodeTag(te->expr))
+			{
+				case T_Var:
+				{
+					attno = te->resorigcol;
+					reloid = te->resorigtbl;
+					relid = get_relid_oid(reloid, parse);
+					break;
+				}
+				/* recurse into is_node_nonnullable for other types of Node */
+				default:
+					return is_node_nonnullable((Node *)te->expr, parse);
+			}
+			break;
+		}
+		case T_CoalesceExpr:
+		{
+			ListCell   			*arg;
+			CoalesceExpr	*cexpr = (CoalesceExpr *)node;
+
+			/* handle COALESCE Function by looking for non-null argument */
+			foreach(arg, cexpr->args)
+			{
+				Node	*e = lfirst(arg);
+
+				/* recurse into is_node_nonnullable */
+				if (is_node_nonnullable(e, parse))
+				{
+					return true;
+				}
+			}
+			break;
+		}
+		case T_Const:
+		{
+			if(!((Const *) node)->constisnull)
+			{
+				return true;
+			}
+			break;
+		}
+		/*
+		 * TODO: handle more cases to make the nullability test more accurate
+		 * Assume unhandled cases are nullable.
+		 */
+		default:
+			return false;
+	}
+
+	/* check if the Var appears on the NULL-padded side of an outer join */
+	if(relid &&
+		on_null_sideof_outerjoin(relid, (Node *)parse->jointree))
+	{
+		return false;
+	}
+	/* check for NOT NULL Constraint */
+	if(attno != InvalidAttrNumber &&
+			get_attnotnull(reloid, attno))
+	{
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * Returns true if at least one Node in the input list is non-nullable
+ */
+static bool
+list_hasnonnullable(List * list, Query *parse)
+{
+	ListCell	*lc;
+	Node	*node;
+
+	foreach(lc, list)
+	{
+		node = lfirst(lc);
+		if(is_node_nonnullable(node, parse))
+		{
+			return true;
+		}
+	}
+	return false;
+}
+
+/* Try to reduce outer joins if there is one in the Query */
+static void
+reduce_outer_joins_NOT_IN(Query *parse)
+{
+	ListCell				*lc;
+	RangeTblEntry	*rte;
+
+	foreach(lc, parse->rtable)
+	{
+		rte = (RangeTblEntry *) lfirst(lc);
+		/* try to reduce outer joins if there is one */
+		if (rte->rtekind == RTE_JOIN &&
+				IS_OUTER_JOIN(rte->jointype))
+		{
+			reduce_outer_joins(parse);
+			return;
+		}
+	}
+}
+
+/*
+* Convert x NOT IN (y) to ANTI JOIN. See detailed comments below.
+*/
+static bool
+convert_NOT_IN_to_join(PlannerInfo *root, Node **quals,
+									SubLink *sublink, List *subquery_vars,
+									Node **pullout)
+{
+	Query			*parse = root->parse;
+	NullTest			*nt;
+	TargetEntry	*te;
+	List				*testnodes = getTestExpr(sublink);
+	bool				 isxnonnullable;
+	bool				 isynonnullable;
+	Query			*subselect = (Query *) sublink->subselect;
+	bool				 hassubquals = subselect->jointree->quals ? true: false;
+
+	/*
+	 * Try reduce_outer_joins since outer join affects the nullability test that's coming up next.
+	 * We have to call reduce_outer_joins for outer and inner query separately because we
+	 * don't have a global range table yet.
+	 */
+	reduce_outer_joins_NOT_IN(parse);
+
+	reduce_outer_joins_NOT_IN(subselect);
+
+	Assert(testnodes);
+	isxnonnullable =
+					list_hasnonnullable(testnodes, parse);
+	isynonnullable =
+					list_hasnonnullable(subselect->targetList, subselect);
+	/*
+	 * We consider the conversion: t1.x not in (t2.y where p) => ANTI JOIN
+	 * t1, t2 on join condition (t1.x=t2.y or t2.y is null or t1.x is null) and p.
+	 * This is logically equivalent to NOT EXISTS (select 1 from t2 where
+	 * (t1.x=t2.y or t2.y is null or t1.x is null) and p).
+	 * In general, the above conversion does not result in a faster plan due to the
+	 * multiple ORs in the join condition. However, if we can remove t1.x is null (in
+	 * the case when x is nonnullable) or pull "OR x is null" out as "AND x is not null"
+	 * (in the case where p does not exist and x is not a multi-expr (x1,x2...xn))
+	 * we will have a much faster plan after the conversion because it's eligible to use
+	 * bitmap index scan and bitmap ORs or index scan(if y is nonnullable too) by
+	 * downstream optimizations.
+	 * Basically we aim to do the following conversions:
+	 * when p does not exist:
+	 * t1.x not in (t2.y) => ANTI JOIN t1(Filter: x is not null), t2 on join
+	 * condition: t1.x=t2.y or t2.y is null.
+	 * when x is nonnullable:
+	 * t1.x not in (t2.y where p) => ANTI JOIN t1, t2 on join
+	 * condition: (t1.x=t2.y or t2.y is null) and p.
+	 * So do NOT convert if (p exists and the test expr x is nullable) or
+	 * if x is a multi-expr (x1,x2,...xn) and has no nonnullable
+	 * var. In other words, we only do the conversion when
+	 * x is nonnull or (p does not exist and x is a single test expression
+	 * (i.e. quals is of type OpExpr. If x is a multiple expression,
+	 * qulas will be of type BoolExpr))
+	 */
+	if(!(isxnonnullable ||
+		(!hassubquals && IsA(*quals, OpExpr))))
+		return false;
+
+	/* If there is only one node in the testexpr, do the following conversion:
+	 * when p does not exist:
+	 * t1.x not in (t2.y) => ANTI JOIN t1(Filter: x is not null), t2 on
+	 * join condition: t1.x=t2.y or t2.y is null.
+	 * when x is nonnullable:
+	 * t1.x not in (t2.y where p) => ANTI JOIN t1, t2 on join condition:
+	 * (t1.x=t2.y or t2.y is null) and p.
+	 * Notice the case where p exits and x is nullable has been ruled out earlier.
+	 */
+	if(IsA(*quals, OpExpr))
+	{
+		/* add "OR y IS NULL" if y is nullable */
+		if(!isynonnullable)
+		{
+			/* make expr y IS NULL */
+			nt = makeNode(NullTest);
+			nt->arg = (Expr *)linitial(subquery_vars);
+			nt->nulltesttype = IS_NULL;
+			nt->argisrow = false;
+			nt->location = -1;
+
+			/* make orclause (x = y OR y IS NULL) */
+			*quals = (Node *)make_orclause(list_make2(*quals,
+								(Node *)nt));
+		}
+		/* pullout "AND x IS NOT NULL" if x is nullable */
+		if(!isxnonnullable)
+		{
+			/* make expr x IS NOT NULL */
+			nt = makeNode(NullTest);
+			nt->arg = (Expr *)linitial(testnodes);
+			nt->argisrow = false;
+			nt->nulltesttype = IS_NOT_NULL;
+			nt->location = -1;
+			*pullout = (Node *)nt;
+		}
+	}
+	/*
+	 * When there are multiple nodes in the testexpr:
+	 * i.e. (x1, x2, ... xn) not in (y1, y2, ... yn ) convert it to
+	 * ANTI JOIN t1, t2 on join condition: (t1.x1 = t2.y1 or
+	 * t2.y1 is NULL or t1.x1 is null) and ...(t1.xi = t2.yi or t2.yi is NULL) ...
+	 * and (t1.xn = t2.yn or t2.yn is NULL or t1.xn is NULL). Notice there will
+	 * be at least one xi that is nonnullable due to the nullability test earlier.
+	 * Downstream optimization will pick the shorter clause
+	 * (t1.xi = t2.yi or t2.yi is NULL) for bitmap index scan on t2 and bitmap OR
+	 * (or index scan on t2 if t2.yi is also nonnullable).
+	 */
+	else if(IsA(*quals, BoolExpr))
+	{
+		ListCell    *qualc;
+		ListCell    *xc = list_head(testnodes);
+		ListCell    *yc = list_head(subquery_vars);
+		ListCell    *ytlc = list_head(subselect->targetList);
+		List        *quallist = ((BoolExpr *)*quals)->args;
+		Node		*tempnode = NULL;
+
+		/* Reconstruct quals in the loop */
+		*quals = NULL;
+		foreach(qualc, quallist)
+		{
+			te = (TargetEntry *)lfirst(ytlc);
+			isynonnullable = is_node_nonnullable((Node*)te, subselect);
+			/* x_i = y_i */
+			tempnode = lfirst(qualc);
+			/* append y_i IS NULL if y_i nonnullable */
+			if(!isynonnullable)
+			{
+				/* make expr y_i IS NULL */
+				nt = makeNode(NullTest);
+				nt->arg = (Expr *)lfirst(yc);
+				nt->nulltesttype = IS_NULL;
+				nt->argisrow = false;
+				nt->location = -1;
+
+				/* make orclause (x_i = y_i OR y_i IS NULL) */
+				tempnode = (Node *)make_orclause(list_make2(lfirst(qualc),
+										(Node *)nt));
+			}
+			isxnonnullable = is_node_nonnullable(lfirst(xc), parse);
+			/*
+			 * append x_i IS NULL if x_i nonnullable, can't pull it out as
+			 * x_i is NOT NULL due to multiple AND clauses.
+			 */
+			if(!isxnonnullable)
+			{
+				/* make expr x_i IS NULL */
+				nt = makeNode(NullTest);
+				nt->arg = (Expr *)lfirst(xc);
+				nt->argisrow = false;
+				nt->nulltesttype = IS_NULL;
+				nt->location = -1;
+				/* make orclause (x_i = y_i OR y_i IS NULL OR x_i IS NULL) */
+				tempnode = (Node *)make_orclause(list_make2(tempnode,
+															(Node *)nt));
+			}
+			/*
+			 * Now append tempnode to quals as one andclause.
+			 * (x_i = y_i OR y_i IS NULL OR x_i IS NULL) AND
+			 * (x_j = y_j OR y_j IS NULL OR x_j IS NULL)...
+			 */
+			*quals = (Node *)make_andclause(list_make2(*quals, tempnode));
+			xc = lnext(xc);
+			yc = lnext(yc);
+			ytlc = lnext(ytlc);
+		}
+	}
+	/* quals should be either OpExpr or BoolExpr, otherwise don't convert */
+	else
+	{
+		return false;
+	}
+	return true;
+}
 
 /*
  * convert_ANY_sublink_to_join: try to convert an ANY SubLink to a join
@@ -1144,6 +1578,7 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
  */
 JoinExpr *
 convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
+							bool under_not, Node **pullout,
 							Relids available_rels)
 {
 	JoinExpr   *result;
@@ -1224,11 +1659,22 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 	 */
 	quals = convert_testexpr(root, sublink->testexpr, subquery_vars);
 
+	/*
+	 * Try converting x NOT IN (y) to ANTI JOIN.
+	 */
+	if(under_not &&
+			!convert_NOT_IN_to_join(root, &quals,
+					sublink, subquery_vars, pullout))
+	{
+		return NULL;
+	}
+
 	/*
 	 * And finally, build the JoinExpr node.
 	 */
 	result = makeNode(JoinExpr);
-	result->jointype = JOIN_SEMI;
+	/* NOT IN will be converted to ANTI JOIN */
+	result->jointype = under_not ? JOIN_ANTI : JOIN_SEMI;
 	result->isNatural = false;
 	result->larg = NULL;		/* caller must fill this in */
 	result->rarg = (Node *) rtr;
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index aebe162713..184d71cd7c 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -104,7 +104,7 @@ static Query *pullup_replace_vars_subquery(Query *query,
 static reduce_outer_joins_state *reduce_outer_joins_pass1(Node *jtnode);
 static void reduce_outer_joins_pass2(Node *jtnode,
 						 reduce_outer_joins_state *state,
-						 PlannerInfo *root,
+						 Query *parse,
 						 Relids nonnullable_rels,
 						 List *nonnullable_vars,
 						 List *forced_null_vars);
@@ -256,6 +256,13 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 		}
 		/* Build the replacement FromExpr; no quals yet */
 		newf = makeFromExpr(newfromlist, NULL);
+		/*
+		 * Replace parse->jointree with newf now because we might modify join types
+		 * during reduce_outer_joins() in convert_NOT_IN_to_join() which is called
+		 * in pull_up_sublinks_qual_recurse() that's coming up next.
+		 */
+		newf->quals = f->quals;
+		root->parse->jointree = newf;
 		/* Set up a link representing the rebuilt jointree */
 		jtlink = (Node *) newf;
 		/* Now process qual --- all children are available for use */
@@ -388,7 +395,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 		/* Is it a convertible ANY or EXISTS clause? */
 		if (sublink->subLinkType == ANY_SUBLINK)
 		{
-			if ((j = convert_ANY_sublink_to_join(root, sublink,
+			if ((j = convert_ANY_sublink_to_join(root, sublink, false, NULL,
 												 available_rels1)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
@@ -414,7 +421,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 				return NULL;
 			}
 			if (available_rels2 != NULL &&
-				(j = convert_ANY_sublink_to_join(root, sublink,
+				(j = convert_ANY_sublink_to_join(root, sublink, false, NULL,
 												 available_rels2)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
@@ -560,6 +567,69 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 					return NULL;
 				}
 			}
+			else if (sublink->subLinkType == ANY_SUBLINK)
+			{
+				Node *pullout = NULL;
+
+				if ((j = convert_ANY_sublink_to_join(root, sublink, true, &pullout,
+													 available_rels1)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink1;
+					*jtlink1 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Any inserted
+					 * joins can get stacked onto either j->larg or j->rarg,
+					 * depending on which rels they reference.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->larg,
+															 available_rels1,
+															 &j->rarg,
+															 child_rels);
+					/*
+					 * Return pullout predicate (x is NOT NULL) if it's not null,
+					 * otherwise return NULL representing constant TRUE.
+					 */
+					return pullout? pullout : NULL;
+				}
+				if (available_rels2 != NULL &&
+					(j = convert_ANY_sublink_to_join(root, sublink, true, &pullout,
+													 available_rels2)) != NULL)
+				{
+					Node *pullout = NULL;
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink2;
+					*jtlink2 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Any inserted
+					 * joins can get stacked onto either j->larg or j->rarg,
+					 * depending on which rels they reference.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->larg,
+															 available_rels2,
+															 &j->rarg,
+															 child_rels);
+					/*
+					 * Return pullout predicate (x is NOT NULL) if it's not null,
+					 * otherwise return NULL representing constant TRUE.
+					 */
+					return pullout? pullout : NULL;
+				}
+			}
 		}
 		/* Else return it unmodified */
 		return node;
@@ -2377,7 +2447,7 @@ flatten_simple_union_all(PlannerInfo *root)
  * alias-var expansion).
  */
 void
-reduce_outer_joins(PlannerInfo *root)
+reduce_outer_joins(Query *parse)
 {
 	reduce_outer_joins_state *state;
 
@@ -2390,14 +2460,14 @@ reduce_outer_joins(PlannerInfo *root)
 	 * join(s) below each side of each join clause. The second pass examines
 	 * qual clauses and changes join types as it descends the tree.
 	 */
-	state = reduce_outer_joins_pass1((Node *) root->parse->jointree);
+	state = reduce_outer_joins_pass1((Node *) parse->jointree);
 
 	/* planner.c shouldn't have called me if no outer joins */
 	if (state == NULL || !state->contains_outer)
 		elog(ERROR, "so where are the outer joins?");
 
-	reduce_outer_joins_pass2((Node *) root->parse->jointree,
-							 state, root, NULL, NIL, NIL);
+	reduce_outer_joins_pass2((Node *) parse->jointree,
+							 state, parse, NULL, NIL, NIL);
 }
 
 /*
@@ -2480,7 +2550,7 @@ reduce_outer_joins_pass1(Node *jtnode)
 static void
 reduce_outer_joins_pass2(Node *jtnode,
 						 reduce_outer_joins_state *state,
-						 PlannerInfo *root,
+						 Query *parse,
 						 Relids nonnullable_rels,
 						 List *nonnullable_vars,
 						 List *forced_null_vars)
@@ -2520,7 +2590,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 			reduce_outer_joins_state *sub_state = lfirst(s);
 
 			if (sub_state->contains_outer)
-				reduce_outer_joins_pass2(lfirst(l), sub_state, root,
+				reduce_outer_joins_pass2(lfirst(l), sub_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
@@ -2632,7 +2702,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 		/* Apply the jointype change, if any, to both jointree node and RTE */
 		if (rtindex && jointype != j->jointype)
 		{
-			RangeTblEntry *rte = rt_fetch(rtindex, root->parse->rtable);
+			RangeTblEntry *rte = rt_fetch(rtindex, parse->rtable);
 
 			Assert(rte->rtekind == RTE_JOIN);
 			Assert(rte->jointype == j->jointype);
@@ -2717,7 +2787,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 					pass_nonnullable_vars = NIL;
 					pass_forced_null_vars = NIL;
 				}
-				reduce_outer_joins_pass2(j->larg, left_state, root,
+				reduce_outer_joins_pass2(j->larg, left_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
@@ -2739,7 +2809,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 					pass_nonnullable_vars = NIL;
 					pass_forced_null_vars = NIL;
 				}
-				reduce_outer_joins_pass2(j->rarg, right_state, root,
+				reduce_outer_joins_pass2(j->rarg, right_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 4130514952..96563bb180 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -28,6 +28,7 @@
 #include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
 #include "partitioning/partbounds.h"
+#include "parser/parsetree.h"
 #include "utils/hsearch.h"
 
 
@@ -65,6 +66,23 @@ static void build_child_join_reltarget(PlannerInfo *root,
 						   AppendRelInfo **appinfos);
 
 
+/* get RangeTable index from Oid of a relation and the Query Tree */
+Index get_relid_oid(Oid reloid, Query *parse)
+{
+	Index				 rti;
+	RangeTblEntry	*rte;
+
+	for(rti = 1; rti <  list_length(parse->rtable) + 1; rti++)
+	{
+		rte = rt_fetch(rti, parse->rtable);
+		if(rte->relid == reloid)
+		{
+			return rti;
+		}
+	}
+	return 0;
+}
+
 /*
  * setup_simple_rel_arrays
  *	  Prepare the arrays we use for quickly accessing base relations.
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index e88c45d268..0b9436628a 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -878,6 +878,35 @@ get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 	ReleaseSysCache(tp);
 }
 
+/*
+ * get_attnotnull
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	  tp;
+	Form_pg_attribute att_tup;
+	bool              result;
+
+	tp = SearchSysCache2(ATTNUM,
+			ObjectIdGetDatum(relid),
+			Int16GetDatum(attnum));
+
+	if (!HeapTupleIsValid(tp))
+	{
+		elog(ERROR, "cache lookup failed for attribute %d of relation %u",
+			 attnum, relid);
+	}
+
+	att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+
+	result = att_tup->attnotnull;
+	ReleaseSysCache(tp);
+	return result;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 574bb85b50..5e85be2eca 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -275,6 +275,7 @@ extern Path *reparameterize_path_by_child(PlannerInfo *root, Path *path,
 /*
  * prototypes for relnode.c
  */
+extern Index get_relid_oid(Oid reloid, Query *parse);
 extern void setup_simple_rel_arrays(PlannerInfo *root);
 extern void setup_append_rel_array(PlannerInfo *root);
 extern RelOptInfo *build_simple_rel(PlannerInfo *root, int relid,
diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h
index a9b2c9026c..ec8af5abf4 100644
--- a/src/include/optimizer/prep.h
+++ b/src/include/optimizer/prep.h
@@ -26,7 +26,7 @@ extern void pull_up_sublinks(PlannerInfo *root);
 extern void inline_set_returning_functions(PlannerInfo *root);
 extern void pull_up_subqueries(PlannerInfo *root);
 extern void flatten_simple_union_all(PlannerInfo *root);
-extern void reduce_outer_joins(PlannerInfo *root);
+extern void reduce_outer_joins(Query *parse);
 extern void remove_useless_result_rtes(PlannerInfo *root);
 extern Relids get_relids_in_jointree(Node *jtnode, bool include_joins);
 extern Relids get_relids_for_join(Query *query, int joinrelid);
diff --git a/src/include/optimizer/subselect.h b/src/include/optimizer/subselect.h
index 2d2c3bcbc0..ea2abaa131 100644
--- a/src/include/optimizer/subselect.h
+++ b/src/include/optimizer/subselect.h
@@ -19,6 +19,8 @@
 extern void SS_process_ctes(PlannerInfo *root);
 extern JoinExpr *convert_ANY_sublink_to_join(PlannerInfo *root,
 							SubLink *sublink,
+							bool under_not,
+							Node ** pullout,
 							Relids available_rels);
 extern JoinExpr *convert_EXISTS_sublink_to_join(PlannerInfo *root,
 							   SubLink *sublink,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 16b0b1d2dc..71370d8877 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -89,6 +89,7 @@ extern AttrNumber get_attnum(Oid relid, const char *attname);
 extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 					  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern char *get_collation_name(Oid colloid);
 extern char *get_constraint_name(Oid conoid);
 extern char *get_language_name(Oid langoid, bool missing_ok);
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index cc3f5f3737..b56f6f8040 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -1310,3 +1310,1231 @@ select * from x for update;
    Output: subselect_tbl.f1, subselect_tbl.f2, subselect_tbl.f3
 (2 rows)
 
+-- test NON IN to ANTI JOIN conversion
+CREATE TABLE s (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s values(1000002, 1000002, 1000002, 'foofoo');
+UPDATE s set n = NULL WHERE n = 3;
+CREATE UNIQUE INDEX s_u ON s (u);
+CREATE INDEX s_n ON s (n);
+CREATE INDEX s_nn ON s (nn);
+CREATE TABLE l (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into l (u, n, nn, p)
+	select
+    generate_series(1,10000 ) as u,
+	generate_series(1,10000 ) as n,
+	generate_series(1,10000 ) as nn,
+	'bar' as p;
+UPDATE l set n = NULL WHERE n = 7;
+CREATE UNIQUE INDEX l_u ON l (u);
+CREATE INDEX l_n ON l (n);
+CREATE INDEX l_nn ON l (nn);
+CREATE TABLE s1 (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s1 (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s1 values(1000003, 1000003, 1000003, 'foofoo');
+insert into s1 values(1003, 1003, 1003, 'foofoo');
+UPDATE s1 set n = NULL WHERE n = 3;
+-- test single expression
+explain (costs false) select * from s where n not in (select n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n IS NOT NULL)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select u from l);
+              QUERY PLAN              
+--------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_u on l
+         Index Cond: (u = s.u)
+(4 rows)
+
+select * from s where u not in (select u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where 3*n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: ((3 * n) IS NOT NULL)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: (((3 * s.n) = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = (3 * s.n))
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where 3*n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select 3*n from l);
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = (3 * l.n)) OR ((3 * l.n) IS NULL))
+   ->  Seq Scan on s
+         Filter: (n IS NOT NULL)
+   ->  Materialize
+         ->  Seq Scan on l
+(6 rows)
+
+select * from s where n not in (select 3*n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- test multi expression
+explain (costs false) select * from s where (n,u) not in (select n,u from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_u on l
+         Index Cond: (u = s.u)
+         Filter: ((s.n = n) OR (n IS NULL) OR (s.n IS NULL))
+(5 rows)
+
+select * from s where (n,u) not in (select n,u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (u, nn) not in (select u, nn from l);
+           QUERY PLAN            
+---------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_u on l
+         Index Cond: (u = s.u)
+         Filter: (s.nn = nn)
+(5 rows)
+
+select * from s where (u, nn) not in (select u, nn from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u) not in (select u,n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.u = n) OR (n IS NULL))
+         Filter: ((s.n = u) OR (s.n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.u)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where (n,u) not in (select u,n from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l);
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.nn = nn) AND ((s.u = n) OR (n IS NULL)))
+         Filter: ((s.n = u) OR (s.n IS NULL))
+         ->  BitmapAnd
+               ->  Bitmap Index Scan on l_nn
+                     Index Cond: (nn = s.nn)
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.u)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+(13 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.nn = nn) AND ((s.u = n) OR (n IS NULL)))
+         Filter: ((u > 1000) AND ((s.n = u) OR (s.n IS NULL)))
+         ->  BitmapAnd
+               ->  Bitmap Index Scan on l_nn
+                     Index Cond: (nn = s.nn)
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.u)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+(13 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+-- test multi-table
+explain (costs false) select count(*) from s, l where s.n not in (select n from l);
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Seq Scan on l
+         ->  Materialize
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                           Filter: (n IS NOT NULL)
+                     ->  Bitmap Heap Scan on l l_1
+                           Recheck Cond: ((s.n = n) OR (n IS NULL))
+                           ->  BitmapOr
+                                 ->  Bitmap Index Scan on l_n
+                                       Index Cond: (n = s.n)
+                                 ->  Bitmap Index Scan on l_n
+                                       Index Cond: (n IS NULL)
+(14 rows)
+
+select count(*) from s, l where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s, l where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Seq Scan on l
+         ->  Materialize
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+(8 rows)
+
+select count(*) from s, l where s.nn not in (select nn from l);
+ count 
+-------
+ 10000
+(1 row)
+
+-- test null padded results from outer join
+explain (costs false) select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = s_1.nn) OR (s_1.nn IS NULL))
+   ->  Seq Scan on s
+         Filter: (n IS NOT NULL)
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s_1.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s s_1
+(9 rows)
+
+select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = s_1.nn) OR (s_1.nn IS NULL))
+   ->  Seq Scan on s
+         Filter: (n IS NOT NULL)
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s_1.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s s_1
+(9 rows)
+
+select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+                QUERY PLAN                 
+-------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         ->  Hash Anti Join
+               Hash Cond: (l.nn = s_1.nn)
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s s_1
+         ->  Hash
+               ->  Seq Scan on s
+(10 rows)
+
+select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+ count 
+-------
+  9997
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+                  QUERY PLAN                  
+----------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Hash Anti Join
+               Hash Cond: (s.nn = s_1.nn)
+               ->  Seq Scan on s
+                     Filter: (nn IS NOT NULL)
+               ->  Hash
+                     ->  Seq Scan on s s_1
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: (nn = s.nn)
+               ->  Bitmap Index Scan on l_nn
+                     Index Cond: (nn = s.nn)
+(12 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         ->  Hash Anti Join
+               Hash Cond: (l.nn = l_1.nn)
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Left Join
+                           Hash Cond: (l_1.nn = s_1.nn)
+                           ->  Seq Scan on l l_1
+                           ->  Hash
+                                 ->  Seq Scan on s s_1
+         ->  Hash
+               ->  Seq Scan on s
+(14 rows)
+
+select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               Join Filter: ((s.nn = s_1.nn) OR (s_1.nn IS NULL))
+               ->  Seq Scan on s
+                     Filter: (nn IS NOT NULL)
+               ->  Hash Left Join
+                     Hash Cond: (l_1.nn = s_1.nn)
+                     ->  Seq Scan on l l_1
+                     ->  Hash
+                           ->  Seq Scan on s s_1
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: (nn = s.nn)
+               ->  Bitmap Index Scan on l_nn
+                     Index Cond: (nn = s.nn)
+(15 rows)
+
+select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Aggregate
+   ->  Hash Right Join
+         Hash Cond: (s1.u = s.u)
+         ->  Seq Scan on s1
+         ->  Hash
+               ->  Nested Loop
+                     ->  Nested Loop Anti Join
+                           ->  Seq Scan on s
+                           ->  Index Only Scan using l_nn on l l_1
+                                 Index Cond: (nn = s.nn)
+                     ->  Index Only Scan using l_u on l
+                           Index Cond: (u = s.u)
+(12 rows)
+
+select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Aggregate
+   ->  Hash Right Join
+         Hash Cond: (s1.u = s.u)
+         ->  Seq Scan on s1
+         ->  Hash
+               ->  Nested Loop
+                     ->  Nested Loop Anti Join
+                           ->  Seq Scan on s
+                                 Filter: (nn IS NOT NULL)
+                           ->  Index Only Scan using l_nn on l l_1
+                                 Index Cond: (nn = s.nn)
+                     ->  Index Only Scan using l_u on l
+                           Index Cond: (u = s.u)
+(13 rows)
+
+select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Aggregate
+   ->  Hash Right Join
+         Hash Cond: (s1.u = s.u)
+         ->  Seq Scan on s1
+         ->  Hash
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l
+                           Index Cond: (nn = s.nn)
+(9 rows)
+
+select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = l.u)
+         ->  Hash Join
+               Hash Cond: (s1.u = s.u)
+               ->  Seq Scan on s1
+               ->  Hash
+                     ->  Nested Loop Anti Join
+                           ->  Seq Scan on s
+                           ->  Index Only Scan using l_nn on l l_1
+                                 Index Cond: (nn = s.nn)
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s1.u)
+(13 rows)
+
+select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = l.u)
+         ->  Hash Join
+               Hash Cond: (s1.u = s.u)
+               ->  Seq Scan on s1
+               ->  Hash
+                     ->  Nested Loop Anti Join
+                           ->  Seq Scan on s
+                                 Filter: (nn IS NOT NULL)
+                           ->  Index Only Scan using l_nn on l l_1
+                                 Index Cond: (nn = s.nn)
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s1.u)
+(14 rows)
+
+select * from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+ u | n | nn | p | u | n | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Aggregate
+   ->  Hash Right Join
+         Hash Cond: (s1.u = s.u)
+         ->  Seq Scan on s1
+         ->  Hash
+               ->  Nested Loop
+                     ->  Nested Loop Anti Join
+                           ->  Seq Scan on s
+                           ->  Index Only Scan using l_nn on l l_1
+                                 Index Cond: (nn = s.nn)
+                     ->  Index Only Scan using l_u on l
+                           Index Cond: (u = s.u)
+(12 rows)
+
+select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Aggregate
+   ->  Hash Join
+         Hash Cond: (s1.u = s.u)
+         ->  Seq Scan on s1
+         ->  Hash
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                           Filter: (nn IS NOT NULL)
+                     ->  Index Only Scan using l_nn on l
+                           Index Cond: (nn = s.nn)
+(10 rows)
+
+select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Hash Right Join
+         Hash Cond: (s1.nn = l.nn)
+         ->  Seq Scan on s1
+         ->  Hash
+               ->  Hash Left Join
+                     Hash Cond: (l.nn = s_1.nn)
+                     ->  Bitmap Heap Scan on l
+                           Recheck Cond: (s.nn = nn)
+                           ->  Bitmap Index Scan on l_nn
+                                 Index Cond: (nn = s.nn)
+                     ->  Hash
+                           ->  Seq Scan on s s_1
+(14 rows)
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.nn = l.nn) OR (l.nn IS NULL))
+   ->  Seq Scan on s
+   ->  Materialize
+         ->  Merge Right Join
+               Merge Cond: (l.nn = s1.nn)
+               ->  Merge Left Join
+                     Merge Cond: (l.nn = s_1.nn)
+                     ->  Index Only Scan using l_nn on l
+                     ->  Sort
+                           Sort Key: s_1.nn
+                           ->  Seq Scan on s s_1
+               ->  Sort
+                     Sort Key: s1.nn
+                     ->  Seq Scan on s1
+(15 rows)
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+                                     QUERY PLAN                                      
+-------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Nested Loop Left Join
+         ->  Index Scan using l_u on l
+               Index Cond: (u = s.u)
+               Filter: (((s.n = n) OR (n IS NULL) OR (s.n IS NULL)) AND (s.nn = nn))
+         ->  Index Only Scan using s_nn on s s_1
+               Index Cond: (nn = l.nn)
+(8 rows)
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l right join s on l.nn = s.nn);
+                                                                QUERY PLAN                                                                 
+-------------------------------------------------------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s.n = l.n) OR (l.n IS NULL) OR (s.n IS NULL)) AND ((s.u = l.u) OR (l.u IS NULL)) AND ((s.nn = l.nn) OR (l.nn IS NULL)))
+   ->  Seq Scan on s
+   ->  Materialize
+         ->  Nested Loop Left Join
+               ->  Seq Scan on s s_1
+               ->  Bitmap Heap Scan on l
+                     Recheck Cond: (nn = s_1.nn)
+                     ->  Bitmap Index Scan on l_nn
+                           Index Cond: (nn = s_1.nn)
+(10 rows)
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+--test reduce outer joins from outer query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+                     Filter: (nn IS NOT NULL)
+               ->  Index Only Scan using l_nn on l l_1
+                     Index Cond: (nn = s.nn)
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: (nn = s.nn)
+               ->  Bitmap Index Scan on l_nn
+                     Index Cond: (nn = s.nn)
+(11 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+                     Filter: (u > 0)
+               ->  Index Only Scan using l_nn on l l_1
+                     Index Cond: (nn = s.nn)
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: (nn = s.nn)
+               ->  Bitmap Index Scan on l_nn
+                     Index Cond: (nn = s.nn)
+(11 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Hash Join
+         Hash Cond: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Bitmap Heap Scan on l
+                     Recheck Cond: (nn = s.nn)
+                     ->  Bitmap Index Scan on l_nn
+                           Index Cond: (nn = s.nn)
+         ->  Hash
+               ->  Seq Scan on s1
+(14 rows)
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Hash Join
+         Hash Cond: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                           Filter: (nn IS NOT NULL)
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Bitmap Heap Scan on l
+                     Recheck Cond: (nn = s.nn)
+                     ->  Bitmap Index Scan on l_nn
+                           Index Cond: (nn = s.nn)
+         ->  Hash
+               ->  Seq Scan on s1
+(15 rows)
+
+select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                           Filter: (nn IS NOT NULL)
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Bitmap Heap Scan on l
+                     Recheck Cond: (nn = s.nn)
+                     ->  Bitmap Index Scan on l_nn
+                           Index Cond: (nn = s.nn)
+         ->  Hash
+               ->  Seq Scan on s1
+(15 rows)
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+--test reduce outer joins from subquery
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.nn = l.nn) OR (l.nn IS NULL))
+   ->  Seq Scan on s
+   ->  Materialize
+         ->  Nested Loop Left Join
+               ->  Seq Scan on s s_1
+               ->  Bitmap Heap Scan on l
+                     Recheck Cond: (nn = s_1.nn)
+                     ->  Bitmap Index Scan on l_nn
+                           Index Cond: (nn = s_1.nn)
+(10 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Nested Loop
+         ->  Seq Scan on s s_1
+         ->  Index Scan using l_nn on l
+               Index Cond: ((nn = s.nn) AND (nn = s_1.nn))
+               Filter: (u > 9)
+(7 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.nn = l.nn) OR (l.nn IS NULL))
+   ->  Seq Scan on s
+   ->  Materialize
+         ->  Nested Loop Left Join
+               ->  Seq Scan on s s_1
+                     Filter: (u > 9)
+               ->  Bitmap Heap Scan on l
+                     Recheck Cond: (nn = s_1.nn)
+                     ->  Bitmap Index Scan on l_nn
+                           Index Cond: (nn = s_1.nn)
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+                              QUERY PLAN                               
+-----------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Hash Join
+         Hash Cond: (s1.n = l.n)
+         ->  Seq Scan on s1
+         ->  Hash
+               ->  Nested Loop
+                     ->  Seq Scan on s s_1
+                     ->  Index Scan using l_nn on l
+                           Index Cond: ((nn = s.nn) AND (nn = s_1.nn))
+(10 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.nn = l.nn) OR (l.nn IS NULL))
+   ->  Seq Scan on s
+   ->  Materialize
+         ->  Hash Left Join
+               Hash Cond: (s1.n = l.n)
+               ->  Seq Scan on s1
+               ->  Hash
+                     ->  Nested Loop
+                           ->  Seq Scan on s s_1
+                           ->  Bitmap Heap Scan on l
+                                 Recheck Cond: (nn = s_1.nn)
+                                 ->  Bitmap Index Scan on l_nn
+                                       Index Cond: (nn = s_1.nn)
+(14 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.nn = l.nn) OR (l.nn IS NULL))
+   ->  Seq Scan on s
+   ->  Materialize
+         ->  Nested Loop Left Join
+               ->  Seq Scan on s s_1
+               ->  Hash Right Join
+                     Hash Cond: (s1.n = l.n)
+                     ->  Seq Scan on s1
+                     ->  Hash
+                           ->  Bitmap Heap Scan on l
+                                 Recheck Cond: (nn = s_1.nn)
+                                 ->  Bitmap Index Scan on l_nn
+                                       Index Cond: (nn = s_1.nn)
+(14 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+--test reduce outer join on outer and sub-query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
+ Aggregate
+   ->  Hash Join
+         Hash Cond: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Hash Join
+                           Hash Cond: (s1_1.n = l_1.n)
+                           ->  Seq Scan on s1 s1_1
+                           ->  Hash
+                                 ->  Nested Loop
+                                       ->  Seq Scan on s s_1
+                                       ->  Index Scan using l_nn on l l_1
+                                             Index Cond: ((nn = s.nn) AND (nn = s_1.nn))
+               ->  Bitmap Heap Scan on l
+                     Recheck Cond: (nn = s.nn)
+                     ->  Bitmap Index Scan on l_nn
+                           Index Cond: (nn = s.nn)
+         ->  Hash
+               ->  Seq Scan on s1
+(20 rows)
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+                                    QUERY PLAN                                     
+-----------------------------------------------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     Join Filter: ((s.nn = l_1.nn) OR (l_1.nn IS NULL))
+                     ->  Seq Scan on s
+                           Filter: (nn IS NOT NULL)
+                     ->  Materialize
+                           ->  Nested Loop Left Join
+                                 ->  Seq Scan on s s_1
+                                 ->  Hash Right Join
+                                       Hash Cond: (s1_1.n = l_1.n)
+                                       ->  Seq Scan on s1 s1_1
+                                       ->  Hash
+                                             ->  Bitmap Heap Scan on l l_1
+                                                   Recheck Cond: (nn = s_1.nn)
+                                                   ->  Bitmap Index Scan on l_nn
+                                                         Index Cond: (nn = s_1.nn)
+               ->  Bitmap Heap Scan on l
+                     Recheck Cond: (nn = s.nn)
+                     ->  Bitmap Index Scan on l_nn
+                           Index Cond: (nn = s.nn)
+         ->  Hash
+               ->  Seq Scan on s1
+(25 rows)
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+ count 
+-------
+     0
+(1 row)
+
+-- test union all
+explain (costs false) select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s t
+         Filter: (n IS NOT NULL)
+   ->  Append
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: ((t.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = t.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+         ->  Seq Scan on s
+               Filter: ((t.n = u) OR (u IS NULL))
+         ->  Seq Scan on s s_1
+               Filter: ((t.n = nn) OR (nn IS NULL))
+(15 rows)
+
+select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n IS NOT NULL)
+   ->  Append
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+         ->  Seq Scan on s s_1
+               Filter: ((s.n = u) OR (u IS NULL))
+         ->  Seq Scan on s s_2
+               Filter: ((s.n = nn) OR (nn IS NULL))
+(15 rows)
+
+select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- test multi-levels of NOT IN
+explain (costs false) select * from s where n not in (select n from s where n not in (select n from l));
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Anti Join
+           ->  Seq Scan on s s_1
+                 Filter: (n IS NOT NULL)
+           ->  Bitmap Heap Scan on l
+                 Recheck Cond: ((s_1.n = n) OR (n IS NULL))
+                 ->  BitmapOr
+                       ->  Bitmap Index Scan on l_n
+                             Index Cond: (n = s_1.n)
+                       ->  Bitmap Index Scan on l_n
+                             Index Cond: (n IS NULL)
+(13 rows)
+
+select * from s where n not in (select n from s where n not in (select n from l));
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where n not in (select n from s where n not in (select u from l));
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Anti Join
+           ->  Seq Scan on s s_1
+                 Filter: (n IS NOT NULL)
+           ->  Index Only Scan using l_u on l
+                 Index Cond: (u = s_1.n)
+(8 rows)
+
+select * from s where n not in (select n from s where n not in (select u from l));
+ u | n | nn |  p  
+---+---+----+-----
+ 1 | 1 |  1 | foo
+ 2 | 2 |  2 | foo
+(2 rows)
+
+--test COALESCE
+explain (costs false) select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, '-1'::integer) = COALESCE(l.n, '-1'::integer))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, '-1'::integer) = COALESCE(l.n, '-1'::integer))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+                                 QUERY PLAN                                  
+-----------------------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((COALESCE(s.n) = COALESCE(l.n)) OR (COALESCE(l.n) IS NULL))
+   ->  Seq Scan on s
+         Filter: (COALESCE(n) IS NOT NULL)
+   ->  Materialize
+         ->  Seq Scan on l
+(6 rows)
+
+select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, s.nn) = COALESCE(l.n, l.nn))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+                   QUERY PLAN                   
+------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.nn) = COALESCE(l.nn))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+                                                       QUERY PLAN                                                        
+-------------------------------------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: (s.nn = nn)
+         Filter: ((COALESCE(s.n, '-1'::integer) = COALESCE(n, '-1'::integer)) AND (COALESCE(s.n, s.u) = COALESCE(n, u)))
+         ->  Bitmap Index Scan on l_nn
+               Index Cond: (nn = s.nn)
+(7 rows)
+
+select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+-- non conversion cases
+explain (costs false) select * from s where n not in (select n from l where u > 500);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on l
+           Filter: (u > 500)
+(5 rows)
+
+select * from s where n not in (select n from l where u > 500);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(3 rows)
+
+explain (costs false) select * from s where (n,n) not in (select n,n from l);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on l
+(4 rows)
+
+select * from s where (n,n) not in (select n,n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+             QUERY PLAN             
+------------------------------------
+ Hash Left Join
+   Hash Cond: (l.nn = s.nn)
+   Filter: (NOT (hashed SubPlan 1))
+   ->  Seq Scan on l
+   ->  Hash
+         ->  Seq Scan on s
+   SubPlan 1
+     ->  Seq Scan on l l_1
+(8 rows)
+
+select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+ u | n | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+---
+(0 rows)
+
+-- clean up
+drop table s;
+drop table s1;
+drop table l;
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index 8bca1f5d55..03b4c2a610 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -683,3 +683,286 @@ select * from (with x as (select 2 as y) select * from x) ss;
 explain (verbose, costs off)
 with x as (select * from subselect_tbl)
 select * from x for update;
+
+-- test NON IN to ANTI JOIN conversion
+CREATE TABLE s (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s values(1000002, 1000002, 1000002, 'foofoo');
+UPDATE s set n = NULL WHERE n = 3;
+
+CREATE UNIQUE INDEX s_u ON s (u);
+CREATE INDEX s_n ON s (n);
+CREATE INDEX s_nn ON s (nn);
+
+CREATE TABLE l (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into l (u, n, nn, p)
+	select
+    generate_series(1,10000 ) as u,
+	generate_series(1,10000 ) as n,
+	generate_series(1,10000 ) as nn,
+	'bar' as p;
+UPDATE l set n = NULL WHERE n = 7;
+
+CREATE UNIQUE INDEX l_u ON l (u);
+CREATE INDEX l_n ON l (n);
+CREATE INDEX l_nn ON l (nn);
+
+CREATE TABLE s1 (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s1 (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s1 values(1000003, 1000003, 1000003, 'foofoo');
+insert into s1 values(1003, 1003, 1003, 'foofoo');
+UPDATE s1 set n = NULL WHERE n = 3;
+
+-- test single expression
+explain (costs false) select * from s where n not in (select n from l);
+
+select * from s where n not in (select n from l);
+
+explain (costs false) select * from s where u not in (select u from l);
+
+select * from s where u not in (select u from l);
+
+explain (costs false) select * from s where 3*n not in (select n from l);
+
+select * from s where 3*n not in (select n from l);
+
+explain (costs false) select * from s where n not in (select 3*n from l);
+
+select * from s where n not in (select 3*n from l);
+
+-- test multi expression
+explain (costs false) select * from s where (n,u) not in (select n,u from l);
+
+select * from s where (n,u) not in (select n,u from l);
+
+explain (costs false) select * from s where (u, nn) not in (select u, nn from l);
+
+select * from s where (u, nn) not in (select u, nn from l);
+
+explain (costs false) select * from s where (n,u) not in (select u,n from l);
+
+select * from s where (n,u) not in (select u,n from l);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+
+-- test multi-table
+explain (costs false) select count(*) from s, l where s.n not in (select n from l);
+
+select count(*) from s, l where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s, l where s.nn not in (select nn from l);
+
+select count(*) from s, l where s.nn not in (select nn from l);
+
+-- test null padded results from outer join
+explain (costs false) select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+
+select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+
+select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+
+select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+
+select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+
+select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+select * from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l right join s on l.nn = s.nn);
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+--test reduce outer joins from outer query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+--test reduce outer joins from subquery
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+--test reduce outer join on outer and sub-query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+-- test union all
+explain (costs false) select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+
+select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+
+explain (costs false) select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+-- test multi-levels of NOT IN
+explain (costs false) select * from s where n not in (select n from s where n not in (select n from l));
+
+select * from s where n not in (select n from s where n not in (select n from l));
+
+explain (costs false) select * from s where n not in (select n from s where n not in (select u from l));
+
+select * from s where n not in (select n from s where n not in (select u from l));
+
+--test COALESCE
+explain (costs false) select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+
+select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+
+explain (costs false) select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+
+select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+
+explain (costs false) select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+
+select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+
+explain (costs false) select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+
+select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+
+explain (costs false) select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+
+select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+
+explain (costs false) select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+
+select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+
+-- non conversion cases
+explain (costs false) select * from s where n not in (select n from l where u > 500);
+
+select * from s where n not in (select n from l where u > 500);
+
+explain (costs false) select * from s where (n,n) not in (select n,n from l);
+
+select * from s where (n,n) not in (select n,n from l);
+
+explain (costs false) select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+
+select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+
+-- clean up
+drop table s;
+drop table s1;
+drop table l;
#10David Rowley
david.rowley@2ndquadrant.com
In reply to: Li, Zheng (#9)
Re: NOT IN subquery optimization

On Tue, 26 Feb 2019 at 11:51, Li, Zheng <zhelli@amazon.com> wrote:

Resend the patch with a whitespace removed so that "git apply patch" works directly.

I had a quick look at this and it seems to be broken for the empty
table case I mentioned up thread.

Quick example:

Setup:

create table t1 (a int);
create table t2 (a int not null);
insert into t1 values(NULL),(1),(2);

select * from t1 where a not in(select a from t2);

Patched:
a
---
1
2
(2 rows)

Master:
a
---

1
2
(3 rows)

This will be due to the fact you're adding an a IS NOT NULL qual to
the scan of a:

postgres=# explain select * from t1 where a not in(select a from t2);
QUERY PLAN
------------------------------------------------------------------
Hash Anti Join (cost=67.38..152.18 rows=1268 width=4)
Hash Cond: (t1.a = t2.a)
-> Seq Scan on t1 (cost=0.00..35.50 rows=2537 width=4)
Filter: (a IS NOT NULL)
-> Hash (cost=35.50..35.50 rows=2550 width=4)
-> Seq Scan on t2 (cost=0.00..35.50 rows=2550 width=4)
(6 rows)

but as I mentioned, you can't do that as t2 might be empty and there's
no way to know that during planning.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#11Richard Guo
riguo@pivotal.io
In reply to: David Rowley (#10)
Re: NOT IN subquery optimization

Greenplum Database does this optimization. The idea is to use a new join
type, let's call it JOIN_LASJ_NOTIN, and its semantic regarding NULL is
defined as below:

1. If there is a NULL in the outer side, and the inner side is empty, the
NULL should be part of the outputs.

2. If there is a NULL in the outer side, and the inner side is not empty,
the NULL should not be part of the outputs.

3. If there is a NULL in the inner side, no outputs should be produced.

An example plan looks like:

gpadmin=# explain (costs off) select * from t1 where a not in(select a
from t2);
QUERY PLAN
-----------------------------------
Hash Left Anti Semi (Not-In) Join
Hash Cond: (t1.a = t2.a)
-> Seq Scan on t1
-> Hash
-> Seq Scan on t2
(5 rows)

Thanks
Richard

On Tue, Feb 26, 2019 at 7:20 AM David Rowley <david.rowley@2ndquadrant.com>
wrote:

Show quoted text

On Tue, 26 Feb 2019 at 11:51, Li, Zheng <zhelli@amazon.com> wrote:

Resend the patch with a whitespace removed so that "git apply patch"

works directly.

I had a quick look at this and it seems to be broken for the empty
table case I mentioned up thread.

Quick example:

Setup:

create table t1 (a int);
create table t2 (a int not null);
insert into t1 values(NULL),(1),(2);

select * from t1 where a not in(select a from t2);

Patched:
a
---
1
2
(2 rows)

Master:
a
---

1
2
(3 rows)

This will be due to the fact you're adding an a IS NOT NULL qual to
the scan of a:

postgres=# explain select * from t1 where a not in(select a from t2);
QUERY PLAN
------------------------------------------------------------------
Hash Anti Join (cost=67.38..152.18 rows=1268 width=4)
Hash Cond: (t1.a = t2.a)
-> Seq Scan on t1 (cost=0.00..35.50 rows=2537 width=4)
Filter: (a IS NOT NULL)
-> Hash (cost=35.50..35.50 rows=2550 width=4)
-> Seq Scan on t2 (cost=0.00..35.50 rows=2550 width=4)
(6 rows)

but as I mentioned, you can't do that as t2 might be empty and there's
no way to know that during planning.

--
David Rowley
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.2ndQuadrant.com_&amp;d=DwIBaQ&amp;c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&amp;r=5r3cnfZPUDOHrMiXq8Mq2g&amp;m=dE1nglE17x3nD-oH_BrF0r4SLaFnQKzwwJBJGpDoaaA&amp;s=dshupMomMvkDAd92918cU21AJ1E1s7QwbrxIGSRxZA8&amp;e=
PostgreSQL Development, 24x7 Support, Training & Services

#12Jim Finnerty
jfinnert@amazon.com
In reply to: David Rowley (#10)
Re: NOT IN subquery optimization

The problem is that the special optimization that was proposed for the case
where the subquery has no WHERE clause isn't valid when the subquery returns
no rows. That additional optimization needs to be removed, and preferably
replaced with some sort of efficient run-time test.

-----
Jim Finnerty, AWS, Amazon Aurora PostgreSQL
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

#13David Rowley
david.rowley@2ndquadrant.com
In reply to: Jim Finnerty (#12)
Re: NOT IN subquery optimization

On Wed, 27 Feb 2019 at 03:07, Jim Finnerty <jfinnert@amazon.com> wrote:

The problem is that the special optimization that was proposed for the case
where the subquery has no WHERE clause isn't valid when the subquery returns
no rows. That additional optimization needs to be removed, and preferably
replaced with some sort of efficient run-time test.

That is one option, however, the join type that Richard mentions, to
satisfy point #3, surely only can work for Hash joins and perhaps
Merge joins that required a sort, assuming there's some way for the
sort to communicate about if it found NULLs or not. Either way, we
need to have looked at the entire inner side to ensure there are no
nulls. Probably it would be possible to somehow team that up with a
planner check to see if the inner exprs could be NULL then just
implement points #1 and #2 for other join methods.

If you're proposing to do that for this thread then I can take my
planner only patch somewhere else. I only posted my patch as I pretty
much already had what I thought you were originally talking about.
However, please be aware there are current patents around adding
execution time smarts in this area, so it's probably unlikely you'll
find a way to do this in the executor that does not infringe on those.
Probably no committer would want to touch it. I think my patch covers
a good number of use cases and as far as I understand, does not go
near any current patents.

Please let me know if I should move my patch to another thread. I
don't want to hi-jack this if you're going in another direction.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#14Li, Zheng
zhelli@amazon.com
In reply to: David Rowley (#13)
Re: NOT IN subquery optimization

I agree we will need some runtime smarts (such as a new anti join type as pointed out by Richard) to "ultimately" account for all the cases of NOT IN queries.

However, given that the March CommitFest is imminent and the runtime smarts patent concerns David had pointed out (which I was not aware of before), we would not move that direction at the moment.

I propose that we collaborate to build one patch from the two patches submitted in this thread for the CF. The two patches are for the same purpose and similar. However, they differ in the following ways as far as I can tell:

Nullability Test:
-David's patch uses strict predicates for nullability test.
-Our patch doesn't use strict predicates, but it accounts for COALESCE and null-padded rows from outer join. In addition, we made reduce_outer_joins() work before the transformation which makes the nullability test more accurate.

Anti Join Transformation:
-Dvaid's patch does the transformation when both inner and outer outputs are non-nullable.
-With the latest fix (for the empty table case), our patch does the transformation as long as the outer is non-nullable regardless of the inner nullability, experiments show that the results are always faster.

David, please let me know what you think. If you would like to collaborate, I'll start merging with your code on using strict predicates to make a better Nullability Test.

Thanks,
Zheng

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Li, Zheng (#14)
Re: NOT IN subquery optimization

"Li, Zheng" <zhelli@amazon.com> writes:

However, given that the March CommitFest is imminent and the runtime smarts patent concerns David had pointed out (which I was not aware of before), we would not move that direction at the moment.

I propose that we collaborate to build one patch from the two patches submitted in this thread for the CF.

TBH, I think it's very unlikely that any patch for this will be seriously
considered for commit in v12. It would be against project policy, and
spending a lot of time reviewing the patch would be quite unfair to other
patches that have been in the queue longer. Therefore, I'd suggest that
you not bend things out of shape just to have some patch to submit before
March 1. It'd be better to work with the goal of having a coherent patch
ready for the first v13 CF, probably July-ish.

regards, tom lane

#16David Rowley
david.rowley@2ndquadrant.com
In reply to: Li, Zheng (#14)
Re: NOT IN subquery optimization

On Wed, 27 Feb 2019 at 13:05, Li, Zheng <zhelli@amazon.com> wrote:

-With the latest fix (for the empty table case), our patch does the transformation as long as the outer is non-nullable regardless of the inner nullability, experiments show that the results are always faster.

Hi Zheng,

I'm interested to know how this works without testing for inner
nullability. If any of the inner side's join exprs are NULL then no
records can match. What do you propose to work around that?

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#17David Rowley
david.rowley@2ndquadrant.com
In reply to: Tom Lane (#15)
Re: NOT IN subquery optimization

On Wed, 27 Feb 2019 at 13:13, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Li, Zheng" <zhelli@amazon.com> writes:

However, given that the March CommitFest is imminent and the runtime smarts patent concerns David had pointed out (which I was not aware of before), we would not move that direction at the moment.

I propose that we collaborate to build one patch from the two patches submitted in this thread for the CF.

TBH, I think it's very unlikely that any patch for this will be seriously
considered for commit in v12. It would be against project policy, and
spending a lot of time reviewing the patch would be quite unfair to other
patches that have been in the queue longer. Therefore, I'd suggest that
you not bend things out of shape just to have some patch to submit before
March 1. It'd be better to work with the goal of having a coherent patch
ready for the first v13 CF, probably July-ish.

FWIW, I did add this to the March CF, but I set the target version to
13. I wasn't considering this for PG12. I see Zheng was, but I agree
with you on PG13 being the target for this.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#18Li, Zheng
zhelli@amazon.com
In reply to: David Rowley (#17)
Re: NOT IN subquery optimization

I'm totally fine with setting the target to PG13.

--
I'm interested to know how this works without testing for inner
nullability. If any of the inner side's join exprs are NULL then no
records can match. What do you propose to work around that?
--

We still check for inner side's nullability, when it is nullable we
append a "var is NULL" to the anti join condition. So every outer
tuple is going to evaluate to true on the join condition when there
is indeed a null entry in the inner.
Actually I think the nested loop anti join can end early in this case,
but I haven't find a way to do it properly, this may be one other reason
why we need a new join type for NOT IN.

e.g.
explain select count(*) from s where u not in (select n from l);
QUERY PLAN
------------------------------------------------------------------------------------
Aggregate (cost=2892.88..2892.89 rows=1 width=8)
-> Nested Loop Anti Join (cost=258.87..2892.88 rows=1 width=0)
-> Seq Scan on s (cost=0.00..1.11 rows=11 width=4)
-> Bitmap Heap Scan on l (cost=258.87..262.88 rows=1 width=4)
Recheck Cond: ((s.u = n) OR (n IS NULL))
-> BitmapOr (cost=258.87..258.87 rows=1 width=0)
-> Bitmap Index Scan on l_n (cost=0.00..4.43 rows=1 width=0)
Index Cond: (s.u = n)
-> Bitmap Index Scan on l_n (cost=0.00..4.43 rows=1 width=0)
Index Cond: (n IS NULL)

Zheng

#19David Rowley
david.rowley@2ndquadrant.com
In reply to: Li, Zheng (#18)
Re: NOT IN subquery optimization

On Wed, 27 Feb 2019 at 13:41, Li, Zheng <zhelli@amazon.com> wrote:

We still check for inner side's nullability, when it is nullable we
append a "var is NULL" to the anti join condition. So every outer
tuple is going to evaluate to true on the join condition when there
is indeed a null entry in the inner.

That's possible, at least providing the var is NULL is an OR
condition, but the problem there is that you force the plan into a
nested loop join. That's unfortunately not going to perform very well
when the number of rows to process is large.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#20Richard Guo
riguo@pivotal.io
In reply to: David Rowley (#13)
Re: NOT IN subquery optimization

On Wed, Feb 27, 2019 at 4:52 AM David Rowley <david.rowley@2ndquadrant.com>
wrote:

On Wed, 27 Feb 2019 at 03:07, Jim Finnerty <jfinnert@amazon.com> wrote:

If you're proposing to do that for this thread then I can take my
planner only patch somewhere else. I only posted my patch as I pretty
much already had what I thought you were originally talking about.
However, please be aware there are current patents around adding
execution time smarts in this area, so it's probably unlikely you'll
find a way to do this in the executor that does not infringe on those.
Probably no committer would want to touch it. I think my patch covers
a good number of use cases and as far as I understand, does not go
near any current patents.

Thanks for pointing out the patent concerns. I was not aware of that

before.
Could you please provide some clue where I can find more info about the
patents?

Thanks
Richard

#21Richard Guo
riguo@pivotal.io
In reply to: Li, Zheng (#9)
Re: NOT IN subquery optimization

On Tue, Feb 26, 2019 at 6:51 AM Li, Zheng <zhelli@amazon.com> wrote:

Resend the patch with a whitespace removed so that "git apply patch" works
directly.

Hi Zheng,

I have reviewed your patch. Good job except two issues I can find:

1. The patch would give wrong results when the inner side is empty. In this
case, the whole data from outer side should be in the outputs. But with the
patch, we will lose the NULLs from outer side.

2. Because of the new added predicate 'OR (var is NULL)', we cannot use hash
join or merge join to do the ANTI JOIN. Nested loop becomes the only
choice,
which is low-efficency.

Thanks
Richard

#22David Rowley
david.rowley@2ndquadrant.com
In reply to: Richard Guo (#21)
Re: NOT IN subquery optimization

On Fri, 1 Mar 2019 at 15:27, Richard Guo <riguo@pivotal.io> wrote:

I have reviewed your patch. Good job except two issues I can find:

1. The patch would give wrong results when the inner side is empty. In this
case, the whole data from outer side should be in the outputs. But with the
patch, we will lose the NULLs from outer side.

2. Because of the new added predicate 'OR (var is NULL)', we cannot use hash
join or merge join to do the ANTI JOIN. Nested loop becomes the only choice,
which is low-efficency.

Yeah. Both of these seem pretty fundamental, so setting the patch to
waiting on author.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#23Andres Freund
andres@anarazel.de
In reply to: David Rowley (#22)
Re: NOT IN subquery optimization

Hi,

On March 1, 2019 4:53:03 AM PST, David Rowley <david.rowley@2ndquadrant.com> wrote:

On Fri, 1 Mar 2019 at 15:27, Richard Guo <riguo@pivotal.io> wrote:

I have reviewed your patch. Good job except two issues I can find:

1. The patch would give wrong results when the inner side is empty.

In this

case, the whole data from outer side should be in the outputs. But

with the

patch, we will lose the NULLs from outer side.

2. Because of the new added predicate 'OR (var is NULL)', we cannot

use hash

join or merge join to do the ANTI JOIN. Nested loop becomes the only

choice,

which is low-efficency.

Yeah. Both of these seem pretty fundamental, so setting the patch to
waiting on author.

I've not checked, but could we please make sure these cases are covered in the regression tests today with a single liner? Seems people had to rediscover them a number of times now, and unless this thread results in an integrated feature soonish, it seems likely other people will again.

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#23)
Re: NOT IN subquery optimization

Andres Freund <andres@anarazel.de> writes:

On March 1, 2019 4:53:03 AM PST, David Rowley <david.rowley@2ndquadrant.com> wrote:

On Fri, 1 Mar 2019 at 15:27, Richard Guo <riguo@pivotal.io> wrote:

1. The patch would give wrong results when the inner side is empty.
2. Because of the new added predicate 'OR (var is NULL)', we cannot
use hash join or merge join to do the ANTI JOIN.

I've not checked, but could we please make sure these cases are covered
in the regression tests today with a single liner?

I'm not sure if the second one is actually a semantics bug or just a
misoptimization? But yeah, +1 for putting in some simple tests for
corner cases right now. Anyone want to propose a specific patch?

regards, tom lane

#25David Rowley
david.rowley@2ndquadrant.com
In reply to: Tom Lane (#24)
1 attachment(s)
Re: NOT IN subquery optimization

On Sat, 2 Mar 2019 at 05:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@anarazel.de> writes:

I've not checked, but could we please make sure these cases are covered
in the regression tests today with a single liner?

I'm not sure if the second one is actually a semantics bug or just a
misoptimization? But yeah, +1 for putting in some simple tests for
corner cases right now. Anyone want to propose a specific patch?

The second is just reducing the planner's flexibility to produce a
good plan. The first is a bug. Proposed regression test attached.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

not_in_empty_subquery_test.sqltext/plain; charset=US-ASCII; name=not_in_empty_subquery_test.sqlDownload
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index cc3f5f3737..1b09b3e3fd 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -830,6 +830,20 @@ explain (verbose, costs off)
                  One-Time Filter: ("*VALUES*".column1 = "*VALUES*".column1)
 (8 rows)
 
+--
+-- Check we don't filter NULL outer rows in a NOT IN where the subquery
+-- returns no rows.
+--
+create temp table notinouter (a int);
+create temp table notininner (a int not null);
+insert into notinouter values(null),(1);
+select * from notinouter where a not in(select a from notininner);
+ a 
+---
+  
+ 1
+(2 rows)
+
 --
 -- Check we behave sanely in corner case of empty SELECT list (bug #8648)
 --
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index 8bca1f5d55..48230d4671 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -466,6 +466,16 @@ explain (verbose, costs off)
   select x, x from
     (select (select random() where y=y) as x from (values(1),(2)) v(y)) ss;
 
+--
+-- Check we don't filter NULL outer rows in a NOT IN where the subquery
+-- returns no rows.
+--
+create temp table notinouter (a int);
+create temp table notininner (a int not null);
+insert into notinouter values(null),(1);
+
+select * from notinouter where a not in(select a from notininner);
+
 --
 -- Check we behave sanely in corner case of empty SELECT list (bug #8648)
 --
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#25)
Re: NOT IN subquery optimization

David Rowley <david.rowley@2ndquadrant.com> writes:

On Sat, 2 Mar 2019 at 05:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm not sure if the second one is actually a semantics bug or just a
misoptimization? But yeah, +1 for putting in some simple tests for
corner cases right now. Anyone want to propose a specific patch?

The second is just reducing the planner's flexibility to produce a
good plan. The first is a bug. Proposed regression test attached.

LGTM, pushed.

regards, tom lane

#27Li, Zheng
zhelli@amazon.com
In reply to: David Rowley (#25)
Re: NOT IN subquery optimization

Thanks all for the feedbacks! I'm working on a refined patch.

Although adding "or var is NULL" to the anti join condition forces the planner to choose nested loop anti join, it is always faster compared to the original plan. In order to enable the transformation from NOT IN to anti join when the inner/outer is nullable, we have to add some NULL test to the join condition.

We could make anti join t1, t2 on (t1.x = t2.y or t2.y IS NULL) eligible for hashjoin, it would require changes in allowing this special join quals for hash join as well as changes in hash join executor to handle NULL accordingly for the case.

Another option of transformation is to add "is not false" on top of the join condition.

Regards,
Zheng
On 3/1/19, 5:28 PM, "David Rowley" <david.rowley@2ndquadrant.com> wrote:

On Sat, 2 Mar 2019 at 05:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@anarazel.de> writes:

I've not checked, but could we please make sure these cases are covered
in the regression tests today with a single liner?

I'm not sure if the second one is actually a semantics bug or just a
misoptimization? But yeah, +1 for putting in some simple tests for
corner cases right now. Anyone want to propose a specific patch?

The second is just reducing the planner's flexibility to produce a
good plan. The first is a bug. Proposed regression test attached.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Li, Zheng (#27)
Re: NOT IN subquery optimization

"Li, Zheng" <zhelli@amazon.com> writes:

Although adding "or var is NULL" to the anti join condition forces the planner to choose nested loop anti join, it is always faster compared to the original plan.

TBH, I am *really* skeptical of sweeping claims like that. The existing
code will typically produce a hashed-subplan plan, which ought not be
that awful as long as the subquery result doesn't blow out memory.
It certainly is going to beat a naive nested loop.

regards, tom lane

#29David Rowley
david.rowley@2ndquadrant.com
In reply to: Tom Lane (#28)
Re: NOT IN subquery optimization

On Sat, 2 Mar 2019 at 12:13, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Li, Zheng" <zhelli@amazon.com> writes:

Although adding "or var is NULL" to the anti join condition forces the planner to choose nested loop anti join, it is always faster compared to the original plan.

TBH, I am *really* skeptical of sweeping claims like that. The existing
code will typically produce a hashed-subplan plan, which ought not be
that awful as long as the subquery result doesn't blow out memory.
It certainly is going to beat a naive nested loop.

It's pretty easy to show the claim is false using master and NOT EXISTS.

create table small(a int not null);
create table big (a int not null);
insert into small select generate_Series(1,1000);
insert into big select x%1000+1 from generate_Series(1,1000000) x;

select count(*) from big b where not exists(select 1 from small s
where s.a = b.a);
Time: 178.575 ms

select count(*) from big b where not exists(select 1 from small s
where s.a = b.a or s.a is null);
Time: 38049.969 ms (00:38.050)

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#30Li, Zheng
zhelli@amazon.com
In reply to: David Rowley (#29)
Re: NOT IN subquery optimization

The current transformation would not add "or s.a is NULL" in the example provided since it is non-nullable. You will be comparing these two cases in terms of the transformation:

select count(*) from big b where not exists(select 1 from small s
where s.a = b.a);
Time: 51.416 ms

select count(*) from big b where a not in (select a from s);
Time: 890.088 ms

But if s.a is nullable, yes, you have proved my previous statement is false... I should have used almost always.
However, if s.a is nullable, we would do this transformation:
select count(*) from big b where not exists(select 1 from small s
where s.a = b.a or s.a is null);

It's possible to stop the nested loop join early during execution once we find an inner Null entry because every outer tuple is
going to evaluate to true on the join condition.

Zheng

On 3/1/19, 6:17 PM, "David Rowley" <david.rowley@2ndquadrant.com> wrote:

On Sat, 2 Mar 2019 at 12:13, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Li, Zheng" <zhelli@amazon.com> writes:

Although adding "or var is NULL" to the anti join condition forces the planner to choose nested loop anti join, it is always faster compared to the original plan.

TBH, I am *really* skeptical of sweeping claims like that. The existing
code will typically produce a hashed-subplan plan, which ought not be
that awful as long as the subquery result doesn't blow out memory.
It certainly is going to beat a naive nested loop.

It's pretty easy to show the claim is false using master and NOT EXISTS.

create table small(a int not null);
create table big (a int not null);
insert into small select generate_Series(1,1000);
insert into big select x%1000+1 from generate_Series(1,1000000) x;

select count(*) from big b where not exists(select 1 from small s
where s.a = b.a);
Time: 178.575 ms

select count(*) from big b where not exists(select 1 from small s
where s.a = b.a or s.a is null);
Time: 38049.969 ms (00:38.050)

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#31David Rowley
david.rowley@2ndquadrant.com
In reply to: Li, Zheng (#30)
Re: NOT IN subquery optimization

On Sat, 2 Mar 2019 at 12:39, Li, Zheng <zhelli@amazon.com> wrote:

However, if s.a is nullable, we would do this transformation:
select count(*) from big b where not exists(select 1 from small s
where s.a = b.a or s.a is null);

I understand you're keen to make this work, but you're assuming again
that forcing the planner into a nested loop plan is going to be a win
over the current behaviour. It may well be in some cases, but it's
very simple to show cases where it's a significant regression.

Using the same tables from earlier, and again with master:

alter table small alter column a drop not null;
select * from big where a not in(select a from small);
Time: 430.283 ms

Here's what you're proposing:

select * from big b where not exists(select 1 from small s where s.a =
b.a or s.a is null);
Time: 37419.646 ms (00:37.420)

about 80 times slower. Making "small" a little less small would likely
see that gap grow even further.

I think you're fighting a losing battle here with adding OR quals to
the join condition. This transformation happens so early in planning
that you really can't cost it out either. I think the only way that
could be made to work satisfactorily would be with some execution
level support for it. Short of that, you're left with just adding
checks that either side of the join cannot produce NULL values...
That's what I've proposed in [1]/messages/by-id/CAKJS1f_OA5VeZx8A8H8mkj3uqEgOtmHBGCUA6+xqgmUJ6JQURw@mail.gmail.com.

[1]: /messages/by-id/CAKJS1f_OA5VeZx8A8H8mkj3uqEgOtmHBGCUA6+xqgmUJ6JQURw@mail.gmail.com

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#31)
Re: NOT IN subquery optimization

David Rowley <david.rowley@2ndquadrant.com> writes:

I think you're fighting a losing battle here with adding OR quals to
the join condition.

Yeah --- that has a nontrivial risk of making things significantly worse,
which makes it a hard sell. I think the most reasonable bet here is
simply to not perform the transformation if we can't prove the inner side
NOT NULL. That's going to catch most of the useful cases anyway IMO.

regards, tom lane

#33Jim Finnerty
jfinnert@amazon.com
In reply to: Tom Lane (#32)
Re: NOT IN subquery optimization

Folks - I was away on vacation for the month of February, and can give this
my attention again.

I agree with Tom's comment above - when the cost of the NOT IN is dominated
by the cost of the outer scan (i.e. when the cardinality of the outer
relation is large relative to the cardinality of the subquery), and if the
inner cardinality is small enough to fit in memory, then the current
implementation does an efficient hash lookup into an in-memory structure,
and that's a very fast way to do the NOT IN. It almost achieves the
lower-bound cost of scanning the outer relation. It can also parallelizes
easily, whether or not we currently can do that. In these cases, the
current plan is the preferred plan, and we should keep it.

preferred in-memory hash lookup plan: https://explain.depesz.com/s/I1kN

This is a case that we would want to avoid the transform, because when both
the inner and outer are nullable and the outer is large and the inner is
small, the transformed plan would Scan and Materialize the inner for each
row of the outer row, which is very slow compared to the untransformed plan:

slow case for the transformation: https://explain.depesz.com/s/0CBB

However, if the inner is too large to fit into memory, then the transformed
plan is faster on all of our other test cases, although our test cases are
far from complete. If the current solution supports parallel scan of the
outer, for example, then PQ could have lower elapsed time than the non-PQ
nested loop solution.

Also, remember that the issue with the empty inner was just a bug that was
the result of trying to do an additional optimization in the case where
there is no WHERE clause in the subquery. That bug has been fixed. The
general case transformation described in the base note produces the correct
result in all cases, including the empty subquery case.

-----
Jim Finnerty, AWS, Amazon Aurora PostgreSQL
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

#34David Rowley
david.rowley@2ndquadrant.com
In reply to: Tom Lane (#32)
Re: NOT IN subquery optimization

On Sat, 2 Mar 2019 at 13:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Rowley <david.rowley@2ndquadrant.com> writes:

I think you're fighting a losing battle here with adding OR quals to
the join condition.

Yeah --- that has a nontrivial risk of making things significantly worse,
which makes it a hard sell. I think the most reasonable bet here is
simply to not perform the transformation if we can't prove the inner side
NOT NULL. That's going to catch most of the useful cases anyway IMO.

Did you mean outer side NOT NULL? The OR col IS NULL was trying to
solve the outer side nullability problem when the inner side is empty.
Of course, the inner side needs to not produce NULLs either, but
that's due to the fact that if a NULL exists in the inner side then
the anti-join should not produce any records.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#35Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#34)
Re: NOT IN subquery optimization

David Rowley <david.rowley@2ndquadrant.com> writes:

On Sat, 2 Mar 2019 at 13:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah --- that has a nontrivial risk of making things significantly worse,
which makes it a hard sell. I think the most reasonable bet here is
simply to not perform the transformation if we can't prove the inner side
NOT NULL. That's going to catch most of the useful cases anyway IMO.

Did you mean outer side NOT NULL?

Sorry, sloppy thinking.

Of course, the inner side needs to not produce NULLs either, but
that's due to the fact that if a NULL exists in the inner side then
the anti-join should not produce any records.

Right. So the path of least resistance is to transform to antijoin
only if we can prove both of those things (and maybe we need to check
that the join operator is strict, too? -ENOCAFFEINE). The question
before us is what is the cost-benefit ratio of trying to cope with
additional cases. I'm skeptical that it's attractive: the cost
certainly seems high, and I don't know that there are very many
real-world cases where we'd get a win.

Hmm ... thinking about the strictness angle some more: what we really
need to optimize NOT IN, IIUC, is an assumption that the join operator
will never return NULL. While not having NULL inputs is certainly a
*necessary* condition for that (assuming a strict operator) it's not a
*sufficient* condition. Any Postgres function/operator is capable
of returning NULL whenever it feels like it. So checking strictness
does not lead to a mathematically correct optimization.

My initial thought about plugging that admittedly-academic point is
to insist that the join operator be both strict and a member of a
btree opclass (hash might be OK too; less sure about other index types).
The system already contains assumptions that btree comparators never
return NULL. I doubt that this costs us any real-world applicability,
because if the join operator can neither merge nor hash, we're screwed
anyway for finding a join plan that's better than nested-loop.

regards, tom lane

#36David Rowley
david.rowley@2ndquadrant.com
In reply to: Tom Lane (#35)
1 attachment(s)
Re: NOT IN subquery optimization

On Sun, 3 Mar 2019 at 05:25, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hmm ... thinking about the strictness angle some more: what we really
need to optimize NOT IN, IIUC, is an assumption that the join operator
will never return NULL. While not having NULL inputs is certainly a
*necessary* condition for that (assuming a strict operator) it's not a
*sufficient* condition. Any Postgres function/operator is capable
of returning NULL whenever it feels like it. So checking strictness
does not lead to a mathematically correct optimization.

That's something I didn't think of.

My initial thought about plugging that admittedly-academic point is
to insist that the join operator be both strict and a member of a
btree opclass (hash might be OK too; less sure about other index types).
The system already contains assumptions that btree comparators never
return NULL. I doubt that this costs us any real-world applicability,
because if the join operator can neither merge nor hash, we're screwed
anyway for finding a join plan that's better than nested-loop.

Why strict? If both inputs are non-NULL, then what additional
guarantees does strict give us?

I implemented a btree opfamily check in my version of the patch. Not
so sure about hash, can you point me in the direction of a mention of
how this is guarantees for btree?

The attached v1.2 does this adds a regression test using the LINE
type. This has an operator named '=', but no btree opfamily. A few
other types are in this boat too, per:

select typname from pg_type t where not exists(select 1 from pg_amop
where amoplefttype = t.oid and amopmethod=403) and exists (select 1
from pg_operator where oprleft = t.oid and oprname = '=');

The list of builtin types that have a hash opfamily but no btree
opfamily that support NOT IN are not very exciting, so doing the same
for hash might not be worth the extra code.

select typname from pg_type t where exists(select 1 from pg_amop where
amoplefttype = t.oid and amopmethod=405) and exists (select 1 from
pg_operator where oprleft = t.oid and oprname = '=') and not
exists(select 1 from pg_amop where amoplefttype = t.oid and
amopmethod=403);
typname
---------
xid
cid
aclitem
(3 rows)

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

not_in_anti_join_v1.2.patchapplication/octet-stream; name=not_in_anti_join_v1.2.patchDownload
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 555c91f61e..d4309d4762 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -87,6 +87,9 @@ static bool contain_dml(Node *node);
 static bool contain_dml_walker(Node *node, void *context);
 static void inline_cte(PlannerInfo *root, CommonTableExpr *cte);
 static bool inline_cte_walker(Node *node, inline_cte_walker_context *context);
+static bool is_NOTANY_compatible_with_antijoin(Query *outerquery,
+											   SubLink * sublink,
+											   Node *notnull_proofs);
 static bool simplify_EXISTS_query(PlannerInfo *root, Query *query);
 static Query *convert_EXISTS_to_ANY(PlannerInfo *root, Query *subselect,
 					  Node **testexpr, List **paramIds);
@@ -1108,6 +1111,98 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
 }
 
 
+/*
+ * is_NOTANY_compatible_with_antijoin
+ *		True if the NOT IN sublink can be safely converted into an ANTI JOIN.
+ *		Per SQL spec, NOT IN is not ordinarily equivalent to an anti-join,
+ *		however, if we can prove that all of the expressions on both sides of
+ *		the, would be, join condition are all certainly not NULL and none of
+ *		the join expressions can produce NULLs, then it's safe to convert NOT
+ *		IN to an anti-join.
+ *
+ * To ensure that the join expressions cannot be NULL, we can't just check for
+ * strict operators since these only guarantee NULL on NULL input, we need a
+ * guarantee of NOT NULL on NOT NULL input.  Fortunately we can just insist
+ * that the operator is a member of a btree opclass.
+ *
+ * 'outerquery' is the parse of the query that the NOT IN is present in.
+ * 'notnull_proofs' are quals from the same syntactical level as the NOT IN.
+ * These will be used to assist in proving the outer query's would be join
+ * expressions cannot be NULL.
+ *
+ * Note:
+ *	This function is quite locked into the NOT IN syntax. Certain assumptions
+ *	are made about the structure of the join conditions:
+ *
+ * 1.	We assume that when more than 1 join condition exists that these are
+ *		AND type conditions, i.e not OR conditions.
+ *
+ * 2.	We assume that each join qual is an OpExpr with 2 arguments and the
+ *		first arguments in each qual is the one that belongs to the outer side
+ *		of the NOT IN clause.
+ */
+static bool
+is_NOTANY_compatible_with_antijoin(Query *outerquery, SubLink *sublink,
+								   Node *notnull_proofs)
+{
+	Node	   *testexpr = sublink->testexpr;
+	List	   *outerexpr;
+	ListCell   *lc;
+
+	/* Extract the outer rel's join condition expressions. */
+
+	/* if it's a single expression */
+	if (IsA(testexpr, OpExpr))
+	{
+		OpExpr *opexpr = (OpExpr *) testexpr;
+
+		Assert(list_length(opexpr->args) == 2);
+
+		/* Reject if op is not part of a btree opfamily */
+		if (!has_mergejoin_opfamilies(opexpr->opno))
+			return false;
+
+		outerexpr = list_make1(linitial(opexpr->args));
+	}
+
+	/* Extract exprs from multiple expressions ANDed together */
+	else if (IsA(testexpr, BoolExpr))
+	{
+		List	 *list = ((BoolExpr *) testexpr)->args;
+
+		outerexpr = NIL;
+
+		/* Build a list containing the lefthand expr from each OpExpr */
+		foreach(lc, list)
+		{
+			OpExpr *opexpr = (OpExpr *) lfirst(lc);
+
+			Assert(list_length(opexpr->args) == 2);
+
+			/* Reject if op is not part of a btree opfamily */
+			if (!has_mergejoin_opfamilies(opexpr->opno))
+				return false;
+
+			outerexpr = lappend(outerexpr, linitial(opexpr->args));
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+					(int) nodeTag(testexpr));
+
+	Assert(outerexpr != NIL);
+
+	/* Check if any outer expressions can be NULL. */
+	if (!expressions_are_not_nullable(outerquery, outerexpr, notnull_proofs))
+		return false;
+
+	/* Now validate the subquery targetlist to ensure no NULL are possible. */
+	if (!query_outputs_are_not_nullable((Query *) sublink->subselect))
+		return false;
+
+	return true; /* supports ANTI JOIN */
+}
+
 /*
  * convert_ANY_sublink_to_join: try to convert an ANY SubLink to a join
  *
@@ -1117,11 +1212,22 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
  * If so, form a JoinExpr and return it.  Return NULL if the SubLink cannot
  * be converted to a join.
  *
- * The only non-obvious input parameter is available_rels: this is the set
- * of query rels that can safely be referenced in the sublink expression.
- * (We must restrict this to avoid changing the semantics when a sublink
- * is present in an outer join's ON qual.)  The conversion must fail if
- * the converted qual would reference any but these parent-query relids.
+ * If under_not is true, the caller actually found NOT (ANY SubLink),
+ * so that what we must try to build is an ANTI not SEMI join.  Per SQL spec,
+ * NOT IN is not ordinarily equivalent to an anti-join, so that by default we
+ * have to fail when under_not.  However, if we can prove that all of the
+ * expressions on both sides of the, would be, join condition are all
+ * certainly not NULL and none of the join expressions can produce NULLs, then
+ * it's safe to convert NOT IN to an anti-join.  To assist with nullability
+ * proofs for the join conditions on the outside of the join, we make use of
+ *'notnull_proofs'. It is the caller's responsibility to ensure these proofs
+ * come from the same syntactical level as the NOT IN sublink.
+ *
+ * available_rels is the set of query rels that can safely be referenced
+ * in the sublink expression.  (We must restrict this to avoid changing the
+ * semantics when a sublink is present in an outer join's ON qual.)
+ * The conversion must fail if the converted qual would reference any but
+ * these parent-query relids.
  *
  * On success, the returned JoinExpr has larg = NULL and rarg = the jointree
  * item representing the pulled-up subquery.  The caller must set larg to
@@ -1144,7 +1250,8 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
  */
 JoinExpr *
 convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
-							Relids available_rels)
+							bool under_not, Relids available_rels,
+							Node *notnull_proofs)
 {
 	JoinExpr   *result;
 	Query	   *parse = root->parse;
@@ -1159,6 +1266,11 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 
 	Assert(sublink->subLinkType == ANY_SUBLINK);
 
+	/* Check if NOT IN can be converted to an anti-join. */
+	if (under_not &&
+		!is_NOTANY_compatible_with_antijoin(parse, sublink, notnull_proofs))
+		return NULL;
+
 	/*
 	 * The sub-select must not refer to any Vars of the parent query. (Vars of
 	 * higher levels should be okay, though.)
@@ -1228,7 +1340,7 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 	 * And finally, build the JoinExpr node.
 	 */
 	result = makeNode(JoinExpr);
-	result->jointype = JOIN_SEMI;
+	result->jointype = under_not ? JOIN_ANTI : JOIN_SEMI;
 	result->isNatural = false;
 	result->larg = NULL;		/* caller must fill this in */
 	result->rarg = (Node *) rtr;
@@ -1243,9 +1355,7 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 /*
  * convert_EXISTS_sublink_to_join: try to convert an EXISTS SubLink to a join
  *
- * The API of this function is identical to convert_ANY_sublink_to_join's,
- * except that we also support the case where the caller has found NOT EXISTS,
- * so we need an additional input parameter "under_not".
+ * The API of this function is identical to convert_ANY_sublink_to_join's.
  */
 JoinExpr *
 convert_EXISTS_sublink_to_join(PlannerInfo *root, SubLink *sublink,
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index aebe162713..27a7fafcf4 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -64,7 +64,8 @@ static Node *pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 								  Relids *relids);
 static Node *pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 							  Node **jtlink1, Relids available_rels1,
-							  Node **jtlink2, Relids available_rels2);
+							  Node **jtlink2, Relids available_rels2,
+							  Node *notnull_proofs);
 static Node *pull_up_subqueries_recurse(PlannerInfo *root, Node *jtnode,
 						   JoinExpr *lowest_outer_join,
 						   JoinExpr *lowest_nulling_outer_join,
@@ -261,7 +262,7 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 		/* Now process qual --- all children are available for use */
 		newf->quals = pull_up_sublinks_qual_recurse(root, f->quals,
 													&jtlink, frelids,
-													NULL, NULL);
+													NULL, NULL, f->quals);
 
 		/*
 		 * Note that the result will be either newf, or a stack of JoinExprs
@@ -315,13 +316,13 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 														 &jtlink,
 														 bms_union(leftrelids,
 																   rightrelids),
-														 NULL, NULL);
+														 NULL, NULL, j->quals);
 				break;
 			case JOIN_LEFT:
 				j->quals = pull_up_sublinks_qual_recurse(root, j->quals,
 														 &j->rarg,
 														 rightrelids,
-														 NULL, NULL);
+														 NULL, NULL, j->quals);
 				break;
 			case JOIN_FULL:
 				/* can't do anything with full-join quals */
@@ -330,7 +331,7 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 				j->quals = pull_up_sublinks_qual_recurse(root, j->quals,
 														 &j->larg,
 														 leftrelids,
-														 NULL, NULL);
+														 NULL, NULL, j->quals);
 				break;
 			default:
 				elog(ERROR, "unrecognized join type: %d",
@@ -370,12 +371,17 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
  * and/or jtlink2 in the order we encounter them.  We rely on subsequent
  * optimization to rearrange the stack if appropriate.
  *
+ * notnull_proofs, if not passed as NULL, is used to help prove that the
+ * would be outer rel's join condition exprs cannot be NULL when we attempt
+ * to convert NOT IN to use an antijoin.
+ *
  * Returns the replacement qual node, or NULL if the qual should be removed.
  */
 static Node *
 pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 							  Node **jtlink1, Relids available_rels1,
-							  Node **jtlink2, Relids available_rels2)
+							  Node **jtlink2, Relids available_rels2,
+							  Node *notnull_proofs)
 {
 	if (node == NULL)
 		return NULL;
@@ -388,8 +394,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 		/* Is it a convertible ANY or EXISTS clause? */
 		if (sublink->subLinkType == ANY_SUBLINK)
 		{
-			if ((j = convert_ANY_sublink_to_join(root, sublink,
-												 available_rels1)) != NULL)
+			if ((j = convert_ANY_sublink_to_join(root, sublink, false,
+												 available_rels1, NULL)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
 				j->larg = *jtlink1;
@@ -409,13 +415,14 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 														 &j->larg,
 														 available_rels1,
 														 &j->rarg,
-														 child_rels);
+														 child_rels,
+														 notnull_proofs);
 				/* Return NULL representing constant TRUE */
 				return NULL;
 			}
 			if (available_rels2 != NULL &&
-				(j = convert_ANY_sublink_to_join(root, sublink,
-												 available_rels2)) != NULL)
+				(j = convert_ANY_sublink_to_join(root, sublink, false,
+												 available_rels2, NULL)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
 				j->larg = *jtlink2;
@@ -435,7 +442,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 														 &j->larg,
 														 available_rels2,
 														 &j->rarg,
-														 child_rels);
+														 child_rels,
+														 notnull_proofs);
 				/* Return NULL representing constant TRUE */
 				return NULL;
 			}
@@ -463,7 +471,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 														 &j->larg,
 														 available_rels1,
 														 &j->rarg,
-														 child_rels);
+														 child_rels,
+														 notnull_proofs);
 				/* Return NULL representing constant TRUE */
 				return NULL;
 			}
@@ -489,7 +498,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 														 &j->larg,
 														 available_rels2,
 														 &j->rarg,
-														 child_rels);
+														 child_rels,
+														 notnull_proofs);
 				/* Return NULL representing constant TRUE */
 				return NULL;
 			}
@@ -499,14 +509,70 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 	}
 	if (is_notclause(node))
 	{
-		/* If the immediate argument of NOT is EXISTS, try to convert */
+		/* If the immediate argument of NOT is ANY or EXISTS, try to convert */
 		SubLink    *sublink = (SubLink *) get_notclausearg((Expr *) node);
 		JoinExpr   *j;
 		Relids		child_rels;
 
 		if (sublink && IsA(sublink, SubLink))
 		{
-			if (sublink->subLinkType == EXISTS_SUBLINK)
+			if (sublink->subLinkType == ANY_SUBLINK)
+			{
+				if ((j = convert_ANY_sublink_to_join(root, sublink, true,
+												   available_rels1, notnull_proofs)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink1;
+					*jtlink1 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Because
+					 * we are underneath a NOT, we can't pull up sublinks that
+					 * reference the left-hand stuff, but it's still okay to
+					 * pull up sublinks referencing j->rarg.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->rarg,
+															 child_rels,
+															 NULL, NULL,
+															 notnull_proofs);
+					/* Return NULL representing constant TRUE */
+					return NULL;
+				}
+				if (available_rels2 != NULL &&
+					(j = convert_ANY_sublink_to_join(root, sublink, true,
+												   available_rels2, notnull_proofs)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink2;
+					*jtlink2 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Because
+					 * we are underneath a NOT, we can't pull up sublinks that
+					 * reference the left-hand stuff, but it's still okay to
+					 * pull up sublinks referencing j->rarg.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->rarg,
+															 child_rels,
+															 NULL, NULL,
+															 notnull_proofs);
+					/* Return NULL representing constant TRUE */
+					return NULL;
+				}
+			}
+			else if (sublink->subLinkType == EXISTS_SUBLINK)
 			{
 				if ((j = convert_EXISTS_sublink_to_join(root, sublink, true,
 														available_rels1)) != NULL)
@@ -529,7 +595,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 															 j->quals,
 															 &j->rarg,
 															 child_rels,
-															 NULL, NULL);
+															 NULL, NULL,
+															 notnull_proofs);
 					/* Return NULL representing constant TRUE */
 					return NULL;
 				}
@@ -555,7 +622,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 															 j->quals,
 															 &j->rarg,
 															 child_rels,
-															 NULL, NULL);
+															 NULL, NULL,
+															 notnull_proofs);
 					/* Return NULL representing constant TRUE */
 					return NULL;
 				}
@@ -580,7 +648,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 													  jtlink1,
 													  available_rels1,
 													  jtlink2,
-													  available_rels2);
+													  available_rels2,
+													  notnull_proofs);
 			if (newclause)
 				newclauses = lappend(newclauses, newclause);
 		}
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 501b0e9e2d..b79a99c885 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -42,6 +42,7 @@
 #include "parser/parse_agg.h"
 #include "parser/parse_coerce.h"
 #include "parser/parse_func.h"
+#include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "tcop/tcopprot.h"
 #include "utils/acl.h"
@@ -113,6 +114,8 @@ static bool contain_context_dependent_node_walker(Node *node, int *flags);
 static bool contain_leaked_vars_walker(Node *node, void *context);
 static Relids find_nonnullable_rels_walker(Node *node, bool top_level);
 static List *find_nonnullable_vars_walker(Node *node, bool top_level);
+static void find_innerjoined_rels(Node *jtnode,
+					  Relids *innerjoined_rels, List **usable_quals);
 static bool is_strict_saop(ScalarArrayOpExpr *expr, bool falseOK);
 static Node *eval_const_expressions_mutator(Node *node,
 							   eval_const_expressions_context *context);
@@ -1467,6 +1470,10 @@ contain_leaked_vars_walker(Node *node, void *context)
 								  context);
 }
 
+/*****************************************************************************
+ *		  Nullability analysis
+ *****************************************************************************/
+
 /*
  * find_nonnullable_rels
  *		Determine which base rels are forced nonnullable by given clause.
@@ -1710,7 +1717,7 @@ find_nonnullable_rels_walker(Node *node, bool top_level)
  * but here we assume that the input is a Boolean expression, and wish to
  * see if NULL inputs will provably cause a FALSE-or-NULL result.  We expect
  * the expression to have been AND/OR flattened and converted to implicit-AND
- * format.
+ * format (but the results are still good if it wasn't AND/OR flattened).
  *
  * The result is a palloc'd List, but we have not copied the member Var nodes.
  * Also, we don't bother trying to eliminate duplicate entries.
@@ -2011,6 +2018,393 @@ find_forced_null_var(Node *node)
 	return NULL;
 }
 
+/*
+ * expressions_are_not_nullable
+ *		Return TRUE if all 'exprs' are certainly not NULL.
+ *
+ * The reason this takes a Query, and not just a list of expressions, is so
+ * that we can determine which relations are INNER JOINed and make use of the
+ * query's WHERE/ON clauses to help prove that inner joined rel's Vars cannot
+ * be NULL in the absense of a NOT NULL constraint.
+ *
+ * 'query' is the query that the 'exprs' belong to.
+ * 'notnull_proofs' can be passed to assist in proving the non-nullability of
+ * 'exprs'.  These may be used for outer joined Vars or for Vars that allow
+ * NULLs to help determine if each 'exprs' cannot be NULL.  It is the callers
+ * responsibility to ensure 'notnull_proofs' come from the same syntactical
+ * level as 'exprs'.
+ *
+ * In current usage, the passed exprs haven't yet been through any planner
+ * processing.  This means that applying find_nonnullable_vars() on them isn't
+ * really ideal: for lack of const-simplification, we might be unable to prove
+ * not-nullness in some cases where we could have proved it afterwards.
+ * However, we should not get any false positive results.
+ *
+ * Here we can err on the side of conservatism: if we're not sure then it's
+ * okay to return FALSE.
+ */
+bool
+expressions_are_not_nullable(Query *query, List *exprs, Node *notnull_proofs)
+{
+	Relids		innerjoined_rels = NULL;
+	List	   *innerjoined_useful_quals = NIL;
+	bool		computed_innerjoined_rels = false;
+	List	   *nonnullable_vars = NIL;
+	bool		computed_nonnullable_vars = false;
+	List	   *nonnullable_inner_vars = NIL;
+	bool		computed_nonnullable_inner_vars = false;
+	ListCell   *lc;
+
+	foreach(lc, exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+
+		/*
+		 * For the most part we don't try to deal with anything more complex
+		 * than Consts and Vars; but it seems worthwhile to look through
+		 * binary relabelings, since we know those don't introduce nulls.
+		 */
+		while (expr && IsA(expr, RelabelType))
+			expr = ((RelabelType *) expr)->arg;
+
+		if (expr == NULL)		/* paranoia */
+			return false;
+
+		if (IsA(expr, Const))
+		{
+			/* Consts are easy: they're either null or not. */
+			if (((Const *) expr)->constisnull)
+				return false;
+		}
+		else if (IsA(expr, Var))
+		{
+			Var		   *var = (Var *) expr;
+
+			/* Currently, we punt for any nonlocal Vars */
+			if (var->varlevelsup != 0)
+				return false;
+
+			/*
+			 * Since the subquery hasn't yet been through expression
+			 * preprocessing, we must apply flatten_join_alias_vars to the
+			 * given Var, and to any Vars found by find_nonnullable_vars, to
+			 * avoid being fooled by join aliases.  If we get something other
+			 * than a plain Var out of the substitution, punt.
+			 */
+			var = (Var *) flatten_join_alias_vars(query, (Node *) var);
+
+			if (!IsA(var, Var))
+				return false;
+			Assert(var->varlevelsup == 0);
+
+			/*
+			 * We don't bother to compute innerjoined_rels until we've found a
+			 * Var we must analyze.
+			 */
+			if (!computed_innerjoined_rels)
+			{
+				find_innerjoined_rels((Node *) query->jointree,
+									  &innerjoined_rels,
+									  &innerjoined_useful_quals);
+				computed_innerjoined_rels = true;
+			}
+
+			/* Check if the Var is from an INNER JOINed rel */
+			if (bms_is_member(var->varno, innerjoined_rels))
+			{
+				RangeTblEntry *rte = rt_fetch(var->varno, query->rtable);
+
+				/*
+				 * If Var is from a plain relation and its column is marked
+				 * NOT NULL according to the catalogs, it can't produce NULL.
+				 */
+				if (rte->rtekind == RTE_RELATION &&
+					get_attnotnull(rte->relid, var->varattno))
+					continue;	/* cannot produce NULL */
+
+				/*
+				 * Otherwise check for the existance of quals which filter
+				 * out NULL values for this Var.  We may need to compute the
+				 * nonnullable_inner_vars, if not done already.
+				 */
+				if (!computed_nonnullable_inner_vars)
+				{
+					nonnullable_inner_vars =
+						find_nonnullable_vars((Node *) innerjoined_useful_quals);
+					nonnullable_inner_vars = (List *)
+						flatten_join_alias_vars(query,
+											(Node *) nonnullable_inner_vars);
+					/* We don't bother removing any non-Vars from the result */
+					computed_nonnullable_inner_vars = true;
+				}
+
+				if (list_member(nonnullable_inner_vars, var))
+					continue;	/* cannot produce NULL */
+			}
+
+			/*
+			 * Even if that didn't work, we can conclude that the Var is not
+			 * nullable if find_nonnullable_vars can find a "var IS NOT NULL"
+			 * or similarly strict condition among the notnull_proofs.
+			 * Compute the list of Vars having such quals if we didn't
+			 * already.
+			 */
+			if (!computed_nonnullable_vars)
+			{
+				nonnullable_vars = find_nonnullable_vars(notnull_proofs);
+				nonnullable_vars = (List *)
+					flatten_join_alias_vars(query,
+											(Node *) nonnullable_vars);
+				/* We don't bother removing any non-Vars from the result */
+				computed_nonnullable_vars = true;
+			}
+
+			if (!list_member(nonnullable_vars, var))
+				return false;	/* we failed to prove the Var non-null */
+		}
+		else
+		{
+			/* Not a Const or Var; punt */
+			return false;
+		}
+	}
+
+	return true;				/* exprs cannot emit NULLs */
+}
+
+/*
+ * query_outputs_are_not_nullable
+ *		Returns TRUE if the output values of the Query are certainly not NULL.
+ *		All output columns must return non-NULL to answer TRUE.
+ *
+ * The reason this takes a Query, and not just an individual tlist expression,
+ * is so that we can make use of the query's WHERE/ON clauses to prove it does
+ * not return nulls.
+ *
+ * In current usage, the passed sub-Query hasn't yet been through any planner
+ * processing.  This means that applying find_nonnullable_vars() to its WHERE
+ * clauses isn't really ideal: for lack of const-simplification, we might be
+ * unable to prove not-nullness in some cases where we could have proved it
+ * afterwards.  However, we should not get any false positive results.
+ *
+ * Like the other forms of nullability analysis above, we can err on the
+ * side of conservatism: if we're not sure, it's okay to return FALSE.
+ */
+bool
+query_outputs_are_not_nullable(Query *query)
+{
+	Relids		innerjoined_rels = NULL;
+	bool		computed_innerjoined_rels = false;
+	List	   *usable_quals = NIL;
+	List	   *nonnullable_vars = NIL;
+	bool		computed_nonnullable_vars = false;
+	ListCell   *tl;
+
+	/*
+	 * If the query contains set operations, punt.  The set ops themselves
+	 * couldn't introduce nulls that weren't in their inputs, but the tlist
+	 * present in the top-level query is just dummy and won't give us useful
+	 * info.  We could get an answer by recursing to examine each leaf query,
+	 * but for the moment it doesn't seem worth the extra complication.
+	 *
+	 * Note that we needn't consider other top-level operators such as
+	 * DISTINCT, GROUP BY, etc, as those will not introduce nulls either.
+	 */
+	if (query->setOperations)
+		return false;
+
+	/*
+	 * Examine each targetlist entry to prove that it can't produce NULL.
+	 */
+	foreach(tl, query->targetList)
+	{
+		TargetEntry *tle = (TargetEntry *) lfirst(tl);
+		Expr	   *expr = tle->expr;
+
+		/* Resjunk columns can be ignored: they don't produce output values */
+		if (tle->resjunk)
+			continue;
+
+		/*
+		 * For the most part we don't try to deal with anything more complex
+		 * than Consts and Vars; but it seems worthwhile to look through
+		 * binary relabelings, since we know those don't introduce nulls.
+		 */
+		while (expr && IsA(expr, RelabelType))
+			expr = ((RelabelType *) expr)->arg;
+
+		if (expr == NULL)		/* paranoia */
+			return false;
+
+		if (IsA(expr, Const))
+		{
+			/* Consts are easy: they're either null or not. */
+			if (((Const *) expr)->constisnull)
+				return false;
+		}
+		else if (IsA(expr, Var))
+		{
+			Var		   *var = (Var *) expr;
+
+			/* Currently, we punt for any nonlocal Vars */
+			if (var->varlevelsup != 0)
+				return false;
+
+			/*
+			 * Since the subquery hasn't yet been through expression
+			 * preprocessing, we must apply flatten_join_alias_vars to the
+			 * given Var, and to any Vars found by find_nonnullable_vars, to
+			 * avoid being fooled by join aliases.  If we get something other
+			 * than a plain Var out of the substitution, punt.
+			 */
+			var = (Var *) flatten_join_alias_vars(query, (Node *) var);
+
+			if (!IsA(var, Var))
+				return false;
+			Assert(var->varlevelsup == 0);
+
+			/*
+			 * We don't bother to compute innerjoined_rels and usable_quals
+			 * until we've found a Var we must analyze.
+			 */
+			if (!computed_innerjoined_rels)
+			{
+				find_innerjoined_rels((Node *) query->jointree,
+									  &innerjoined_rels, &usable_quals);
+				computed_innerjoined_rels = true;
+			}
+
+			/*
+			 * If Var is from a plain relation, and that relation is not on
+			 * the nullable side of any outer join, and its column is marked
+			 * NOT NULL according to the catalogs, it can't produce NULL.
+			 */
+			if (bms_is_member(var->varno, innerjoined_rels))
+			{
+				RangeTblEntry *rte = rt_fetch(var->varno, query->rtable);
+
+				if (rte->rtekind == RTE_RELATION &&
+					get_attnotnull(rte->relid, var->varattno))
+					continue;	/* cannot produce NULL */
+			}
+
+			/*
+			 * Even if that didn't work, we can conclude that the Var is not
+			 * nullable if find_nonnullable_vars can find a "var IS NOT NULL"
+			 * or similarly strict condition among the usable_quals.  Compute
+			 * the list of Vars having such quals if we didn't already.
+			 */
+			if (!computed_nonnullable_vars)
+			{
+				nonnullable_vars = find_nonnullable_vars((Node *) usable_quals);
+				nonnullable_vars = (List *)
+					flatten_join_alias_vars(query,
+											(Node *) nonnullable_vars);
+				/* We don't bother removing any non-Vars from the result */
+				computed_nonnullable_vars = true;
+			}
+
+			if (!list_member(nonnullable_vars, var))
+				return false;	/* we failed to prove the Var non-null */
+		}
+		else
+		{
+			/* Not a Const or Var; punt */
+			return false;
+		}
+	}
+
+	return true;				/* query cannot emit NULLs */
+}
+
+/*
+ * find_innerjoined_rels
+ *		Traverse jointree to locate non-outerjoined-rels and quals above them
+ *
+ * We fill innerjoined_rels with the relids of all rels that are not below
+ * the nullable side of any outer join (which would cause their Vars to be
+ * possibly NULL regardless of what's in the catalogs).  In the same scan,
+ * we locate all WHERE and JOIN/ON quals that constrain these rels add them to
+ * the usable_quals list (forming a list with implicit-AND semantics).
+ *
+ * Top-level caller must initialize innerjoined_rels/usable_quals to NULL/NIL.
+ */
+static void
+find_innerjoined_rels(Node *jtnode,
+					  Relids *innerjoined_rels, List **usable_quals)
+{
+	if (jtnode == NULL)
+		return;
+	if (IsA(jtnode, RangeTblRef))
+	{
+		int			varno = ((RangeTblRef *) jtnode)->rtindex;
+
+		*innerjoined_rels = bms_add_member(*innerjoined_rels, varno);
+	}
+	else if (IsA(jtnode, FromExpr))
+	{
+		FromExpr   *f = (FromExpr *) jtnode;
+		ListCell   *lc;
+
+		/* All elements of the FROM list are allowable */
+		foreach(lc, f->fromlist)
+			find_innerjoined_rels((Node *) lfirst(lc),
+								  innerjoined_rels, usable_quals);
+		/* ... and its WHERE quals are too */
+		if (f->quals)
+			*usable_quals = lappend(*usable_quals, f->quals);
+	}
+	else if (IsA(jtnode, JoinExpr))
+	{
+		JoinExpr   *j = (JoinExpr *) jtnode;
+
+		switch (j->jointype)
+		{
+			case JOIN_INNER:
+				/* visit both children */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				/* and grab the ON quals too */
+				if (j->quals)
+					*usable_quals = lappend(*usable_quals, j->quals);
+				break;
+
+			case JOIN_LEFT:
+			case JOIN_SEMI:
+			case JOIN_ANTI:
+
+				/*
+				 * Only the left input is possibly non-nullable; furthermore,
+				 * the quals of this join don't constrain the left input.
+				 * Note: we probably can't see SEMI or ANTI joins at this
+				 * point, but if we do, we can treat them like LEFT joins.
+				 */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_RIGHT:
+				/* Reverse of the above case */
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_FULL:
+				/* Neither side is non-nullable, so stop descending */
+				break;
+
+			default:
+				elog(ERROR, "unrecognized join type: %d",
+					 (int) j->jointype);
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+			 (int) nodeTag(jtnode));
+}
+
 /*
  * Can we treat a ScalarArrayOpExpr as strict?
  *
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index e88c45d268..822fb0355f 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -388,6 +388,44 @@ get_mergejoin_opfamilies(Oid opno)
 	return result;
 }
 
+/*
+ * has_mergejoin_opfamilies
+ *		A lightweight version of get_mergejoin_opfamilies that can be used
+ *		when we only need to know that mergejoin opfamilies exist and we don't
+ *		need an actual list of the matching families.
+ */
+bool
+has_mergejoin_opfamilies(Oid opno)
+{
+	bool		result = false;
+	CatCList   *catlist;
+	int			i;
+
+	/*
+	 * Search pg_amop to see if the target operator is registered as the "="
+	 * operator of any btree opfamily.
+	 */
+	catlist = SearchSysCacheList1(AMOPOPID, ObjectIdGetDatum(opno));
+
+	for (i = 0; i < catlist->n_members; i++)
+	{
+		HeapTuple	tuple = &catlist->members[i]->tuple;
+		Form_pg_amop aform = (Form_pg_amop) GETSTRUCT(tuple);
+
+		/* must be btree equality */
+		if (aform->amopmethod == BTREE_AM_OID &&
+			aform->amopstrategy == BTEqualStrategyNumber)
+		{
+			result = true;
+			break;
+		}
+	}
+
+	ReleaseSysCacheList(catlist);
+
+	return result;
+}
+
 /*
  * get_compatible_hash_operators
  *		Get the OID(s) of hash equality operator(s) compatible with the given
@@ -878,6 +916,35 @@ get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 	ReleaseSysCache(tp);
 }
 
+/*
+ * get_attnotnull
+ *
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ *
+ *		Returns false if the attr doesn't exist (or is dropped).
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	tp;
+
+	tp = SearchSysCache2(ATTNUM,
+						 ObjectIdGetDatum(relid),
+						 Int16GetDatum(attnum));
+	if (HeapTupleIsValid(tp))
+	{
+		Form_pg_attribute att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+		bool		result;
+
+		result = att_tup->attnotnull;
+		ReleaseSysCache(tp);
+		return result;
+	}
+	else
+		return false;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/include/optimizer/clauses.h b/src/include/optimizer/clauses.h
index 5e10fb1d50..b4d189eabc 100644
--- a/src/include/optimizer/clauses.h
+++ b/src/include/optimizer/clauses.h
@@ -44,6 +44,9 @@ extern Relids find_nonnullable_rels(Node *clause);
 extern List *find_nonnullable_vars(Node *clause);
 extern List *find_forced_null_vars(Node *clause);
 extern Var *find_forced_null_var(Node *clause);
+extern bool expressions_are_not_nullable(Query *query, List *exprs,
+							 Node *notnull_proofs);
+extern bool query_outputs_are_not_nullable(Query *query);
 
 extern bool is_pseudo_constant_clause(Node *clause);
 extern bool is_pseudo_constant_clause_relids(Node *clause, Relids relids);
diff --git a/src/include/optimizer/subselect.h b/src/include/optimizer/subselect.h
index 2d2c3bcbc0..48216a6277 100644
--- a/src/include/optimizer/subselect.h
+++ b/src/include/optimizer/subselect.h
@@ -19,7 +19,9 @@
 extern void SS_process_ctes(PlannerInfo *root);
 extern JoinExpr *convert_ANY_sublink_to_join(PlannerInfo *root,
 							SubLink *sublink,
-							Relids available_rels);
+							bool under_not,
+							Relids available_rels,
+							Node *notnull_proofs);
 extern JoinExpr *convert_EXISTS_sublink_to_join(PlannerInfo *root,
 							   SubLink *sublink,
 							   bool under_not,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 16b0b1d2dc..908de931c1 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -76,6 +76,7 @@ extern bool get_ordering_op_properties(Oid opno,
 extern Oid	get_equality_op_for_ordering_op(Oid opno, bool *reverse);
 extern Oid	get_ordering_op_for_equality_op(Oid opno, bool use_lhs_type);
 extern List *get_mergejoin_opfamilies(Oid opno);
+extern bool has_mergejoin_opfamilies(Oid opno);
 extern bool get_compatible_hash_operators(Oid opno,
 							  Oid *lhs_opno, Oid *rhs_opno);
 extern bool get_op_hash_functions(Oid opno,
@@ -89,6 +90,7 @@ extern AttrNumber get_attnum(Oid relid, const char *attname);
 extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 					  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern char *get_collation_name(Oid colloid);
 extern char *get_constraint_name(Oid conoid);
 extern char *get_language_name(Oid langoid, bool missing_ok);
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index fe5fc64480..ae01307875 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -1323,3 +1323,381 @@ select * from x for update;
    Output: subselect_tbl.f1, subselect_tbl.f2, subselect_tbl.f3
 (2 rows)
 
+--
+-- Check NOT IN performs an ANTI JOIN when NULLs are not possible
+-- on either side of the, would be, join condition.
+--
+BEGIN;
+CREATE TEMP TABLE a (id INT PRIMARY KEY);
+CREATE TEMP TABLE b (x INT NOT NULL, y INT);
+CREATE TEMP TABLE c (z INT NOT NULL);
+INSERT INTO b VALUES(1,1),(2,2),(3,NULL);
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+LEFT OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT z FROM c);
+             QUERY PLAN             
+------------------------------------
+ Hash Right Join
+   Hash Cond: (b.x = a.id)
+   Filter: (NOT (hashed SubPlan 1))
+   ->  Seq Scan on b
+   ->  Hash
+         ->  Seq Scan on a
+   SubPlan 1
+     ->  Seq Scan on c
+(8 rows)
+
+-- ANTI JOIN, b.x is from an outer join but b.x > 100
+-- forces the join not to produce NULL on the righthand
+-- side.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+LEFT OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT z FROM c)
+  AND b.x > 100;
+           QUERY PLAN            
+---------------------------------
+ Hash Join
+   Hash Cond: (b.x = a.id)
+   ->  Hash Anti Join
+         Hash Cond: (b.x = c.z)
+         ->  Seq Scan on b
+               Filter: (x > 100)
+         ->  Hash
+               ->  Seq Scan on c
+   ->  Hash
+         ->  Seq Scan on a
+(10 rows)
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+FULL OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT y FROM c);
+         QUERY PLAN          
+-----------------------------
+ Hash Full Join
+   Hash Cond: (b.x = a.id)
+   Filter: (NOT (SubPlan 1))
+   ->  Seq Scan on b
+   ->  Hash
+         ->  Seq Scan on a
+   SubPlan 1
+     ->  Seq Scan on c
+(8 rows)
+
+-- No ANTI JOIN. y can have NULLs
+EXPLAIN (COSTS OFF)
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on b
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on c
+(4 rows)
+
+-- c is an empty relation so should cause no filtering on b
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+ x | y 
+---+---
+ 1 | 1
+ 2 | 2
+ 3 |  
+(3 rows)
+
+INSERT INTO c VALUES(1);
+-- Records where y is NULL should be filtered out.
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+ x | y 
+---+---
+ 2 | 2
+(1 row)
+
+-- No ANTI JOIN, y can be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+(4 rows)
+
+-- No ANTI JOIN, x is NOT NULL, but we don't know if + 1 will change that.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT x+1 FROM b);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+(4 rows)
+
+-- ANTI JOIN 1 is a Const that is not null.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT 1 FROM b);
+        QUERY PLAN         
+---------------------------
+ Nested Loop Anti Join
+   Join Filter: (a.id = 1)
+   ->  Seq Scan on a
+   ->  Materialize
+         ->  Seq Scan on b
+(5 rows)
+
+-- No ANTI JOIN, results contain a NULL Const
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT NULL::int FROM b);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+(4 rows)
+
+-- ANTI JOIN y = 1 means y can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE y = 1);
+          QUERY PLAN           
+-------------------------------
+ Hash Anti Join
+   Hash Cond: (a.id = b.y)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: (y = 1)
+(6 rows)
+
+-- No ANTI JOIN, OR condition does not ensure y = 1
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE y = 1 OR x = 1);
+               QUERY PLAN               
+----------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+           Filter: ((y = 1) OR (x = 1))
+(5 rows)
+
+-- No ANTI JOIN, OR condition does not ensure y = 1 or y = 2
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR x = 1) AND (y = 2 OR x = 2));
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+           Filter: (((y = 1) OR (x = 1)) AND ((y = 2) OR (x = 2)))
+(5 rows)
+
+-- ANTI JOIN y must be 2, so can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR x = 1) AND y = 2);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (a.id = b.y)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: ((y = 2) AND ((y = 1) OR (x = 1)))
+(6 rows)
+
+-- ANTI JOIN y can be 1 or 2, but can't be null.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR y = 2));
+                 QUERY PLAN                 
+--------------------------------------------
+ Hash Anti Join
+   Hash Cond: (a.id = b.y)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: ((y = 1) OR (y = 2))
+(6 rows)
+
+-- No ANTI JOIN c.z is from a left outer join so it can have nulls.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Merge Left Join
+           Merge Cond: (b.x = c.z)
+           ->  Sort
+                 Sort Key: b.x
+                 ->  Seq Scan on b
+           ->  Sort
+                 Sort Key: c.z
+                 ->  Seq Scan on c
+(11 rows)
+
+-- ANTI JOIN, c.z is not from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b RIGHT JOIN c ON b.x = c.z);
+               QUERY PLAN                
+-----------------------------------------
+ Merge Anti Join
+   Merge Cond: (a.id = c.z)
+   ->  Index Only Scan using a_pkey on a
+   ->  Merge Left Join
+         Merge Cond: (c.z = b.x)
+         ->  Sort
+               Sort Key: c.z
+               ->  Seq Scan on c
+         ->  Sort
+               Sort Key: b.x
+               ->  Seq Scan on b
+(11 rows)
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.x FROM b RIGHT JOIN c ON b.x = c.z);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Merge Right Join
+           Merge Cond: (b.x = c.z)
+           ->  Sort
+                 Sort Key: b.x
+                 ->  Seq Scan on b
+           ->  Sort
+                 Sort Key: c.z
+                 ->  Seq Scan on c
+(11 rows)
+
+-- No ANTI JOIN, c.z is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b FULL JOIN c ON b.x = c.z);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Merge Full Join
+           Merge Cond: (b.x = c.z)
+           ->  Sort
+                 Sort Key: b.x
+                 ->  Seq Scan on b
+           ->  Sort
+                 Sort Key: c.z
+                 ->  Seq Scan on c
+(11 rows)
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.x FROM b FULL JOIN c ON b.x = c.z);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Merge Full Join
+           Merge Cond: (b.x = c.z)
+           ->  Sort
+                 Sort Key: b.x
+                 ->  Seq Scan on b
+           ->  Sort
+                 Sort Key: c.z
+                 ->  Seq Scan on c
+(11 rows)
+
+-- ANTI JOIN, c.z is from an inner join and has a NOT NULL constraint.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b INNER JOIN c ON b.x = c.z);
+               QUERY PLAN                
+-----------------------------------------
+ Merge Anti Join
+   Merge Cond: (a.id = c.z)
+   ->  Index Only Scan using a_pkey on a
+   ->  Merge Join
+         Merge Cond: (b.x = c.z)
+         ->  Sort
+               Sort Key: b.x
+               ->  Seq Scan on b
+         ->  Sort
+               Sort Key: c.z
+               ->  Seq Scan on c
+(11 rows)
+
+-- ANTI JOIN, c.z must be 1
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z WHERE c.z = 1);
+                QUERY PLAN                 
+-------------------------------------------
+ Hash Anti Join
+   Hash Cond: (a.id = c.z)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Nested Loop
+               ->  Seq Scan on c
+                     Filter: (z = 1)
+               ->  Materialize
+                     ->  Seq Scan on b
+                           Filter: (x = 1)
+(10 rows)
+
+-- ANTI JOIN, c.z can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z WHERE c.z IS NOT NULL);
+                 QUERY PLAN                  
+---------------------------------------------
+ Merge Anti Join
+   Merge Cond: (a.id = c.z)
+   ->  Index Only Scan using a_pkey on a
+   ->  Merge Join
+         Merge Cond: (b.x = c.z)
+         ->  Sort
+               Sort Key: b.x
+               ->  Seq Scan on b
+         ->  Sort
+               Sort Key: c.z
+               ->  Seq Scan on c
+                     Filter: (z IS NOT NULL)
+(12 rows)
+
+-- ANTI JOIN, b.y cannot be NULL due to the join condition b.y = c.z
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.y FROM b INNER JOIN c ON b.y = c.z);
+               QUERY PLAN                
+-----------------------------------------
+ Merge Anti Join
+   Merge Cond: (a.id = b.y)
+   ->  Index Only Scan using a_pkey on a
+   ->  Merge Join
+         Merge Cond: (b.y = c.z)
+         ->  Sort
+               Sort Key: b.y
+               ->  Seq Scan on b
+         ->  Sort
+               Sort Key: c.z
+               ->  Seq Scan on c
+(11 rows)
+
+-- No ANTI JOIN when the type's '=' operator is not a member of a btree opfamily.
+CREATE TEMP TABLE l1 (a LINE NOT NULL);
+CREATE TEMP TABLE l2 (b LINE NOT NULL);
+EXPLAIN (COSTS OFF)
+SELECT * FROM l1 WHERE a NOT IN(SELECT b FROM l2);
+          QUERY PLAN          
+------------------------------
+ Seq Scan on l1
+   Filter: (NOT (SubPlan 1))
+   SubPlan 1
+     ->  Materialize
+           ->  Seq Scan on l2
+(5 rows)
+
+ROLLBACK;
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index b5931ee700..9ae4c4ebb3 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -692,3 +692,129 @@ select * from (with x as (select 2 as y) select * from x) ss;
 explain (verbose, costs off)
 with x as (select * from subselect_tbl)
 select * from x for update;
+
+--
+-- Check NOT IN performs an ANTI JOIN when NULLs are not possible
+-- on either side of the, would be, join condition.
+--
+
+BEGIN;
+
+CREATE TEMP TABLE a (id INT PRIMARY KEY);
+CREATE TEMP TABLE b (x INT NOT NULL, y INT);
+CREATE TEMP TABLE c (z INT NOT NULL);
+
+INSERT INTO b VALUES(1,1),(2,2),(3,NULL);
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+LEFT OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT z FROM c);
+
+-- ANTI JOIN, b.x is from an outer join but b.x > 100
+-- forces the join not to produce NULL on the righthand
+-- side.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+LEFT OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT z FROM c)
+  AND b.x > 100;
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+FULL OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT y FROM c);
+
+-- No ANTI JOIN. y can have NULLs
+EXPLAIN (COSTS OFF)
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+
+-- c is an empty relation so should cause no filtering on b
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+
+INSERT INTO c VALUES(1);
+
+-- Records where y is NULL should be filtered out.
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+
+-- No ANTI JOIN, y can be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b);
+
+-- No ANTI JOIN, x is NOT NULL, but we don't know if + 1 will change that.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT x+1 FROM b);
+
+-- ANTI JOIN 1 is a Const that is not null.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT 1 FROM b);
+
+-- No ANTI JOIN, results contain a NULL Const
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT NULL::int FROM b);
+
+-- ANTI JOIN y = 1 means y can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE y = 1);
+
+-- No ANTI JOIN, OR condition does not ensure y = 1
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE y = 1 OR x = 1);
+
+-- No ANTI JOIN, OR condition does not ensure y = 1 or y = 2
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR x = 1) AND (y = 2 OR x = 2));
+
+-- ANTI JOIN y must be 2, so can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR x = 1) AND y = 2);
+
+-- ANTI JOIN y can be 1 or 2, but can't be null.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR y = 2));
+
+-- No ANTI JOIN c.z is from a left outer join so it can have nulls.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z);
+
+-- ANTI JOIN, c.z is not from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b RIGHT JOIN c ON b.x = c.z);
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.x FROM b RIGHT JOIN c ON b.x = c.z);
+
+-- No ANTI JOIN, c.z is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b FULL JOIN c ON b.x = c.z);
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.x FROM b FULL JOIN c ON b.x = c.z);
+
+-- ANTI JOIN, c.z is from an inner join and has a NOT NULL constraint.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b INNER JOIN c ON b.x = c.z);
+
+-- ANTI JOIN, c.z must be 1
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z WHERE c.z = 1);
+
+-- ANTI JOIN, c.z can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z WHERE c.z IS NOT NULL);
+
+-- ANTI JOIN, b.y cannot be NULL due to the join condition b.y = c.z
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.y FROM b INNER JOIN c ON b.y = c.z);
+
+-- No ANTI JOIN when the type's '=' operator is not a member of a btree opfamily.
+CREATE TEMP TABLE l1 (a LINE NOT NULL);
+CREATE TEMP TABLE l2 (b LINE NOT NULL);
+EXPLAIN (COSTS OFF)
+SELECT * FROM l1 WHERE a NOT IN(SELECT b FROM l2);
+
+ROLLBACK;
#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#36)
Re: NOT IN subquery optimization

David Rowley <david.rowley@2ndquadrant.com> writes:

On Sun, 3 Mar 2019 at 05:25, Tom Lane <tgl@sss.pgh.pa.us> wrote:

My initial thought about plugging that admittedly-academic point is
to insist that the join operator be both strict and a member of a
btree opclass (hash might be OK too; less sure about other index types).

Why strict? If both inputs are non-NULL, then what additional
guarantees does strict give us?

Yeah, if we're verifying the inputs are non-null, I think that probably
doesn't matter.

I implemented a btree opfamily check in my version of the patch. Not
so sure about hash, can you point me in the direction of a mention of
how this is guarantees for btree?

https://www.postgresql.org/docs/devel/btree-support-funcs.html
quoth

The comparison function must take two non-null values A and B and
return an int32 value that is < 0, 0, or > 0 when A < B, A = B, or A >
B, respectively. A null result is disallowed: all values of the data
type must be comparable.

(At the code level, this is implicit in the fact that the comparison
function will be called via FunctionCall2Coll or a sibling, and those
all throw an error if the called function returns NULL.)

Now, it doesn't say in so many words that the comparison operators
have to yield results consistent with the comparison support function,
but I think that's pretty obvious ...

For hash, the equivalent constraint is that the hash function has to
work for every possible input value. I suppose it's possible that
the associated equality operator would sometimes return null, but
it's hard to think of a practical reason for doing so.

I've not dug in the code, but I wouldn't be too surprised if
nodeMergejoin.c or nodeHashjoin.c, or the stuff related to hash
grouping or hash aggregation, also contain assumptions about
the equality operators not returning null.

The list of builtin types that have a hash opfamily but no btree
opfamily that support NOT IN are not very exciting, so doing the same
for hash might not be worth the extra code.

Agreed for builtin types, but there might be some extensions out there
where this doesn't hold. It's not terribly hard to imagine a data type
that hasn't got a linear sort order but is amenable to hashing.
(The in-core xid type is an example, actually.)

regards, tom lane

#38David Rowley
david.rowley@2ndquadrant.com
In reply to: Tom Lane (#37)
1 attachment(s)
Re: NOT IN subquery optimization

On Sun, 3 Mar 2019 at 17:11, Tom Lane <tgl@sss.pgh.pa.us> wrote:

(At the code level, this is implicit in the fact that the comparison
function will be called via FunctionCall2Coll or a sibling, and those
all throw an error if the called function returns NULL.)

Now, it doesn't say in so many words that the comparison operators
have to yield results consistent with the comparison support function,
but I think that's pretty obvious ...

Ah okay. I can get it to misbehave by setting fcinfo->isnull = true in
the debugger from int4eq(). I see the NULL result there is not
verified as that's just translated into "false" by ExecInterpExpr()'s
EEOP_QUAL case. If you're saying something doing that is
fundamentally broken, then I guess we're okay.

David Rowley <david.rowley@2ndquadrant.com> writes:

The list of builtin types that have a hash opfamily but no btree
opfamily that support NOT IN are not very exciting, so doing the same
for hash might not be worth the extra code.

Agreed for builtin types, but there might be some extensions out there
where this doesn't hold. It's not terribly hard to imagine a data type
that hasn't got a linear sort order but is amenable to hashing.

On reflection, it seems pretty easy to add this check, so I've done so
in the attached.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

not_in_anti_join_v1.3.patchapplication/octet-stream; name=not_in_anti_join_v1.3.patchDownload
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 555c91f61e..b82ffc002b 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -87,6 +87,9 @@ static bool contain_dml(Node *node);
 static bool contain_dml_walker(Node *node, void *context);
 static void inline_cte(PlannerInfo *root, CommonTableExpr *cte);
 static bool inline_cte_walker(Node *node, inline_cte_walker_context *context);
+static bool is_NOTANY_compatible_with_antijoin(Query *outerquery,
+											   SubLink * sublink,
+											   Node *notnull_proofs);
 static bool simplify_EXISTS_query(PlannerInfo *root, Query *query);
 static Query *convert_EXISTS_to_ANY(PlannerInfo *root, Query *subselect,
 					  Node **testexpr, List **paramIds);
@@ -1108,6 +1111,98 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
 }
 
 
+/*
+ * is_NOTANY_compatible_with_antijoin
+ *		True if the NOT IN sublink can be safely converted into an ANTI JOIN.
+ *		Per SQL spec, NOT IN is not ordinarily equivalent to an anti-join,
+ *		however, if we can prove that all of the expressions on both sides of
+ *		the, would be, join condition are all certainly not NULL and none of
+ *		the join expressions can produce NULLs, then it's safe to convert NOT
+ *		IN to an anti-join.
+ *
+ * To ensure that the join expressions cannot be NULL, we can't just check for
+ * strict operators since these only guarantee NULL on NULL input, we need a
+ * guarantee of NOT NULL on NOT NULL input.  Fortunately we can just insist
+ * that the operator is a member of a btree or hash opfamily.
+ *
+ * 'outerquery' is the parse of the query that the NOT IN is present in.
+ * 'notnull_proofs' are quals from the same syntactical level as the NOT IN.
+ * These will be used to assist in proving the outer query's would be join
+ * expressions cannot be NULL.
+ *
+ * Note:
+ *	This function is quite locked into the NOT IN syntax. Certain assumptions
+ *	are made about the structure of the join conditions:
+ *
+ * 1.	We assume that when more than 1 join condition exists that these are
+ *		AND type conditions, i.e not OR conditions.
+ *
+ * 2.	We assume that each join qual is an OpExpr with 2 arguments and the
+ *		first arguments in each qual is the one that belongs to the outer side
+ *		of the NOT IN clause.
+ */
+static bool
+is_NOTANY_compatible_with_antijoin(Query *outerquery, SubLink *sublink,
+								   Node *notnull_proofs)
+{
+	Node	   *testexpr = sublink->testexpr;
+	List	   *outerexpr;
+	ListCell   *lc;
+
+	/* Extract the outer rel's join condition expressions. */
+
+	/* if it's a single expression */
+	if (IsA(testexpr, OpExpr))
+	{
+		OpExpr *opexpr = (OpExpr *) testexpr;
+
+		Assert(list_length(opexpr->args) == 2);
+
+		/* Reject if op is not part of a btree or hash opfamily */
+		if (!has_merge_or_hash_join_opfamilies(opexpr->opno))
+			return false;
+
+		outerexpr = list_make1(linitial(opexpr->args));
+	}
+
+	/* Extract exprs from multiple expressions ANDed together */
+	else if (IsA(testexpr, BoolExpr))
+	{
+		List	 *list = ((BoolExpr *) testexpr)->args;
+
+		outerexpr = NIL;
+
+		/* Build a list containing the lefthand expr from each OpExpr */
+		foreach(lc, list)
+		{
+			OpExpr *opexpr = (OpExpr *) lfirst(lc);
+
+			Assert(list_length(opexpr->args) == 2);
+
+			/* Reject if op is not part of a btree or hash opfamily */
+			if (!has_merge_or_hash_join_opfamilies(opexpr->opno))
+				return false;
+
+			outerexpr = lappend(outerexpr, linitial(opexpr->args));
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+					(int) nodeTag(testexpr));
+
+	Assert(outerexpr != NIL);
+
+	/* Check if any outer expressions can be NULL. */
+	if (!expressions_are_not_nullable(outerquery, outerexpr, notnull_proofs))
+		return false;
+
+	/* Now validate the subquery targetlist to ensure no NULL are possible. */
+	if (!query_outputs_are_not_nullable((Query *) sublink->subselect))
+		return false;
+
+	return true; /* supports ANTI JOIN */
+}
+
 /*
  * convert_ANY_sublink_to_join: try to convert an ANY SubLink to a join
  *
@@ -1117,11 +1212,22 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
  * If so, form a JoinExpr and return it.  Return NULL if the SubLink cannot
  * be converted to a join.
  *
- * The only non-obvious input parameter is available_rels: this is the set
- * of query rels that can safely be referenced in the sublink expression.
- * (We must restrict this to avoid changing the semantics when a sublink
- * is present in an outer join's ON qual.)  The conversion must fail if
- * the converted qual would reference any but these parent-query relids.
+ * If under_not is true, the caller actually found NOT (ANY SubLink),
+ * so that what we must try to build is an ANTI not SEMI join.  Per SQL spec,
+ * NOT IN is not ordinarily equivalent to an anti-join, so that by default we
+ * have to fail when under_not.  However, if we can prove that all of the
+ * expressions on both sides of the, would be, join condition are all
+ * certainly not NULL and none of the join expressions can produce NULLs, then
+ * it's safe to convert NOT IN to an anti-join.  To assist with nullability
+ * proofs for the join conditions on the outside of the join, we make use of
+ *'notnull_proofs'. It is the caller's responsibility to ensure these proofs
+ * come from the same syntactical level as the NOT IN sublink.
+ *
+ * available_rels is the set of query rels that can safely be referenced
+ * in the sublink expression.  (We must restrict this to avoid changing the
+ * semantics when a sublink is present in an outer join's ON qual.)
+ * The conversion must fail if the converted qual would reference any but
+ * these parent-query relids.
  *
  * On success, the returned JoinExpr has larg = NULL and rarg = the jointree
  * item representing the pulled-up subquery.  The caller must set larg to
@@ -1144,7 +1250,8 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
  */
 JoinExpr *
 convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
-							Relids available_rels)
+							bool under_not, Relids available_rels,
+							Node *notnull_proofs)
 {
 	JoinExpr   *result;
 	Query	   *parse = root->parse;
@@ -1159,6 +1266,11 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 
 	Assert(sublink->subLinkType == ANY_SUBLINK);
 
+	/* Check if NOT IN can be converted to an anti-join. */
+	if (under_not &&
+		!is_NOTANY_compatible_with_antijoin(parse, sublink, notnull_proofs))
+		return NULL;
+
 	/*
 	 * The sub-select must not refer to any Vars of the parent query. (Vars of
 	 * higher levels should be okay, though.)
@@ -1228,7 +1340,7 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 	 * And finally, build the JoinExpr node.
 	 */
 	result = makeNode(JoinExpr);
-	result->jointype = JOIN_SEMI;
+	result->jointype = under_not ? JOIN_ANTI : JOIN_SEMI;
 	result->isNatural = false;
 	result->larg = NULL;		/* caller must fill this in */
 	result->rarg = (Node *) rtr;
@@ -1243,9 +1355,7 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 /*
  * convert_EXISTS_sublink_to_join: try to convert an EXISTS SubLink to a join
  *
- * The API of this function is identical to convert_ANY_sublink_to_join's,
- * except that we also support the case where the caller has found NOT EXISTS,
- * so we need an additional input parameter "under_not".
+ * The API of this function is identical to convert_ANY_sublink_to_join's.
  */
 JoinExpr *
 convert_EXISTS_sublink_to_join(PlannerInfo *root, SubLink *sublink,
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index aebe162713..27a7fafcf4 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -64,7 +64,8 @@ static Node *pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 								  Relids *relids);
 static Node *pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 							  Node **jtlink1, Relids available_rels1,
-							  Node **jtlink2, Relids available_rels2);
+							  Node **jtlink2, Relids available_rels2,
+							  Node *notnull_proofs);
 static Node *pull_up_subqueries_recurse(PlannerInfo *root, Node *jtnode,
 						   JoinExpr *lowest_outer_join,
 						   JoinExpr *lowest_nulling_outer_join,
@@ -261,7 +262,7 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 		/* Now process qual --- all children are available for use */
 		newf->quals = pull_up_sublinks_qual_recurse(root, f->quals,
 													&jtlink, frelids,
-													NULL, NULL);
+													NULL, NULL, f->quals);
 
 		/*
 		 * Note that the result will be either newf, or a stack of JoinExprs
@@ -315,13 +316,13 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 														 &jtlink,
 														 bms_union(leftrelids,
 																   rightrelids),
-														 NULL, NULL);
+														 NULL, NULL, j->quals);
 				break;
 			case JOIN_LEFT:
 				j->quals = pull_up_sublinks_qual_recurse(root, j->quals,
 														 &j->rarg,
 														 rightrelids,
-														 NULL, NULL);
+														 NULL, NULL, j->quals);
 				break;
 			case JOIN_FULL:
 				/* can't do anything with full-join quals */
@@ -330,7 +331,7 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 				j->quals = pull_up_sublinks_qual_recurse(root, j->quals,
 														 &j->larg,
 														 leftrelids,
-														 NULL, NULL);
+														 NULL, NULL, j->quals);
 				break;
 			default:
 				elog(ERROR, "unrecognized join type: %d",
@@ -370,12 +371,17 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
  * and/or jtlink2 in the order we encounter them.  We rely on subsequent
  * optimization to rearrange the stack if appropriate.
  *
+ * notnull_proofs, if not passed as NULL, is used to help prove that the
+ * would be outer rel's join condition exprs cannot be NULL when we attempt
+ * to convert NOT IN to use an antijoin.
+ *
  * Returns the replacement qual node, or NULL if the qual should be removed.
  */
 static Node *
 pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 							  Node **jtlink1, Relids available_rels1,
-							  Node **jtlink2, Relids available_rels2)
+							  Node **jtlink2, Relids available_rels2,
+							  Node *notnull_proofs)
 {
 	if (node == NULL)
 		return NULL;
@@ -388,8 +394,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 		/* Is it a convertible ANY or EXISTS clause? */
 		if (sublink->subLinkType == ANY_SUBLINK)
 		{
-			if ((j = convert_ANY_sublink_to_join(root, sublink,
-												 available_rels1)) != NULL)
+			if ((j = convert_ANY_sublink_to_join(root, sublink, false,
+												 available_rels1, NULL)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
 				j->larg = *jtlink1;
@@ -409,13 +415,14 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 														 &j->larg,
 														 available_rels1,
 														 &j->rarg,
-														 child_rels);
+														 child_rels,
+														 notnull_proofs);
 				/* Return NULL representing constant TRUE */
 				return NULL;
 			}
 			if (available_rels2 != NULL &&
-				(j = convert_ANY_sublink_to_join(root, sublink,
-												 available_rels2)) != NULL)
+				(j = convert_ANY_sublink_to_join(root, sublink, false,
+												 available_rels2, NULL)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
 				j->larg = *jtlink2;
@@ -435,7 +442,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 														 &j->larg,
 														 available_rels2,
 														 &j->rarg,
-														 child_rels);
+														 child_rels,
+														 notnull_proofs);
 				/* Return NULL representing constant TRUE */
 				return NULL;
 			}
@@ -463,7 +471,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 														 &j->larg,
 														 available_rels1,
 														 &j->rarg,
-														 child_rels);
+														 child_rels,
+														 notnull_proofs);
 				/* Return NULL representing constant TRUE */
 				return NULL;
 			}
@@ -489,7 +498,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 														 &j->larg,
 														 available_rels2,
 														 &j->rarg,
-														 child_rels);
+														 child_rels,
+														 notnull_proofs);
 				/* Return NULL representing constant TRUE */
 				return NULL;
 			}
@@ -499,14 +509,70 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 	}
 	if (is_notclause(node))
 	{
-		/* If the immediate argument of NOT is EXISTS, try to convert */
+		/* If the immediate argument of NOT is ANY or EXISTS, try to convert */
 		SubLink    *sublink = (SubLink *) get_notclausearg((Expr *) node);
 		JoinExpr   *j;
 		Relids		child_rels;
 
 		if (sublink && IsA(sublink, SubLink))
 		{
-			if (sublink->subLinkType == EXISTS_SUBLINK)
+			if (sublink->subLinkType == ANY_SUBLINK)
+			{
+				if ((j = convert_ANY_sublink_to_join(root, sublink, true,
+												   available_rels1, notnull_proofs)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink1;
+					*jtlink1 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Because
+					 * we are underneath a NOT, we can't pull up sublinks that
+					 * reference the left-hand stuff, but it's still okay to
+					 * pull up sublinks referencing j->rarg.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->rarg,
+															 child_rels,
+															 NULL, NULL,
+															 notnull_proofs);
+					/* Return NULL representing constant TRUE */
+					return NULL;
+				}
+				if (available_rels2 != NULL &&
+					(j = convert_ANY_sublink_to_join(root, sublink, true,
+												   available_rels2, notnull_proofs)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink2;
+					*jtlink2 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Because
+					 * we are underneath a NOT, we can't pull up sublinks that
+					 * reference the left-hand stuff, but it's still okay to
+					 * pull up sublinks referencing j->rarg.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->rarg,
+															 child_rels,
+															 NULL, NULL,
+															 notnull_proofs);
+					/* Return NULL representing constant TRUE */
+					return NULL;
+				}
+			}
+			else if (sublink->subLinkType == EXISTS_SUBLINK)
 			{
 				if ((j = convert_EXISTS_sublink_to_join(root, sublink, true,
 														available_rels1)) != NULL)
@@ -529,7 +595,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 															 j->quals,
 															 &j->rarg,
 															 child_rels,
-															 NULL, NULL);
+															 NULL, NULL,
+															 notnull_proofs);
 					/* Return NULL representing constant TRUE */
 					return NULL;
 				}
@@ -555,7 +622,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 															 j->quals,
 															 &j->rarg,
 															 child_rels,
-															 NULL, NULL);
+															 NULL, NULL,
+															 notnull_proofs);
 					/* Return NULL representing constant TRUE */
 					return NULL;
 				}
@@ -580,7 +648,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 													  jtlink1,
 													  available_rels1,
 													  jtlink2,
-													  available_rels2);
+													  available_rels2,
+													  notnull_proofs);
 			if (newclause)
 				newclauses = lappend(newclauses, newclause);
 		}
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 501b0e9e2d..b79a99c885 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -42,6 +42,7 @@
 #include "parser/parse_agg.h"
 #include "parser/parse_coerce.h"
 #include "parser/parse_func.h"
+#include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "tcop/tcopprot.h"
 #include "utils/acl.h"
@@ -113,6 +114,8 @@ static bool contain_context_dependent_node_walker(Node *node, int *flags);
 static bool contain_leaked_vars_walker(Node *node, void *context);
 static Relids find_nonnullable_rels_walker(Node *node, bool top_level);
 static List *find_nonnullable_vars_walker(Node *node, bool top_level);
+static void find_innerjoined_rels(Node *jtnode,
+					  Relids *innerjoined_rels, List **usable_quals);
 static bool is_strict_saop(ScalarArrayOpExpr *expr, bool falseOK);
 static Node *eval_const_expressions_mutator(Node *node,
 							   eval_const_expressions_context *context);
@@ -1467,6 +1470,10 @@ contain_leaked_vars_walker(Node *node, void *context)
 								  context);
 }
 
+/*****************************************************************************
+ *		  Nullability analysis
+ *****************************************************************************/
+
 /*
  * find_nonnullable_rels
  *		Determine which base rels are forced nonnullable by given clause.
@@ -1710,7 +1717,7 @@ find_nonnullable_rels_walker(Node *node, bool top_level)
  * but here we assume that the input is a Boolean expression, and wish to
  * see if NULL inputs will provably cause a FALSE-or-NULL result.  We expect
  * the expression to have been AND/OR flattened and converted to implicit-AND
- * format.
+ * format (but the results are still good if it wasn't AND/OR flattened).
  *
  * The result is a palloc'd List, but we have not copied the member Var nodes.
  * Also, we don't bother trying to eliminate duplicate entries.
@@ -2011,6 +2018,393 @@ find_forced_null_var(Node *node)
 	return NULL;
 }
 
+/*
+ * expressions_are_not_nullable
+ *		Return TRUE if all 'exprs' are certainly not NULL.
+ *
+ * The reason this takes a Query, and not just a list of expressions, is so
+ * that we can determine which relations are INNER JOINed and make use of the
+ * query's WHERE/ON clauses to help prove that inner joined rel's Vars cannot
+ * be NULL in the absense of a NOT NULL constraint.
+ *
+ * 'query' is the query that the 'exprs' belong to.
+ * 'notnull_proofs' can be passed to assist in proving the non-nullability of
+ * 'exprs'.  These may be used for outer joined Vars or for Vars that allow
+ * NULLs to help determine if each 'exprs' cannot be NULL.  It is the callers
+ * responsibility to ensure 'notnull_proofs' come from the same syntactical
+ * level as 'exprs'.
+ *
+ * In current usage, the passed exprs haven't yet been through any planner
+ * processing.  This means that applying find_nonnullable_vars() on them isn't
+ * really ideal: for lack of const-simplification, we might be unable to prove
+ * not-nullness in some cases where we could have proved it afterwards.
+ * However, we should not get any false positive results.
+ *
+ * Here we can err on the side of conservatism: if we're not sure then it's
+ * okay to return FALSE.
+ */
+bool
+expressions_are_not_nullable(Query *query, List *exprs, Node *notnull_proofs)
+{
+	Relids		innerjoined_rels = NULL;
+	List	   *innerjoined_useful_quals = NIL;
+	bool		computed_innerjoined_rels = false;
+	List	   *nonnullable_vars = NIL;
+	bool		computed_nonnullable_vars = false;
+	List	   *nonnullable_inner_vars = NIL;
+	bool		computed_nonnullable_inner_vars = false;
+	ListCell   *lc;
+
+	foreach(lc, exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+
+		/*
+		 * For the most part we don't try to deal with anything more complex
+		 * than Consts and Vars; but it seems worthwhile to look through
+		 * binary relabelings, since we know those don't introduce nulls.
+		 */
+		while (expr && IsA(expr, RelabelType))
+			expr = ((RelabelType *) expr)->arg;
+
+		if (expr == NULL)		/* paranoia */
+			return false;
+
+		if (IsA(expr, Const))
+		{
+			/* Consts are easy: they're either null or not. */
+			if (((Const *) expr)->constisnull)
+				return false;
+		}
+		else if (IsA(expr, Var))
+		{
+			Var		   *var = (Var *) expr;
+
+			/* Currently, we punt for any nonlocal Vars */
+			if (var->varlevelsup != 0)
+				return false;
+
+			/*
+			 * Since the subquery hasn't yet been through expression
+			 * preprocessing, we must apply flatten_join_alias_vars to the
+			 * given Var, and to any Vars found by find_nonnullable_vars, to
+			 * avoid being fooled by join aliases.  If we get something other
+			 * than a plain Var out of the substitution, punt.
+			 */
+			var = (Var *) flatten_join_alias_vars(query, (Node *) var);
+
+			if (!IsA(var, Var))
+				return false;
+			Assert(var->varlevelsup == 0);
+
+			/*
+			 * We don't bother to compute innerjoined_rels until we've found a
+			 * Var we must analyze.
+			 */
+			if (!computed_innerjoined_rels)
+			{
+				find_innerjoined_rels((Node *) query->jointree,
+									  &innerjoined_rels,
+									  &innerjoined_useful_quals);
+				computed_innerjoined_rels = true;
+			}
+
+			/* Check if the Var is from an INNER JOINed rel */
+			if (bms_is_member(var->varno, innerjoined_rels))
+			{
+				RangeTblEntry *rte = rt_fetch(var->varno, query->rtable);
+
+				/*
+				 * If Var is from a plain relation and its column is marked
+				 * NOT NULL according to the catalogs, it can't produce NULL.
+				 */
+				if (rte->rtekind == RTE_RELATION &&
+					get_attnotnull(rte->relid, var->varattno))
+					continue;	/* cannot produce NULL */
+
+				/*
+				 * Otherwise check for the existance of quals which filter
+				 * out NULL values for this Var.  We may need to compute the
+				 * nonnullable_inner_vars, if not done already.
+				 */
+				if (!computed_nonnullable_inner_vars)
+				{
+					nonnullable_inner_vars =
+						find_nonnullable_vars((Node *) innerjoined_useful_quals);
+					nonnullable_inner_vars = (List *)
+						flatten_join_alias_vars(query,
+											(Node *) nonnullable_inner_vars);
+					/* We don't bother removing any non-Vars from the result */
+					computed_nonnullable_inner_vars = true;
+				}
+
+				if (list_member(nonnullable_inner_vars, var))
+					continue;	/* cannot produce NULL */
+			}
+
+			/*
+			 * Even if that didn't work, we can conclude that the Var is not
+			 * nullable if find_nonnullable_vars can find a "var IS NOT NULL"
+			 * or similarly strict condition among the notnull_proofs.
+			 * Compute the list of Vars having such quals if we didn't
+			 * already.
+			 */
+			if (!computed_nonnullable_vars)
+			{
+				nonnullable_vars = find_nonnullable_vars(notnull_proofs);
+				nonnullable_vars = (List *)
+					flatten_join_alias_vars(query,
+											(Node *) nonnullable_vars);
+				/* We don't bother removing any non-Vars from the result */
+				computed_nonnullable_vars = true;
+			}
+
+			if (!list_member(nonnullable_vars, var))
+				return false;	/* we failed to prove the Var non-null */
+		}
+		else
+		{
+			/* Not a Const or Var; punt */
+			return false;
+		}
+	}
+
+	return true;				/* exprs cannot emit NULLs */
+}
+
+/*
+ * query_outputs_are_not_nullable
+ *		Returns TRUE if the output values of the Query are certainly not NULL.
+ *		All output columns must return non-NULL to answer TRUE.
+ *
+ * The reason this takes a Query, and not just an individual tlist expression,
+ * is so that we can make use of the query's WHERE/ON clauses to prove it does
+ * not return nulls.
+ *
+ * In current usage, the passed sub-Query hasn't yet been through any planner
+ * processing.  This means that applying find_nonnullable_vars() to its WHERE
+ * clauses isn't really ideal: for lack of const-simplification, we might be
+ * unable to prove not-nullness in some cases where we could have proved it
+ * afterwards.  However, we should not get any false positive results.
+ *
+ * Like the other forms of nullability analysis above, we can err on the
+ * side of conservatism: if we're not sure, it's okay to return FALSE.
+ */
+bool
+query_outputs_are_not_nullable(Query *query)
+{
+	Relids		innerjoined_rels = NULL;
+	bool		computed_innerjoined_rels = false;
+	List	   *usable_quals = NIL;
+	List	   *nonnullable_vars = NIL;
+	bool		computed_nonnullable_vars = false;
+	ListCell   *tl;
+
+	/*
+	 * If the query contains set operations, punt.  The set ops themselves
+	 * couldn't introduce nulls that weren't in their inputs, but the tlist
+	 * present in the top-level query is just dummy and won't give us useful
+	 * info.  We could get an answer by recursing to examine each leaf query,
+	 * but for the moment it doesn't seem worth the extra complication.
+	 *
+	 * Note that we needn't consider other top-level operators such as
+	 * DISTINCT, GROUP BY, etc, as those will not introduce nulls either.
+	 */
+	if (query->setOperations)
+		return false;
+
+	/*
+	 * Examine each targetlist entry to prove that it can't produce NULL.
+	 */
+	foreach(tl, query->targetList)
+	{
+		TargetEntry *tle = (TargetEntry *) lfirst(tl);
+		Expr	   *expr = tle->expr;
+
+		/* Resjunk columns can be ignored: they don't produce output values */
+		if (tle->resjunk)
+			continue;
+
+		/*
+		 * For the most part we don't try to deal with anything more complex
+		 * than Consts and Vars; but it seems worthwhile to look through
+		 * binary relabelings, since we know those don't introduce nulls.
+		 */
+		while (expr && IsA(expr, RelabelType))
+			expr = ((RelabelType *) expr)->arg;
+
+		if (expr == NULL)		/* paranoia */
+			return false;
+
+		if (IsA(expr, Const))
+		{
+			/* Consts are easy: they're either null or not. */
+			if (((Const *) expr)->constisnull)
+				return false;
+		}
+		else if (IsA(expr, Var))
+		{
+			Var		   *var = (Var *) expr;
+
+			/* Currently, we punt for any nonlocal Vars */
+			if (var->varlevelsup != 0)
+				return false;
+
+			/*
+			 * Since the subquery hasn't yet been through expression
+			 * preprocessing, we must apply flatten_join_alias_vars to the
+			 * given Var, and to any Vars found by find_nonnullable_vars, to
+			 * avoid being fooled by join aliases.  If we get something other
+			 * than a plain Var out of the substitution, punt.
+			 */
+			var = (Var *) flatten_join_alias_vars(query, (Node *) var);
+
+			if (!IsA(var, Var))
+				return false;
+			Assert(var->varlevelsup == 0);
+
+			/*
+			 * We don't bother to compute innerjoined_rels and usable_quals
+			 * until we've found a Var we must analyze.
+			 */
+			if (!computed_innerjoined_rels)
+			{
+				find_innerjoined_rels((Node *) query->jointree,
+									  &innerjoined_rels, &usable_quals);
+				computed_innerjoined_rels = true;
+			}
+
+			/*
+			 * If Var is from a plain relation, and that relation is not on
+			 * the nullable side of any outer join, and its column is marked
+			 * NOT NULL according to the catalogs, it can't produce NULL.
+			 */
+			if (bms_is_member(var->varno, innerjoined_rels))
+			{
+				RangeTblEntry *rte = rt_fetch(var->varno, query->rtable);
+
+				if (rte->rtekind == RTE_RELATION &&
+					get_attnotnull(rte->relid, var->varattno))
+					continue;	/* cannot produce NULL */
+			}
+
+			/*
+			 * Even if that didn't work, we can conclude that the Var is not
+			 * nullable if find_nonnullable_vars can find a "var IS NOT NULL"
+			 * or similarly strict condition among the usable_quals.  Compute
+			 * the list of Vars having such quals if we didn't already.
+			 */
+			if (!computed_nonnullable_vars)
+			{
+				nonnullable_vars = find_nonnullable_vars((Node *) usable_quals);
+				nonnullable_vars = (List *)
+					flatten_join_alias_vars(query,
+											(Node *) nonnullable_vars);
+				/* We don't bother removing any non-Vars from the result */
+				computed_nonnullable_vars = true;
+			}
+
+			if (!list_member(nonnullable_vars, var))
+				return false;	/* we failed to prove the Var non-null */
+		}
+		else
+		{
+			/* Not a Const or Var; punt */
+			return false;
+		}
+	}
+
+	return true;				/* query cannot emit NULLs */
+}
+
+/*
+ * find_innerjoined_rels
+ *		Traverse jointree to locate non-outerjoined-rels and quals above them
+ *
+ * We fill innerjoined_rels with the relids of all rels that are not below
+ * the nullable side of any outer join (which would cause their Vars to be
+ * possibly NULL regardless of what's in the catalogs).  In the same scan,
+ * we locate all WHERE and JOIN/ON quals that constrain these rels add them to
+ * the usable_quals list (forming a list with implicit-AND semantics).
+ *
+ * Top-level caller must initialize innerjoined_rels/usable_quals to NULL/NIL.
+ */
+static void
+find_innerjoined_rels(Node *jtnode,
+					  Relids *innerjoined_rels, List **usable_quals)
+{
+	if (jtnode == NULL)
+		return;
+	if (IsA(jtnode, RangeTblRef))
+	{
+		int			varno = ((RangeTblRef *) jtnode)->rtindex;
+
+		*innerjoined_rels = bms_add_member(*innerjoined_rels, varno);
+	}
+	else if (IsA(jtnode, FromExpr))
+	{
+		FromExpr   *f = (FromExpr *) jtnode;
+		ListCell   *lc;
+
+		/* All elements of the FROM list are allowable */
+		foreach(lc, f->fromlist)
+			find_innerjoined_rels((Node *) lfirst(lc),
+								  innerjoined_rels, usable_quals);
+		/* ... and its WHERE quals are too */
+		if (f->quals)
+			*usable_quals = lappend(*usable_quals, f->quals);
+	}
+	else if (IsA(jtnode, JoinExpr))
+	{
+		JoinExpr   *j = (JoinExpr *) jtnode;
+
+		switch (j->jointype)
+		{
+			case JOIN_INNER:
+				/* visit both children */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				/* and grab the ON quals too */
+				if (j->quals)
+					*usable_quals = lappend(*usable_quals, j->quals);
+				break;
+
+			case JOIN_LEFT:
+			case JOIN_SEMI:
+			case JOIN_ANTI:
+
+				/*
+				 * Only the left input is possibly non-nullable; furthermore,
+				 * the quals of this join don't constrain the left input.
+				 * Note: we probably can't see SEMI or ANTI joins at this
+				 * point, but if we do, we can treat them like LEFT joins.
+				 */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_RIGHT:
+				/* Reverse of the above case */
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_FULL:
+				/* Neither side is non-nullable, so stop descending */
+				break;
+
+			default:
+				elog(ERROR, "unrecognized join type: %d",
+					 (int) j->jointype);
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+			 (int) nodeTag(jtnode));
+}
+
 /*
  * Can we treat a ScalarArrayOpExpr as strict?
  *
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index e88c45d268..7b665f0afe 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -388,6 +388,45 @@ get_mergejoin_opfamilies(Oid opno)
 	return result;
 }
 
+/*
+ * has_merge_or_hash_join_opfamilies
+ *		TRUE iif 'opno' belongs to any btree or hash opfamily's equality
+ *		strategy.
+ */
+bool
+has_merge_or_hash_join_opfamilies(Oid opno)
+{
+	bool		result = false;
+	CatCList   *catlist;
+	int			i;
+
+	/*
+	 * Search pg_amop to see if the target operator is registered as the "="
+	 * operator of any btree opfamily.
+	 */
+	catlist = SearchSysCacheList1(AMOPOPID, ObjectIdGetDatum(opno));
+
+	for (i = 0; i < catlist->n_members; i++)
+	{
+		HeapTuple	tuple = &catlist->members[i]->tuple;
+		Form_pg_amop aform = (Form_pg_amop) GETSTRUCT(tuple);
+
+		/* must be btree or hash equality */
+		if ((aform->amopmethod == BTREE_AM_OID &&
+			aform->amopstrategy == BTEqualStrategyNumber) ||
+			(aform->amopmethod == HASH_AM_OID &&
+			aform->amopstrategy == HTEqualStrategyNumber))
+		{
+			result = true;
+			break;
+		}
+	}
+
+	ReleaseSysCacheList(catlist);
+
+	return result;
+}
+
 /*
  * get_compatible_hash_operators
  *		Get the OID(s) of hash equality operator(s) compatible with the given
@@ -878,6 +917,35 @@ get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 	ReleaseSysCache(tp);
 }
 
+/*
+ * get_attnotnull
+ *
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ *
+ *		Returns false if the attr doesn't exist (or is dropped).
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	tp;
+
+	tp = SearchSysCache2(ATTNUM,
+						 ObjectIdGetDatum(relid),
+						 Int16GetDatum(attnum));
+	if (HeapTupleIsValid(tp))
+	{
+		Form_pg_attribute att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+		bool		result;
+
+		result = att_tup->attnotnull;
+		ReleaseSysCache(tp);
+		return result;
+	}
+	else
+		return false;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/include/optimizer/clauses.h b/src/include/optimizer/clauses.h
index 5e10fb1d50..b4d189eabc 100644
--- a/src/include/optimizer/clauses.h
+++ b/src/include/optimizer/clauses.h
@@ -44,6 +44,9 @@ extern Relids find_nonnullable_rels(Node *clause);
 extern List *find_nonnullable_vars(Node *clause);
 extern List *find_forced_null_vars(Node *clause);
 extern Var *find_forced_null_var(Node *clause);
+extern bool expressions_are_not_nullable(Query *query, List *exprs,
+							 Node *notnull_proofs);
+extern bool query_outputs_are_not_nullable(Query *query);
 
 extern bool is_pseudo_constant_clause(Node *clause);
 extern bool is_pseudo_constant_clause_relids(Node *clause, Relids relids);
diff --git a/src/include/optimizer/subselect.h b/src/include/optimizer/subselect.h
index 2d2c3bcbc0..48216a6277 100644
--- a/src/include/optimizer/subselect.h
+++ b/src/include/optimizer/subselect.h
@@ -19,7 +19,9 @@
 extern void SS_process_ctes(PlannerInfo *root);
 extern JoinExpr *convert_ANY_sublink_to_join(PlannerInfo *root,
 							SubLink *sublink,
-							Relids available_rels);
+							bool under_not,
+							Relids available_rels,
+							Node *notnull_proofs);
 extern JoinExpr *convert_EXISTS_sublink_to_join(PlannerInfo *root,
 							   SubLink *sublink,
 							   bool under_not,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 16b0b1d2dc..ad1897962c 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -76,6 +76,7 @@ extern bool get_ordering_op_properties(Oid opno,
 extern Oid	get_equality_op_for_ordering_op(Oid opno, bool *reverse);
 extern Oid	get_ordering_op_for_equality_op(Oid opno, bool use_lhs_type);
 extern List *get_mergejoin_opfamilies(Oid opno);
+extern bool has_merge_or_hash_join_opfamilies(Oid opno);
 extern bool get_compatible_hash_operators(Oid opno,
 							  Oid *lhs_opno, Oid *rhs_opno);
 extern bool get_op_hash_functions(Oid opno,
@@ -89,6 +90,7 @@ extern AttrNumber get_attnum(Oid relid, const char *attname);
 extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 					  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern char *get_collation_name(Oid colloid);
 extern char *get_constraint_name(Oid conoid);
 extern char *get_language_name(Oid langoid, bool missing_ok);
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index fe5fc64480..8917ba35d6 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -1323,3 +1323,382 @@ select * from x for update;
    Output: subselect_tbl.f1, subselect_tbl.f2, subselect_tbl.f3
 (2 rows)
 
+--
+-- Check NOT IN performs an ANTI JOIN when NULLs are not possible
+-- on either side of the, would be, join condition.
+--
+BEGIN;
+CREATE TEMP TABLE a (id INT PRIMARY KEY);
+CREATE TEMP TABLE b (x INT NOT NULL, y INT);
+CREATE TEMP TABLE c (z INT NOT NULL);
+INSERT INTO b VALUES(1,1),(2,2),(3,NULL);
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+LEFT OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT z FROM c);
+             QUERY PLAN             
+------------------------------------
+ Hash Right Join
+   Hash Cond: (b.x = a.id)
+   Filter: (NOT (hashed SubPlan 1))
+   ->  Seq Scan on b
+   ->  Hash
+         ->  Seq Scan on a
+   SubPlan 1
+     ->  Seq Scan on c
+(8 rows)
+
+-- ANTI JOIN, b.x is from an outer join but b.x > 100
+-- forces the join not to produce NULL on the righthand
+-- side.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+LEFT OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT z FROM c)
+  AND b.x > 100;
+           QUERY PLAN            
+---------------------------------
+ Hash Join
+   Hash Cond: (b.x = a.id)
+   ->  Hash Anti Join
+         Hash Cond: (b.x = c.z)
+         ->  Seq Scan on b
+               Filter: (x > 100)
+         ->  Hash
+               ->  Seq Scan on c
+   ->  Hash
+         ->  Seq Scan on a
+(10 rows)
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+FULL OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT y FROM c);
+         QUERY PLAN          
+-----------------------------
+ Hash Full Join
+   Hash Cond: (b.x = a.id)
+   Filter: (NOT (SubPlan 1))
+   ->  Seq Scan on b
+   ->  Hash
+         ->  Seq Scan on a
+   SubPlan 1
+     ->  Seq Scan on c
+(8 rows)
+
+-- No ANTI JOIN. y can have NULLs
+EXPLAIN (COSTS OFF)
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on b
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on c
+(4 rows)
+
+-- c is an empty relation so should cause no filtering on b
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+ x | y 
+---+---
+ 1 | 1
+ 2 | 2
+ 3 |  
+(3 rows)
+
+INSERT INTO c VALUES(1);
+-- Records where y is NULL should be filtered out.
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+ x | y 
+---+---
+ 2 | 2
+(1 row)
+
+-- No ANTI JOIN, y can be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+(4 rows)
+
+-- No ANTI JOIN, x is NOT NULL, but we don't know if + 1 will change that.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT x+1 FROM b);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+(4 rows)
+
+-- ANTI JOIN 1 is a Const that is not null.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT 1 FROM b);
+        QUERY PLAN         
+---------------------------
+ Nested Loop Anti Join
+   Join Filter: (a.id = 1)
+   ->  Seq Scan on a
+   ->  Materialize
+         ->  Seq Scan on b
+(5 rows)
+
+-- No ANTI JOIN, results contain a NULL Const
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT NULL::int FROM b);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+(4 rows)
+
+-- ANTI JOIN y = 1 means y can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE y = 1);
+          QUERY PLAN           
+-------------------------------
+ Hash Anti Join
+   Hash Cond: (a.id = b.y)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: (y = 1)
+(6 rows)
+
+-- No ANTI JOIN, OR condition does not ensure y = 1
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE y = 1 OR x = 1);
+               QUERY PLAN               
+----------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+           Filter: ((y = 1) OR (x = 1))
+(5 rows)
+
+-- No ANTI JOIN, OR condition does not ensure y = 1 or y = 2
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR x = 1) AND (y = 2 OR x = 2));
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on b
+           Filter: (((y = 1) OR (x = 1)) AND ((y = 2) OR (x = 2)))
+(5 rows)
+
+-- ANTI JOIN y must be 2, so can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR x = 1) AND y = 2);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (a.id = b.y)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: ((y = 2) AND ((y = 1) OR (x = 1)))
+(6 rows)
+
+-- ANTI JOIN y can be 1 or 2, but can't be null.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR y = 2));
+                 QUERY PLAN                 
+--------------------------------------------
+ Hash Anti Join
+   Hash Cond: (a.id = b.y)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: ((y = 1) OR (y = 2))
+(6 rows)
+
+-- No ANTI JOIN c.z is from a left outer join so it can have nulls.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Merge Left Join
+           Merge Cond: (b.x = c.z)
+           ->  Sort
+                 Sort Key: b.x
+                 ->  Seq Scan on b
+           ->  Sort
+                 Sort Key: c.z
+                 ->  Seq Scan on c
+(11 rows)
+
+-- ANTI JOIN, c.z is not from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b RIGHT JOIN c ON b.x = c.z);
+               QUERY PLAN                
+-----------------------------------------
+ Merge Anti Join
+   Merge Cond: (a.id = c.z)
+   ->  Index Only Scan using a_pkey on a
+   ->  Merge Left Join
+         Merge Cond: (c.z = b.x)
+         ->  Sort
+               Sort Key: c.z
+               ->  Seq Scan on c
+         ->  Sort
+               Sort Key: b.x
+               ->  Seq Scan on b
+(11 rows)
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.x FROM b RIGHT JOIN c ON b.x = c.z);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Merge Right Join
+           Merge Cond: (b.x = c.z)
+           ->  Sort
+                 Sort Key: b.x
+                 ->  Seq Scan on b
+           ->  Sort
+                 Sort Key: c.z
+                 ->  Seq Scan on c
+(11 rows)
+
+-- No ANTI JOIN, c.z is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b FULL JOIN c ON b.x = c.z);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Merge Full Join
+           Merge Cond: (b.x = c.z)
+           ->  Sort
+                 Sort Key: b.x
+                 ->  Seq Scan on b
+           ->  Sort
+                 Sort Key: c.z
+                 ->  Seq Scan on c
+(11 rows)
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.x FROM b FULL JOIN c ON b.x = c.z);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Merge Full Join
+           Merge Cond: (b.x = c.z)
+           ->  Sort
+                 Sort Key: b.x
+                 ->  Seq Scan on b
+           ->  Sort
+                 Sort Key: c.z
+                 ->  Seq Scan on c
+(11 rows)
+
+-- ANTI JOIN, c.z is from an inner join and has a NOT NULL constraint.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b INNER JOIN c ON b.x = c.z);
+               QUERY PLAN                
+-----------------------------------------
+ Merge Anti Join
+   Merge Cond: (a.id = c.z)
+   ->  Index Only Scan using a_pkey on a
+   ->  Merge Join
+         Merge Cond: (b.x = c.z)
+         ->  Sort
+               Sort Key: b.x
+               ->  Seq Scan on b
+         ->  Sort
+               Sort Key: c.z
+               ->  Seq Scan on c
+(11 rows)
+
+-- ANTI JOIN, c.z must be 1
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z WHERE c.z = 1);
+                QUERY PLAN                 
+-------------------------------------------
+ Hash Anti Join
+   Hash Cond: (a.id = c.z)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Nested Loop
+               ->  Seq Scan on c
+                     Filter: (z = 1)
+               ->  Materialize
+                     ->  Seq Scan on b
+                           Filter: (x = 1)
+(10 rows)
+
+-- ANTI JOIN, c.z can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z WHERE c.z IS NOT NULL);
+                 QUERY PLAN                  
+---------------------------------------------
+ Merge Anti Join
+   Merge Cond: (a.id = c.z)
+   ->  Index Only Scan using a_pkey on a
+   ->  Merge Join
+         Merge Cond: (b.x = c.z)
+         ->  Sort
+               Sort Key: b.x
+               ->  Seq Scan on b
+         ->  Sort
+               Sort Key: c.z
+               ->  Seq Scan on c
+                     Filter: (z IS NOT NULL)
+(12 rows)
+
+-- ANTI JOIN, b.y cannot be NULL due to the join condition b.y = c.z
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.y FROM b INNER JOIN c ON b.y = c.z);
+               QUERY PLAN                
+-----------------------------------------
+ Merge Anti Join
+   Merge Cond: (a.id = b.y)
+   ->  Index Only Scan using a_pkey on a
+   ->  Merge Join
+         Merge Cond: (b.y = c.z)
+         ->  Sort
+               Sort Key: b.y
+               ->  Seq Scan on b
+         ->  Sort
+               Sort Key: c.z
+               ->  Seq Scan on c
+(11 rows)
+
+-- No ANTI JOIN when the type's '=' operator is not a member of a btree or
+-- hash opfamily.
+CREATE TEMP TABLE l1 (a LINE NOT NULL);
+CREATE TEMP TABLE l2 (b LINE NOT NULL);
+EXPLAIN (COSTS OFF)
+SELECT * FROM l1 WHERE a NOT IN(SELECT b FROM l2);
+          QUERY PLAN          
+------------------------------
+ Seq Scan on l1
+   Filter: (NOT (SubPlan 1))
+   SubPlan 1
+     ->  Materialize
+           ->  Seq Scan on l2
+(5 rows)
+
+ROLLBACK;
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index b5931ee700..3d9a2efa33 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -692,3 +692,130 @@ select * from (with x as (select 2 as y) select * from x) ss;
 explain (verbose, costs off)
 with x as (select * from subselect_tbl)
 select * from x for update;
+
+--
+-- Check NOT IN performs an ANTI JOIN when NULLs are not possible
+-- on either side of the, would be, join condition.
+--
+
+BEGIN;
+
+CREATE TEMP TABLE a (id INT PRIMARY KEY);
+CREATE TEMP TABLE b (x INT NOT NULL, y INT);
+CREATE TEMP TABLE c (z INT NOT NULL);
+
+INSERT INTO b VALUES(1,1),(2,2),(3,NULL);
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+LEFT OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT z FROM c);
+
+-- ANTI JOIN, b.x is from an outer join but b.x > 100
+-- forces the join not to produce NULL on the righthand
+-- side.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+LEFT OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT z FROM c)
+  AND b.x > 100;
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a
+FULL OUTER JOIN b ON a.id = b.x
+WHERE b.x NOT IN(SELECT y FROM c);
+
+-- No ANTI JOIN. y can have NULLs
+EXPLAIN (COSTS OFF)
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+
+-- c is an empty relation so should cause no filtering on b
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+
+INSERT INTO c VALUES(1);
+
+-- Records where y is NULL should be filtered out.
+SELECT * FROM b WHERE y NOT IN (SELECT z FROM c);
+
+-- No ANTI JOIN, y can be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b);
+
+-- No ANTI JOIN, x is NOT NULL, but we don't know if + 1 will change that.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT x+1 FROM b);
+
+-- ANTI JOIN 1 is a Const that is not null.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT 1 FROM b);
+
+-- No ANTI JOIN, results contain a NULL Const
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT NULL::int FROM b);
+
+-- ANTI JOIN y = 1 means y can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE y = 1);
+
+-- No ANTI JOIN, OR condition does not ensure y = 1
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE y = 1 OR x = 1);
+
+-- No ANTI JOIN, OR condition does not ensure y = 1 or y = 2
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR x = 1) AND (y = 2 OR x = 2));
+
+-- ANTI JOIN y must be 2, so can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR x = 1) AND y = 2);
+
+-- ANTI JOIN y can be 1 or 2, but can't be null.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT y FROM b WHERE (y = 1 OR y = 2));
+
+-- No ANTI JOIN c.z is from a left outer join so it can have nulls.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z);
+
+-- ANTI JOIN, c.z is not from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b RIGHT JOIN c ON b.x = c.z);
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.x FROM b RIGHT JOIN c ON b.x = c.z);
+
+-- No ANTI JOIN, c.z is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b FULL JOIN c ON b.x = c.z);
+
+-- No ANTI JOIN, b.x is from an outer join
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.x FROM b FULL JOIN c ON b.x = c.z);
+
+-- ANTI JOIN, c.z is from an inner join and has a NOT NULL constraint.
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b INNER JOIN c ON b.x = c.z);
+
+-- ANTI JOIN, c.z must be 1
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z WHERE c.z = 1);
+
+-- ANTI JOIN, c.z can't be NULL
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT c.z FROM b LEFT JOIN c ON b.x = c.z WHERE c.z IS NOT NULL);
+
+-- ANTI JOIN, b.y cannot be NULL due to the join condition b.y = c.z
+EXPLAIN (COSTS OFF)
+SELECT * FROM a WHERE id NOT IN (SELECT b.y FROM b INNER JOIN c ON b.y = c.z);
+
+-- No ANTI JOIN when the type's '=' operator is not a member of a btree or
+-- hash opfamily.
+CREATE TEMP TABLE l1 (a LINE NOT NULL);
+CREATE TEMP TABLE l2 (b LINE NOT NULL);
+EXPLAIN (COSTS OFF)
+SELECT * FROM l1 WHERE a NOT IN(SELECT b FROM l2);
+
+ROLLBACK;
#39Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#38)
Re: NOT IN subquery optimization

David Rowley <david.rowley@2ndquadrant.com> writes:

On Sun, 3 Mar 2019 at 17:11, Tom Lane <tgl@sss.pgh.pa.us> wrote:

(At the code level, this is implicit in the fact that the comparison
function will be called via FunctionCall2Coll or a sibling, and those
all throw an error if the called function returns NULL.)

Ah okay. I can get it to misbehave by setting fcinfo->isnull = true in
the debugger from int4eq(). I see the NULL result there is not
verified as that's just translated into "false" by ExecInterpExpr()'s
EEOP_QUAL case. If you're saying something doing that is
fundamentally broken, then I guess we're okay.

No, what I'm thinking of is this bit in _bt_compare:

result = DatumGetInt32(FunctionCall2Coll(&scankey->sk_func,
scankey->sk_collation,
datum,
scankey->sk_argument));

You absolutely will get errors during btree insertions and searches
if a datatype's btree comparison functions ever return NULL (for
non-NULL inputs).

For hash indexes, that kind of restriction only directly applies to
hash-calculation functions, which perhaps are not as tightly tied to the
opclass's user-visible operators as is the case for btree opclasses.
But I think you might be able to find places in hash join or grouping
that are calling the actual equality operator and not allowing for it
to return NULL.

regards, tom lane

#40David Rowley
david.rowley@2ndquadrant.com
In reply to: Tom Lane (#39)
Re: NOT IN subquery optimization

On Mon, 4 Mar 2019 at 04:42, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Rowley <david.rowley@2ndquadrant.com> writes:

On Sun, 3 Mar 2019 at 17:11, Tom Lane <tgl@sss.pgh.pa.us> wrote:

(At the code level, this is implicit in the fact that the comparison
function will be called via FunctionCall2Coll or a sibling, and those
all throw an error if the called function returns NULL.)

Ah okay. I can get it to misbehave by setting fcinfo->isnull = true in
the debugger from int4eq(). I see the NULL result there is not
verified as that's just translated into "false" by ExecInterpExpr()'s
EEOP_QUAL case. If you're saying something doing that is
fundamentally broken, then I guess we're okay.

No, what I'm thinking of is this bit in _bt_compare:

result = DatumGetInt32(FunctionCall2Coll(&scankey->sk_func,
scankey->sk_collation,
datum,
scankey->sk_argument));

You absolutely will get errors during btree insertions and searches
if a datatype's btree comparison functions ever return NULL (for
non-NULL inputs).

I understand this is the case if an index happens to be used, but
there's no guarantee that's going to be the case. I was looking at the
case where an index was not used.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#41Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#40)
Re: NOT IN subquery optimization

David Rowley <david.rowley@2ndquadrant.com> writes:

On Mon, 4 Mar 2019 at 04:42, Tom Lane <tgl@sss.pgh.pa.us> wrote:

You absolutely will get errors during btree insertions and searches
if a datatype's btree comparison functions ever return NULL (for
non-NULL inputs).

I understand this is the case if an index happens to be used, but
there's no guarantee that's going to be the case. I was looking at the
case where an index was not used.

Not following your point? An index opclass is surely not going to be
designed on the assumption that it can never be used in an index.
Therefore, its support functions can't return NULL unless the index AM
allows that.

regards, tom lane

#42David Rowley
david.rowley@2ndquadrant.com
In reply to: Tom Lane (#41)
Re: NOT IN subquery optimization

On Mon, 4 Mar 2019 at 11:06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Rowley <david.rowley@2ndquadrant.com> writes:

On Mon, 4 Mar 2019 at 04:42, Tom Lane <tgl@sss.pgh.pa.us> wrote:

You absolutely will get errors during btree insertions and searches
if a datatype's btree comparison functions ever return NULL (for
non-NULL inputs).

I understand this is the case if an index happens to be used, but
there's no guarantee that's going to be the case. I was looking at the
case where an index was not used.

Not following your point? An index opclass is surely not going to be
designed on the assumption that it can never be used in an index.
Therefore, its support functions can't return NULL unless the index AM
allows that.

I agree that it makes sense that the behaviour of the two match. I was
trying to hint towards that when I said:

If you're saying something doing that is
fundamentally broken, then I guess we're okay.

but likely I didn't make that very clear.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#43David Steele
david@pgmasters.net
In reply to: David Rowley (#17)
Re: Re: NOT IN subquery optimization

On 2/27/19 2:26 AM, David Rowley wrote:

On Wed, 27 Feb 2019 at 13:13, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Li, Zheng" <zhelli@amazon.com> writes:

However, given that the March CommitFest is imminent and the runtime smarts patent concerns David had pointed out (which I was not aware of before), we would not move that direction at the moment.

I propose that we collaborate to build one patch from the two patches submitted in this thread for the CF.

TBH, I think it's very unlikely that any patch for this will be seriously
considered for commit in v12. It would be against project policy, and
spending a lot of time reviewing the patch would be quite unfair to other
patches that have been in the queue longer. Therefore, I'd suggest that
you not bend things out of shape just to have some patch to submit before
March 1. It'd be better to work with the goal of having a coherent patch
ready for the first v13 CF, probably July-ish.

FWIW, I did add this to the March CF, but I set the target version to
13. I wasn't considering this for PG12. I see Zheng was, but I agree
with you on PG13 being the target for this.

Looks like the target version of 13 was removed but I have added it back.

Regards,
--
-David
david@pgmasters.net

#44David Rowley
david.rowley@2ndquadrant.com
In reply to: David Steele (#43)
Re: Re: NOT IN subquery optimization

On Tue, 5 Mar 2019 at 21:21, David Steele <david@pgmasters.net> wrote:

On 2/27/19 2:26 AM, David Rowley wrote:

FWIW, I did add this to the March CF, but I set the target version to
13. I wasn't considering this for PG12. I see Zheng was, but I agree
with you on PG13 being the target for this.

Looks like the target version of 13 was removed but I have added it back.

The problem seems to be that there are now 2 CF entries for this
thread. I originally added [1]https://commitfest.postgresql.org/22/2020/, but later Zheng added [2]https://commitfest.postgresql.org/22/2023/. From what
Jim mentioned when he opened this thread I had the idea that no patch
existed yet, so I posted the one I already had written for this 4
years ago thinking that might be useful to base new work on. I guess
Zheng's patch already exists when Jim opened this thread as a patch
appeared fairly quickly afterwards. If that's true then I understand
that they wouldn't want to drop the work they'd already done in favour
of picking mine up.

I'm not all that sure what do to about this. It's going to cause quite
a bit of confusion having two patches in one thread. Part of me feels
that I've hijacked this thread and that I should just drop my patch
altogether and help review Zheng's patch, but I'm struggling a bit to
do that as I've not managed to find problems with my version, but a
few have been pointed out with the other patch (of course, there may
be some yet undiscovered issues with my version too).

Alternatively, I could take my patch to another thread, but there does
not seem to be much sense in that. It might not solve the confusion
problem. The best thing would be that if we could work together to
make this work, however, we both seem to have fairly different ideas
on how it should work. Tom and I both agree that Zheng and Jim's
proposal to add OR x IS NULL clauses to the join condition is most
likely a no go area due to it disallowing hash and merge anti-joins.
The last I can understand from Jim is that they appear to disagree
with that and want to do the transformation based on costs. Perhaps
they're working on some new ideas to make that more feasible. I'm
interested to hear the latest on this.

[1]: https://commitfest.postgresql.org/22/2020/
[2]: https://commitfest.postgresql.org/22/2023/

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#45David Steele
david@pgmasters.net
In reply to: David Rowley (#44)
Re: NOT IN subquery optimization

On 3/5/19 10:53 AM, David Rowley wrote:

On Tue, 5 Mar 2019 at 21:21, David Steele <david@pgmasters.net> wrote:

On 2/27/19 2:26 AM, David Rowley wrote:

FWIW, I did add this to the March CF, but I set the target version to
13. I wasn't considering this for PG12. I see Zheng was, but I agree
with you on PG13 being the target for this.

Looks like the target version of 13 was removed but I have added it back.

The problem seems to be that there are now 2 CF entries for this
thread. I originally added [1], but later Zheng added [2]. From what
Jim mentioned when he opened this thread I had the idea that no patch
existed yet, so I posted the one I already had written for this 4
years ago thinking that might be useful to base new work on.

Yeah, I just figured this out when I got to your patch which was
properly marked as PG13 and then saw they were pointing at the same thread.

At the very least one of the patch entries should be closed, or moved to
a new thread.

I'm not sure if I have an issue with competing patches on the same
thread. I've seen that before and it can lead to a good outcome. It
case, as you say, also lead to confusion.

Regards,
--
-David
david@pgmasters.net

#46David Rowley
david.rowley@2ndquadrant.com
In reply to: Jim Finnerty (#33)
Re: NOT IN subquery optimization

On Sun, 3 Mar 2019 at 01:34, Jim Finnerty <jfinnert@amazon.com> wrote:

I agree with Tom's comment above - when the cost of the NOT IN is dominated
by the cost of the outer scan (i.e. when the cardinality of the outer
relation is large relative to the cardinality of the subquery), and if the
inner cardinality is small enough to fit in memory, then the current
implementation does an efficient hash lookup into an in-memory structure,
and that's a very fast way to do the NOT IN. It almost achieves the
lower-bound cost of scanning the outer relation. It can also parallelizes
easily, whether or not we currently can do that. In these cases, the
current plan is the preferred plan, and we should keep it.

If you do the conversion to anti-join (without hacking at the join
quals and assuming no nulls are possible), then the planner can decide
what's best. The planner may choose a hash join which is not hugely
different from a hashed subplan, however from the testing I've done
the Hash Join is a bit faster. I imagine there's been more motivation
over the years to optimise that over hashed subplans. As far as I
know, there's no parallel query support for hashed subplans, but I
know there is for hash joins. In short, I don't think it makes any
sense to not translate into an anti-join (when possible). I think the
best anti-join plan will always be a win over the subquery. The
planner could make a mistake of course, but that's a different issue.
We certainly don't consider keeping the subquery around for NOT
EXISTS.

This is a case that we would want to avoid the transform, because when both
the inner and outer are nullable and the outer is large and the inner is
small, the transformed plan would Scan and Materialize the inner for each
row of the outer row, which is very slow compared to the untransformed plan:

slow case for the transformation: https://explain.depesz.com/s/0CBB

Well, that's because you're forcing the planner into a corner in
regards to the join condition. It has no choice but to nested loop
that join.

However, if the inner is too large to fit into memory, then the transformed
plan is faster on all of our other test cases, although our test cases are
far from complete. If the current solution supports parallel scan of the
outer, for example, then PQ could have lower elapsed time than the non-PQ
nested loop solution.

I'm having a little trouble understanding this. From what I
understand the code adds an OR .. IS NULL clause to the join
condition. Is this still the case with what you've been testing here?
If so, I'm surprised to hear all your test cases are faster. If
there's an OR clause in the join condition then the planner has no
choice but to use a nested loop join, so it's very surprising that you
would find that faster with larger data sets.

Or does the code your testing implement this a different way? Perhaps
with some execution level support?

Also, remember that the issue with the empty inner was just a bug that was
the result of trying to do an additional optimization in the case where
there is no WHERE clause in the subquery. That bug has been fixed. The
general case transformation described in the base note produces the correct
result in all cases, including the empty subquery case.

I'm not sure why lack of WHERE clause in the subquery counts for
anything here. The results set from the subquery can be empty or not
empty with or without a WHERE clause. The only way you'll know it's
empty during planning is if some gating qual says so, but that's yet
to be determined by the time the transformation should be done.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#47Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Steele (#45)
Re: NOT IN subquery optimization

David Steele <david@pgmasters.net> writes:

I'm not sure if I have an issue with competing patches on the same
thread. I've seen that before and it can lead to a good outcome. It
case, as you say, also lead to confusion.

It's a bit of a shame that the cfbot will only be testing one of them
at a time if we leave it like this. I kind of lean towards the
two-thread, two-CF-entry approach because of that. The amount of
confusion is a constant.

regards, tom lane

#48David Rowley
david.rowley@2ndquadrant.com
In reply to: Tom Lane (#47)
Re: NOT IN subquery optimization

On Wed, 6 Mar 2019 at 03:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Steele <david@pgmasters.net> writes:

I'm not sure if I have an issue with competing patches on the same
thread. I've seen that before and it can lead to a good outcome. It
case, as you say, also lead to confusion.

It's a bit of a shame that the cfbot will only be testing one of them
at a time if we leave it like this. I kind of lean towards the
two-thread, two-CF-entry approach because of that. The amount of
confusion is a constant.

That sounds fine. I'll take mine elsewhere since I didn't start this thread.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#49David Rowley
david.rowley@2ndquadrant.com
In reply to: David Rowley (#48)
Re: NOT IN subquery optimization

On Wed, 6 Mar 2019 at 12:25, David Rowley <david.rowley@2ndquadrant.com> wrote:

That sounds fine. I'll take mine elsewhere since I didn't start this thread.

Moved to /messages/by-id/CAKJS1f82pqjqe3WT9_xREmXyG20aOkHc-XqkKZG_yMA7JVJ3Tw@mail.gmail.com

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#50Li, Zheng
zhelli@amazon.com
In reply to: David Rowley (#49)
1 attachment(s)
Re: NOT IN subquery optimization

Hey, here is our latest patch. Major changes in this patch include:
1. Use the original hashed subplan if the inner fits in memory as decided by subplan_is_hashable().
2. Fixed the inner relation empty case by adding an inner relation existence check when we pull x out as a filter on the outer (see details below).
3. Integrate David Rowley's routine to use strict predicates and inner join conditions when checking nullability of a Var.

Detailed description of the patch:

NOT IN to ANTI JOIN transformation

If the NOT IN subquery is not eligible for hashed subplan as decided by
subplan_is_hashable(), do the following NOT IN to ANTI JOIN
transformation:

Single expression:
When x is nullable:
t1.x not in (t2.y where p) =>
ANTI JOIN
t1 (Filter: t1.x IS NOT NULL or NOT EXISTS (select 1 from t2 where p)),
t2 on join condition (t1.x=t2.y or t2.y IS NULL)
and p.
The predicate "t2.y IS NULL" can be removed if y is non-nullable.

When x is non-nullable:
t1.x not in (t2.y where p) =>
ANTI JOIN t1, t2 on join condition (t1.x=t2.y or t2.y IS NULL)
and p.
The predicate "t2.y IS NULL" can be removed if y is non-nullable.

Multi expression:
If all xi's are nullable:
(x1, x2, ... xn) not in (y1, y2, ... yn ) =>
ANTI JOIN t1, t2 on join condition:
((t1.x1 = t2.y1) and ... (t1.xi = t2.yi) ... and
(t1.xn = t2.yn)) IS NOT FALSE.

If at least one xi is non-nuallable:
(x1, x2, ... xn) not in (y1, y2, ... yn ) =>
ANTI JOIN t1, t2 on join condition:
(t1.x1 = t2.y1 or t2.y1 IS NULL or t1.x1 IS NULL) and ...
(t1.xi = t2.yi or t2.yi IS NULL) ... and
(t1.xn = t2.yn or t2.yn IS NULL or t1.xn IS NULL).

Add nullability testing routine is_node_nonnullable(), currently it
handles Var, TargetEntry, CoalesceExpr and Const. It uses strict
predicates, inner join conditions and NOT NULL constraint to check
the nullability of a Var.

Adjust and apply reduce_outer_joins() before the transformation so
that the outer joins have an opportunity to be converted to inner joins
prior to the transformation.

We measured performance improvements of two to five orders of magnitude
on most queries in a development environment. In our performance experiments,
table s (small) has 11 rows, table l (large) has 1 million rows. s.n and l.n
have NULL value. s.nn and l.nn are NOT NULL. Index is created on each column.

Cases using hash anti join:
l.nn not in (l.nn) 21900s -> 235ms
l.nn not in (l.nn where u>0) 22000s -> 240ms
l.n not in (l.nn) 21900s -> 238ms
l.n not in (l.nn where u>0) 22000s -> 248ms

Cases using index nested loop anti join
s.n not in (l.nn) 360ms -> 0.5ms
s.n not in (l.nn where u>0) 390ms -> 0.6ms
s.nn not in (l.nn) 365ms -> 0.5ms
s.nn not in (l.nn where u>0) 390ms -> 0.5ms

Cases using bitmap heap scan on the inner and nested loop anti join:
s.n not in (l.n) 360ms -> 0.7ms
l.n not in (l.n) 21900s -> 1.6s
l.n not in (l.n where u>0) 22000s -> 1680ms
s.nn not in (l.n) 360ms -> 0.5ms
l.nn not in (l.n) 21900s -> 1650ms
l.nn not in (l.n where u>0) 22000s -> 1660ms

Cases using the original hashed subplan:
l.n not in (s.n) 63ms -> 63ms
l.nn not in (s.nn) 63ms -> 63ms
l.n not in (s.n where u>0) 63ms -> 63ms

Comments are welcome.

Regards,
-----------
Zheng Li
AWS, Amazon Aurora PostgreSQL

Attachments:

not_in_to_anti_join_transformation_v2.patchapplication/octet-stream; name=not_in_to_anti_join_transformation_v2.patchDownload
commit dfdb393987d54e2725933b97f57f601804a0b756
Author: Zheng Li <zhelli@amazon.com>
Date:   Sat Mar 16 20:08:11 2019 +0000

    NOT IN to ANTI JOIN transformation
    
    If the NOT IN subquery is not eligible for hashed subplan as decided by
    subplan_is_hashable(), do the following NOT IN to ANTI JOIN
    transformation:
    
    Single expression:
        When x is nullable:
        t1.x not in (t2.y where p) =>
        ANTI JOIN
        t1 (Filter: t1.x IS NOT NULL or NOT EXISTS (select 1 from t2 where p)),
        t2 on join condition (t1.x=t2.y or t2.y IS NULL)
        and p.
        The predicate "t2.y IS NULL" can be removed if y is non-nullable.
    
        When x is non-nullable:
        t1.x not in (t2.y where p) =>
        ANTI JOIN t1, t2 on join condition (t1.x=t2.y or t2.y IS NULL)
        and p.
        The predicate "t2.y IS NULL" can be removed if y is non-nullable.
    
    Multi expression:
        If all xi's are nullable:
        (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
        ANTI JOIN t1, t2 on join condition:
        ((t1.x1 = t2.y1) and ... (t1.xi = t2.yi) ... and
         (t1.xn = t2.yn)) IS NOT FALSE.
    
        If at least one xi is non-nuallable:
        (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
        ANTI JOIN t1, t2 on join condition:
        (t1.x1 = t2.y1 or t2.y1 IS NULL or t1.x1 IS NULL) and ...
        (t1.xi = t2.yi or t2.yi IS NULL) ... and
        (t1.xn = t2.yn or t2.yn IS NULL or t1.xn IS NULL).
    
    Add nullability testing routine is_node_nonnullable(), currently it
    handles Var, TargetEntry, CoalesceExpr and Const. It uses strict
    predicates, inner join conditions and NOT NULL constraint to check
    the nullability of a Var.
    
    Adjust and apply reduce_outer_joins() before the transformation so
    that the outer joins have an opportunity to be converted to inner joins
    prior to the transformation.
    
    We measured performance improvements of two to five orders of magnitude
    on most queries in a development environment. In our performance experiments,
    table s (small) has 11 rows, table l (large) has 1 million rows. s.n and l.n
    have NULL value. s.nn and l.nn are NOT NULL. Index is created on each column.
    
    Cases using hash anti join:
    l.nn not in (l.nn) 21900s -> 235ms
    l.nn not in (l.nn where u>0) 22000s -> 240ms
    l.n not in (l.nn) 21900s -> 238ms
    l.n not in (l.nn where u>0) 22000s -> 248ms
    
    Cases using index nested loop anti join
    s.n not in (l.nn) 360ms -> 0.5ms
    s.n not in (l.nn where u>0) 390ms -> 0.6ms
    s.nn not in (l.nn) 365ms -> 0.5ms
    s.nn not in (l.nn where u>0) 390ms -> 0.5ms
    
    Cases using bitmap heap scan on the inner and nested loop anti join:
    s.n not in (l.n) 360ms -> 0.7ms
    l.n not in (l.n) 21900s ->  1.6s
    l.n not in (l.n where u>0) 22000s -> 1680ms
    s.nn not in (l.n) 360ms -> 0.5ms
    l.nn not in (l.n) 21900s -> 1650ms
    l.nn not in (l.n where u>0) 22000s -> 1660ms
    
    Cases using the original hashed subplan:
    l.n not in (s.n) 63ms -> 63ms
    l.nn not in (s.nn) 63ms -> 63ms
    l.n not in (s.n where u>0) 63ms -> 63ms
    
    Authors: Zheng Li, Jim Finnerty
    Discussion:
    https://www.postgresql.org/message-id/flat/1550706289606-0.post%40n3.nabble.com

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index e408e77d6f..bd412a6b56 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -971,7 +971,7 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
 	 * preprocessing.
 	 */
 	if (hasOuterJoins)
-		reduce_outer_joins(root);
+		reduce_outer_joins(parse);
 
 	/*
 	 * If we have any RTE_RESULT relations, see if they can be deleted from
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 555c91f61e..0df238446f 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -1107,6 +1107,357 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
 	return expression_tree_walker(node, inline_cte_walker, context);
 }
 
+/* Returns a List of Nodes from the testexpr of an Any SubLink */
+static List *
+getTestExpr(SubLink *sublink)
+{
+	Node * testexpr = sublink->testexpr;
+	Assert(testexpr);
+
+	/* single expression */
+	if(IsA(testexpr, OpExpr))
+	{
+		OpExpr	*opexpr = (OpExpr *) testexpr;
+		Node	*testnode = linitial(opexpr->args);
+
+		return list_make1(testnode);
+	}
+	/* multi-expression */
+	else if(IsA(testexpr, BoolExpr))
+	{
+		BoolExpr	*bexpr = (BoolExpr *) testexpr;
+		ListCell		*lc;
+		Node		*node;
+		List			*result = NULL;
+
+		foreach(lc, bexpr->args)
+		{
+			node = lfirst(lc);
+			if(IsA(node, OpExpr))
+			{
+				OpExpr *expr = (OpExpr *) node;
+				result = lappend(result, linitial(expr->args));
+			}
+			else
+			{
+				elog(ERROR, "unrecognized node type for testexpr: %d",
+						(int) nodeTag(node));
+			}
+		}
+		return result;
+	}
+	else
+	{
+		elog(ERROR, "unrecognized node type for testexpr: %d",
+				(int) nodeTag(testexpr));
+	}
+}
+
+/* Try to reduce outer joins if there is one in the Query */
+static void
+reduce_outer_joins_NOT_IN(Query *parse)
+{
+	ListCell				*lc;
+	RangeTblEntry	*rte;
+
+	foreach(lc, parse->rtable)
+	{
+		rte = (RangeTblEntry *) lfirst(lc);
+		/* try to reduce outer joins if there is one */
+		if (rte->rtekind == RTE_JOIN &&
+				IS_OUTER_JOIN(rte->jointype))
+		{
+			reduce_outer_joins(parse);
+			return;
+		}
+	}
+}
+
+/*
+ * Make clause NOT EXISTS
+ * (select 1 from t2 where p) for the NOT IN to ANTI JOIN
+ * transformation.
+ */
+static
+Node *
+makeExistsTest_NOT_IN(Query *subselect)
+{
+	BoolExpr *notExpr = makeNode(BoolExpr);
+	SubLink *exists = makeNode(SubLink);
+	Query *selectOne =  copyObject(subselect);
+	Const *oneconst;
+	TargetEntry *dummyte;
+
+	/* modify subselect target list to contain a dummy const 1 */
+	oneconst = makeConst(INT4OID,
+						 -1,
+						 InvalidOid,
+						 sizeof(int32),
+						 Int32GetDatum(1),
+						 false, /* isnull */
+						 true); /* pass by value */
+	dummyte = makeTargetEntry((Expr *) oneconst,
+							  1,
+							  "one",
+							  false /* resjunk */ );
+	selectOne->targetList = list_make1(dummyte);
+
+	/* make EXISTS(select 1 from t2 where p) */
+	exists->subLinkType = EXISTS_SUBLINK;
+	exists->subLinkId = 0;
+	exists->subselect = (Node *) selectOne;
+	exists->location = -1;
+
+	/* make NOT EXISTS(select 1 from t2 where p) */
+	notExpr->boolop = NOT_EXPR;
+	notExpr->args = list_make1(exists);
+	notExpr->location = -1;
+
+	return (Node *) notExpr;
+}
+
+/*
+ * If the NOT IN subquery is not eligible for hashed subplan as
+ * decided by subplan_is_hashable(),
+ * do the following NOT IN to ANTI JOIN conversions:
+ *
+ * When x is non-nullable:
+ * t1.x not in (t2.y where p) => ANTI JOIN
+ * t1, t2 on join condition (t1.x=t2.y or t2.y IS NULL) and p.
+ * the above predicate "t2.y IS NULL" can be removed if y
+ * is also non-nullable.
+ *
+ * When x is nullable:
+ * t1.x not in (t2.y where p) => ANTI JOIN
+ * t1 (Filter: t1.x is not null or not exists (select 1 from t2 where p)),
+ * t2 on join condition (t1.x=t2.y or t2.y is null) and p.
+ * the above predicate "t2.y IS NULL" can be removed if y
+ * is also non-nullable.
+ *
+ * The multi-expression case is just ANDs of the single-
+ * expression case.
+ */
+static bool
+convert_NOT_IN_to_join(PlannerInfo *root, Node **quals,
+									SubLink *sublink, List *subquery_vars,
+									Node **pullout)
+{
+	Query			*parse = root->parse;
+	Query			*subselect = (Query *) sublink->subselect;
+	List				*testnodes = getTestExpr(sublink);
+	bool				 outerNonNull;
+	bool				 innerNonNull;
+	PlannerInfo 	*subroot;
+	double			 tuple_fraction;
+	RelOptInfo		*final_rel;
+	Path				*best_path;
+	Plan				*plan;
+	NullTest 		*nt;
+
+	/* For ALL and ANY subplans, we will be
+	 * able to stop evaluating if the test condition fails or matches, so very
+	 * often not all the tuples will be retrieved; for lack of a better idea,
+	 * specify 50% retrieval.
+	 */
+	tuple_fraction = 0.5;
+	/*
+	 * Generate Paths for the subquery, use a copied version of the subquery
+	 * so that the existing one doesn't get modified.
+	 */
+	subroot = subquery_planner(root->glob, copyObject(subselect),
+								root,
+							   false, tuple_fraction);
+
+	/* Select best Path and turn it into a Plan. */
+	final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
+	best_path = get_cheapest_fractional_path(final_rel, tuple_fraction);
+
+	plan = create_plan(subroot, best_path);
+
+	/*
+	 * Punt if subplan is hashable since using hashed subplan is almost like
+	 * doing a Hash Anti Join, we probably can't do better than that.
+	 */
+	if(subplan_is_hashable(plan))
+	{
+		return false;
+	}
+
+	/*
+	 * Try reduce_outer_joins since outer join affects the nullability test that's coming up next.
+	 * We have to call reduce_outer_joins for outer and inner query separately because we
+	 * don't have a global range table yet.
+	 */
+	reduce_outer_joins_NOT_IN(parse);
+
+	reduce_outer_joins_NOT_IN(subselect);
+
+	Assert(testnodes);
+	outerNonNull =
+					list_hasnonnullable(testnodes, parse);
+	innerNonNull =
+					list_hasnonnullable(subselect->targetList, subselect);
+
+	/* Single-expression case, do the following:
+	 * When x is non-nullable:
+	 * t1.x not in (t2.y where p) => ANTI JOIN
+	 * t1, t2 on join condition (t1.x=t2.y or t2.y IS NULL) and p.
+	 * the above predicate "t2.y IS NULL" can be removed if y
+	 * is also non-nullable.
+	 *
+	 * When x is nullable:
+	 * t1.x not in (t2.y where p) => ANTI JOIN
+	 * t1 (Filter: t1.x is not null or not exists (select 1 from t2 where p)),
+	 * t2 on join condition (t1.x=t2.y or t2.y is null) and p.
+	 * the above predicate "t2.y IS NULL" can be removed if y
+	 * is also non-nullable.
+	 */
+	if(IsA(*quals, OpExpr))
+	{
+		/*  Add "OR y IS NULL" if y is nullable */
+		if(!innerNonNull)
+		{
+			/* make expr y IS NULL */
+			nt = makeNode(NullTest);
+			nt->arg = (Expr *)linitial(subquery_vars);
+			nt->nulltesttype = IS_NULL;
+			nt->argisrow = false;
+			nt->location = -1;
+
+			/* make orclause (x = y OR y IS NULL) */
+			*quals = (Node *)make_orclause(list_make2(*quals,
+								(Node *)nt));
+		}
+
+		/*
+		 * if x is nullable, make the following filter for t1 :
+		 * x IS NOT NULL or NOT EXISTS (select 1 from t2 where p)
+		 */
+		if(!outerNonNull)
+		{
+			Node * existsTest = NULL;
+
+			/* make expr x IS NOT NULL */
+			nt = makeNode(NullTest);
+			nt->arg = (Expr *) linitial(testnodes);
+			nt->nulltesttype = IS_NOT_NULL;
+			nt->argisrow = false;
+			nt->location = -1;
+
+			existsTest = makeExistsTest_NOT_IN(subselect);
+			/* make x IS NOT NULL OR NOT EXISTS (select 1 from t2 where p) */
+			*pullout = (Node *)make_orclause(list_make2(
+									(Node *) nt, existsTest));
+		}
+	}
+	/*
+	 * Multi-expression case:
+	 * If all xi's are nullable:
+	 * (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
+	 * ANTI JOIN t1,
+	 * t2 on join condition:
+	 * ((t1.x1 = t2.y1) and ... (t1.xi = t2.yi) ... and
+	 * (t1.xn = t2.yn)) is NOT FALSE.
+	 *
+	 * If at least one xi is non-nuallable:
+	 * (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
+	 * ANTI JOIN t1,
+	 * t2 on join condition:
+	 * (t1.x1 = t2.y1 or t2.y1 is NULL) and ...
+	 * (t1.xi = t2.yi or t2.yi is NULL or t1.xi is NULL) ... and
+	 * (t1.xn = t2.yn or t2.yn is NULL).
+	 */
+	else if(IsA(*quals, BoolExpr))
+	{
+		/*
+		 * Add IS NOT FALSE on top of the join condition if ALL x_i's are nullable
+		 */
+		if(!outerNonNull)
+		{
+			BooleanTest *btest;
+
+			btest = makeNode(BooleanTest);
+			btest->arg = (Expr *) *quals;
+			btest->booltesttype = IS_NOT_FALSE;
+			*quals = (Node *) btest;
+		}
+		else
+		{
+			ListCell			*qualc;
+			TargetEntry	*te;
+			ListCell			*xc = list_head(testnodes);
+			ListCell			*yc = list_head(subquery_vars);
+			ListCell			*ytlc = list_head(subselect->targetList);
+			List				*quallist = ((BoolExpr *)*quals)->args;
+			Node				*joinCondition = NULL;
+			bool				 xnonNull;
+			bool				 ynonNull;
+
+			/* Reconstruct quals in the loop */
+			*quals = NULL;
+			foreach(qualc, quallist)
+			{
+				te = (TargetEntry *)lfirst(ytlc);
+				/* x_i = y_i */
+				joinCondition = lfirst(qualc);
+				ynonNull = is_node_nonnullable((Node*)te, subselect);
+
+				/* append y_i IS NULL to x_i = y_i if y_i is non-nullable */
+				if(!ynonNull)
+				{
+					/* make expr y_i IS NULL */
+					nt = makeNode(NullTest);
+					nt->arg = (Expr *)lfirst(yc);
+					nt->nulltesttype = IS_NULL;
+					nt->argisrow = false;
+					nt->location = -1;
+
+					/* make orclause (x_i = y_i OR y_i IS NULL) */
+					joinCondition = (Node *)make_orclause(list_make2(joinCondition,
+											(Node *)nt));
+				}
+
+				/*
+				 * Append "OR x_i is null" to the join condition if x_i is nullable.
+				 * Notice at least one x_i should be non-nullable because the all
+				 * x_i's nullable case is handled earlier by adding "IS NOT FALSE"
+				 * on top of the join condition.
+				 */
+				xnonNull = is_node_nonnullable(lfirst(xc), parse);
+				if(!xnonNull)
+				{
+					/* make expr x_i IS NULL */
+					nt = makeNode(NullTest);
+					nt->arg = (Expr *)lfirst(xc);
+					nt->nulltesttype = IS_NULL;
+					nt->argisrow = false;
+					nt->location = -1;
+
+					/* make orclause (x_i = y_i OR y_i IS NULL OR x_i IS NULL) */
+					joinCondition = (Node *)make_orclause(list_make2(joinCondition,
+											(Node *)nt));
+				}
+
+				/*
+				 * Now append joinCondition to quals as one andclause.
+				 * (x_i = y_i OR y_i IS NULL OR x_i IS NULL) AND
+				 * (x_j = y_j OR y_j IS NULL OR x_j IS NULL)...
+				 */
+				*quals = (Node *)make_andclause(list_make2(*quals, joinCondition));
+				xc = lnext(xc);
+				yc = lnext(yc);
+				ytlc = lnext(ytlc);
+			}
+		}
+	}
+	/* quals should be either OpExpr or BoolExpr, otherwise don't convert */
+	else
+	{
+		return false;
+	}
+
+	return true;
+}
 
 /*
  * convert_ANY_sublink_to_join: try to convert an ANY SubLink to a join
@@ -1144,7 +1495,8 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
  */
 JoinExpr *
 convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
-							Relids available_rels)
+											bool under_not, Node **pullout,
+											Relids available_rels)
 {
 	JoinExpr   *result;
 	Query	   *parse = root->parse;
@@ -1224,11 +1576,22 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 	 */
 	quals = convert_testexpr(root, sublink->testexpr, subquery_vars);
 
+	/*
+	 * Try converting x NOT IN (y) to ANTI JOIN.
+	 */
+	if(under_not &&
+			!convert_NOT_IN_to_join(root, &quals,
+								sublink, subquery_vars, pullout))
+	{
+		return NULL;
+	}
+
 	/*
 	 * And finally, build the JoinExpr node.
 	 */
 	result = makeNode(JoinExpr);
-	result->jointype = JOIN_SEMI;
+	/* NOT IN will be converted to ANTI JOIN */
+	result->jointype = under_not ? JOIN_ANTI : JOIN_SEMI;
 	result->isNatural = false;
 	result->larg = NULL;		/* caller must fill this in */
 	result->rarg = (Node *) rtr;
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index aebe162713..d2a2149724 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -104,7 +104,7 @@ static Query *pullup_replace_vars_subquery(Query *query,
 static reduce_outer_joins_state *reduce_outer_joins_pass1(Node *jtnode);
 static void reduce_outer_joins_pass2(Node *jtnode,
 						 reduce_outer_joins_state *state,
-						 PlannerInfo *root,
+						 Query *parse,
 						 Relids nonnullable_rels,
 						 List *nonnullable_vars,
 						 List *forced_null_vars);
@@ -256,6 +256,13 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 		}
 		/* Build the replacement FromExpr; no quals yet */
 		newf = makeFromExpr(newfromlist, NULL);
+		/*
+		 * Replace parse->jointree with newf now because we might modify join types
+		 * during reduce_outer_joins() in convert_NOT_IN_to_join() which is called
+		 * in pull_up_sublinks_qual_recurse() that's coming up next.
+		 */
+		newf->quals = f->quals;
+		root->parse->jointree = newf;
 		/* Set up a link representing the rebuilt jointree */
 		jtlink = (Node *) newf;
 		/* Now process qual --- all children are available for use */
@@ -388,7 +395,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 		/* Is it a convertible ANY or EXISTS clause? */
 		if (sublink->subLinkType == ANY_SUBLINK)
 		{
-			if ((j = convert_ANY_sublink_to_join(root, sublink,
+			if ((j = convert_ANY_sublink_to_join(root, sublink, false, NULL,
 												 available_rels1)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
@@ -414,7 +421,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 				return NULL;
 			}
 			if (available_rels2 != NULL &&
-				(j = convert_ANY_sublink_to_join(root, sublink,
+				(j = convert_ANY_sublink_to_join(root, sublink, false, NULL,
 												 available_rels2)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
@@ -560,6 +567,68 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 					return NULL;
 				}
 			}
+			else if (sublink->subLinkType == ANY_SUBLINK)
+			{
+				Node *pullout = NULL;
+
+				if ((j = convert_ANY_sublink_to_join(root, sublink, true, &pullout,
+													 available_rels1)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink1;
+					*jtlink1 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Any inserted
+					 * joins can get stacked onto either j->larg or j->rarg,
+					 * depending on which rels they reference.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->larg,
+															 available_rels1,
+															 &j->rarg,
+															 child_rels);
+					/*
+					 * Return pullout predicate (x is NOT NULL) if it's not null,
+					 * otherwise return NULL representing constant TRUE.
+					 */
+					return pullout? pullout : NULL;
+				}
+				if (available_rels2 != NULL &&
+					(j = convert_ANY_sublink_to_join(root, sublink, true, &pullout,
+													 available_rels2)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink2;
+					*jtlink2 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Any inserted
+					 * joins can get stacked onto either j->larg or j->rarg,
+					 * depending on which rels they reference.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->larg,
+															 available_rels2,
+															 &j->rarg,
+															 child_rels);
+					/*
+					 * Return pullout predicate (x is NOT NULL) if it's not null,
+					 * otherwise return NULL representing constant TRUE.
+					 */
+					return pullout? pullout : NULL;
+				}
+			}
 		}
 		/* Else return it unmodified */
 		return node;
@@ -2377,7 +2446,7 @@ flatten_simple_union_all(PlannerInfo *root)
  * alias-var expansion).
  */
 void
-reduce_outer_joins(PlannerInfo *root)
+reduce_outer_joins(Query *parse)
 {
 	reduce_outer_joins_state *state;
 
@@ -2390,14 +2459,14 @@ reduce_outer_joins(PlannerInfo *root)
 	 * join(s) below each side of each join clause. The second pass examines
 	 * qual clauses and changes join types as it descends the tree.
 	 */
-	state = reduce_outer_joins_pass1((Node *) root->parse->jointree);
+	state = reduce_outer_joins_pass1((Node *) parse->jointree);
 
 	/* planner.c shouldn't have called me if no outer joins */
 	if (state == NULL || !state->contains_outer)
 		elog(ERROR, "so where are the outer joins?");
 
-	reduce_outer_joins_pass2((Node *) root->parse->jointree,
-							 state, root, NULL, NIL, NIL);
+	reduce_outer_joins_pass2((Node *) parse->jointree,
+							 state, parse, NULL, NIL, NIL);
 }
 
 /*
@@ -2480,7 +2549,7 @@ reduce_outer_joins_pass1(Node *jtnode)
 static void
 reduce_outer_joins_pass2(Node *jtnode,
 						 reduce_outer_joins_state *state,
-						 PlannerInfo *root,
+						 Query *parse,
 						 Relids nonnullable_rels,
 						 List *nonnullable_vars,
 						 List *forced_null_vars)
@@ -2520,7 +2589,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 			reduce_outer_joins_state *sub_state = lfirst(s);
 
 			if (sub_state->contains_outer)
-				reduce_outer_joins_pass2(lfirst(l), sub_state, root,
+				reduce_outer_joins_pass2(lfirst(l), sub_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
@@ -2632,7 +2701,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 		/* Apply the jointype change, if any, to both jointree node and RTE */
 		if (rtindex && jointype != j->jointype)
 		{
-			RangeTblEntry *rte = rt_fetch(rtindex, root->parse->rtable);
+			RangeTblEntry *rte = rt_fetch(rtindex, parse->rtable);
 
 			Assert(rte->rtekind == RTE_JOIN);
 			Assert(rte->jointype == j->jointype);
@@ -2717,7 +2786,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 					pass_nonnullable_vars = NIL;
 					pass_forced_null_vars = NIL;
 				}
-				reduce_outer_joins_pass2(j->larg, left_state, root,
+				reduce_outer_joins_pass2(j->larg, left_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
@@ -2739,7 +2808,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 					pass_nonnullable_vars = NIL;
 					pass_forced_null_vars = NIL;
 				}
-				reduce_outer_joins_pass2(j->rarg, right_state, root,
+				reduce_outer_joins_pass2(j->rarg, right_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 501b0e9e2d..88159568a0 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -38,10 +38,12 @@
 #include "optimizer/optimizer.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planmain.h"
+#include "optimizer/pathnode.h"
 #include "parser/analyze.h"
 #include "parser/parse_agg.h"
 #include "parser/parse_coerce.h"
 #include "parser/parse_func.h"
+#include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "tcop/tcopprot.h"
 #include "utils/acl.h"
@@ -2011,6 +2013,271 @@ find_forced_null_var(Node *node)
 	return NULL;
 }
 
+/*
+ * find_innerjoined_rels
+ *		Traverse jointree to locate non-outerjoined-rels and quals above them
+ *
+ * We fill innerjoined_rels with the relids of all rels that are not below
+ * the nullable side of any outer join (which would cause their Vars to be
+ * possibly NULL regardless of what's in the catalogs).  In the same scan,
+ * we locate all WHERE and JOIN/ON quals that constrain these rels add them to
+ * the usable_quals list (forming a list with implicit-AND semantics).
+ *
+ * Top-level caller must initialize innerjoined_rels/usable_quals to NULL/NIL.
+ */
+static void
+find_innerjoined_rels(Node *jtnode,
+					  Relids *innerjoined_rels, List **usable_quals)
+{
+	if (jtnode == NULL)
+		return;
+	if (IsA(jtnode, RangeTblRef))
+	{
+		int			varno = ((RangeTblRef *) jtnode)->rtindex;
+
+		*innerjoined_rels = bms_add_member(*innerjoined_rels, varno);
+	}
+	else if (IsA(jtnode, FromExpr))
+	{
+		FromExpr   *f = (FromExpr *) jtnode;
+		ListCell   *lc;
+
+		/* All elements of the FROM list are allowable */
+		foreach(lc, f->fromlist)
+			find_innerjoined_rels((Node *) lfirst(lc),
+								  innerjoined_rels, usable_quals);
+		/* ... and its WHERE quals are too */
+		if (f->quals)
+			*usable_quals = lappend(*usable_quals, f->quals);
+	}
+	else if (IsA(jtnode, JoinExpr))
+	{
+		JoinExpr   *j = (JoinExpr *) jtnode;
+
+		switch (j->jointype)
+		{
+			case JOIN_INNER:
+				/* visit both children */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				/* and grab the ON quals too */
+				if (j->quals)
+					*usable_quals = lappend(*usable_quals, j->quals);
+				break;
+
+			case JOIN_LEFT:
+			case JOIN_SEMI:
+			case JOIN_ANTI:
+
+				/*
+				 * Only the left input is possibly non-nullable; furthermore,
+				 * the quals of this join don't constrain the left input.
+				 * Note: we probably can't see SEMI or ANTI joins at this
+				 * point, but if we do, we can treat them like LEFT joins.
+				 */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_RIGHT:
+				/* Reverse of the above case */
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_FULL:
+				/* Neither side is non-nullable, so stop descending */
+				break;
+
+			case JOIN_UNIQUE_OUTER:
+			case JOIN_UNIQUE_INNER:
+				/* Don't think we will see JOIN_UNIQUE_OUTER or
+				 * JOIN_UNIQUE_INNER since they are only used internally in
+				 * the planner in a much later phase (in standard_join_search).
+				*/
+				break;
+
+			default:
+				elog(ERROR, "unrecognized join type: %d",
+					 (int) j->jointype);
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+			 (int) nodeTag(jtnode));
+}
+
+/*
+ * Returns true if the Node passed in is nonnullable. Currently handles Var,
+ * TargetEntry, CoaleseExpr and Const.
+ * TODO: Add more supporting cases.
+ * A Var is nonnullable if:
+ *	It does not appear on the null-padded side of an outer join and it has NOT NULL constraint,
+ * 	Or if it's forced non-null by inner join or other strict predicates.
+ * A CoalesceExpr is nonnullable if it has a non-null argument.
+ */
+bool
+is_node_nonnullable(Node * node, Query *parse)
+{
+	AttrNumber	 attno = InvalidAttrNumber;
+	Oid				 reloid;
+	Var 				*var = NULL;
+
+	/*
+	 * If the query contains set operations, punt.  The set ops themselves
+	 * couldn't introduce nulls that weren't in their inputs, but the tlist
+	 * present in the top-level query is just dummy and won't give us useful
+	 * info.  We could get an answer by recursing to examine each leaf query,
+	 * but for the moment it doesn't seem worth the extra complication.
+	 *
+	 * Note that we needn't consider other top-level operators such as
+	 * DISTINCT, GROUP BY, etc, as those will not introduce nulls either.
+	 */
+	if (parse->setOperations)
+		return false;
+
+	switch (nodeTag(node))
+	{
+		case T_Var:
+		{
+			RangeTblEntry	*rte;
+
+			var = (Var *) node;
+			attno = var->varattno;
+			rte = rt_fetch(var->varno, parse->rtable);
+			reloid = rte->relid;
+			break;
+		}
+		case T_TargetEntry:
+		{
+			TargetEntry	*te = (TargetEntry *)node;
+			switch(nodeTag(te->expr))
+			{
+				case T_Var:
+				{
+					var = (Var *) te->expr;
+					attno = te->resorigcol;
+					reloid = te->resorigtbl;
+					break;
+				}
+				/* recurse into is_node_nonnullable for other types of Node */
+				default:
+					return is_node_nonnullable((Node *)te->expr, parse);
+			}
+			break;
+		}
+		case T_CoalesceExpr:
+		{
+			ListCell   			*arg;
+			CoalesceExpr	*cexpr = (CoalesceExpr *)node;
+
+			/* handle COALESCE Function by looking for non-null argument */
+			foreach(arg, cexpr->args)
+			{
+				Node	*e = lfirst(arg);
+
+				/* recurse into is_node_nonnullable */
+				if (is_node_nonnullable(e, parse))
+				{
+					return true;
+				}
+			}
+			break;
+		}
+		case T_Const:
+		{
+			if(!((Const *) node)->constisnull)
+			{
+				return true;
+			}
+			break;
+		}
+		/*
+		 * TODO: handle more cases to make the nullability test more accurate
+		 * Assume unhandled cases are nullable.
+		 */
+		default:
+			return false;
+	}
+
+	/*
+	 * If we have found a Var, it is non-null if:
+	 * not on NULL-padded side of an outer join and has NOT NULL constraint
+	 * or forced NOT NULL by inner join conditions or by other strict predicates.
+	 */
+	if(var &&
+			reloid != InvalidOid)
+	{
+		Relids		 innerjoined_rels = NULL;
+		List	   		*innerjoined_useful_quals = NIL;
+		List			*nonnullable_innerjoined_vars = NIL;
+
+		find_innerjoined_rels((Node *) parse->jointree,
+							  &innerjoined_rels,
+							  &innerjoined_useful_quals);
+
+		/*
+		 * Check if the Var is from an INNER JOINed rel, it's also guaranteed
+		 * to not be on the null-padded side of an outer join.
+		 */
+		if (bms_is_member(var->varno, innerjoined_rels))
+		{
+			/*
+			 * If Var is from a plain relation and its column is marked
+			 * NOT NULL according to the catalogs, it can't produce NULL.
+			 */
+			if (get_attnotnull(reloid, attno))
+			{
+				return true;
+			}
+
+			/*
+			 * Otherwise check for the existance of strict predicates which filter
+			 * out NULL values for this Var.
+			 */
+			nonnullable_innerjoined_vars =
+				find_nonnullable_vars((Node *) innerjoined_useful_quals);
+
+			if (list_member(nonnullable_innerjoined_vars, var))
+			{
+				return true;
+			}
+		}
+
+		/*
+		 * If we get here it means -
+		 * The var is on null-padded side of an outer-join, since we've already
+		 * tried reduce_outer_joins(), there isn't any strict predicates to turn this
+		 * outer join to inner join, then there will be no strict predicates to force
+		 * this var non-null as well.
+		 */
+	}
+
+	return false;
+}
+
+/*
+ * Returns true if at least one Node in the input list is non-nullable
+ */
+bool
+list_hasnonnullable(List * list, Query *parse)
+{
+	ListCell	*lc;
+	Node	*node;
+
+	foreach(lc, list)
+	{
+		node = lfirst(lc);
+		if(is_node_nonnullable(node, parse))
+		{
+			return true;
+		}
+	}
+	return false;
+}
+
 /*
  * Can we treat a ScalarArrayOpExpr as strict?
  *
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index e88c45d268..b6a012b716 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -878,6 +878,35 @@ get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 	ReleaseSysCache(tp);
 }
 
+/*
+ * get_attnotnull
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	  tp;
+	Form_pg_attribute att_tup;
+
+	tp = SearchSysCache2(ATTNUM,
+			ObjectIdGetDatum(relid),
+			Int16GetDatum(attnum));
+
+	if (HeapTupleIsValid(tp))
+	{
+		bool result;
+
+		att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+		result = att_tup->attnotnull;
+		ReleaseSysCache(tp);
+
+		return result;
+	}
+	/* Assume att is nullable if no valid heap tuple is found */
+	return false;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/include/optimizer/clauses.h b/src/include/optimizer/clauses.h
index 5e10fb1d50..8badd954fa 100644
--- a/src/include/optimizer/clauses.h
+++ b/src/include/optimizer/clauses.h
@@ -45,6 +45,9 @@ extern List *find_nonnullable_vars(Node *clause);
 extern List *find_forced_null_vars(Node *clause);
 extern Var *find_forced_null_var(Node *clause);
 
+extern bool is_node_nonnullable(Node * node, Query *parse);
+extern bool list_hasnonnullable(List * list, Query *parse);
+
 extern bool is_pseudo_constant_clause(Node *clause);
 extern bool is_pseudo_constant_clause_relids(Node *clause, Relids relids);
 
diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h
index a9b2c9026c..ec8af5abf4 100644
--- a/src/include/optimizer/prep.h
+++ b/src/include/optimizer/prep.h
@@ -26,7 +26,7 @@ extern void pull_up_sublinks(PlannerInfo *root);
 extern void inline_set_returning_functions(PlannerInfo *root);
 extern void pull_up_subqueries(PlannerInfo *root);
 extern void flatten_simple_union_all(PlannerInfo *root);
-extern void reduce_outer_joins(PlannerInfo *root);
+extern void reduce_outer_joins(Query *parse);
 extern void remove_useless_result_rtes(PlannerInfo *root);
 extern Relids get_relids_in_jointree(Node *jtnode, bool include_joins);
 extern Relids get_relids_for_join(Query *query, int joinrelid);
diff --git a/src/include/optimizer/subselect.h b/src/include/optimizer/subselect.h
index 2d2c3bcbc0..344cdac41b 100644
--- a/src/include/optimizer/subselect.h
+++ b/src/include/optimizer/subselect.h
@@ -19,6 +19,8 @@
 extern void SS_process_ctes(PlannerInfo *root);
 extern JoinExpr *convert_ANY_sublink_to_join(PlannerInfo *root,
 							SubLink *sublink,
+							bool under_not,
+							Node **pullout,
 							Relids available_rels);
 extern JoinExpr *convert_EXISTS_sublink_to_join(PlannerInfo *root,
 							   SubLink *sublink,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 16b0b1d2dc..71370d8877 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -89,6 +89,7 @@ extern AttrNumber get_attnum(Oid relid, const char *attname);
 extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 					  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern char *get_collation_name(Oid colloid);
 extern char *get_constraint_name(Oid conoid);
 extern char *get_language_name(Oid langoid, bool missing_ok);
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index fe5fc64480..807f27d4a1 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -1323,3 +1323,2208 @@ select * from x for update;
    Output: subselect_tbl.f1, subselect_tbl.f2, subselect_tbl.f3
 (2 rows)
 
+-- test NON IN to ANTI JOIN conversion
+CREATE TABLE s (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s values(1000002, 1000002, 1000002, 'foofoo');
+UPDATE s set n = NULL WHERE n = 3;
+analyze s;
+CREATE TABLE l (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into l (u, n, nn, p)
+	select
+    generate_series(1,10000 ) as u,
+	generate_series(1,10000 ) as n,
+	generate_series(1,10000 ) as nn,
+	'bar' as p;
+UPDATE l set n = NULL WHERE n = 7;
+CREATE UNIQUE INDEX l_u ON l (u);
+CREATE INDEX l_n ON l (n);
+CREATE INDEX l_nn ON l (nn);
+analyze l;
+CREATE TABLE s1 (u INTEGER NOT NULL, n INTEGER NULL, n1 INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s1 (u, n, n1, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as n1,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s1 values(1000003, 1000003, 1000003, 1000003, 'foofoo');
+insert into s1 values(1003, 1003, 1003, 1003, 'foofoo');
+UPDATE s1 set n = NULL WHERE n = 3;
+UPDATE s1 set n1 = NULL WHERE n = 2;
+UPDATE s1 set n1 = NULL WHERE n1 = 3;
+analyze s1;
+CREATE TABLE empty (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+analyze empty;
+-- set work_mem to 64KB so that NOT IN to ANTI JOIN optimization will kick in
+set work_mem = 64;
+-- correctness test 1: inner empty, return every thing from outer including NULL
+explain (costs false) select * from s where n not in (select n from empty);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on empty
+(4 rows)
+
+select * from s where n not in (select n from empty);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+-- correctness test 2: inner has NULL, return empty result
+explain (costs false) select * from s where n not in (select n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- correctness test 3: inner non-null, result has no NULL
+explain (costs false) select * from s where n not in (select u from l);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Index Only Scan using l_u on l
+         Index Cond: (u = s.n)
+(7 rows)
+
+select * from s where n not in (select u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+-- correctness test 4: inner has predicate
+explain (costs false) select * from s where n not in (select n from l where u > 7);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (u > 7)
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 7)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(14 rows)
+
+select * from s where n not in (select n from l where u > 7);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(3 rows)
+
+-- correctness test 5: multi-expression, (2, 2, null, 2, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s1.n = l.u) AND (s1.n1 = l.nn)) IS NOT FALSE)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Seq Scan on l
+               Filter: (u >= 3)
+(6 rows)
+
+select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+    u    |    n    |   n1    |   nn    |   p    
+---------+---------+---------+---------+--------
+       1 |       1 |       1 |       1 | foo
+ 1000003 | 1000003 | 1000003 | 1000003 | foofoo
+       2 |       2 |         |       2 | foo
+(3 rows)
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should not be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s1.n = l.u) AND (s1.n1 = l.nn)) IS NOT FALSE)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Seq Scan on l
+               Filter: (u > 0)
+(6 rows)
+
+select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+    u    |    n    |   n1    |   nn    |   p    
+---------+---------+---------+---------+--------
+ 1000003 | 1000003 | 1000003 | 1000003 | foofoo
+(1 row)
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on s1
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Index Scan using l_u on l
+           Index Cond: (u < 0)
+(5 rows)
+
+select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+    u    |    n    |   n1    |   nn    |   p    
+---------+---------+---------+---------+--------
+       1 |       1 |       1 |       1 | foo
+ 1000003 | 1000003 | 1000003 | 1000003 | foofoo
+    1003 |    1003 |    1003 |    1003 | foofoo
+       2 |       2 |         |       2 | foo
+       3 |         |         |       3 | foo
+(5 rows)
+
+-- test using hashed subplan when inner fits in work_mem
+explain (costs false) select * from l where n not in (select n from s);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on l
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on s
+(4 rows)
+
+select * from l where n not in (select n from s);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- test single expression
+explain (costs false) select * from s where n not in (select n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select u from l);
+              QUERY PLAN              
+--------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_u on l
+         Index Cond: (u = s.u)
+(4 rows)
+
+select * from s where u not in (select u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where 3*n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: (((3 * n) IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: (((3 * s.n) = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = (3 * s.n))
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where 3*n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select 3*n from l);
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = (3 * l.n)) OR ((3 * l.n) IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Materialize
+         ->  Seq Scan on l
+(8 rows)
+
+select * from s where n not in (select 3*n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- test single expression with predicates
+explain (costs false) select * from s where n not in (select n from l where u > 0);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (u > 0)
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 0)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(14 rows)
+
+select * from s where n not in (select n from l where u > 0);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l where u > 100);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (u > 100)
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 100)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(14 rows)
+
+select * from s where n not in (select n from l where u > 100);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(3 rows)
+
+-- test multi expression
+explain (costs false) select * from s where (n,u) not in (select n,u from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_u on l
+         Index Cond: (u = s.u)
+         Filter: ((s.n = n) OR (n IS NULL) OR (s.n IS NULL))
+(5 rows)
+
+select * from s where (n,u) not in (select n,u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (u, nn) not in (select u, nn from l);
+            QUERY PLAN            
+----------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: (s.u = u)
+(5 rows)
+
+select * from s where (u, nn) not in (select u, nn from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u) not in (select u,n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.u = n) OR (n IS NULL))
+         Filter: ((s.n = u) OR (s.n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.u)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where (n,u) not in (select u,n from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l);
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: (((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+                                          QUERY PLAN                                          
+----------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((u > 1000) AND ((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((u > 0) AND ((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((u > 1) AND ((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(2 rows)
+
+-- test multi-table
+explain (costs false) select count(*) from s, l where s.n not in (select n from l);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+                     Filter: ((n IS NOT NULL) OR (NOT $0))
+               ->  Bitmap Heap Scan on l l_1
+                     Recheck Cond: ((s.n = n) OR (n IS NULL))
+                     ->  BitmapOr
+                           ->  Bitmap Index Scan on l_n
+                                 Index Cond: (n = s.n)
+                           ->  Bitmap Index Scan on l_n
+                                 Index Cond: (n IS NULL)
+         ->  Seq Scan on l
+(15 rows)
+
+select count(*) from s, l where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s, l where s.nn not in (select nn from l);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+               ->  Index Only Scan using l_nn on l l_1
+                     Index Cond: (nn = s.nn)
+         ->  Seq Scan on l
+(7 rows)
+
+select count(*) from s, l where s.nn not in (select nn from l);
+ count 
+-------
+ 10000
+(1 row)
+
+-- test null padded results from outer join
+explain (costs false) select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = s_1.nn) OR (s_1.nn IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Nested Loop Left Join
+           Join Filter: (l_1.nn = s_2.nn)
+           ->  Seq Scan on l l_1
+           ->  Materialize
+                 ->  Seq Scan on s s_2
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s_1.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s s_1
+(15 rows)
+
+select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = s_1.nn) OR (s_1.nn IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Nested Loop Left Join
+           Join Filter: (s_2.nn = l_1.nn)
+           ->  Seq Scan on l l_1
+           ->  Materialize
+                 ->  Seq Scan on s s_2
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s_1.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s s_1
+(15 rows)
+
+select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+                   QUERY PLAN                   
+------------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         ->  Seq Scan on l
+               Filter: (NOT (hashed SubPlan 1))
+               SubPlan 1
+                 ->  Seq Scan on s s_1
+         ->  Hash
+               ->  Seq Scan on s
+(9 rows)
+
+select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+ count 
+-------
+  9997
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+                QUERY PLAN                
+------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         Filter: (NOT (hashed SubPlan 1))
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s
+         SubPlan 1
+           ->  Seq Scan on s s_1
+(9 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   ->  Nested Loop Left Join
+         Join Filter: (s.nn = l.nn)
+         ->  Hash Anti Join
+               Hash Cond: (l.nn = l_1.nn)
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Left Join
+                           Hash Cond: (l_1.nn = s_1.nn)
+                           ->  Seq Scan on l l_1
+                           ->  Hash
+                                 ->  Seq Scan on s s_1
+         ->  Seq Scan on s
+(13 rows)
+
+select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Nested Loop Left Join
+           Join Filter: (l_2.nn = s_2.nn)
+           ->  Seq Scan on l l_2
+           ->  Materialize
+                 ->  Seq Scan on s s_2
+   ->  Nested Loop Anti Join
+         Join Filter: ((s.nn = s_1.nn) OR (s_1.nn IS NULL))
+         ->  Hash Left Join
+               Hash Cond: (l.nn = s.nn)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s
+         ->  Materialize
+               ->  Hash Left Join
+                     Hash Cond: (l_1.nn = s_1.nn)
+                     ->  Seq Scan on l l_1
+                     ->  Hash
+                           ->  Seq Scan on s s_1
+(21 rows)
+
+select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Seq Scan on s1
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(11 rows)
+
+select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Left Join
+               Hash Cond: (l.u = s.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Right Join
+                           Hash Cond: (s1.u = s.u)
+                           ->  Seq Scan on s1
+                           ->  Hash
+                                 ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(17 rows)
+
+select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+                    QUERY PLAN                     
+---------------------------------------------------
+ Aggregate
+   ->  Nested Loop Left Join
+         Join Filter: (s.u = s1.u)
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+               ->  Index Only Scan using l_nn on l
+                     Index Cond: (nn = s.nn)
+         ->  Seq Scan on s1
+(8 rows)
+
+select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop
+               Join Filter: (s.u = s1.u)
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Seq Scan on s1
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(11 rows)
+
+select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Left Join
+               Hash Cond: (l.u = s.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Join
+                           Hash Cond: (s1.u = s.u)
+                           ->  Seq Scan on s1
+                           ->  Hash
+                                 ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(17 rows)
+
+select * from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+ u | n | nn | p | u | n | n1 | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Seq Scan on s1
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(11 rows)
+
+select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Full Join
+               Hash Cond: (l.u = s.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Join
+                           Hash Cond: (s1.u = s.u)
+                           ->  Seq Scan on s1
+                           ->  Hash
+                                 ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(17 rows)
+
+select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+                    QUERY PLAN                     
+---------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Nested Loop Left Join
+         Join Filter: (l.nn = s1.nn)
+         ->  Nested Loop Left Join
+               Join Filter: (l.nn = s_1.nn)
+               ->  Index Only Scan using l_nn on l
+                     Index Cond: (nn = s.nn)
+               ->  Seq Scan on s s_1
+         ->  Seq Scan on s1
+(10 rows)
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.nn = s_1.nn)
+           ->  Nested Loop Left Join
+                 ->  Seq Scan on s1
+                 ->  Index Only Scan using l_nn on l
+                       Index Cond: (nn = s1.nn)
+           ->  Materialize
+                 ->  Seq Scan on s s_1
+(11 rows)
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+                                    QUERY PLAN                                     
+-----------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Nested Loop Left Join
+         Join Filter: (l.nn = s_1.nn)
+         ->  Index Scan using l_nn on l
+               Index Cond: (nn = s.nn)
+               Filter: (((s.n = n) OR (n IS NULL) OR (s.n IS NULL)) AND (s.u = u))
+         ->  Seq Scan on s s_1
+(8 rows)
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l right join s on l.nn = s.nn);
+                QUERY PLAN                 
+-------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+           ->  Index Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+(7 rows)
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+--test reduce outer joins from outer query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Left Join
+               Hash Cond: (l.nn = s.nn)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(13 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+                     Filter: (u > 0)
+               ->  Index Only Scan using l_nn on l l_1
+                     Index Cond: (nn = s.nn)
+         ->  Index Only Scan using l_nn on l
+               Index Cond: (nn = s.nn)
+(9 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Index Only Scan using l_nn on l
+                     Index Cond: (nn = s.nn)
+         ->  Seq Scan on s1
+(11 rows)
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Anti Join
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on s1
+               ->  Materialize
+                     ->  Nested Loop
+                           ->  Seq Scan on s
+                           ->  Index Only Scan using l_nn on l
+                                 Index Cond: (nn = s.nn)
+         ->  Index Only Scan using l_nn on l l_1
+               Index Cond: (nn = s.nn)
+(15 rows)
+
+select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Left Join
+         Join Filter: (s.u = s1.u)
+         ->  Hash Anti Join
+               Hash Cond: (s.nn = l_1.nn)
+               ->  Hash Left Join
+                     Hash Cond: (l.nn = s.nn)
+                     Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+                     ->  Seq Scan on l
+                     ->  Hash
+                           ->  Seq Scan on s
+               ->  Hash
+                     ->  Seq Scan on l l_1
+         ->  Seq Scan on s1
+(16 rows)
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+--test reduce outer joins from subquery
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+(7 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+                QUERY PLAN                 
+-------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+                 Filter: (u > 9)
+(8 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+                 Filter: (u > 9)
+           ->  Index Only Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+(8 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           Join Filter: (l.n = s1.n)
+           ->  Seq Scan on s1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Scan using l_nn on l
+                             Index Cond: (nn = s_1.nn)
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.n = s1.n)
+           ->  Seq Scan on s1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Scan using l_nn on l
+                             Index Cond: (nn = s_1.nn)
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+                   QUERY PLAN                    
+-------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.n = s1.n)
+           ->  Nested Loop Left Join
+                 ->  Seq Scan on s s_1
+                 ->  Index Scan using l_nn on l
+                       Index Cond: (nn = s_1.nn)
+           ->  Materialize
+                 ->  Seq Scan on s1
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+--test reduce outer join on outer and sub-query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+                                    QUERY PLAN                                    
+----------------------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = s1.u)
+         ->  Seq Scan on s1
+         ->  Materialize
+               ->  Nested Loop
+                     ->  Seq Scan on s
+                           Filter: (NOT (hashed SubPlan 1))
+                           SubPlan 1
+                             ->  Nested Loop
+                                   Join Filter: (l_1.n = s1_1.n)
+                                   ->  Seq Scan on s1 s1_1
+                                   ->  Materialize
+                                         ->  Nested Loop
+                                               ->  Seq Scan on s s_1
+                                               ->  Index Scan using l_nn on l l_1
+                                                     Index Cond: (nn = s_1.nn)
+                     ->  Index Only Scan using l_nn on l
+                           Index Cond: (nn = s.nn)
+(19 rows)
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         Filter: (NOT (hashed SubPlan 1))
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Hash Right Join
+                     Hash Cond: (s1.u = s.u)
+                     ->  Seq Scan on s1
+                     ->  Hash
+                           ->  Seq Scan on s
+         SubPlan 1
+           ->  Nested Loop Left Join
+                 Join Filter: (l_1.n = s1_1.n)
+                 ->  Nested Loop Left Join
+                       ->  Seq Scan on s s_1
+                       ->  Index Scan using l_nn on l l_1
+                             Index Cond: (nn = s_1.nn)
+                 ->  Materialize
+                       ->  Seq Scan on s1 s1_1
+(20 rows)
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+ count 
+-------
+     0
+(1 row)
+
+-- test union all
+explain (costs false) select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s t
+         Filter: (n IS NOT NULL)
+   ->  Append
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: ((t.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = t.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+         ->  Seq Scan on s
+               Filter: ((t.n = u) OR (u IS NULL))
+         ->  Seq Scan on s s_1
+               Filter: ((t.n = nn) OR (nn IS NULL))
+(15 rows)
+
+select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Result
+           ->  Append
+                 ->  Seq Scan on l l_1
+                 ->  Seq Scan on s s_3
+                 ->  Seq Scan on s s_4
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Append
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+         ->  Seq Scan on s s_1
+               Filter: ((s.n = u) OR (u IS NULL))
+         ->  Seq Scan on s s_2
+               Filter: ((s.n = nn) OR (nn IS NULL))
+(21 rows)
+
+select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.n = l_1.nn)
+         ->  Append
+               ->  Seq Scan on s
+                     Filter: ((n IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+                     Filter: ((u IS NOT NULL) OR (NOT $0))
+         ->  Hash
+               ->  Seq Scan on l l_1
+(12 rows)
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+                   QUERY PLAN                   
+------------------------------------------------
+ Aggregate
+   ->  Append
+         ->  Seq Scan on s
+               Filter: (NOT (hashed SubPlan 1))
+               SubPlan 1
+                 ->  Seq Scan on empty
+         ->  Seq Scan on l
+               Filter: (NOT (hashed SubPlan 1))
+(8 rows)
+
+select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+ count 
+-------
+ 10004
+(1 row)
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
+ Finalize Aggregate
+   InitPlan 1 (returns $0)
+     ->  Result
+           ->  Append
+                 ->  Seq Scan on l l_2
+                 ->  Seq Scan on s s_3
+                 ->  Seq Scan on s s_4
+   ->  Gather
+         Workers Planned: 2
+         Params Evaluated: $0
+         ->  Partial Aggregate
+               ->  Nested Loop Anti Join
+                     ->  Parallel Append
+                           ->  Parallel Seq Scan on l
+                                 Filter: ((u IS NOT NULL) OR (NOT $0))
+                           ->  Parallel Seq Scan on s
+                                 Filter: ((n IS NOT NULL) OR (NOT $0))
+                     ->  Append
+                           ->  Bitmap Heap Scan on l l_1
+                                 Recheck Cond: ((l.u = n) OR (n IS NULL))
+                                 ->  BitmapOr
+                                       ->  Bitmap Index Scan on l_n
+                                             Index Cond: (n = l.u)
+                                       ->  Bitmap Index Scan on l_n
+                                             Index Cond: (n IS NULL)
+                           ->  Seq Scan on s s_1
+                                 Filter: ((l.u = u) OR (u IS NULL))
+                           ->  Seq Scan on s s_2
+                                 Filter: ((l.u = nn) OR (nn IS NULL))
+(29 rows)
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+ count 
+-------
+     0
+(1 row)
+
+-- test multi-levels of NOT IN
+explain (costs false) select * from s where n not in (select n from s where n not in (select n from l));
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 3))
+   SubPlan 3
+     ->  Nested Loop Anti Join
+           InitPlan 2 (returns $2)
+             ->  Seq Scan on l
+           ->  Seq Scan on s s_1
+                 Filter: ((n IS NOT NULL) OR (NOT $2))
+           ->  Bitmap Heap Scan on l l_1
+                 Recheck Cond: ((s_1.n = n) OR (n IS NULL))
+                 ->  BitmapOr
+                       ->  Bitmap Index Scan on l_n
+                             Index Cond: (n = s_1.n)
+                       ->  Bitmap Index Scan on l_n
+                             Index Cond: (n IS NULL)
+(15 rows)
+
+select * from s where n not in (select n from s where n not in (select n from l));
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where n not in (select n from s where n not in (select u from l));
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 3))
+   SubPlan 3
+     ->  Nested Loop Anti Join
+           InitPlan 2 (returns $2)
+             ->  Seq Scan on l
+           ->  Seq Scan on s s_1
+                 Filter: ((n IS NOT NULL) OR (NOT $2))
+           ->  Index Only Scan using l_u on l l_1
+                 Index Cond: (u = s_1.n)
+(10 rows)
+
+select * from s where n not in (select n from s where n not in (select u from l));
+ u | n | nn |  p  
+---+---+----+-----
+ 1 | 1 |  1 | foo
+ 2 | 2 |  2 | foo
+(2 rows)
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Aggregate
+   ->  Seq Scan on s
+         Filter: (NOT (SubPlan 2))
+         SubPlan 2
+           ->  Result
+                 One-Time Filter: (NOT $2)
+                 InitPlan 1 (returns $2)
+                   ->  Nested Loop Anti Join
+                         ->  Seq Scan on s1
+                               Filter: (n = s.n)
+                         ->  Bitmap Heap Scan on l
+                               Recheck Cond: ((s1.u = n) OR (n IS NULL))
+                               ->  BitmapOr
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n = s1.u)
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n IS NULL)
+                 ->  Seq Scan on s1 s1_1
+(18 rows)
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+                                           QUERY PLAN                                           
+------------------------------------------------------------------------------------------------
+ Seq Scan on s
+   Filter: ((NOT (hashed SubPlan 1)) AND (NOT (hashed SubPlan 2)) AND (NOT (hashed SubPlan 3)))
+   SubPlan 1
+     ->  Seq Scan on s1
+   SubPlan 2
+     ->  Seq Scan on s1 s1_1
+   SubPlan 3
+     ->  Seq Scan on s1 s1_2
+(8 rows)
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: ((NOT (hashed SubPlan 1)) AND (NOT (hashed SubPlan 2)))
+         SubPlan 1
+           ->  Seq Scan on s1
+         SubPlan 2
+           ->  Seq Scan on s1 s1_1
+   ->  Index Only Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+(9 rows)
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Aggregate
+   ->  Seq Scan on s
+         Filter: ((NOT (SubPlan 2)) AND (NOT (SubPlan 4)))
+         SubPlan 2
+           ->  Result
+                 One-Time Filter: (NOT $2)
+                 InitPlan 1 (returns $2)
+                   ->  Nested Loop Anti Join
+                         ->  Seq Scan on s1
+                               Filter: (n = s.n)
+                         ->  Bitmap Heap Scan on l
+                               Recheck Cond: ((s1.u = n) OR (n IS NULL))
+                               ->  BitmapOr
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n = s1.u)
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n IS NULL)
+                 ->  Seq Scan on s1 s1_1
+         SubPlan 4
+           ->  Result
+                 One-Time Filter: (NOT $6)
+                 InitPlan 3 (returns $6)
+                   ->  Nested Loop Anti Join
+                         ->  Seq Scan on s1 s1_2
+                               Filter: (n = s.n)
+                         ->  Bitmap Heap Scan on l l_1
+                               Recheck Cond: ((s1_2.u = n) OR (n IS NULL))
+                               ->  BitmapOr
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n = s1_2.u)
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n IS NULL)
+                 ->  Seq Scan on s1 s1_3
+(33 rows)
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+ count 
+-------
+     0
+(1 row)
+
+--test COALESCE
+explain (costs false) select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, '-1'::integer) = COALESCE(l.n, '-1'::integer))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, '-1'::integer) = COALESCE(l.n, '-1'::integer))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+                                 QUERY PLAN                                  
+-----------------------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((COALESCE(s.n) = COALESCE(l.n)) OR (COALESCE(l.n) IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((COALESCE(n) IS NOT NULL) OR (NOT $0))
+   ->  Materialize
+         ->  Seq Scan on l
+(8 rows)
+
+select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, s.nn) = COALESCE(l.n, l.nn))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+                   QUERY PLAN                   
+------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.nn) = COALESCE(l.nn))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+                                                       QUERY PLAN                                                        
+-------------------------------------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((COALESCE(s.n, '-1'::integer) = COALESCE(n, '-1'::integer)) AND (COALESCE(s.n, s.u) = COALESCE(n, u)))
+(5 rows)
+
+select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+-- test miscellaneous outer nullable cases
+explain (costs false) select * from s where (n,n) not in (select n,n from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s.n = l.n) AND (s.n = l.n)) IS NOT FALSE)
+   ->  Seq Scan on s
+   ->  Materialize
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where (n,n) not in (select n,n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+                                     QUERY PLAN                                      
+-------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s.n = l_1.n) AND (s.u = l_1.u) AND (s.nn = l_1.nn)) IS NOT FALSE)
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s
+   ->  Materialize
+         ->  Seq Scan on l l_1
+(9 rows)
+
+select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+ u | n | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+                 QUERY PLAN                  
+---------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         Filter: (NOT (hashed SubPlan 1))
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s
+         SubPlan 1
+           ->  Index Scan using l_u on l l_1
+                 Index Cond: (u < 0)
+(10 rows)
+
+select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+ count 
+-------
+ 10000
+(1 row)
+
+explain (costs false) select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Sort
+   Sort Key: s.n
+   ->  Seq Scan on s
+         Filter: (NOT (hashed SubPlan 1))
+         SubPlan 1
+           ->  Limit
+                 ->  Unique
+                       ->  Index Scan using l_n on l
+                             Filter: (u > 0)
+(9 rows)
+
+select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+--test outer has strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l) and n > 0;
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n > 0)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where n not in (select n from l) and n > 0;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l) and u > 0;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: (((n IS NOT NULL) OR (NOT $0)) AND (u > 0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where n not in (select n from l) and u > 0;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l) and n is not null;
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n IS NOT NULL)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where n not in (select n from l) and n is not null;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s join l on s.n = l.n where s.n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop
+   ->  Nested Loop Anti Join
+         ->  Seq Scan on s
+         ->  Bitmap Heap Scan on l l_1
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+   ->  Index Scan using l_n on l
+         Index Cond: (n = s.n)
+(12 rows)
+
+select * from s join l on s.n = l.n where s.n not in (select n from l);
+ u | n | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Anti Join
+         ->  Hash Left Join
+               Hash Cond: (l.n = s.n)
+               Filter: ((s.n IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s
+         ->  Bitmap Heap Scan on l l_1
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+(17 rows)
+
+select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Bitmap Heap Scan on l l_1
+                           Recheck Cond: ((s.n = n) OR (n IS NULL))
+                           ->  BitmapOr
+                                 ->  Bitmap Index Scan on l_n
+                                       Index Cond: (n = s.n)
+                                 ->  Bitmap Index Scan on l_n
+                                       Index Cond: (n IS NULL)
+               ->  Index Only Scan using l_n on l
+                     Index Cond: (n = s.n)
+         ->  Seq Scan on s1
+(16 rows)
+
+select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Anti Join
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               Filter: ((s.n IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on s1
+               ->  Materialize
+                     ->  Nested Loop
+                           ->  Seq Scan on s
+                           ->  Index Only Scan using l_n on l
+                                 Index Cond: (n = s.n)
+         ->  Bitmap Heap Scan on l l_1
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+(20 rows)
+
+select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+--test inner has strict predicate or inner join
+explain (costs false) select * from s where u not in (select n from l where n > 0);
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.u) AND (n > 0))
+(4 rows)
+
+select * from s where u not in (select n from l where n > 0);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where u not in (select n from l where u > 0);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.u = n) OR (n IS NULL))
+         Filter: (u > 0)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.u)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where u not in (select n from l where u > 0);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select n from l where n is not null);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.u) AND (n IS NOT NULL))
+(4 rows)
+
+select * from s where u not in (select n from l where n is not null);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(7 rows)
+
+select * from s where u not in (select l.n from l join s on l.n=s.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.u=s.u);
+               QUERY PLAN                
+-----------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Scan using l_u on l
+                 Index Cond: (u = s_1.u)
+(7 rows)
+
+select * from s where u not in (select l.n from l join s on l.u=s.u);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n = s.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(7 rows)
+
+select * from s where u not in (select l.n from l join s on l.n = s.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n = s.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(7 rows)
+
+select * from s where u not in (select l.n from l right join s on l.n = s.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Nested Loop
+                 Join Filter: (s_1.n = s1.n)
+                 ->  Seq Scan on s1
+                 ->  Materialize
+                       ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(11 rows)
+
+select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.n = s1.n)
+           ->  Seq Scan on s1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Only Scan using l_n on l
+                             Index Cond: (n = s_1.n)
+(11 rows)
+
+select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+--test both sides have strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l where n > 0) and n > 0;
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n > 0)
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.n) AND (n > 0))
+(5 rows)
+
+select * from s where n not in (select n from l where n > 0) and n > 0;
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where n not in (select n from l where u > 0) and n > 0;
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n > 0)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 0)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(11 rows)
+
+select * from s where n not in (select n from l where u > 0) and n > 0;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l where n > 0) and u > 0;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (n > 0)
+   ->  Seq Scan on s
+         Filter: (((n IS NOT NULL) OR (NOT $0)) AND (u > 0))
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.n) AND (n > 0))
+(8 rows)
+
+select * from s where n not in (select n from l where n > 0) and u > 0;
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Nested Loop
+   Join Filter: (s.u = s1.u)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Nested Loop
+               ->  Seq Scan on s
+                     Filter: (NOT (hashed SubPlan 1))
+                     SubPlan 1
+                       ->  Nested Loop
+                             ->  Nested Loop
+                                   Join Filter: (s_1.n = s1_1.n)
+                                   ->  Seq Scan on s s_1
+                                   ->  Materialize
+                                         ->  Seq Scan on s s1_1
+                             ->  Index Only Scan using l_n on l l_1
+                                   Index Cond: (n = s_1.n)
+               ->  Index Scan using l_n on l
+                     Index Cond: (n = s.n)
+(18 rows)
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+ u | n | nn | p | u | n | nn | p | u | n | n1 | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+----+---
+(0 rows)
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+                                   QUERY PLAN                                   
+--------------------------------------------------------------------------------
+ Nested Loop
+   Join Filter: (s.u = s1.u)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Nested Loop
+               ->  Seq Scan on s
+                     Filter: (NOT (hashed SubPlan 1))
+                     SubPlan 1
+                       ->  Nested Loop Left Join
+                             Join Filter: (l_1.n = s1_1.n)
+                             ->  Seq Scan on s s1_1
+                             ->  Materialize
+                                   ->  Nested Loop
+                                         ->  Seq Scan on s s_1
+                                         ->  Index Only Scan using l_n on l l_1
+                                               Index Cond: (n = s_1.n)
+               ->  Index Scan using l_n on l
+                     Index Cond: (n = s.n)
+(18 rows)
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+ u | n | nn | p | u | n | nn | p | u | n | n1 | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+----+---
+(0 rows)
+
+explain (costs false) select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Nested Loop Left Join
+   Join Filter: (s.u = s1.u)
+   Filter: (NOT (hashed SubPlan 1))
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Nested Loop
+               ->  Seq Scan on s
+               ->  Index Scan using l_n on l
+                     Index Cond: (n = s.n)
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l_1.n = s1_1.n)
+           ->  Seq Scan on s s1_1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Only Scan using l_n on l l_1
+                             Index Cond: (n = s_1.n)
+(18 rows)
+
+select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+ u | n | nn | p | u | n | nn | p | u | n | n1 | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+----+---
+(0 rows)
+
+-- clean up
+set work_mem = 4000;
+drop table s;
+drop table s1;
+drop table l;
+drop table empty;
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index b5931ee700..f905f65351 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -692,3 +692,507 @@ select * from (with x as (select 2 as y) select * from x) ss;
 explain (verbose, costs off)
 with x as (select * from subselect_tbl)
 select * from x for update;
+
+-- test NON IN to ANTI JOIN conversion
+CREATE TABLE s (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s values(1000002, 1000002, 1000002, 'foofoo');
+UPDATE s set n = NULL WHERE n = 3;
+analyze s;
+
+CREATE TABLE l (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into l (u, n, nn, p)
+	select
+    generate_series(1,10000 ) as u,
+	generate_series(1,10000 ) as n,
+	generate_series(1,10000 ) as nn,
+	'bar' as p;
+UPDATE l set n = NULL WHERE n = 7;
+
+CREATE UNIQUE INDEX l_u ON l (u);
+CREATE INDEX l_n ON l (n);
+CREATE INDEX l_nn ON l (nn);
+analyze l;
+
+CREATE TABLE s1 (u INTEGER NOT NULL, n INTEGER NULL, n1 INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s1 (u, n, n1, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as n1,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s1 values(1000003, 1000003, 1000003, 1000003, 'foofoo');
+insert into s1 values(1003, 1003, 1003, 1003, 'foofoo');
+UPDATE s1 set n = NULL WHERE n = 3;
+UPDATE s1 set n1 = NULL WHERE n = 2;
+UPDATE s1 set n1 = NULL WHERE n1 = 3;
+analyze s1;
+
+CREATE TABLE empty (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+analyze empty;
+
+-- set work_mem to 64KB so that NOT IN to ANTI JOIN optimization will kick in
+set work_mem = 64;
+
+-- correctness test 1: inner empty, return every thing from outer including NULL
+explain (costs false) select * from s where n not in (select n from empty);
+
+select * from s where n not in (select n from empty);
+
+-- correctness test 2: inner has NULL, return empty result
+explain (costs false) select * from s where n not in (select n from l);
+
+select * from s where n not in (select n from l);
+
+-- correctness test 3: inner non-null, result has no NULL
+explain (costs false) select * from s where n not in (select u from l);
+
+select * from s where n not in (select u from l);
+
+-- correctness test 4: inner has predicate
+explain (costs false) select * from s where n not in (select n from l where u > 7);
+
+select * from s where n not in (select n from l where u > 7);
+
+-- correctness test 5: multi-expression, (2, 2, null, 2, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+
+select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should not be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+
+select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+
+select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+
+-- test using hashed subplan when inner fits in work_mem
+explain (costs false) select * from l where n not in (select n from s);
+
+select * from l where n not in (select n from s);
+
+-- test single expression
+explain (costs false) select * from s where n not in (select n from l);
+
+select * from s where n not in (select n from l);
+
+explain (costs false) select * from s where u not in (select u from l);
+
+select * from s where u not in (select u from l);
+
+explain (costs false) select * from s where 3*n not in (select n from l);
+
+select * from s where 3*n not in (select n from l);
+
+explain (costs false) select * from s where n not in (select 3*n from l);
+
+select * from s where n not in (select 3*n from l);
+
+-- test single expression with predicates
+explain (costs false) select * from s where n not in (select n from l where u > 0);
+
+select * from s where n not in (select n from l where u > 0);
+
+explain (costs false) select * from s where n not in (select n from l where u > 100);
+
+select * from s where n not in (select n from l where u > 100);
+
+-- test multi expression
+explain (costs false) select * from s where (n,u) not in (select n,u from l);
+
+select * from s where (n,u) not in (select n,u from l);
+
+explain (costs false) select * from s where (u, nn) not in (select u, nn from l);
+
+select * from s where (u, nn) not in (select u, nn from l);
+
+explain (costs false) select * from s where (n,u) not in (select u,n from l);
+
+select * from s where (n,u) not in (select u,n from l);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+
+-- test multi-table
+explain (costs false) select count(*) from s, l where s.n not in (select n from l);
+
+select count(*) from s, l where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s, l where s.nn not in (select nn from l);
+
+select count(*) from s, l where s.nn not in (select nn from l);
+
+-- test null padded results from outer join
+explain (costs false) select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+
+select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+
+select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+
+select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+
+select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+
+select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+select * from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l right join s on l.nn = s.nn);
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+--test reduce outer joins from outer query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+--test reduce outer joins from subquery
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+--test reduce outer join on outer and sub-query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+-- test union all
+explain (costs false) select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+
+select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+
+explain (costs false) select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+
+explain (costs false) select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+
+select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+-- test multi-levels of NOT IN
+explain (costs false) select * from s where n not in (select n from s where n not in (select n from l));
+
+select * from s where n not in (select n from s where n not in (select n from l));
+
+explain (costs false) select * from s where n not in (select n from s where n not in (select u from l));
+
+select * from s where n not in (select n from s where n not in (select u from l));
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+--test COALESCE
+explain (costs false) select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+
+select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+
+explain (costs false) select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+
+select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+
+explain (costs false) select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+
+select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+
+explain (costs false) select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+
+select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+
+explain (costs false) select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+
+select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+
+explain (costs false) select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+
+select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+
+-- test miscellaneous outer nullable cases
+
+explain (costs false) select * from s where (n,n) not in (select n,n from l);
+
+select * from s where (n,n) not in (select n,n from l);
+
+explain (costs false) select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+
+select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+
+select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+
+explain (costs false) select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+
+select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+
+--test outer has strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l) and n > 0;
+
+select * from s where n not in (select n from l) and n > 0;
+
+explain (costs false) select * from s where n not in (select n from l) and u > 0;
+
+select * from s where n not in (select n from l) and u > 0;
+
+explain (costs false) select * from s where n not in (select n from l) and n is not null;
+
+select * from s where n not in (select n from l) and n is not null;
+
+explain (costs false) select * from s join l on s.n = l.n where s.n not in (select n from l);
+
+select * from s join l on s.n = l.n where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+
+select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+
+select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+
+select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+
+--test inner has strict predicate or inner join
+explain (costs false) select * from s where u not in (select n from l where n > 0);
+
+select * from s where u not in (select n from l where n > 0);
+
+explain (costs false) select * from s where u not in (select n from l where u > 0);
+
+select * from s where u not in (select n from l where u > 0);
+
+explain (costs false) select * from s where u not in (select n from l where n is not null);
+
+select * from s where u not in (select n from l where n is not null);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n);
+
+select * from s where u not in (select l.n from l join s on l.n=s.n);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.u=s.u);
+
+select * from s where u not in (select l.n from l join s on l.u=s.u);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n = s.n);
+
+select * from s where u not in (select l.n from l join s on l.n = s.n);
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n = s.n);
+
+select * from s where u not in (select l.n from l right join s on l.n = s.n);
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+
+select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+
+select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+
+--test both sides have strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l where n > 0) and n > 0;
+
+select * from s where n not in (select n from l where n > 0) and n > 0;
+
+explain (costs false) select * from s where n not in (select n from l where u > 0) and n > 0;
+
+select * from s where n not in (select n from l where u > 0) and n > 0;
+
+explain (costs false) select * from s where n not in (select n from l where n > 0) and u > 0;
+
+select * from s where n not in (select n from l where n > 0) and u > 0;
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+explain (costs false) select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+-- clean up
+set work_mem = 4000;
+drop table s;
+drop table s1;
+drop table l;
+drop table empty;
#51zhengli
zhelli@amazon.com
In reply to: Li, Zheng (#50)
Re: NOT IN subquery optimization

In our patch, we only proceed with the ANTI JOIN transformation if
subplan_is_hashable(subplan) is
false, it requires the subquery to be planned at this point.

To avoid planning the subquery again later on, I want to keep a pointer of
the subplan in SubLink so that we can directly reuse the subplan when
needed. However, this change breaks initdb for some reason and I'm trying to
figure it out.

I'll send the rebased patch in the following email since it's been a while.

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

#52Li, Zheng
zhelli@amazon.com
In reply to: zhengli (#51)
1 attachment(s)
Re: [UNVERIFIED SENDER] Re: NOT IN subquery optimization

Rebased patch is attached.

Comments are welcome.

-----------
Zheng Li
AWS, Amazon Aurora PostgreSQL

On 6/14/19, 5:39 PM, "zhengli" <zhelli@amazon.com> wrote:

In our patch, we only proceed with the ANTI JOIN transformation if
subplan_is_hashable(subplan) is
false, it requires the subquery to be planned at this point.

To avoid planning the subquery again later on, I want to keep a pointer of
the subplan in SubLink so that we can directly reuse the subplan when
needed. However, this change breaks initdb for some reason and I'm trying to
figure it out.

I'll send the rebased patch in the following email since it's been a while.

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Attachments:

not_in_to_anti_join_transformation_v2.patchapplication/octet-stream; name=not_in_to_anti_join_transformation_v2.patchDownload
commit 3909b88d9e4cbafeac46fa6d46786620f342fca2
Author: Zheng Li <zhelli@amazon.com>
Date:   Sat Mar 16 20:08:11 2019 +0000

    NOT IN to ANTI JOIN transformation
    
    If the NOT IN subquery is not eligible for hashed subplan as decided by
    subplan_is_hashable(), do the following NOT IN to ANTI JOIN
    transformation:
    
    Single expression:
        When x is nullable:
        t1.x not in (t2.y where p) =>
        ANTI JOIN
        t1 (Filter: t1.x IS NOT NULL or NOT EXISTS (select 1 from t2 where p)),
        t2 on join condition (t1.x=t2.y or t2.y IS NULL)
        and p.
        The predicate "t2.y IS NULL" can be removed if y is non-nullable.
    
        When x is non-nullable:
        t1.x not in (t2.y where p) =>
        ANTI JOIN t1, t2 on join condition (t1.x=t2.y or t2.y IS NULL)
        and p.
        The predicate "t2.y IS NULL" can be removed if y is non-nullable.
    
    Multi expression:
        If all xi's are nullable:
        (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
        ANTI JOIN t1, t2 on join condition:
        ((t1.x1 = t2.y1) and ... (t1.xi = t2.yi) ... and
         (t1.xn = t2.yn)) IS NOT FALSE.
    
        If at least one xi is non-nuallable:
        (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
        ANTI JOIN t1, t2 on join condition:
        (t1.x1 = t2.y1 or t2.y1 IS NULL or t1.x1 IS NULL) and ...
        (t1.xi = t2.yi or t2.yi IS NULL) ... and
        (t1.xn = t2.yn or t2.yn IS NULL or t1.xn IS NULL).
    
    Add nullability testing routine is_node_nonnullable(), currently it
    handles Var, TargetEntry, CoalesceExpr and Const. It uses strict
    predicates, inner join conditions and NOT NULL constraint to check
    the nullability of a Var.
    
    Adjust and apply reduce_outer_joins() before the transformation so
    that the outer joins have an opportunity to be converted to inner joins
    prior to the transformation.
    
    We measured performance improvements of two to five orders of magnitude
    on most queries in a development environment. In our performance experiments,
    table s (small) has 11 rows, table l (large) has 1 million rows. s.n and l.n
    have NULL value. s.nn and l.nn are NOT NULL. Index is created on each column.
    
    Cases using hash anti join:
    l.nn not in (l.nn) 21900s -> 235ms
    l.nn not in (l.nn where u>0) 22000s -> 240ms
    l.n not in (l.nn) 21900s -> 238ms
    l.n not in (l.nn where u>0) 22000s -> 248ms
    
    Cases using index nested loop anti join
    s.n not in (l.nn) 360ms -> 0.5ms
    s.n not in (l.nn where u>0) 390ms -> 0.6ms
    s.nn not in (l.nn) 365ms -> 0.5ms
    s.nn not in (l.nn where u>0) 390ms -> 0.5ms
    
    Cases using bitmap heap scan on the inner and nested loop anti join:
    s.n not in (l.n) 360ms -> 0.7ms
    l.n not in (l.n) 21900s ->  1.6s
    l.n not in (l.n where u>0) 22000s -> 1680ms
    s.nn not in (l.n) 360ms -> 0.5ms
    l.nn not in (l.n) 21900s -> 1650ms
    l.nn not in (l.n where u>0) 22000s -> 1660ms
    
    Cases using the original hashed subplan:
    l.n not in (s.n) 63ms -> 63ms
    l.nn not in (s.nn) 63ms -> 63ms
    l.n not in (s.n where u>0) 63ms -> 63ms
    
    Authors: Zheng Li, Jim Finnerty
    Discussion:
    https://www.postgresql.org/message-id/flat/1550706289606-0.post%40n3.nabble.com

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index cb897cc7f4..6edb5f65f5 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -991,7 +991,7 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
 	 * preprocessing.
 	 */
 	if (hasOuterJoins)
-		reduce_outer_joins(root);
+		reduce_outer_joins(parse);
 
 	/*
 	 * If we have any RTE_RESULT relations, see if they can be deleted from
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index efd0fbc21c..21322d0556 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -1173,6 +1173,357 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
 	return expression_tree_walker(node, inline_cte_walker, context);
 }
 
+/* Returns a List of Nodes from the testexpr of an Any SubLink */
+static List *
+getTestExpr(SubLink *sublink)
+{
+	Node * testexpr = sublink->testexpr;
+	Assert(testexpr);
+
+	/* single expression */
+	if(IsA(testexpr, OpExpr))
+	{
+		OpExpr	*opexpr = (OpExpr *) testexpr;
+		Node	*testnode = linitial(opexpr->args);
+
+		return list_make1(testnode);
+	}
+	/* multi-expression */
+	else if(IsA(testexpr, BoolExpr))
+	{
+		BoolExpr	*bexpr = (BoolExpr *) testexpr;
+		ListCell		*lc;
+		Node		*node;
+		List			*result = NULL;
+
+		foreach(lc, bexpr->args)
+		{
+			node = lfirst(lc);
+			if(IsA(node, OpExpr))
+			{
+				OpExpr *expr = (OpExpr *) node;
+				result = lappend(result, linitial(expr->args));
+			}
+			else
+			{
+				elog(ERROR, "unrecognized node type for testexpr: %d",
+						(int) nodeTag(node));
+			}
+		}
+		return result;
+	}
+	else
+	{
+		elog(ERROR, "unrecognized node type for testexpr: %d",
+				(int) nodeTag(testexpr));
+	}
+}
+
+/* Try to reduce outer joins if there is one in the Query */
+static void
+reduce_outer_joins_NOT_IN(Query *parse)
+{
+	ListCell				*lc;
+	RangeTblEntry	*rte;
+
+	foreach(lc, parse->rtable)
+	{
+		rte = (RangeTblEntry *) lfirst(lc);
+		/* try to reduce outer joins if there is one */
+		if (rte->rtekind == RTE_JOIN &&
+				IS_OUTER_JOIN(rte->jointype))
+		{
+			reduce_outer_joins(parse);
+			return;
+		}
+	}
+}
+
+/*
+ * Make clause NOT EXISTS
+ * (select 1 from t2 where p) for the NOT IN to ANTI JOIN
+ * transformation.
+ */
+static
+Node *
+makeExistsTest_NOT_IN(Query *subselect)
+{
+	BoolExpr *notExpr = makeNode(BoolExpr);
+	SubLink *exists = makeNode(SubLink);
+	Query *selectOne =  copyObject(subselect);
+	Const *oneconst;
+	TargetEntry *dummyte;
+
+	/* modify subselect target list to contain a dummy const 1 */
+	oneconst = makeConst(INT4OID,
+						 -1,
+						 InvalidOid,
+						 sizeof(int32),
+						 Int32GetDatum(1),
+						 false, /* isnull */
+						 true); /* pass by value */
+	dummyte = makeTargetEntry((Expr *) oneconst,
+							  1,
+							  "one",
+							  false /* resjunk */ );
+	selectOne->targetList = list_make1(dummyte);
+
+	/* make EXISTS(select 1 from t2 where p) */
+	exists->subLinkType = EXISTS_SUBLINK;
+	exists->subLinkId = 0;
+	exists->subselect = (Node *) selectOne;
+	exists->location = -1;
+
+	/* make NOT EXISTS(select 1 from t2 where p) */
+	notExpr->boolop = NOT_EXPR;
+	notExpr->args = list_make1(exists);
+	notExpr->location = -1;
+
+	return (Node *) notExpr;
+}
+
+/*
+ * If the NOT IN subquery is not eligible for hashed subplan as
+ * decided by subplan_is_hashable(),
+ * do the following NOT IN to ANTI JOIN conversions:
+ *
+ * When x is non-nullable:
+ * t1.x not in (t2.y where p) => ANTI JOIN
+ * t1, t2 on join condition (t1.x=t2.y or t2.y IS NULL) and p.
+ * the above predicate "t2.y IS NULL" can be removed if y
+ * is also non-nullable.
+ *
+ * When x is nullable:
+ * t1.x not in (t2.y where p) => ANTI JOIN
+ * t1 (Filter: t1.x is not null or not exists (select 1 from t2 where p)),
+ * t2 on join condition (t1.x=t2.y or t2.y is null) and p.
+ * the above predicate "t2.y IS NULL" can be removed if y
+ * is also non-nullable.
+ *
+ * The multi-expression case is just ANDs of the single-
+ * expression case.
+ */
+static bool
+convert_NOT_IN_to_join(PlannerInfo *root, Node **quals,
+									SubLink *sublink, List *subquery_vars,
+									Node **pullout)
+{
+	Query			*parse = root->parse;
+	Query			*subselect = (Query *) sublink->subselect;
+	List				*testnodes = getTestExpr(sublink);
+	bool				 outerNonNull;
+	bool				 innerNonNull;
+	PlannerInfo 	*subroot;
+	double			 tuple_fraction;
+	RelOptInfo		*final_rel;
+	Path				*best_path;
+	Plan				*plan;
+	NullTest 		*nt;
+
+	/* For ALL and ANY subplans, we will be
+	 * able to stop evaluating if the test condition fails or matches, so very
+	 * often not all the tuples will be retrieved; for lack of a better idea,
+	 * specify 50% retrieval.
+	 */
+	tuple_fraction = 0.5;
+	/*
+	 * Generate Paths for the subquery, use a copied version of the subquery
+	 * so that the existing one doesn't get modified.
+	 */
+	subroot = subquery_planner(root->glob, copyObject(subselect),
+								root,
+							   false, tuple_fraction);
+
+	/* Select best Path and turn it into a Plan. */
+	final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
+	best_path = get_cheapest_fractional_path(final_rel, tuple_fraction);
+
+	plan = create_plan(subroot, best_path);
+
+	/*
+	 * Punt if subplan is hashable since using hashed subplan is almost like
+	 * doing a Hash Anti Join, we probably can't do better than that.
+	 */
+	if(subplan_is_hashable(plan))
+	{
+		return false;
+	}
+
+	/*
+	 * Try reduce_outer_joins since outer join affects the nullability test that's coming up next.
+	 * We have to call reduce_outer_joins for outer and inner query separately because we
+	 * don't have a global range table yet.
+	 */
+	reduce_outer_joins_NOT_IN(parse);
+
+	reduce_outer_joins_NOT_IN(subselect);
+
+	Assert(testnodes);
+	outerNonNull =
+					list_hasnonnullable(testnodes, parse);
+	innerNonNull =
+					list_hasnonnullable(subselect->targetList, subselect);
+
+	/* Single-expression case, do the following:
+	 * When x is non-nullable:
+	 * t1.x not in (t2.y where p) => ANTI JOIN
+	 * t1, t2 on join condition (t1.x=t2.y or t2.y IS NULL) and p.
+	 * the above predicate "t2.y IS NULL" can be removed if y
+	 * is also non-nullable.
+	 *
+	 * When x is nullable:
+	 * t1.x not in (t2.y where p) => ANTI JOIN
+	 * t1 (Filter: t1.x is not null or not exists (select 1 from t2 where p)),
+	 * t2 on join condition (t1.x=t2.y or t2.y is null) and p.
+	 * the above predicate "t2.y IS NULL" can be removed if y
+	 * is also non-nullable.
+	 */
+	if(IsA(*quals, OpExpr))
+	{
+		/*  Add "OR y IS NULL" if y is nullable */
+		if(!innerNonNull)
+		{
+			/* make expr y IS NULL */
+			nt = makeNode(NullTest);
+			nt->arg = (Expr *)linitial(subquery_vars);
+			nt->nulltesttype = IS_NULL;
+			nt->argisrow = false;
+			nt->location = -1;
+
+			/* make orclause (x = y OR y IS NULL) */
+			*quals = (Node *)make_orclause(list_make2(*quals,
+								(Node *)nt));
+		}
+
+		/*
+		 * if x is nullable, make the following filter for t1 :
+		 * x IS NOT NULL or NOT EXISTS (select 1 from t2 where p)
+		 */
+		if(!outerNonNull)
+		{
+			Node * existsTest = NULL;
+
+			/* make expr x IS NOT NULL */
+			nt = makeNode(NullTest);
+			nt->arg = (Expr *) linitial(testnodes);
+			nt->nulltesttype = IS_NOT_NULL;
+			nt->argisrow = false;
+			nt->location = -1;
+
+			existsTest = makeExistsTest_NOT_IN(subselect);
+			/* make x IS NOT NULL OR NOT EXISTS (select 1 from t2 where p) */
+			*pullout = (Node *)make_orclause(list_make2(
+									(Node *) nt, existsTest));
+		}
+	}
+	/*
+	 * Multi-expression case:
+	 * If all xi's are nullable:
+	 * (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
+	 * ANTI JOIN t1,
+	 * t2 on join condition:
+	 * ((t1.x1 = t2.y1) and ... (t1.xi = t2.yi) ... and
+	 * (t1.xn = t2.yn)) is NOT FALSE.
+	 *
+	 * If at least one xi is non-nuallable:
+	 * (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
+	 * ANTI JOIN t1,
+	 * t2 on join condition:
+	 * (t1.x1 = t2.y1 or t2.y1 is NULL) and ...
+	 * (t1.xi = t2.yi or t2.yi is NULL or t1.xi is NULL) ... and
+	 * (t1.xn = t2.yn or t2.yn is NULL).
+	 */
+	else if(IsA(*quals, BoolExpr))
+	{
+		/*
+		 * Add IS NOT FALSE on top of the join condition if ALL x_i's are nullable
+		 */
+		if(!outerNonNull)
+		{
+			BooleanTest *btest;
+
+			btest = makeNode(BooleanTest);
+			btest->arg = (Expr *) *quals;
+			btest->booltesttype = IS_NOT_FALSE;
+			*quals = (Node *) btest;
+		}
+		else
+		{
+			ListCell			*qualc;
+			TargetEntry	*te;
+			ListCell			*xc = list_head(testnodes);
+			ListCell			*yc = list_head(subquery_vars);
+			ListCell			*ytlc = list_head(subselect->targetList);
+			List				*quallist = ((BoolExpr *)*quals)->args;
+			Node				*joinCondition = NULL;
+			bool				 xnonNull;
+			bool				 ynonNull;
+
+			/* Reconstruct quals in the loop */
+			*quals = NULL;
+			foreach(qualc, quallist)
+			{
+				te = (TargetEntry *)lfirst(ytlc);
+				/* x_i = y_i */
+				joinCondition = lfirst(qualc);
+				ynonNull = is_node_nonnullable((Node*)te, subselect);
+
+				/* append y_i IS NULL to x_i = y_i if y_i is non-nullable */
+				if(!ynonNull)
+				{
+					/* make expr y_i IS NULL */
+					nt = makeNode(NullTest);
+					nt->arg = (Expr *)lfirst(yc);
+					nt->nulltesttype = IS_NULL;
+					nt->argisrow = false;
+					nt->location = -1;
+
+					/* make orclause (x_i = y_i OR y_i IS NULL) */
+					joinCondition = (Node *)make_orclause(list_make2(joinCondition,
+											(Node *)nt));
+				}
+
+				/*
+				 * Append "OR x_i is null" to the join condition if x_i is nullable.
+				 * Notice at least one x_i should be non-nullable because the all
+				 * x_i's nullable case is handled earlier by adding "IS NOT FALSE"
+				 * on top of the join condition.
+				 */
+				xnonNull = is_node_nonnullable(lfirst(xc), parse);
+				if(!xnonNull)
+				{
+					/* make expr x_i IS NULL */
+					nt = makeNode(NullTest);
+					nt->arg = (Expr *)lfirst(xc);
+					nt->nulltesttype = IS_NULL;
+					nt->argisrow = false;
+					nt->location = -1;
+
+					/* make orclause (x_i = y_i OR y_i IS NULL OR x_i IS NULL) */
+					joinCondition = (Node *)make_orclause(list_make2(joinCondition,
+											(Node *)nt));
+				}
+
+				/*
+				 * Now append joinCondition to quals as one andclause.
+				 * (x_i = y_i OR y_i IS NULL OR x_i IS NULL) AND
+				 * (x_j = y_j OR y_j IS NULL OR x_j IS NULL)...
+				 */
+				*quals = (Node *)make_andclause(list_make2(*quals, joinCondition));
+				xc = lnext(xc);
+				yc = lnext(yc);
+				ytlc = lnext(ytlc);
+			}
+		}
+	}
+	/* quals should be either OpExpr or BoolExpr, otherwise don't convert */
+	else
+	{
+		return false;
+	}
+
+	return true;
+}
 
 /*
  * convert_ANY_sublink_to_join: try to convert an ANY SubLink to a join
@@ -1210,7 +1561,8 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
  */
 JoinExpr *
 convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
-							Relids available_rels)
+											bool under_not, Node **pullout,
+											Relids available_rels)
 {
 	JoinExpr   *result;
 	Query	   *parse = root->parse;
@@ -1290,11 +1642,22 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 	 */
 	quals = convert_testexpr(root, sublink->testexpr, subquery_vars);
 
+	/*
+	 * Try converting x NOT IN (y) to ANTI JOIN.
+	 */
+	if(under_not &&
+			!convert_NOT_IN_to_join(root, &quals,
+								sublink, subquery_vars, pullout))
+	{
+		return NULL;
+	}
+
 	/*
 	 * And finally, build the JoinExpr node.
 	 */
 	result = makeNode(JoinExpr);
-	result->jointype = JOIN_SEMI;
+	/* NOT IN will be converted to ANTI JOIN */
+	result->jointype = under_not ? JOIN_ANTI : JOIN_SEMI;
 	result->isNatural = false;
 	result->larg = NULL;		/* caller must fill this in */
 	result->rarg = (Node *) rtr;
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index 67eeba938d..edc716b679 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -104,7 +104,7 @@ static Query *pullup_replace_vars_subquery(Query *query,
 static reduce_outer_joins_state *reduce_outer_joins_pass1(Node *jtnode);
 static void reduce_outer_joins_pass2(Node *jtnode,
 									 reduce_outer_joins_state *state,
-									 PlannerInfo *root,
+									 Query *parse,
 									 Relids nonnullable_rels,
 									 List *nonnullable_vars,
 									 List *forced_null_vars);
@@ -256,6 +256,13 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 		}
 		/* Build the replacement FromExpr; no quals yet */
 		newf = makeFromExpr(newfromlist, NULL);
+		/*
+		 * Replace parse->jointree with newf now because we might modify join types
+		 * during reduce_outer_joins() in convert_NOT_IN_to_join() which is called
+		 * in pull_up_sublinks_qual_recurse() that's coming up next.
+		 */
+		newf->quals = f->quals;
+		root->parse->jointree = newf;
 		/* Set up a link representing the rebuilt jointree */
 		jtlink = (Node *) newf;
 		/* Now process qual --- all children are available for use */
@@ -388,7 +395,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 		/* Is it a convertible ANY or EXISTS clause? */
 		if (sublink->subLinkType == ANY_SUBLINK)
 		{
-			if ((j = convert_ANY_sublink_to_join(root, sublink,
+			if ((j = convert_ANY_sublink_to_join(root, sublink, false, NULL,
 												 available_rels1)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
@@ -414,7 +421,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 				return NULL;
 			}
 			if (available_rels2 != NULL &&
-				(j = convert_ANY_sublink_to_join(root, sublink,
+				(j = convert_ANY_sublink_to_join(root, sublink, false, NULL,
 												 available_rels2)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
@@ -560,6 +567,68 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 					return NULL;
 				}
 			}
+			else if (sublink->subLinkType == ANY_SUBLINK)
+			{
+				Node *pullout = NULL;
+
+				if ((j = convert_ANY_sublink_to_join(root, sublink, true, &pullout,
+													 available_rels1)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink1;
+					*jtlink1 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Any inserted
+					 * joins can get stacked onto either j->larg or j->rarg,
+					 * depending on which rels they reference.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->larg,
+															 available_rels1,
+															 &j->rarg,
+															 child_rels);
+					/*
+					 * Return pullout predicate (x is NOT NULL) if it's not null,
+					 * otherwise return NULL representing constant TRUE.
+					 */
+					return pullout? pullout : NULL;
+				}
+				if (available_rels2 != NULL &&
+					(j = convert_ANY_sublink_to_join(root, sublink, true, &pullout,
+													 available_rels2)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink2;
+					*jtlink2 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Any inserted
+					 * joins can get stacked onto either j->larg or j->rarg,
+					 * depending on which rels they reference.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->larg,
+															 available_rels2,
+															 &j->rarg,
+															 child_rels);
+					/*
+					 * Return pullout predicate (x is NOT NULL) if it's not null,
+					 * otherwise return NULL representing constant TRUE.
+					 */
+					return pullout? pullout : NULL;
+				}
+			}
 		}
 		/* Else return it unmodified */
 		return node;
@@ -2377,7 +2446,7 @@ flatten_simple_union_all(PlannerInfo *root)
  * alias-var expansion).
  */
 void
-reduce_outer_joins(PlannerInfo *root)
+reduce_outer_joins(Query *parse)
 {
 	reduce_outer_joins_state *state;
 
@@ -2390,14 +2459,14 @@ reduce_outer_joins(PlannerInfo *root)
 	 * join(s) below each side of each join clause. The second pass examines
 	 * qual clauses and changes join types as it descends the tree.
 	 */
-	state = reduce_outer_joins_pass1((Node *) root->parse->jointree);
+	state = reduce_outer_joins_pass1((Node *) parse->jointree);
 
 	/* planner.c shouldn't have called me if no outer joins */
 	if (state == NULL || !state->contains_outer)
 		elog(ERROR, "so where are the outer joins?");
 
-	reduce_outer_joins_pass2((Node *) root->parse->jointree,
-							 state, root, NULL, NIL, NIL);
+	reduce_outer_joins_pass2((Node *) parse->jointree,
+							 state, parse, NULL, NIL, NIL);
 }
 
 /*
@@ -2480,7 +2549,7 @@ reduce_outer_joins_pass1(Node *jtnode)
 static void
 reduce_outer_joins_pass2(Node *jtnode,
 						 reduce_outer_joins_state *state,
-						 PlannerInfo *root,
+						 Query *parse,
 						 Relids nonnullable_rels,
 						 List *nonnullable_vars,
 						 List *forced_null_vars)
@@ -2520,7 +2589,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 			reduce_outer_joins_state *sub_state = lfirst(s);
 
 			if (sub_state->contains_outer)
-				reduce_outer_joins_pass2(lfirst(l), sub_state, root,
+				reduce_outer_joins_pass2(lfirst(l), sub_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
@@ -2632,7 +2701,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 		/* Apply the jointype change, if any, to both jointree node and RTE */
 		if (rtindex && jointype != j->jointype)
 		{
-			RangeTblEntry *rte = rt_fetch(rtindex, root->parse->rtable);
+			RangeTblEntry *rte = rt_fetch(rtindex, parse->rtable);
 
 			Assert(rte->rtekind == RTE_JOIN);
 			Assert(rte->jointype == j->jointype);
@@ -2717,7 +2786,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 					pass_nonnullable_vars = NIL;
 					pass_forced_null_vars = NIL;
 				}
-				reduce_outer_joins_pass2(j->larg, left_state, root,
+				reduce_outer_joins_pass2(j->larg, left_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
@@ -2739,7 +2808,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 					pass_nonnullable_vars = NIL;
 					pass_forced_null_vars = NIL;
 				}
-				reduce_outer_joins_pass2(j->rarg, right_state, root,
+				reduce_outer_joins_pass2(j->rarg, right_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 2e84d6b3b4..7835ecf6ca 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -38,10 +38,12 @@
 #include "optimizer/optimizer.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planmain.h"
+#include "optimizer/pathnode.h"
 #include "parser/analyze.h"
 #include "parser/parse_agg.h"
 #include "parser/parse_coerce.h"
 #include "parser/parse_func.h"
+#include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "tcop/tcopprot.h"
 #include "utils/acl.h"
@@ -2011,6 +2013,271 @@ find_forced_null_var(Node *node)
 	return NULL;
 }
 
+/*
+ * find_innerjoined_rels
+ *		Traverse jointree to locate non-outerjoined-rels and quals above them
+ *
+ * We fill innerjoined_rels with the relids of all rels that are not below
+ * the nullable side of any outer join (which would cause their Vars to be
+ * possibly NULL regardless of what's in the catalogs).  In the same scan,
+ * we locate all WHERE and JOIN/ON quals that constrain these rels add them to
+ * the usable_quals list (forming a list with implicit-AND semantics).
+ *
+ * Top-level caller must initialize innerjoined_rels/usable_quals to NULL/NIL.
+ */
+static void
+find_innerjoined_rels(Node *jtnode,
+					  Relids *innerjoined_rels, List **usable_quals)
+{
+	if (jtnode == NULL)
+		return;
+	if (IsA(jtnode, RangeTblRef))
+	{
+		int			varno = ((RangeTblRef *) jtnode)->rtindex;
+
+		*innerjoined_rels = bms_add_member(*innerjoined_rels, varno);
+	}
+	else if (IsA(jtnode, FromExpr))
+	{
+		FromExpr   *f = (FromExpr *) jtnode;
+		ListCell   *lc;
+
+		/* All elements of the FROM list are allowable */
+		foreach(lc, f->fromlist)
+			find_innerjoined_rels((Node *) lfirst(lc),
+								  innerjoined_rels, usable_quals);
+		/* ... and its WHERE quals are too */
+		if (f->quals)
+			*usable_quals = lappend(*usable_quals, f->quals);
+	}
+	else if (IsA(jtnode, JoinExpr))
+	{
+		JoinExpr   *j = (JoinExpr *) jtnode;
+
+		switch (j->jointype)
+		{
+			case JOIN_INNER:
+				/* visit both children */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				/* and grab the ON quals too */
+				if (j->quals)
+					*usable_quals = lappend(*usable_quals, j->quals);
+				break;
+
+			case JOIN_LEFT:
+			case JOIN_SEMI:
+			case JOIN_ANTI:
+
+				/*
+				 * Only the left input is possibly non-nullable; furthermore,
+				 * the quals of this join don't constrain the left input.
+				 * Note: we probably can't see SEMI or ANTI joins at this
+				 * point, but if we do, we can treat them like LEFT joins.
+				 */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_RIGHT:
+				/* Reverse of the above case */
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_FULL:
+				/* Neither side is non-nullable, so stop descending */
+				break;
+
+			case JOIN_UNIQUE_OUTER:
+			case JOIN_UNIQUE_INNER:
+				/* Don't think we will see JOIN_UNIQUE_OUTER or
+				 * JOIN_UNIQUE_INNER since they are only used internally in
+				 * the planner in a much later phase (in standard_join_search).
+				*/
+				break;
+
+			default:
+				elog(ERROR, "unrecognized join type: %d",
+					 (int) j->jointype);
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+			 (int) nodeTag(jtnode));
+}
+
+/*
+ * Returns true if the Node passed in is nonnullable. Currently handles Var,
+ * TargetEntry, CoaleseExpr and Const.
+ * TODO: Add more supporting cases.
+ * A Var is nonnullable if:
+ *	It does not appear on the null-padded side of an outer join and it has NOT NULL constraint,
+ * 	Or if it's forced non-null by inner join or other strict predicates.
+ * A CoalesceExpr is nonnullable if it has a non-null argument.
+ */
+bool
+is_node_nonnullable(Node * node, Query *parse)
+{
+	AttrNumber	 attno = InvalidAttrNumber;
+	Oid				 reloid;
+	Var 				*var = NULL;
+
+	/*
+	 * If the query contains set operations, punt.  The set ops themselves
+	 * couldn't introduce nulls that weren't in their inputs, but the tlist
+	 * present in the top-level query is just dummy and won't give us useful
+	 * info.  We could get an answer by recursing to examine each leaf query,
+	 * but for the moment it doesn't seem worth the extra complication.
+	 *
+	 * Note that we needn't consider other top-level operators such as
+	 * DISTINCT, GROUP BY, etc, as those will not introduce nulls either.
+	 */
+	if (parse->setOperations)
+		return false;
+
+	switch (nodeTag(node))
+	{
+		case T_Var:
+		{
+			RangeTblEntry	*rte;
+
+			var = (Var *) node;
+			attno = var->varattno;
+			rte = rt_fetch(var->varno, parse->rtable);
+			reloid = rte->relid;
+			break;
+		}
+		case T_TargetEntry:
+		{
+			TargetEntry	*te = (TargetEntry *)node;
+			switch(nodeTag(te->expr))
+			{
+				case T_Var:
+				{
+					var = (Var *) te->expr;
+					attno = te->resorigcol;
+					reloid = te->resorigtbl;
+					break;
+				}
+				/* recurse into is_node_nonnullable for other types of Node */
+				default:
+					return is_node_nonnullable((Node *)te->expr, parse);
+			}
+			break;
+		}
+		case T_CoalesceExpr:
+		{
+			ListCell   			*arg;
+			CoalesceExpr	*cexpr = (CoalesceExpr *)node;
+
+			/* handle COALESCE Function by looking for non-null argument */
+			foreach(arg, cexpr->args)
+			{
+				Node	*e = lfirst(arg);
+
+				/* recurse into is_node_nonnullable */
+				if (is_node_nonnullable(e, parse))
+				{
+					return true;
+				}
+			}
+			break;
+		}
+		case T_Const:
+		{
+			if(!((Const *) node)->constisnull)
+			{
+				return true;
+			}
+			break;
+		}
+		/*
+		 * TODO: handle more cases to make the nullability test more accurate
+		 * Assume unhandled cases are nullable.
+		 */
+		default:
+			return false;
+	}
+
+	/*
+	 * If we have found a Var, it is non-null if:
+	 * not on NULL-padded side of an outer join and has NOT NULL constraint
+	 * or forced NOT NULL by inner join conditions or by other strict predicates.
+	 */
+	if(var &&
+			reloid != InvalidOid)
+	{
+		Relids		 innerjoined_rels = NULL;
+		List	   		*innerjoined_useful_quals = NIL;
+		List			*nonnullable_innerjoined_vars = NIL;
+
+		find_innerjoined_rels((Node *) parse->jointree,
+							  &innerjoined_rels,
+							  &innerjoined_useful_quals);
+
+		/*
+		 * Check if the Var is from an INNER JOINed rel, it's also guaranteed
+		 * to not be on the null-padded side of an outer join.
+		 */
+		if (bms_is_member(var->varno, innerjoined_rels))
+		{
+			/*
+			 * If Var is from a plain relation and its column is marked
+			 * NOT NULL according to the catalogs, it can't produce NULL.
+			 */
+			if (get_attnotnull(reloid, attno))
+			{
+				return true;
+			}
+
+			/*
+			 * Otherwise check for the existance of strict predicates which filter
+			 * out NULL values for this Var.
+			 */
+			nonnullable_innerjoined_vars =
+				find_nonnullable_vars((Node *) innerjoined_useful_quals);
+
+			if (list_member(nonnullable_innerjoined_vars, var))
+			{
+				return true;
+			}
+		}
+
+		/*
+		 * If we get here it means -
+		 * The var is on null-padded side of an outer-join, since we've already
+		 * tried reduce_outer_joins(), there isn't any strict predicates to turn this
+		 * outer join to inner join, then there will be no strict predicates to force
+		 * this var non-null as well.
+		 */
+	}
+
+	return false;
+}
+
+/*
+ * Returns true if at least one Node in the input list is non-nullable
+ */
+bool
+list_hasnonnullable(List * list, Query *parse)
+{
+	ListCell	*lc;
+	Node	*node;
+
+	foreach(lc, list)
+	{
+		node = lfirst(lc);
+		if(is_node_nonnullable(node, parse))
+		{
+			return true;
+		}
+	}
+	return false;
+}
+
 /*
  * Can we treat a ScalarArrayOpExpr as strict?
  *
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index c13c08a97b..59aa736f1e 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -908,6 +908,35 @@ get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 	ReleaseSysCache(tp);
 }
 
+/*
+ * get_attnotnull
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	  tp;
+	Form_pg_attribute att_tup;
+
+	tp = SearchSysCache2(ATTNUM,
+			ObjectIdGetDatum(relid),
+			Int16GetDatum(attnum));
+
+	if (HeapTupleIsValid(tp))
+	{
+		bool result;
+
+		att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+		result = att_tup->attnotnull;
+		ReleaseSysCache(tp);
+
+		return result;
+	}
+	/* Assume att is nullable if no valid heap tuple is found */
+	return false;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/include/optimizer/clauses.h b/src/include/optimizer/clauses.h
index 2f9aeec4a7..b3229e96a7 100644
--- a/src/include/optimizer/clauses.h
+++ b/src/include/optimizer/clauses.h
@@ -45,6 +45,9 @@ extern List *find_nonnullable_vars(Node *clause);
 extern List *find_forced_null_vars(Node *clause);
 extern Var *find_forced_null_var(Node *clause);
 
+extern bool is_node_nonnullable(Node * node, Query *parse);
+extern bool list_hasnonnullable(List * list, Query *parse);
+
 extern bool is_pseudo_constant_clause(Node *clause);
 extern bool is_pseudo_constant_clause_relids(Node *clause, Relids relids);
 
diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h
index a9b2c9026c..ec8af5abf4 100644
--- a/src/include/optimizer/prep.h
+++ b/src/include/optimizer/prep.h
@@ -26,7 +26,7 @@ extern void pull_up_sublinks(PlannerInfo *root);
 extern void inline_set_returning_functions(PlannerInfo *root);
 extern void pull_up_subqueries(PlannerInfo *root);
 extern void flatten_simple_union_all(PlannerInfo *root);
-extern void reduce_outer_joins(PlannerInfo *root);
+extern void reduce_outer_joins(Query *parse);
 extern void remove_useless_result_rtes(PlannerInfo *root);
 extern Relids get_relids_in_jointree(Node *jtnode, bool include_joins);
 extern Relids get_relids_for_join(Query *query, int joinrelid);
diff --git a/src/include/optimizer/subselect.h b/src/include/optimizer/subselect.h
index 71a22eeb64..77c3730e47 100644
--- a/src/include/optimizer/subselect.h
+++ b/src/include/optimizer/subselect.h
@@ -19,6 +19,8 @@
 extern void SS_process_ctes(PlannerInfo *root);
 extern JoinExpr *convert_ANY_sublink_to_join(PlannerInfo *root,
 											 SubLink *sublink,
+											 bool under_not,
+											 Node **pullout,
 											 Relids available_rels);
 extern JoinExpr *convert_EXISTS_sublink_to_join(PlannerInfo *root,
 												SubLink *sublink,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index c8df5bff9f..d2bf81985d 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -90,6 +90,7 @@ extern char get_attgenerated(Oid relid, AttrNumber attnum);
 extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 								  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern char *get_collation_name(Oid colloid);
 extern bool get_collation_isdeterministic(Oid colloid);
 extern char *get_constraint_name(Oid conoid);
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index 2d1963d12a..ca2a88e139 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -1442,3 +1442,2208 @@ select * from x for update;
    Output: subselect_tbl.f1, subselect_tbl.f2, subselect_tbl.f3
 (2 rows)
 
+-- test NON IN to ANTI JOIN conversion
+CREATE TABLE s (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s values(1000002, 1000002, 1000002, 'foofoo');
+UPDATE s set n = NULL WHERE n = 3;
+analyze s;
+CREATE TABLE l (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into l (u, n, nn, p)
+	select
+    generate_series(1,10000 ) as u,
+	generate_series(1,10000 ) as n,
+	generate_series(1,10000 ) as nn,
+	'bar' as p;
+UPDATE l set n = NULL WHERE n = 7;
+CREATE UNIQUE INDEX l_u ON l (u);
+CREATE INDEX l_n ON l (n);
+CREATE INDEX l_nn ON l (nn);
+analyze l;
+CREATE TABLE s1 (u INTEGER NOT NULL, n INTEGER NULL, n1 INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s1 (u, n, n1, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as n1,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s1 values(1000003, 1000003, 1000003, 1000003, 'foofoo');
+insert into s1 values(1003, 1003, 1003, 1003, 'foofoo');
+UPDATE s1 set n = NULL WHERE n = 3;
+UPDATE s1 set n1 = NULL WHERE n = 2;
+UPDATE s1 set n1 = NULL WHERE n1 = 3;
+analyze s1;
+CREATE TABLE empty (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+analyze empty;
+-- set work_mem to 64KB so that NOT IN to ANTI JOIN optimization will kick in
+set work_mem = 64;
+-- correctness test 1: inner empty, return every thing from outer including NULL
+explain (costs false) select * from s where n not in (select n from empty);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on empty
+(4 rows)
+
+select * from s where n not in (select n from empty);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+-- correctness test 2: inner has NULL, return empty result
+explain (costs false) select * from s where n not in (select n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- correctness test 3: inner non-null, result has no NULL
+explain (costs false) select * from s where n not in (select u from l);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Index Only Scan using l_u on l
+         Index Cond: (u = s.n)
+(7 rows)
+
+select * from s where n not in (select u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+-- correctness test 4: inner has predicate
+explain (costs false) select * from s where n not in (select n from l where u > 7);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (u > 7)
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 7)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(14 rows)
+
+select * from s where n not in (select n from l where u > 7);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(3 rows)
+
+-- correctness test 5: multi-expression, (2, 2, null, 2, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s1.n = l.u) AND (s1.n1 = l.nn)) IS NOT FALSE)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Seq Scan on l
+               Filter: (u >= 3)
+(6 rows)
+
+select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+    u    |    n    |   n1    |   nn    |   p    
+---------+---------+---------+---------+--------
+       1 |       1 |       1 |       1 | foo
+ 1000003 | 1000003 | 1000003 | 1000003 | foofoo
+       2 |       2 |         |       2 | foo
+(3 rows)
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should not be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s1.n = l.u) AND (s1.n1 = l.nn)) IS NOT FALSE)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Seq Scan on l
+               Filter: (u > 0)
+(6 rows)
+
+select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+    u    |    n    |   n1    |   nn    |   p    
+---------+---------+---------+---------+--------
+ 1000003 | 1000003 | 1000003 | 1000003 | foofoo
+(1 row)
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on s1
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Index Scan using l_u on l
+           Index Cond: (u < 0)
+(5 rows)
+
+select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+    u    |    n    |   n1    |   nn    |   p    
+---------+---------+---------+---------+--------
+       1 |       1 |       1 |       1 | foo
+ 1000003 | 1000003 | 1000003 | 1000003 | foofoo
+    1003 |    1003 |    1003 |    1003 | foofoo
+       2 |       2 |         |       2 | foo
+       3 |         |         |       3 | foo
+(5 rows)
+
+-- test using hashed subplan when inner fits in work_mem
+explain (costs false) select * from l where n not in (select n from s);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on l
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on s
+(4 rows)
+
+select * from l where n not in (select n from s);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- test single expression
+explain (costs false) select * from s where n not in (select n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select u from l);
+              QUERY PLAN              
+--------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_u on l
+         Index Cond: (u = s.u)
+(4 rows)
+
+select * from s where u not in (select u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where 3*n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: (((3 * n) IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: (((3 * s.n) = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = (3 * s.n))
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where 3*n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select 3*n from l);
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = (3 * l.n)) OR ((3 * l.n) IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Materialize
+         ->  Seq Scan on l
+(8 rows)
+
+select * from s where n not in (select 3*n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- test single expression with predicates
+explain (costs false) select * from s where n not in (select n from l where u > 0);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (u > 0)
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 0)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(14 rows)
+
+select * from s where n not in (select n from l where u > 0);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l where u > 100);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (u > 100)
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 100)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(14 rows)
+
+select * from s where n not in (select n from l where u > 100);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(3 rows)
+
+-- test multi expression
+explain (costs false) select * from s where (n,u) not in (select n,u from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_u on l
+         Index Cond: (u = s.u)
+         Filter: ((s.n = n) OR (n IS NULL) OR (s.n IS NULL))
+(5 rows)
+
+select * from s where (n,u) not in (select n,u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (u, nn) not in (select u, nn from l);
+            QUERY PLAN            
+----------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: (s.u = u)
+(5 rows)
+
+select * from s where (u, nn) not in (select u, nn from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u) not in (select u,n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.u = n) OR (n IS NULL))
+         Filter: ((s.n = u) OR (s.n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.u)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where (n,u) not in (select u,n from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l);
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: (((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+                                          QUERY PLAN                                          
+----------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((u > 1000) AND ((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((u > 0) AND ((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((u > 1) AND ((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(2 rows)
+
+-- test multi-table
+explain (costs false) select count(*) from s, l where s.n not in (select n from l);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+                     Filter: ((n IS NOT NULL) OR (NOT $0))
+               ->  Bitmap Heap Scan on l l_1
+                     Recheck Cond: ((s.n = n) OR (n IS NULL))
+                     ->  BitmapOr
+                           ->  Bitmap Index Scan on l_n
+                                 Index Cond: (n = s.n)
+                           ->  Bitmap Index Scan on l_n
+                                 Index Cond: (n IS NULL)
+         ->  Seq Scan on l
+(15 rows)
+
+select count(*) from s, l where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s, l where s.nn not in (select nn from l);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+               ->  Index Only Scan using l_nn on l l_1
+                     Index Cond: (nn = s.nn)
+         ->  Seq Scan on l
+(7 rows)
+
+select count(*) from s, l where s.nn not in (select nn from l);
+ count 
+-------
+ 10000
+(1 row)
+
+-- test null padded results from outer join
+explain (costs false) select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = s_1.nn) OR (s_1.nn IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Nested Loop Left Join
+           Join Filter: (l_1.nn = s_2.nn)
+           ->  Seq Scan on l l_1
+           ->  Materialize
+                 ->  Seq Scan on s s_2
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s_1.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s s_1
+(15 rows)
+
+select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = s_1.nn) OR (s_1.nn IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Nested Loop Left Join
+           Join Filter: (s_2.nn = l_1.nn)
+           ->  Seq Scan on l l_1
+           ->  Materialize
+                 ->  Seq Scan on s s_2
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s_1.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s s_1
+(15 rows)
+
+select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+                   QUERY PLAN                   
+------------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         ->  Seq Scan on l
+               Filter: (NOT (hashed SubPlan 1))
+               SubPlan 1
+                 ->  Seq Scan on s s_1
+         ->  Hash
+               ->  Seq Scan on s
+(9 rows)
+
+select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+ count 
+-------
+  9997
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+                QUERY PLAN                
+------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         Filter: (NOT (hashed SubPlan 1))
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s
+         SubPlan 1
+           ->  Seq Scan on s s_1
+(9 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   ->  Nested Loop Left Join
+         Join Filter: (s.nn = l.nn)
+         ->  Hash Anti Join
+               Hash Cond: (l.nn = l_1.nn)
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Left Join
+                           Hash Cond: (l_1.nn = s_1.nn)
+                           ->  Seq Scan on l l_1
+                           ->  Hash
+                                 ->  Seq Scan on s s_1
+         ->  Seq Scan on s
+(13 rows)
+
+select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Nested Loop Left Join
+           Join Filter: (l_2.nn = s_2.nn)
+           ->  Seq Scan on l l_2
+           ->  Materialize
+                 ->  Seq Scan on s s_2
+   ->  Nested Loop Anti Join
+         Join Filter: ((s.nn = s_1.nn) OR (s_1.nn IS NULL))
+         ->  Hash Left Join
+               Hash Cond: (l.nn = s.nn)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s
+         ->  Materialize
+               ->  Hash Left Join
+                     Hash Cond: (l_1.nn = s_1.nn)
+                     ->  Seq Scan on l l_1
+                     ->  Hash
+                           ->  Seq Scan on s s_1
+(21 rows)
+
+select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Seq Scan on s1
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(11 rows)
+
+select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Left Join
+               Hash Cond: (l.u = s.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Right Join
+                           Hash Cond: (s1.u = s.u)
+                           ->  Seq Scan on s1
+                           ->  Hash
+                                 ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(17 rows)
+
+select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+                    QUERY PLAN                     
+---------------------------------------------------
+ Aggregate
+   ->  Nested Loop Left Join
+         Join Filter: (s.u = s1.u)
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+               ->  Index Only Scan using l_nn on l
+                     Index Cond: (nn = s.nn)
+         ->  Seq Scan on s1
+(8 rows)
+
+select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop
+               Join Filter: (s.u = s1.u)
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Seq Scan on s1
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(11 rows)
+
+select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Left Join
+               Hash Cond: (l.u = s.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Join
+                           Hash Cond: (s1.u = s.u)
+                           ->  Seq Scan on s1
+                           ->  Hash
+                                 ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(17 rows)
+
+select * from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+ u | n | nn | p | u | n | n1 | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Seq Scan on s1
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(11 rows)
+
+select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Full Join
+               Hash Cond: (l.u = s.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Join
+                           Hash Cond: (s1.u = s.u)
+                           ->  Seq Scan on s1
+                           ->  Hash
+                                 ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(17 rows)
+
+select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+                    QUERY PLAN                     
+---------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Nested Loop Left Join
+         Join Filter: (l.nn = s1.nn)
+         ->  Nested Loop Left Join
+               Join Filter: (l.nn = s_1.nn)
+               ->  Index Only Scan using l_nn on l
+                     Index Cond: (nn = s.nn)
+               ->  Seq Scan on s s_1
+         ->  Seq Scan on s1
+(10 rows)
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.nn = s_1.nn)
+           ->  Nested Loop Left Join
+                 ->  Seq Scan on s1
+                 ->  Index Only Scan using l_nn on l
+                       Index Cond: (nn = s1.nn)
+           ->  Materialize
+                 ->  Seq Scan on s s_1
+(11 rows)
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+                                    QUERY PLAN                                     
+-----------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Nested Loop Left Join
+         Join Filter: (l.nn = s_1.nn)
+         ->  Index Scan using l_nn on l
+               Index Cond: (nn = s.nn)
+               Filter: (((s.n = n) OR (n IS NULL) OR (s.n IS NULL)) AND (s.u = u))
+         ->  Seq Scan on s s_1
+(8 rows)
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l right join s on l.nn = s.nn);
+                QUERY PLAN                 
+-------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+           ->  Index Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+(7 rows)
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+--test reduce outer joins from outer query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Left Join
+               Hash Cond: (l.nn = s.nn)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(13 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+                     Filter: (u > 0)
+               ->  Index Only Scan using l_nn on l l_1
+                     Index Cond: (nn = s.nn)
+         ->  Index Only Scan using l_nn on l
+               Index Cond: (nn = s.nn)
+(9 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Index Only Scan using l_nn on l
+                     Index Cond: (nn = s.nn)
+         ->  Seq Scan on s1
+(11 rows)
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Anti Join
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on s1
+               ->  Materialize
+                     ->  Nested Loop
+                           ->  Seq Scan on s
+                           ->  Index Only Scan using l_nn on l
+                                 Index Cond: (nn = s.nn)
+         ->  Index Only Scan using l_nn on l l_1
+               Index Cond: (nn = s.nn)
+(15 rows)
+
+select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Left Join
+         Join Filter: (s.u = s1.u)
+         ->  Hash Anti Join
+               Hash Cond: (s.nn = l_1.nn)
+               ->  Hash Left Join
+                     Hash Cond: (l.nn = s.nn)
+                     Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+                     ->  Seq Scan on l
+                     ->  Hash
+                           ->  Seq Scan on s
+               ->  Hash
+                     ->  Seq Scan on l l_1
+         ->  Seq Scan on s1
+(16 rows)
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+--test reduce outer joins from subquery
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+(7 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+                QUERY PLAN                 
+-------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+                 Filter: (u > 9)
+(8 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+                 Filter: (u > 9)
+           ->  Index Only Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+(8 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           Join Filter: (l.n = s1.n)
+           ->  Seq Scan on s1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Scan using l_nn on l
+                             Index Cond: (nn = s_1.nn)
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.n = s1.n)
+           ->  Seq Scan on s1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Scan using l_nn on l
+                             Index Cond: (nn = s_1.nn)
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+                   QUERY PLAN                    
+-------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.n = s1.n)
+           ->  Nested Loop Left Join
+                 ->  Seq Scan on s s_1
+                 ->  Index Scan using l_nn on l
+                       Index Cond: (nn = s_1.nn)
+           ->  Materialize
+                 ->  Seq Scan on s1
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+--test reduce outer join on outer and sub-query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+                                    QUERY PLAN                                    
+----------------------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = s1.u)
+         ->  Seq Scan on s1
+         ->  Materialize
+               ->  Nested Loop
+                     ->  Seq Scan on s
+                           Filter: (NOT (hashed SubPlan 1))
+                           SubPlan 1
+                             ->  Nested Loop
+                                   Join Filter: (l_1.n = s1_1.n)
+                                   ->  Seq Scan on s1 s1_1
+                                   ->  Materialize
+                                         ->  Nested Loop
+                                               ->  Seq Scan on s s_1
+                                               ->  Index Scan using l_nn on l l_1
+                                                     Index Cond: (nn = s_1.nn)
+                     ->  Index Only Scan using l_nn on l
+                           Index Cond: (nn = s.nn)
+(19 rows)
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         Filter: (NOT (hashed SubPlan 1))
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Hash Right Join
+                     Hash Cond: (s1.u = s.u)
+                     ->  Seq Scan on s1
+                     ->  Hash
+                           ->  Seq Scan on s
+         SubPlan 1
+           ->  Nested Loop Left Join
+                 Join Filter: (l_1.n = s1_1.n)
+                 ->  Nested Loop Left Join
+                       ->  Seq Scan on s s_1
+                       ->  Index Scan using l_nn on l l_1
+                             Index Cond: (nn = s_1.nn)
+                 ->  Materialize
+                       ->  Seq Scan on s1 s1_1
+(20 rows)
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+ count 
+-------
+     0
+(1 row)
+
+-- test union all
+explain (costs false) select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s t
+         Filter: (n IS NOT NULL)
+   ->  Append
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: ((t.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = t.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+         ->  Seq Scan on s
+               Filter: ((t.n = u) OR (u IS NULL))
+         ->  Seq Scan on s s_1
+               Filter: ((t.n = nn) OR (nn IS NULL))
+(15 rows)
+
+select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Result
+           ->  Append
+                 ->  Seq Scan on l l_1
+                 ->  Seq Scan on s s_3
+                 ->  Seq Scan on s s_4
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Append
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+         ->  Seq Scan on s s_1
+               Filter: ((s.n = u) OR (u IS NULL))
+         ->  Seq Scan on s s_2
+               Filter: ((s.n = nn) OR (nn IS NULL))
+(21 rows)
+
+select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.n = l_1.nn)
+         ->  Append
+               ->  Seq Scan on s
+                     Filter: ((n IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+                     Filter: ((u IS NOT NULL) OR (NOT $0))
+         ->  Hash
+               ->  Seq Scan on l l_1
+(12 rows)
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+                   QUERY PLAN                   
+------------------------------------------------
+ Aggregate
+   ->  Append
+         ->  Seq Scan on s
+               Filter: (NOT (hashed SubPlan 1))
+               SubPlan 1
+                 ->  Seq Scan on empty
+         ->  Seq Scan on l
+               Filter: (NOT (hashed SubPlan 1))
+(8 rows)
+
+select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+ count 
+-------
+ 10004
+(1 row)
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
+ Finalize Aggregate
+   InitPlan 1 (returns $0)
+     ->  Result
+           ->  Append
+                 ->  Seq Scan on l l_2
+                 ->  Seq Scan on s s_3
+                 ->  Seq Scan on s s_4
+   ->  Gather
+         Workers Planned: 2
+         Params Evaluated: $0
+         ->  Partial Aggregate
+               ->  Nested Loop Anti Join
+                     ->  Parallel Append
+                           ->  Parallel Seq Scan on l
+                                 Filter: ((u IS NOT NULL) OR (NOT $0))
+                           ->  Parallel Seq Scan on s
+                                 Filter: ((n IS NOT NULL) OR (NOT $0))
+                     ->  Append
+                           ->  Bitmap Heap Scan on l l_1
+                                 Recheck Cond: ((l.u = n) OR (n IS NULL))
+                                 ->  BitmapOr
+                                       ->  Bitmap Index Scan on l_n
+                                             Index Cond: (n = l.u)
+                                       ->  Bitmap Index Scan on l_n
+                                             Index Cond: (n IS NULL)
+                           ->  Seq Scan on s s_1
+                                 Filter: ((l.u = u) OR (u IS NULL))
+                           ->  Seq Scan on s s_2
+                                 Filter: ((l.u = nn) OR (nn IS NULL))
+(29 rows)
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+ count 
+-------
+     0
+(1 row)
+
+-- test multi-levels of NOT IN
+explain (costs false) select * from s where n not in (select n from s where n not in (select n from l));
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 3))
+   SubPlan 3
+     ->  Nested Loop Anti Join
+           InitPlan 2 (returns $2)
+             ->  Seq Scan on l
+           ->  Seq Scan on s s_1
+                 Filter: ((n IS NOT NULL) OR (NOT $2))
+           ->  Bitmap Heap Scan on l l_1
+                 Recheck Cond: ((s_1.n = n) OR (n IS NULL))
+                 ->  BitmapOr
+                       ->  Bitmap Index Scan on l_n
+                             Index Cond: (n = s_1.n)
+                       ->  Bitmap Index Scan on l_n
+                             Index Cond: (n IS NULL)
+(15 rows)
+
+select * from s where n not in (select n from s where n not in (select n from l));
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where n not in (select n from s where n not in (select u from l));
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 3))
+   SubPlan 3
+     ->  Nested Loop Anti Join
+           InitPlan 2 (returns $2)
+             ->  Seq Scan on l
+           ->  Seq Scan on s s_1
+                 Filter: ((n IS NOT NULL) OR (NOT $2))
+           ->  Index Only Scan using l_u on l l_1
+                 Index Cond: (u = s_1.n)
+(10 rows)
+
+select * from s where n not in (select n from s where n not in (select u from l));
+ u | n | nn |  p  
+---+---+----+-----
+ 1 | 1 |  1 | foo
+ 2 | 2 |  2 | foo
+(2 rows)
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Aggregate
+   ->  Seq Scan on s
+         Filter: (NOT (SubPlan 2))
+         SubPlan 2
+           ->  Result
+                 One-Time Filter: (NOT $2)
+                 InitPlan 1 (returns $2)
+                   ->  Nested Loop Anti Join
+                         ->  Seq Scan on s1
+                               Filter: (n = s.n)
+                         ->  Bitmap Heap Scan on l
+                               Recheck Cond: ((s1.u = n) OR (n IS NULL))
+                               ->  BitmapOr
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n = s1.u)
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n IS NULL)
+                 ->  Seq Scan on s1 s1_1
+(18 rows)
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+                                           QUERY PLAN                                           
+------------------------------------------------------------------------------------------------
+ Seq Scan on s
+   Filter: ((NOT (hashed SubPlan 1)) AND (NOT (hashed SubPlan 2)) AND (NOT (hashed SubPlan 3)))
+   SubPlan 1
+     ->  Seq Scan on s1
+   SubPlan 2
+     ->  Seq Scan on s1 s1_1
+   SubPlan 3
+     ->  Seq Scan on s1 s1_2
+(8 rows)
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: ((NOT (hashed SubPlan 1)) AND (NOT (hashed SubPlan 2)))
+         SubPlan 1
+           ->  Seq Scan on s1
+         SubPlan 2
+           ->  Seq Scan on s1 s1_1
+   ->  Index Only Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+(9 rows)
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Aggregate
+   ->  Seq Scan on s
+         Filter: ((NOT (SubPlan 2)) AND (NOT (SubPlan 4)))
+         SubPlan 2
+           ->  Result
+                 One-Time Filter: (NOT $2)
+                 InitPlan 1 (returns $2)
+                   ->  Nested Loop Anti Join
+                         ->  Seq Scan on s1
+                               Filter: (n = s.n)
+                         ->  Bitmap Heap Scan on l
+                               Recheck Cond: ((s1.u = n) OR (n IS NULL))
+                               ->  BitmapOr
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n = s1.u)
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n IS NULL)
+                 ->  Seq Scan on s1 s1_1
+         SubPlan 4
+           ->  Result
+                 One-Time Filter: (NOT $6)
+                 InitPlan 3 (returns $6)
+                   ->  Nested Loop Anti Join
+                         ->  Seq Scan on s1 s1_2
+                               Filter: (n = s.n)
+                         ->  Bitmap Heap Scan on l l_1
+                               Recheck Cond: ((s1_2.u = n) OR (n IS NULL))
+                               ->  BitmapOr
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n = s1_2.u)
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n IS NULL)
+                 ->  Seq Scan on s1 s1_3
+(33 rows)
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+ count 
+-------
+     0
+(1 row)
+
+--test COALESCE
+explain (costs false) select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, '-1'::integer) = COALESCE(l.n, '-1'::integer))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, '-1'::integer) = COALESCE(l.n, '-1'::integer))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+                                 QUERY PLAN                                  
+-----------------------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((COALESCE(s.n) = COALESCE(l.n)) OR (COALESCE(l.n) IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((COALESCE(n) IS NOT NULL) OR (NOT $0))
+   ->  Materialize
+         ->  Seq Scan on l
+(8 rows)
+
+select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, s.nn) = COALESCE(l.n, l.nn))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+                   QUERY PLAN                   
+------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.nn) = COALESCE(l.nn))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+                                                       QUERY PLAN                                                        
+-------------------------------------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((COALESCE(s.n, '-1'::integer) = COALESCE(n, '-1'::integer)) AND (COALESCE(s.n, s.u) = COALESCE(n, u)))
+(5 rows)
+
+select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+-- test miscellaneous outer nullable cases
+explain (costs false) select * from s where (n,n) not in (select n,n from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s.n = l.n) AND (s.n = l.n)) IS NOT FALSE)
+   ->  Seq Scan on s
+   ->  Materialize
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where (n,n) not in (select n,n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+                                     QUERY PLAN                                      
+-------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s.n = l_1.n) AND (s.u = l_1.u) AND (s.nn = l_1.nn)) IS NOT FALSE)
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s
+   ->  Materialize
+         ->  Seq Scan on l l_1
+(9 rows)
+
+select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+ u | n | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+                 QUERY PLAN                  
+---------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         Filter: (NOT (hashed SubPlan 1))
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s
+         SubPlan 1
+           ->  Index Scan using l_u on l l_1
+                 Index Cond: (u < 0)
+(10 rows)
+
+select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+ count 
+-------
+ 10000
+(1 row)
+
+explain (costs false) select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Sort
+   Sort Key: s.n
+   ->  Seq Scan on s
+         Filter: (NOT (hashed SubPlan 1))
+         SubPlan 1
+           ->  Limit
+                 ->  Unique
+                       ->  Index Scan using l_n on l
+                             Filter: (u > 0)
+(9 rows)
+
+select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+--test outer has strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l) and n > 0;
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n > 0)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where n not in (select n from l) and n > 0;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l) and u > 0;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: (((n IS NOT NULL) OR (NOT $0)) AND (u > 0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where n not in (select n from l) and u > 0;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l) and n is not null;
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n IS NOT NULL)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where n not in (select n from l) and n is not null;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s join l on s.n = l.n where s.n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop
+   ->  Nested Loop Anti Join
+         ->  Seq Scan on s
+         ->  Bitmap Heap Scan on l l_1
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+   ->  Index Scan using l_n on l
+         Index Cond: (n = s.n)
+(12 rows)
+
+select * from s join l on s.n = l.n where s.n not in (select n from l);
+ u | n | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Anti Join
+         ->  Hash Left Join
+               Hash Cond: (l.n = s.n)
+               Filter: ((s.n IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s
+         ->  Bitmap Heap Scan on l l_1
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+(17 rows)
+
+select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Bitmap Heap Scan on l l_1
+                           Recheck Cond: ((s.n = n) OR (n IS NULL))
+                           ->  BitmapOr
+                                 ->  Bitmap Index Scan on l_n
+                                       Index Cond: (n = s.n)
+                                 ->  Bitmap Index Scan on l_n
+                                       Index Cond: (n IS NULL)
+               ->  Index Only Scan using l_n on l
+                     Index Cond: (n = s.n)
+         ->  Seq Scan on s1
+(16 rows)
+
+select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Anti Join
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               Filter: ((s.n IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on s1
+               ->  Materialize
+                     ->  Nested Loop
+                           ->  Seq Scan on s
+                           ->  Index Only Scan using l_n on l
+                                 Index Cond: (n = s.n)
+         ->  Bitmap Heap Scan on l l_1
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+(20 rows)
+
+select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+--test inner has strict predicate or inner join
+explain (costs false) select * from s where u not in (select n from l where n > 0);
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.u) AND (n > 0))
+(4 rows)
+
+select * from s where u not in (select n from l where n > 0);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where u not in (select n from l where u > 0);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.u = n) OR (n IS NULL))
+         Filter: (u > 0)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.u)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where u not in (select n from l where u > 0);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select n from l where n is not null);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.u) AND (n IS NOT NULL))
+(4 rows)
+
+select * from s where u not in (select n from l where n is not null);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(7 rows)
+
+select * from s where u not in (select l.n from l join s on l.n=s.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.u=s.u);
+               QUERY PLAN                
+-----------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Scan using l_u on l
+                 Index Cond: (u = s_1.u)
+(7 rows)
+
+select * from s where u not in (select l.n from l join s on l.u=s.u);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n = s.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(7 rows)
+
+select * from s where u not in (select l.n from l join s on l.n = s.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n = s.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(7 rows)
+
+select * from s where u not in (select l.n from l right join s on l.n = s.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Nested Loop
+                 Join Filter: (s_1.n = s1.n)
+                 ->  Seq Scan on s1
+                 ->  Materialize
+                       ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(11 rows)
+
+select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.n = s1.n)
+           ->  Seq Scan on s1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Only Scan using l_n on l
+                             Index Cond: (n = s_1.n)
+(11 rows)
+
+select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+--test both sides have strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l where n > 0) and n > 0;
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n > 0)
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.n) AND (n > 0))
+(5 rows)
+
+select * from s where n not in (select n from l where n > 0) and n > 0;
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where n not in (select n from l where u > 0) and n > 0;
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n > 0)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 0)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(11 rows)
+
+select * from s where n not in (select n from l where u > 0) and n > 0;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l where n > 0) and u > 0;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (n > 0)
+   ->  Seq Scan on s
+         Filter: (((n IS NOT NULL) OR (NOT $0)) AND (u > 0))
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.n) AND (n > 0))
+(8 rows)
+
+select * from s where n not in (select n from l where n > 0) and u > 0;
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Nested Loop
+   Join Filter: (s.u = s1.u)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Nested Loop
+               ->  Seq Scan on s
+                     Filter: (NOT (hashed SubPlan 1))
+                     SubPlan 1
+                       ->  Nested Loop
+                             ->  Nested Loop
+                                   Join Filter: (s_1.n = s1_1.n)
+                                   ->  Seq Scan on s s_1
+                                   ->  Materialize
+                                         ->  Seq Scan on s s1_1
+                             ->  Index Only Scan using l_n on l l_1
+                                   Index Cond: (n = s_1.n)
+               ->  Index Scan using l_n on l
+                     Index Cond: (n = s.n)
+(18 rows)
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+ u | n | nn | p | u | n | nn | p | u | n | n1 | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+----+---
+(0 rows)
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+                                   QUERY PLAN                                   
+--------------------------------------------------------------------------------
+ Nested Loop
+   Join Filter: (s.u = s1.u)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Nested Loop
+               ->  Seq Scan on s
+                     Filter: (NOT (hashed SubPlan 1))
+                     SubPlan 1
+                       ->  Nested Loop Left Join
+                             Join Filter: (l_1.n = s1_1.n)
+                             ->  Seq Scan on s s1_1
+                             ->  Materialize
+                                   ->  Nested Loop
+                                         ->  Seq Scan on s s_1
+                                         ->  Index Only Scan using l_n on l l_1
+                                               Index Cond: (n = s_1.n)
+               ->  Index Scan using l_n on l
+                     Index Cond: (n = s.n)
+(18 rows)
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+ u | n | nn | p | u | n | nn | p | u | n | n1 | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+----+---
+(0 rows)
+
+explain (costs false) select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Nested Loop Left Join
+   Join Filter: (s.u = s1.u)
+   Filter: (NOT (hashed SubPlan 1))
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Nested Loop
+               ->  Seq Scan on s
+               ->  Index Scan using l_n on l
+                     Index Cond: (n = s.n)
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l_1.n = s1_1.n)
+           ->  Seq Scan on s s1_1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Only Scan using l_n on l l_1
+                             Index Cond: (n = s_1.n)
+(18 rows)
+
+select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+ u | n | nn | p | u | n | nn | p | u | n | n1 | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+----+---
+(0 rows)
+
+-- clean up
+set work_mem = 4000;
+drop table s;
+drop table s1;
+drop table l;
+drop table empty;
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index 99ca69791e..f6ce9b5719 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -744,3 +744,507 @@ select * from (with x as (select 2 as y) select * from x) ss;
 explain (verbose, costs off)
 with x as (select * from subselect_tbl)
 select * from x for update;
+
+-- test NON IN to ANTI JOIN conversion
+CREATE TABLE s (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s values(1000002, 1000002, 1000002, 'foofoo');
+UPDATE s set n = NULL WHERE n = 3;
+analyze s;
+
+CREATE TABLE l (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into l (u, n, nn, p)
+	select
+    generate_series(1,10000 ) as u,
+	generate_series(1,10000 ) as n,
+	generate_series(1,10000 ) as nn,
+	'bar' as p;
+UPDATE l set n = NULL WHERE n = 7;
+
+CREATE UNIQUE INDEX l_u ON l (u);
+CREATE INDEX l_n ON l (n);
+CREATE INDEX l_nn ON l (nn);
+analyze l;
+
+CREATE TABLE s1 (u INTEGER NOT NULL, n INTEGER NULL, n1 INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s1 (u, n, n1, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as n1,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s1 values(1000003, 1000003, 1000003, 1000003, 'foofoo');
+insert into s1 values(1003, 1003, 1003, 1003, 'foofoo');
+UPDATE s1 set n = NULL WHERE n = 3;
+UPDATE s1 set n1 = NULL WHERE n = 2;
+UPDATE s1 set n1 = NULL WHERE n1 = 3;
+analyze s1;
+
+CREATE TABLE empty (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+analyze empty;
+
+-- set work_mem to 64KB so that NOT IN to ANTI JOIN optimization will kick in
+set work_mem = 64;
+
+-- correctness test 1: inner empty, return every thing from outer including NULL
+explain (costs false) select * from s where n not in (select n from empty);
+
+select * from s where n not in (select n from empty);
+
+-- correctness test 2: inner has NULL, return empty result
+explain (costs false) select * from s where n not in (select n from l);
+
+select * from s where n not in (select n from l);
+
+-- correctness test 3: inner non-null, result has no NULL
+explain (costs false) select * from s where n not in (select u from l);
+
+select * from s where n not in (select u from l);
+
+-- correctness test 4: inner has predicate
+explain (costs false) select * from s where n not in (select n from l where u > 7);
+
+select * from s where n not in (select n from l where u > 7);
+
+-- correctness test 5: multi-expression, (2, 2, null, 2, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+
+select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should not be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+
+select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+
+select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+
+-- test using hashed subplan when inner fits in work_mem
+explain (costs false) select * from l where n not in (select n from s);
+
+select * from l where n not in (select n from s);
+
+-- test single expression
+explain (costs false) select * from s where n not in (select n from l);
+
+select * from s where n not in (select n from l);
+
+explain (costs false) select * from s where u not in (select u from l);
+
+select * from s where u not in (select u from l);
+
+explain (costs false) select * from s where 3*n not in (select n from l);
+
+select * from s where 3*n not in (select n from l);
+
+explain (costs false) select * from s where n not in (select 3*n from l);
+
+select * from s where n not in (select 3*n from l);
+
+-- test single expression with predicates
+explain (costs false) select * from s where n not in (select n from l where u > 0);
+
+select * from s where n not in (select n from l where u > 0);
+
+explain (costs false) select * from s where n not in (select n from l where u > 100);
+
+select * from s where n not in (select n from l where u > 100);
+
+-- test multi expression
+explain (costs false) select * from s where (n,u) not in (select n,u from l);
+
+select * from s where (n,u) not in (select n,u from l);
+
+explain (costs false) select * from s where (u, nn) not in (select u, nn from l);
+
+select * from s where (u, nn) not in (select u, nn from l);
+
+explain (costs false) select * from s where (n,u) not in (select u,n from l);
+
+select * from s where (n,u) not in (select u,n from l);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+
+-- test multi-table
+explain (costs false) select count(*) from s, l where s.n not in (select n from l);
+
+select count(*) from s, l where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s, l where s.nn not in (select nn from l);
+
+select count(*) from s, l where s.nn not in (select nn from l);
+
+-- test null padded results from outer join
+explain (costs false) select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+
+select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+
+select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+
+select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+
+select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+
+select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+select * from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l right join s on l.nn = s.nn);
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+--test reduce outer joins from outer query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+--test reduce outer joins from subquery
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+--test reduce outer join on outer and sub-query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+-- test union all
+explain (costs false) select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+
+select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+
+explain (costs false) select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+
+explain (costs false) select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+
+select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+-- test multi-levels of NOT IN
+explain (costs false) select * from s where n not in (select n from s where n not in (select n from l));
+
+select * from s where n not in (select n from s where n not in (select n from l));
+
+explain (costs false) select * from s where n not in (select n from s where n not in (select u from l));
+
+select * from s where n not in (select n from s where n not in (select u from l));
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+--test COALESCE
+explain (costs false) select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+
+select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+
+explain (costs false) select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+
+select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+
+explain (costs false) select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+
+select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+
+explain (costs false) select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+
+select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+
+explain (costs false) select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+
+select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+
+explain (costs false) select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+
+select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+
+-- test miscellaneous outer nullable cases
+
+explain (costs false) select * from s where (n,n) not in (select n,n from l);
+
+select * from s where (n,n) not in (select n,n from l);
+
+explain (costs false) select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+
+select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+
+select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+
+explain (costs false) select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+
+select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+
+--test outer has strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l) and n > 0;
+
+select * from s where n not in (select n from l) and n > 0;
+
+explain (costs false) select * from s where n not in (select n from l) and u > 0;
+
+select * from s where n not in (select n from l) and u > 0;
+
+explain (costs false) select * from s where n not in (select n from l) and n is not null;
+
+select * from s where n not in (select n from l) and n is not null;
+
+explain (costs false) select * from s join l on s.n = l.n where s.n not in (select n from l);
+
+select * from s join l on s.n = l.n where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+
+select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+
+select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+
+select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+
+--test inner has strict predicate or inner join
+explain (costs false) select * from s where u not in (select n from l where n > 0);
+
+select * from s where u not in (select n from l where n > 0);
+
+explain (costs false) select * from s where u not in (select n from l where u > 0);
+
+select * from s where u not in (select n from l where u > 0);
+
+explain (costs false) select * from s where u not in (select n from l where n is not null);
+
+select * from s where u not in (select n from l where n is not null);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n);
+
+select * from s where u not in (select l.n from l join s on l.n=s.n);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.u=s.u);
+
+select * from s where u not in (select l.n from l join s on l.u=s.u);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n = s.n);
+
+select * from s where u not in (select l.n from l join s on l.n = s.n);
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n = s.n);
+
+select * from s where u not in (select l.n from l right join s on l.n = s.n);
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+
+select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+
+select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+
+--test both sides have strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l where n > 0) and n > 0;
+
+select * from s where n not in (select n from l where n > 0) and n > 0;
+
+explain (costs false) select * from s where n not in (select n from l where u > 0) and n > 0;
+
+select * from s where n not in (select n from l where u > 0) and n > 0;
+
+explain (costs false) select * from s where n not in (select n from l where n > 0) and u > 0;
+
+select * from s where n not in (select n from l where n > 0) and u > 0;
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+explain (costs false) select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+-- clean up
+set work_mem = 4000;
+drop table s;
+drop table s1;
+drop table l;
+drop table empty;
#53Li, Zheng
zhelli@amazon.com
In reply to: Li, Zheng (#52)
1 attachment(s)
Re: NOT IN subquery optimization

-----
To avoid planning the subquery again later on, I want to keep a pointer of
the subplan in SubLink so that we can directly reuse the subplan when
needed. However, this change breaks initdb for some reason and I'm trying to
figure it out.
-----

"make clean" solved the initdb issue. This new patch keeps a pointer of the subplan
in SubLink so that we can directly reuse the subplan when needed. When the subplan
is not hashable (too big to fit in work_mem), the NOT IN query will be flattened to
an ANTI JOIN and we won't need to use subplan again. However, when the subplan
is hashable, we don't do the conversion and will need to use subplan later, patch v2.1
avoids planning the subquery twice in this case.

-----------
Zheng Li
AWS, Amazon Aurora PostgreSQL

Attachments:

not_in_to_anti_join_transformation_v2_1.patchapplication/octet-stream; name=not_in_to_anti_join_transformation_v2_1.patchDownload
commit 2238dedf511ac9555bb372b91b85cf80a3b3f081
Author: Zheng Li <zhelli@amazon.com>
Date:   Fri Jun 14 17:38:52 2019 +0000

        NOT IN to ANTI JOIN transformation
    
        If the NOT IN subquery is not eligible for hashed subplan as decided by
        subplan_is_hashable(), do the following NOT IN to ANTI JOIN
        transformation:
    
        Single expression:
            When x is nullable:
            t1.x not in (t2.y where p) =>
            ANTI JOIN
            t1 (Filter: t1.x IS NOT NULL or NOT EXISTS (select 1 from t2 where p)),
            t2 on join condition (t1.x=t2.y or t2.y IS NULL)
            and p.
            The predicate "t2.y IS NULL" can be removed if y is non-nullable.
    
            When x is non-nullable:
            t1.x not in (t2.y where p) =>
            ANTI JOIN t1, t2 on join condition (t1.x=t2.y or t2.y IS NULL)
            and p.
            The predicate "t2.y IS NULL" can be removed if y is non-nullable.
    
        Multi expression:
            If all xi's are nullable:
            (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
            ANTI JOIN t1, t2 on join condition:
            ((t1.x1 = t2.y1) and ... (t1.xi = t2.yi) ... and
             (t1.xn = t2.yn)) IS NOT FALSE.
    
            If at least one xi is non-nuallable:
            (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
            ANTI JOIN t1, t2 on join condition:
            (t1.x1 = t2.y1 or t2.y1 IS NULL or t1.x1 IS NULL) and ...
            (t1.xi = t2.yi or t2.yi IS NULL) ... and
            (t1.xn = t2.yn or t2.yn IS NULL or t1.xn IS NULL).
    
        Add nullability testing routine is_node_nonnullable(), currently it
        handles Var, TargetEntry, CoalesceExpr and Const. It uses strict
        predicates, inner join conditions and NOT NULL constraint to check
        the nullability of a Var.
    
        Adjust and apply reduce_outer_joins() before the transformation so
        that the outer joins have an opportunity to be converted to inner joins
        prior to the transformation.
    
        We measured performance improvements of two to five orders of magnitude
        on most queries in a development environment. In our performance experiments,
        table s (small) has 11 rows, table l (large) has 1 million rows. s.n and l.n
        have NULL value. s.nn and l.nn are NOT NULL. Index is created on each column.
    
        Cases using hash anti join:
        l.nn not in (l.nn) 21900s -> 235ms
        l.nn not in (l.nn where u>0) 22000s -> 240ms
        l.n not in (l.nn) 21900s -> 238ms
        l.n not in (l.nn where u>0) 22000s -> 248ms
    
        Cases using index nested loop anti join
        s.n not in (l.nn) 360ms -> 0.5ms
        s.n not in (l.nn where u>0) 390ms -> 0.6ms
        s.nn not in (l.nn) 365ms -> 0.5ms
        s.nn not in (l.nn where u>0) 390ms -> 0.5ms
    
        Cases using bitmap heap scan on the inner and nested loop anti join:
        s.n not in (l.n) 360ms -> 0.7ms
        l.n not in (l.n) 21900s ->  1.6s
        l.n not in (l.n where u>0) 22000s -> 1680ms
        s.nn not in (l.n) 360ms -> 0.5ms
        l.nn not in (l.n) 21900s -> 1650ms
        l.nn not in (l.n where u>0) 22000s -> 1660ms
    
        Cases using the original hashed subplan:
        l.n not in (s.n) 63ms -> 63ms
        l.nn not in (s.nn) 63ms -> 63ms
        l.n not in (s.n where u>0) 63ms -> 63ms
    
        Authors: Zheng Li, Jim Finnerty
        Discussion:
            https://www.postgresql.org/message-id/flat/812F84BC-702F-4063-95CC-A7DBD7C9D66B%40amazon.com#bfbaff1ad4a366ed92d7c9657afdb533

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index cb897cc7f4..6edb5f65f5 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -991,7 +991,7 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
 	 * preprocessing.
 	 */
 	if (hasOuterJoins)
-		reduce_outer_joins(root);
+		reduce_outer_joins(parse);
 
 	/*
 	 * If we have any RTE_RESULT relations, see if they can be deleted from
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index efd0fbc21c..faeee1a2fc 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -158,8 +158,7 @@ get_first_col_type(Plan *plan, Oid *coltype, int32 *coltypmod,
  * subquery itself is in a resjunk tlist entry whose value is uninteresting).
  */
 static Node *
-make_subplan(PlannerInfo *root, Query *orig_subquery,
-			 SubLinkType subLinkType, int subLinkId,
+make_subplan(PlannerInfo *root, SubLink *sublink,
 			 Node *testexpr, bool isTopQual)
 {
 	Query	   *subquery;
@@ -171,6 +170,9 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 	Plan	   *plan;
 	List	   *plan_params;
 	Node	   *result;
+	Query      *orig_subquery = (Query *) sublink->subselect;
+	SubLinkType subLinkType = sublink->subLinkType;
+	int         subLinkId = sublink->subLinkId;
 
 	/*
 	 * Copy the source Query node.  This is a quick and dirty kluge to resolve
@@ -216,24 +218,33 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 	/* plan_params should not be in use in current query level */
 	Assert(root->plan_params == NIL);
 
-	/* Generate Paths for the subquery */
-	subroot = subquery_planner(root->glob, subquery,
-							   root,
-							   false, tuple_fraction);
+	if(sublink->subroot != NULL &&
+		sublink->subplan != NULL)
+	{
+		subroot = (PlannerInfo *) sublink->subroot;
+		plan = (Plan *) sublink->subplan;
+	}
+	else
+	{
+		/* Generate Paths for the subquery */
+		subroot = subquery_planner(root->glob, subquery,
+								   root,
+								   false, tuple_fraction);
+
+		/*
+		 * Select best Path and turn it into a Plan.  At least for now, there
+		 * seems no reason to postpone doing that.
+		 */
+		final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
+		best_path = get_cheapest_fractional_path(final_rel, tuple_fraction);
+
+		plan = create_plan(subroot, best_path);
+	}
 
 	/* Isolate the params needed by this specific subplan */
 	plan_params = root->plan_params;
 	root->plan_params = NIL;
 
-	/*
-	 * Select best Path and turn it into a Plan.  At least for now, there
-	 * seems no reason to postpone doing that.
-	 */
-	final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
-	best_path = get_cheapest_fractional_path(final_rel, tuple_fraction);
-
-	plan = create_plan(subroot, best_path);
-
 	/* And convert to SubPlan or InitPlan format. */
 	result = build_subplan(root, plan, subroot, plan_params,
 						   subLinkType, subLinkId,
@@ -1173,6 +1184,361 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
 	return expression_tree_walker(node, inline_cte_walker, context);
 }
 
+/* Returns a List of Nodes from the testexpr of an Any SubLink */
+static List *
+getTestExpr(SubLink *sublink)
+{
+	Node * testexpr = sublink->testexpr;
+	Assert(testexpr);
+
+	/* single expression */
+	if(IsA(testexpr, OpExpr))
+	{
+		OpExpr	*opexpr = (OpExpr *) testexpr;
+		Node	*testnode = linitial(opexpr->args);
+
+		return list_make1(testnode);
+	}
+	/* multi-expression */
+	else if(IsA(testexpr, BoolExpr))
+	{
+		BoolExpr	*bexpr = (BoolExpr *) testexpr;
+		ListCell		*lc;
+		Node		*node;
+		List			*result = NULL;
+
+		foreach(lc, bexpr->args)
+		{
+			node = lfirst(lc);
+			if(IsA(node, OpExpr))
+			{
+				OpExpr *expr = (OpExpr *) node;
+				result = lappend(result, linitial(expr->args));
+			}
+			else
+			{
+				elog(ERROR, "unrecognized node type for testexpr: %d",
+						(int) nodeTag(node));
+			}
+		}
+		return result;
+	}
+	else
+	{
+		elog(ERROR, "unrecognized node type for testexpr: %d",
+				(int) nodeTag(testexpr));
+	}
+}
+
+/* Try to reduce outer joins if there is one in the Query */
+static void
+reduce_outer_joins_NOT_IN(Query *parse)
+{
+	ListCell				*lc;
+	RangeTblEntry	*rte;
+
+	foreach(lc, parse->rtable)
+	{
+		rte = (RangeTblEntry *) lfirst(lc);
+		/* try to reduce outer joins if there is one */
+		if (rte->rtekind == RTE_JOIN &&
+				IS_OUTER_JOIN(rte->jointype))
+		{
+			reduce_outer_joins(parse);
+			return;
+		}
+	}
+}
+
+/*
+ * Make clause NOT EXISTS
+ * (select 1 from t2 where p) for the NOT IN to ANTI JOIN
+ * transformation.
+ */
+static
+Node *
+makeExistsTest_NOT_IN(Query *subselect)
+{
+	BoolExpr *notExpr = makeNode(BoolExpr);
+	SubLink *exists = makeNode(SubLink);
+	Query *selectOne =  copyObject(subselect);
+	Const *oneconst;
+	TargetEntry *dummyte;
+
+	/* modify subselect target list to contain a dummy const 1 */
+	oneconst = makeConst(INT4OID,
+						 -1,
+						 InvalidOid,
+						 sizeof(int32),
+						 Int32GetDatum(1),
+						 false, /* isnull */
+						 true); /* pass by value */
+	dummyte = makeTargetEntry((Expr *) oneconst,
+							  1,
+							  "one",
+							  false /* resjunk */ );
+	selectOne->targetList = list_make1(dummyte);
+
+	/* make EXISTS(select 1 from t2 where p) */
+	exists->subLinkType = EXISTS_SUBLINK;
+	exists->subLinkId = 0;
+	exists->subselect = (Node *) selectOne;
+	exists->location = -1;
+	exists->subroot = NULL;
+	exists->subplan = NULL;
+
+	/* make NOT EXISTS(select 1 from t2 where p) */
+	notExpr->boolop = NOT_EXPR;
+	notExpr->args = list_make1(exists);
+	notExpr->location = -1;
+
+	return (Node *) notExpr;
+}
+
+/*
+ * If the NOT IN subquery is not eligible for hashed subplan as
+ * decided by subplan_is_hashable(),
+ * do the following NOT IN to ANTI JOIN conversions:
+ *
+ * When x is non-nullable:
+ * t1.x not in (t2.y where p) => ANTI JOIN
+ * t1, t2 on join condition (t1.x=t2.y or t2.y IS NULL) and p.
+ * the above predicate "t2.y IS NULL" can be removed if y
+ * is also non-nullable.
+ *
+ * When x is nullable:
+ * t1.x not in (t2.y where p) => ANTI JOIN
+ * t1 (Filter: t1.x is not null or not exists (select 1 from t2 where p)),
+ * t2 on join condition (t1.x=t2.y or t2.y is null) and p.
+ * the above predicate "t2.y IS NULL" can be removed if y
+ * is also non-nullable.
+ *
+ * The multi-expression case is just ANDs of the single-
+ * expression case.
+ */
+static bool
+convert_NOT_IN_to_join(PlannerInfo *root, Node **quals,
+									SubLink *sublink, List *subquery_vars,
+									Node **pullout)
+{
+	Query			*parse = root->parse;
+	Query			*subselect = (Query *) sublink->subselect;
+	List				*testnodes = getTestExpr(sublink);
+	bool				 outerNonNull;
+	bool				 innerNonNull;
+	PlannerInfo 	*subroot;
+	double			 tuple_fraction;
+	RelOptInfo		*final_rel;
+	Path				*best_path;
+	Plan				*plan;
+	NullTest 		*nt;
+
+	/* For ALL and ANY subplans, we will be
+	 * able to stop evaluating if the test condition fails or matches, so very
+	 * often not all the tuples will be retrieved; for lack of a better idea,
+	 * specify 50% retrieval.
+	 */
+	tuple_fraction = 0.5;
+	/*
+	 * Generate Paths for the subquery, use a copied version of the subquery
+	 * so that the existing one doesn't get modified.
+	 */
+	subroot = subquery_planner(root->glob, copyObject(subselect),
+								root,
+							   false, tuple_fraction);
+
+	/* Select best Path and turn it into a Plan. */
+	final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
+	best_path = get_cheapest_fractional_path(final_rel, tuple_fraction);
+
+	plan = create_plan(subroot, best_path);
+
+	sublink->subroot = (Node *) subroot;
+	sublink->subplan = (Node *) plan;
+	/*
+	 * Punt if subplan is hashable since using hashed subplan is almost like
+	 * doing a Hash Anti Join, we probably can't do better than that.
+	 */
+	if(subplan_is_hashable(plan))
+	{
+		return false;
+	}
+
+	/*
+	 * Try reduce_outer_joins since outer join affects the nullability test that's coming up next.
+	 * We have to call reduce_outer_joins for outer and inner query separately because we
+	 * don't have a global range table yet.
+	 */
+	reduce_outer_joins_NOT_IN(parse);
+
+	reduce_outer_joins_NOT_IN(subselect);
+
+	Assert(testnodes);
+	outerNonNull =
+					list_hasnonnullable(testnodes, parse);
+	innerNonNull =
+					list_hasnonnullable(subselect->targetList, subselect);
+
+	/* Single-expression case, do the following:
+	 * When x is non-nullable:
+	 * t1.x not in (t2.y where p) => ANTI JOIN
+	 * t1, t2 on join condition (t1.x=t2.y or t2.y IS NULL) and p.
+	 * the above predicate "t2.y IS NULL" can be removed if y
+	 * is also non-nullable.
+	 *
+	 * When x is nullable:
+	 * t1.x not in (t2.y where p) => ANTI JOIN
+	 * t1 (Filter: t1.x is not null or not exists (select 1 from t2 where p)),
+	 * t2 on join condition (t1.x=t2.y or t2.y is null) and p.
+	 * the above predicate "t2.y IS NULL" can be removed if y
+	 * is also non-nullable.
+	 */
+	if(IsA(*quals, OpExpr))
+	{
+		/*  Add "OR y IS NULL" if y is nullable */
+		if(!innerNonNull)
+		{
+			/* make expr y IS NULL */
+			nt = makeNode(NullTest);
+			nt->arg = (Expr *)linitial(subquery_vars);
+			nt->nulltesttype = IS_NULL;
+			nt->argisrow = false;
+			nt->location = -1;
+
+			/* make orclause (x = y OR y IS NULL) */
+			*quals = (Node *)make_orclause(list_make2(*quals,
+								(Node *)nt));
+		}
+
+		/*
+		 * if x is nullable, make the following filter for t1 :
+		 * x IS NOT NULL or NOT EXISTS (select 1 from t2 where p)
+		 */
+		if(!outerNonNull)
+		{
+			Node * existsTest = NULL;
+
+			/* make expr x IS NOT NULL */
+			nt = makeNode(NullTest);
+			nt->arg = (Expr *) linitial(testnodes);
+			nt->nulltesttype = IS_NOT_NULL;
+			nt->argisrow = false;
+			nt->location = -1;
+
+			existsTest = makeExistsTest_NOT_IN(subselect);
+			/* make x IS NOT NULL OR NOT EXISTS (select 1 from t2 where p) */
+			*pullout = (Node *)make_orclause(list_make2(
+									(Node *) nt, existsTest));
+		}
+	}
+	/*
+	 * Multi-expression case:
+	 * If all xi's are nullable:
+	 * (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
+	 * ANTI JOIN t1,
+	 * t2 on join condition:
+	 * ((t1.x1 = t2.y1) and ... (t1.xi = t2.yi) ... and
+	 * (t1.xn = t2.yn)) is NOT FALSE.
+	 *
+	 * If at least one xi is non-nuallable:
+	 * (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
+	 * ANTI JOIN t1,
+	 * t2 on join condition:
+	 * (t1.x1 = t2.y1 or t2.y1 is NULL) and ...
+	 * (t1.xi = t2.yi or t2.yi is NULL or t1.xi is NULL) ... and
+	 * (t1.xn = t2.yn or t2.yn is NULL).
+	 */
+	else if(IsA(*quals, BoolExpr))
+	{
+		/*
+		 * Add IS NOT FALSE on top of the join condition if ALL x_i's are nullable
+		 */
+		if(!outerNonNull)
+		{
+			BooleanTest *btest;
+
+			btest = makeNode(BooleanTest);
+			btest->arg = (Expr *) *quals;
+			btest->booltesttype = IS_NOT_FALSE;
+			*quals = (Node *) btest;
+		}
+		else
+		{
+			ListCell			*qualc;
+			TargetEntry	*te;
+			ListCell			*xc = list_head(testnodes);
+			ListCell			*yc = list_head(subquery_vars);
+			ListCell			*ytlc = list_head(subselect->targetList);
+			List				*quallist = ((BoolExpr *)*quals)->args;
+			Node				*joinCondition = NULL;
+			bool				 xnonNull;
+			bool				 ynonNull;
+
+			/* Reconstruct quals in the loop */
+			*quals = NULL;
+			foreach(qualc, quallist)
+			{
+				te = (TargetEntry *)lfirst(ytlc);
+				/* x_i = y_i */
+				joinCondition = lfirst(qualc);
+				ynonNull = is_node_nonnullable((Node*)te, subselect);
+
+				/* append y_i IS NULL to x_i = y_i if y_i is non-nullable */
+				if(!ynonNull)
+				{
+					/* make expr y_i IS NULL */
+					nt = makeNode(NullTest);
+					nt->arg = (Expr *)lfirst(yc);
+					nt->nulltesttype = IS_NULL;
+					nt->argisrow = false;
+					nt->location = -1;
+
+					/* make orclause (x_i = y_i OR y_i IS NULL) */
+					joinCondition = (Node *)make_orclause(list_make2(joinCondition,
+											(Node *)nt));
+				}
+
+				/*
+				 * Append "OR x_i is null" to the join condition if x_i is nullable.
+				 * Notice at least one x_i should be non-nullable because the all
+				 * x_i's nullable case is handled earlier by adding "IS NOT FALSE"
+				 * on top of the join condition.
+				 */
+				xnonNull = is_node_nonnullable(lfirst(xc), parse);
+				if(!xnonNull)
+				{
+					/* make expr x_i IS NULL */
+					nt = makeNode(NullTest);
+					nt->arg = (Expr *)lfirst(xc);
+					nt->nulltesttype = IS_NULL;
+					nt->argisrow = false;
+					nt->location = -1;
+
+					/* make orclause (x_i = y_i OR y_i IS NULL OR x_i IS NULL) */
+					joinCondition = (Node *)make_orclause(list_make2(joinCondition,
+											(Node *)nt));
+				}
+
+				/*
+				 * Now append joinCondition to quals as one andclause.
+				 * (x_i = y_i OR y_i IS NULL OR x_i IS NULL) AND
+				 * (x_j = y_j OR y_j IS NULL OR x_j IS NULL)...
+				 */
+				*quals = (Node *)make_andclause(list_make2(*quals, joinCondition));
+				xc = lnext(xc);
+				yc = lnext(yc);
+				ytlc = lnext(ytlc);
+			}
+		}
+	}
+	/* quals should be either OpExpr or BoolExpr, otherwise don't convert */
+	else
+	{
+		return false;
+	}
+
+	return true;
+}
 
 /*
  * convert_ANY_sublink_to_join: try to convert an ANY SubLink to a join
@@ -1210,7 +1576,8 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
  */
 JoinExpr *
 convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
-							Relids available_rels)
+											bool under_not, Node **pullout,
+											Relids available_rels)
 {
 	JoinExpr   *result;
 	Query	   *parse = root->parse;
@@ -1290,11 +1657,22 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 	 */
 	quals = convert_testexpr(root, sublink->testexpr, subquery_vars);
 
+	/*
+	 * Try converting x NOT IN (y) to ANTI JOIN.
+	 */
+	if(under_not &&
+			!convert_NOT_IN_to_join(root, &quals,
+								sublink, subquery_vars, pullout))
+	{
+		return NULL;
+	}
+
 	/*
 	 * And finally, build the JoinExpr node.
 	 */
 	result = makeNode(JoinExpr);
-	result->jointype = JOIN_SEMI;
+	/* NOT IN will be converted to ANTI JOIN */
+	result->jointype = under_not ? JOIN_ANTI : JOIN_SEMI;
 	result->isNatural = false;
 	result->larg = NULL;		/* caller must fill this in */
 	result->rarg = (Node *) rtr;
@@ -1883,9 +2261,7 @@ process_sublinks_mutator(Node *node, process_sublinks_context *context)
 		 * Now build the SubPlan node and make the expr to return.
 		 */
 		return make_subplan(context->root,
-							(Query *) sublink->subselect,
-							sublink->subLinkType,
-							sublink->subLinkId,
+							sublink,
 							testexpr,
 							context->isTopQual);
 	}
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index 67eeba938d..edc716b679 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -104,7 +104,7 @@ static Query *pullup_replace_vars_subquery(Query *query,
 static reduce_outer_joins_state *reduce_outer_joins_pass1(Node *jtnode);
 static void reduce_outer_joins_pass2(Node *jtnode,
 									 reduce_outer_joins_state *state,
-									 PlannerInfo *root,
+									 Query *parse,
 									 Relids nonnullable_rels,
 									 List *nonnullable_vars,
 									 List *forced_null_vars);
@@ -256,6 +256,13 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 		}
 		/* Build the replacement FromExpr; no quals yet */
 		newf = makeFromExpr(newfromlist, NULL);
+		/*
+		 * Replace parse->jointree with newf now because we might modify join types
+		 * during reduce_outer_joins() in convert_NOT_IN_to_join() which is called
+		 * in pull_up_sublinks_qual_recurse() that's coming up next.
+		 */
+		newf->quals = f->quals;
+		root->parse->jointree = newf;
 		/* Set up a link representing the rebuilt jointree */
 		jtlink = (Node *) newf;
 		/* Now process qual --- all children are available for use */
@@ -388,7 +395,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 		/* Is it a convertible ANY or EXISTS clause? */
 		if (sublink->subLinkType == ANY_SUBLINK)
 		{
-			if ((j = convert_ANY_sublink_to_join(root, sublink,
+			if ((j = convert_ANY_sublink_to_join(root, sublink, false, NULL,
 												 available_rels1)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
@@ -414,7 +421,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 				return NULL;
 			}
 			if (available_rels2 != NULL &&
-				(j = convert_ANY_sublink_to_join(root, sublink,
+				(j = convert_ANY_sublink_to_join(root, sublink, false, NULL,
 												 available_rels2)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
@@ -560,6 +567,68 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 					return NULL;
 				}
 			}
+			else if (sublink->subLinkType == ANY_SUBLINK)
+			{
+				Node *pullout = NULL;
+
+				if ((j = convert_ANY_sublink_to_join(root, sublink, true, &pullout,
+													 available_rels1)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink1;
+					*jtlink1 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Any inserted
+					 * joins can get stacked onto either j->larg or j->rarg,
+					 * depending on which rels they reference.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->larg,
+															 available_rels1,
+															 &j->rarg,
+															 child_rels);
+					/*
+					 * Return pullout predicate (x is NOT NULL) if it's not null,
+					 * otherwise return NULL representing constant TRUE.
+					 */
+					return pullout? pullout : NULL;
+				}
+				if (available_rels2 != NULL &&
+					(j = convert_ANY_sublink_to_join(root, sublink, true, &pullout,
+													 available_rels2)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink2;
+					*jtlink2 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Any inserted
+					 * joins can get stacked onto either j->larg or j->rarg,
+					 * depending on which rels they reference.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->larg,
+															 available_rels2,
+															 &j->rarg,
+															 child_rels);
+					/*
+					 * Return pullout predicate (x is NOT NULL) if it's not null,
+					 * otherwise return NULL representing constant TRUE.
+					 */
+					return pullout? pullout : NULL;
+				}
+			}
 		}
 		/* Else return it unmodified */
 		return node;
@@ -2377,7 +2446,7 @@ flatten_simple_union_all(PlannerInfo *root)
  * alias-var expansion).
  */
 void
-reduce_outer_joins(PlannerInfo *root)
+reduce_outer_joins(Query *parse)
 {
 	reduce_outer_joins_state *state;
 
@@ -2390,14 +2459,14 @@ reduce_outer_joins(PlannerInfo *root)
 	 * join(s) below each side of each join clause. The second pass examines
 	 * qual clauses and changes join types as it descends the tree.
 	 */
-	state = reduce_outer_joins_pass1((Node *) root->parse->jointree);
+	state = reduce_outer_joins_pass1((Node *) parse->jointree);
 
 	/* planner.c shouldn't have called me if no outer joins */
 	if (state == NULL || !state->contains_outer)
 		elog(ERROR, "so where are the outer joins?");
 
-	reduce_outer_joins_pass2((Node *) root->parse->jointree,
-							 state, root, NULL, NIL, NIL);
+	reduce_outer_joins_pass2((Node *) parse->jointree,
+							 state, parse, NULL, NIL, NIL);
 }
 
 /*
@@ -2480,7 +2549,7 @@ reduce_outer_joins_pass1(Node *jtnode)
 static void
 reduce_outer_joins_pass2(Node *jtnode,
 						 reduce_outer_joins_state *state,
-						 PlannerInfo *root,
+						 Query *parse,
 						 Relids nonnullable_rels,
 						 List *nonnullable_vars,
 						 List *forced_null_vars)
@@ -2520,7 +2589,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 			reduce_outer_joins_state *sub_state = lfirst(s);
 
 			if (sub_state->contains_outer)
-				reduce_outer_joins_pass2(lfirst(l), sub_state, root,
+				reduce_outer_joins_pass2(lfirst(l), sub_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
@@ -2632,7 +2701,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 		/* Apply the jointype change, if any, to both jointree node and RTE */
 		if (rtindex && jointype != j->jointype)
 		{
-			RangeTblEntry *rte = rt_fetch(rtindex, root->parse->rtable);
+			RangeTblEntry *rte = rt_fetch(rtindex, parse->rtable);
 
 			Assert(rte->rtekind == RTE_JOIN);
 			Assert(rte->jointype == j->jointype);
@@ -2717,7 +2786,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 					pass_nonnullable_vars = NIL;
 					pass_forced_null_vars = NIL;
 				}
-				reduce_outer_joins_pass2(j->larg, left_state, root,
+				reduce_outer_joins_pass2(j->larg, left_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
@@ -2739,7 +2808,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 					pass_nonnullable_vars = NIL;
 					pass_forced_null_vars = NIL;
 				}
-				reduce_outer_joins_pass2(j->rarg, right_state, root,
+				reduce_outer_joins_pass2(j->rarg, right_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 2e84d6b3b4..7835ecf6ca 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -38,10 +38,12 @@
 #include "optimizer/optimizer.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planmain.h"
+#include "optimizer/pathnode.h"
 #include "parser/analyze.h"
 #include "parser/parse_agg.h"
 #include "parser/parse_coerce.h"
 #include "parser/parse_func.h"
+#include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "tcop/tcopprot.h"
 #include "utils/acl.h"
@@ -2011,6 +2013,271 @@ find_forced_null_var(Node *node)
 	return NULL;
 }
 
+/*
+ * find_innerjoined_rels
+ *		Traverse jointree to locate non-outerjoined-rels and quals above them
+ *
+ * We fill innerjoined_rels with the relids of all rels that are not below
+ * the nullable side of any outer join (which would cause their Vars to be
+ * possibly NULL regardless of what's in the catalogs).  In the same scan,
+ * we locate all WHERE and JOIN/ON quals that constrain these rels add them to
+ * the usable_quals list (forming a list with implicit-AND semantics).
+ *
+ * Top-level caller must initialize innerjoined_rels/usable_quals to NULL/NIL.
+ */
+static void
+find_innerjoined_rels(Node *jtnode,
+					  Relids *innerjoined_rels, List **usable_quals)
+{
+	if (jtnode == NULL)
+		return;
+	if (IsA(jtnode, RangeTblRef))
+	{
+		int			varno = ((RangeTblRef *) jtnode)->rtindex;
+
+		*innerjoined_rels = bms_add_member(*innerjoined_rels, varno);
+	}
+	else if (IsA(jtnode, FromExpr))
+	{
+		FromExpr   *f = (FromExpr *) jtnode;
+		ListCell   *lc;
+
+		/* All elements of the FROM list are allowable */
+		foreach(lc, f->fromlist)
+			find_innerjoined_rels((Node *) lfirst(lc),
+								  innerjoined_rels, usable_quals);
+		/* ... and its WHERE quals are too */
+		if (f->quals)
+			*usable_quals = lappend(*usable_quals, f->quals);
+	}
+	else if (IsA(jtnode, JoinExpr))
+	{
+		JoinExpr   *j = (JoinExpr *) jtnode;
+
+		switch (j->jointype)
+		{
+			case JOIN_INNER:
+				/* visit both children */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				/* and grab the ON quals too */
+				if (j->quals)
+					*usable_quals = lappend(*usable_quals, j->quals);
+				break;
+
+			case JOIN_LEFT:
+			case JOIN_SEMI:
+			case JOIN_ANTI:
+
+				/*
+				 * Only the left input is possibly non-nullable; furthermore,
+				 * the quals of this join don't constrain the left input.
+				 * Note: we probably can't see SEMI or ANTI joins at this
+				 * point, but if we do, we can treat them like LEFT joins.
+				 */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_RIGHT:
+				/* Reverse of the above case */
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_FULL:
+				/* Neither side is non-nullable, so stop descending */
+				break;
+
+			case JOIN_UNIQUE_OUTER:
+			case JOIN_UNIQUE_INNER:
+				/* Don't think we will see JOIN_UNIQUE_OUTER or
+				 * JOIN_UNIQUE_INNER since they are only used internally in
+				 * the planner in a much later phase (in standard_join_search).
+				*/
+				break;
+
+			default:
+				elog(ERROR, "unrecognized join type: %d",
+					 (int) j->jointype);
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+			 (int) nodeTag(jtnode));
+}
+
+/*
+ * Returns true if the Node passed in is nonnullable. Currently handles Var,
+ * TargetEntry, CoaleseExpr and Const.
+ * TODO: Add more supporting cases.
+ * A Var is nonnullable if:
+ *	It does not appear on the null-padded side of an outer join and it has NOT NULL constraint,
+ * 	Or if it's forced non-null by inner join or other strict predicates.
+ * A CoalesceExpr is nonnullable if it has a non-null argument.
+ */
+bool
+is_node_nonnullable(Node * node, Query *parse)
+{
+	AttrNumber	 attno = InvalidAttrNumber;
+	Oid				 reloid;
+	Var 				*var = NULL;
+
+	/*
+	 * If the query contains set operations, punt.  The set ops themselves
+	 * couldn't introduce nulls that weren't in their inputs, but the tlist
+	 * present in the top-level query is just dummy and won't give us useful
+	 * info.  We could get an answer by recursing to examine each leaf query,
+	 * but for the moment it doesn't seem worth the extra complication.
+	 *
+	 * Note that we needn't consider other top-level operators such as
+	 * DISTINCT, GROUP BY, etc, as those will not introduce nulls either.
+	 */
+	if (parse->setOperations)
+		return false;
+
+	switch (nodeTag(node))
+	{
+		case T_Var:
+		{
+			RangeTblEntry	*rte;
+
+			var = (Var *) node;
+			attno = var->varattno;
+			rte = rt_fetch(var->varno, parse->rtable);
+			reloid = rte->relid;
+			break;
+		}
+		case T_TargetEntry:
+		{
+			TargetEntry	*te = (TargetEntry *)node;
+			switch(nodeTag(te->expr))
+			{
+				case T_Var:
+				{
+					var = (Var *) te->expr;
+					attno = te->resorigcol;
+					reloid = te->resorigtbl;
+					break;
+				}
+				/* recurse into is_node_nonnullable for other types of Node */
+				default:
+					return is_node_nonnullable((Node *)te->expr, parse);
+			}
+			break;
+		}
+		case T_CoalesceExpr:
+		{
+			ListCell   			*arg;
+			CoalesceExpr	*cexpr = (CoalesceExpr *)node;
+
+			/* handle COALESCE Function by looking for non-null argument */
+			foreach(arg, cexpr->args)
+			{
+				Node	*e = lfirst(arg);
+
+				/* recurse into is_node_nonnullable */
+				if (is_node_nonnullable(e, parse))
+				{
+					return true;
+				}
+			}
+			break;
+		}
+		case T_Const:
+		{
+			if(!((Const *) node)->constisnull)
+			{
+				return true;
+			}
+			break;
+		}
+		/*
+		 * TODO: handle more cases to make the nullability test more accurate
+		 * Assume unhandled cases are nullable.
+		 */
+		default:
+			return false;
+	}
+
+	/*
+	 * If we have found a Var, it is non-null if:
+	 * not on NULL-padded side of an outer join and has NOT NULL constraint
+	 * or forced NOT NULL by inner join conditions or by other strict predicates.
+	 */
+	if(var &&
+			reloid != InvalidOid)
+	{
+		Relids		 innerjoined_rels = NULL;
+		List	   		*innerjoined_useful_quals = NIL;
+		List			*nonnullable_innerjoined_vars = NIL;
+
+		find_innerjoined_rels((Node *) parse->jointree,
+							  &innerjoined_rels,
+							  &innerjoined_useful_quals);
+
+		/*
+		 * Check if the Var is from an INNER JOINed rel, it's also guaranteed
+		 * to not be on the null-padded side of an outer join.
+		 */
+		if (bms_is_member(var->varno, innerjoined_rels))
+		{
+			/*
+			 * If Var is from a plain relation and its column is marked
+			 * NOT NULL according to the catalogs, it can't produce NULL.
+			 */
+			if (get_attnotnull(reloid, attno))
+			{
+				return true;
+			}
+
+			/*
+			 * Otherwise check for the existance of strict predicates which filter
+			 * out NULL values for this Var.
+			 */
+			nonnullable_innerjoined_vars =
+				find_nonnullable_vars((Node *) innerjoined_useful_quals);
+
+			if (list_member(nonnullable_innerjoined_vars, var))
+			{
+				return true;
+			}
+		}
+
+		/*
+		 * If we get here it means -
+		 * The var is on null-padded side of an outer-join, since we've already
+		 * tried reduce_outer_joins(), there isn't any strict predicates to turn this
+		 * outer join to inner join, then there will be no strict predicates to force
+		 * this var non-null as well.
+		 */
+	}
+
+	return false;
+}
+
+/*
+ * Returns true if at least one Node in the input list is non-nullable
+ */
+bool
+list_hasnonnullable(List * list, Query *parse)
+{
+	ListCell	*lc;
+	Node	*node;
+
+	foreach(lc, list)
+	{
+		node = lfirst(lc);
+		if(is_node_nonnullable(node, parse))
+		{
+			return true;
+		}
+	}
+	return false;
+}
+
 /*
  * Can we treat a ScalarArrayOpExpr as strict?
  *
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index c13c08a97b..59aa736f1e 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -908,6 +908,35 @@ get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 	ReleaseSysCache(tp);
 }
 
+/*
+ * get_attnotnull
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	  tp;
+	Form_pg_attribute att_tup;
+
+	tp = SearchSysCache2(ATTNUM,
+			ObjectIdGetDatum(relid),
+			Int16GetDatum(attnum));
+
+	if (HeapTupleIsValid(tp))
+	{
+		bool result;
+
+		att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+		result = att_tup->attnotnull;
+		ReleaseSysCache(tp);
+
+		return result;
+	}
+	/* Assume att is nullable if no valid heap tuple is found */
+	return false;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 7c278c0e56..b8cdd5ad1e 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -643,6 +643,8 @@ typedef struct SubLink
 	List	   *operName;		/* originally specified operator name */
 	Node	   *subselect;		/* subselect as Query* or raw parsetree */
 	int			location;		/* token location, or -1 if unknown */
+	Node	   *subroot;		/* PlannerInfo for the subquery */
+	Node	   *subplan;		/* best Plan for the subquery */
 } SubLink;
 
 /*
diff --git a/src/include/optimizer/clauses.h b/src/include/optimizer/clauses.h
index 2f9aeec4a7..b3229e96a7 100644
--- a/src/include/optimizer/clauses.h
+++ b/src/include/optimizer/clauses.h
@@ -45,6 +45,9 @@ extern List *find_nonnullable_vars(Node *clause);
 extern List *find_forced_null_vars(Node *clause);
 extern Var *find_forced_null_var(Node *clause);
 
+extern bool is_node_nonnullable(Node * node, Query *parse);
+extern bool list_hasnonnullable(List * list, Query *parse);
+
 extern bool is_pseudo_constant_clause(Node *clause);
 extern bool is_pseudo_constant_clause_relids(Node *clause, Relids relids);
 
diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h
index a9b2c9026c..ec8af5abf4 100644
--- a/src/include/optimizer/prep.h
+++ b/src/include/optimizer/prep.h
@@ -26,7 +26,7 @@ extern void pull_up_sublinks(PlannerInfo *root);
 extern void inline_set_returning_functions(PlannerInfo *root);
 extern void pull_up_subqueries(PlannerInfo *root);
 extern void flatten_simple_union_all(PlannerInfo *root);
-extern void reduce_outer_joins(PlannerInfo *root);
+extern void reduce_outer_joins(Query *parse);
 extern void remove_useless_result_rtes(PlannerInfo *root);
 extern Relids get_relids_in_jointree(Node *jtnode, bool include_joins);
 extern Relids get_relids_for_join(Query *query, int joinrelid);
diff --git a/src/include/optimizer/subselect.h b/src/include/optimizer/subselect.h
index 71a22eeb64..77c3730e47 100644
--- a/src/include/optimizer/subselect.h
+++ b/src/include/optimizer/subselect.h
@@ -19,6 +19,8 @@
 extern void SS_process_ctes(PlannerInfo *root);
 extern JoinExpr *convert_ANY_sublink_to_join(PlannerInfo *root,
 											 SubLink *sublink,
+											 bool under_not,
+											 Node **pullout,
 											 Relids available_rels);
 extern JoinExpr *convert_EXISTS_sublink_to_join(PlannerInfo *root,
 												SubLink *sublink,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index c8df5bff9f..d2bf81985d 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -90,6 +90,7 @@ extern char get_attgenerated(Oid relid, AttrNumber attnum);
 extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 								  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern char *get_collation_name(Oid colloid);
 extern bool get_collation_isdeterministic(Oid colloid);
 extern char *get_constraint_name(Oid conoid);
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index 2d1963d12a..8e8b2ee8d1 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -1442,3 +1442,2208 @@ select * from x for update;
    Output: subselect_tbl.f1, subselect_tbl.f2, subselect_tbl.f3
 (2 rows)
 
+-- test NON IN to ANTI JOIN conversion
+CREATE TABLE s (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s values(1000002, 1000002, 1000002, 'foofoo');
+UPDATE s set n = NULL WHERE n = 3;
+analyze s;
+CREATE TABLE l (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into l (u, n, nn, p)
+	select
+    generate_series(1,10000 ) as u,
+	generate_series(1,10000 ) as n,
+	generate_series(1,10000 ) as nn,
+	'bar' as p;
+UPDATE l set n = NULL WHERE n = 7;
+CREATE UNIQUE INDEX l_u ON l (u);
+CREATE INDEX l_n ON l (n);
+CREATE INDEX l_nn ON l (nn);
+analyze l;
+CREATE TABLE s1 (u INTEGER NOT NULL, n INTEGER NULL, n1 INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s1 (u, n, n1, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as n1,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s1 values(1000003, 1000003, 1000003, 1000003, 'foofoo');
+insert into s1 values(1003, 1003, 1003, 1003, 'foofoo');
+UPDATE s1 set n = NULL WHERE n = 3;
+UPDATE s1 set n1 = NULL WHERE n = 2;
+UPDATE s1 set n1 = NULL WHERE n1 = 3;
+analyze s1;
+CREATE TABLE empty (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+analyze empty;
+-- set work_mem to 64KB so that NOT IN to ANTI JOIN optimization will kick in
+set work_mem = 64;
+-- correctness test 1: inner empty, return every thing from outer including NULL
+explain (costs false) select * from s where n not in (select n from empty);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on empty
+(4 rows)
+
+select * from s where n not in (select n from empty);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+-- correctness test 2: inner has NULL, return empty result
+explain (costs false) select * from s where n not in (select n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- correctness test 3: inner non-null, result has no NULL
+explain (costs false) select * from s where n not in (select u from l);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Index Only Scan using l_u on l
+         Index Cond: (u = s.n)
+(7 rows)
+
+select * from s where n not in (select u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+-- correctness test 4: inner has predicate
+explain (costs false) select * from s where n not in (select n from l where u > 7);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (u > 7)
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 7)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(14 rows)
+
+select * from s where n not in (select n from l where u > 7);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(3 rows)
+
+-- correctness test 5: multi-expression, (2, 2, null, 2, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s1.n = l.u) AND (s1.n1 = l.nn)) IS NOT FALSE)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Seq Scan on l
+               Filter: (u >= 3)
+(6 rows)
+
+select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+    u    |    n    |   n1    |   nn    |   p    
+---------+---------+---------+---------+--------
+       1 |       1 |       1 |       1 | foo
+ 1000003 | 1000003 | 1000003 | 1000003 | foofoo
+       2 |       2 |         |       2 | foo
+(3 rows)
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should not be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s1.n = l.u) AND (s1.n1 = l.nn)) IS NOT FALSE)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Seq Scan on l
+               Filter: (u > 0)
+(6 rows)
+
+select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+    u    |    n    |   n1    |   nn    |   p    
+---------+---------+---------+---------+--------
+ 1000003 | 1000003 | 1000003 | 1000003 | foofoo
+(1 row)
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on s1
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Index Scan using l_u on l
+           Index Cond: (u < 0)
+(5 rows)
+
+select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+    u    |    n    |   n1    |   nn    |   p    
+---------+---------+---------+---------+--------
+       1 |       1 |       1 |       1 | foo
+ 1000003 | 1000003 | 1000003 | 1000003 | foofoo
+    1003 |    1003 |    1003 |    1003 | foofoo
+       2 |       2 |         |       2 | foo
+       3 |         |         |       3 | foo
+(5 rows)
+
+-- test using hashed subplan when inner fits in work_mem
+explain (costs false) select * from l where n not in (select n from s);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on l
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on s
+(4 rows)
+
+select * from l where n not in (select n from s);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- test single expression
+explain (costs false) select * from s where n not in (select n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select u from l);
+              QUERY PLAN              
+--------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_u on l
+         Index Cond: (u = s.u)
+(4 rows)
+
+select * from s where u not in (select u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where 3*n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: (((3 * n) IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: (((3 * s.n) = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = (3 * s.n))
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where 3*n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select 3*n from l);
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = (3 * l.n)) OR ((3 * l.n) IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Materialize
+         ->  Seq Scan on l
+(8 rows)
+
+select * from s where n not in (select 3*n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- test single expression with predicates
+explain (costs false) select * from s where n not in (select n from l where u > 0);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (u > 0)
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 0)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(14 rows)
+
+select * from s where n not in (select n from l where u > 0);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l where u > 100);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (u > 100)
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 100)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(14 rows)
+
+select * from s where n not in (select n from l where u > 100);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(3 rows)
+
+-- test multi expression
+explain (costs false) select * from s where (n,u) not in (select n,u from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_u on l
+         Index Cond: (u = s.u)
+         Filter: ((s.n = n) OR (n IS NULL) OR (s.n IS NULL))
+(5 rows)
+
+select * from s where (n,u) not in (select n,u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (u, nn) not in (select u, nn from l);
+            QUERY PLAN            
+----------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: (s.u = u)
+(5 rows)
+
+select * from s where (u, nn) not in (select u, nn from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u) not in (select u,n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.u = n) OR (n IS NULL))
+         Filter: ((s.n = u) OR (s.n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.u)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where (n,u) not in (select u,n from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l);
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: (((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+                                          QUERY PLAN                                          
+----------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((u > 1000) AND ((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((u > 0) AND ((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((u > 1) AND ((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(2 rows)
+
+-- test multi-table
+explain (costs false) select count(*) from s, l where s.n not in (select n from l);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+                     Filter: ((n IS NOT NULL) OR (NOT $0))
+               ->  Bitmap Heap Scan on l l_1
+                     Recheck Cond: ((s.n = n) OR (n IS NULL))
+                     ->  BitmapOr
+                           ->  Bitmap Index Scan on l_n
+                                 Index Cond: (n = s.n)
+                           ->  Bitmap Index Scan on l_n
+                                 Index Cond: (n IS NULL)
+         ->  Seq Scan on l
+(15 rows)
+
+select count(*) from s, l where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s, l where s.nn not in (select nn from l);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+               ->  Index Only Scan using l_nn on l l_1
+                     Index Cond: (nn = s.nn)
+         ->  Seq Scan on l
+(7 rows)
+
+select count(*) from s, l where s.nn not in (select nn from l);
+ count 
+-------
+ 10000
+(1 row)
+
+-- test null padded results from outer join
+explain (costs false) select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = s_1.nn) OR (s_1.nn IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Nested Loop Left Join
+           Join Filter: (l_1.nn = s_2.nn)
+           ->  Seq Scan on l l_1
+           ->  Materialize
+                 ->  Seq Scan on s s_2
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s_1.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s s_1
+(15 rows)
+
+select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = s_1.nn) OR (s_1.nn IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Nested Loop Left Join
+           Join Filter: (s_2.nn = l_1.nn)
+           ->  Seq Scan on l l_1
+           ->  Materialize
+                 ->  Seq Scan on s s_2
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s_1.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s s_1
+(15 rows)
+
+select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+                   QUERY PLAN                   
+------------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         ->  Seq Scan on l
+               Filter: (NOT (hashed SubPlan 1))
+               SubPlan 1
+                 ->  Seq Scan on s s_1
+         ->  Hash
+               ->  Seq Scan on s
+(9 rows)
+
+select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+ count 
+-------
+  9997
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+                QUERY PLAN                
+------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         Filter: (NOT (hashed SubPlan 1))
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s
+         SubPlan 1
+           ->  Seq Scan on s s_1
+(9 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   ->  Nested Loop Left Join
+         Join Filter: (s.nn = l.nn)
+         ->  Hash Anti Join
+               Hash Cond: (l.nn = l_1.nn)
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Left Join
+                           Hash Cond: (l_1.nn = s_1.nn)
+                           ->  Seq Scan on l l_1
+                           ->  Hash
+                                 ->  Seq Scan on s s_1
+         ->  Seq Scan on s
+(13 rows)
+
+select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Nested Loop Left Join
+           Join Filter: (l_2.nn = s_2.nn)
+           ->  Seq Scan on l l_2
+           ->  Materialize
+                 ->  Seq Scan on s s_2
+   ->  Nested Loop Anti Join
+         Join Filter: ((s.nn = s_1.nn) OR (s_1.nn IS NULL))
+         ->  Hash Left Join
+               Hash Cond: (l.nn = s.nn)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s
+         ->  Materialize
+               ->  Hash Left Join
+                     Hash Cond: (l_1.nn = s_1.nn)
+                     ->  Seq Scan on l l_1
+                     ->  Hash
+                           ->  Seq Scan on s s_1
+(21 rows)
+
+select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Seq Scan on s1
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(11 rows)
+
+select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Left Join
+               Hash Cond: (l.u = s.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Right Join
+                           Hash Cond: (s1.u = s.u)
+                           ->  Seq Scan on s1
+                           ->  Hash
+                                 ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(17 rows)
+
+select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+                    QUERY PLAN                     
+---------------------------------------------------
+ Aggregate
+   ->  Nested Loop Left Join
+         Join Filter: (s.u = s1.u)
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+               ->  Index Only Scan using l_nn on l
+                     Index Cond: (nn = s.nn)
+         ->  Seq Scan on s1
+(8 rows)
+
+select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop
+               Join Filter: (s.u = s1.u)
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Seq Scan on s1
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(11 rows)
+
+select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Left Join
+               Hash Cond: (l.u = s.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Join
+                           Hash Cond: (s1.u = s.u)
+                           ->  Seq Scan on s1
+                           ->  Hash
+                                 ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(17 rows)
+
+select * from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+ u | n | nn | p | u | n | n1 | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Seq Scan on s1
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(11 rows)
+
+select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Full Join
+               Hash Cond: (l.u = s.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Join
+                           Hash Cond: (s1.u = s.u)
+                           ->  Seq Scan on s1
+                           ->  Hash
+                                 ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(17 rows)
+
+select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+                    QUERY PLAN                     
+---------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Nested Loop Left Join
+         Join Filter: (l.nn = s1.nn)
+         ->  Nested Loop Left Join
+               Join Filter: (l.nn = s_1.nn)
+               ->  Index Only Scan using l_nn on l
+                     Index Cond: (nn = s.nn)
+               ->  Seq Scan on s s_1
+         ->  Seq Scan on s1
+(10 rows)
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.nn = s_1.nn)
+           ->  Nested Loop Left Join
+                 ->  Seq Scan on s1
+                 ->  Index Only Scan using l_nn on l
+                       Index Cond: (nn = s1.nn)
+           ->  Materialize
+                 ->  Seq Scan on s s_1
+(11 rows)
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+                                    QUERY PLAN                                     
+-----------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Nested Loop Left Join
+         Join Filter: (l.nn = s_1.nn)
+         ->  Index Scan using l_nn on l
+               Index Cond: (nn = s.nn)
+               Filter: (((s.n = n) OR (n IS NULL) OR (s.n IS NULL)) AND (s.u = u))
+         ->  Seq Scan on s s_1
+(8 rows)
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l right join s on l.nn = s.nn);
+                QUERY PLAN                 
+-------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+           ->  Index Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+(7 rows)
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+--test reduce outer joins from outer query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Left Join
+               Hash Cond: (l.nn = s.nn)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(13 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+                     Filter: (u > 0)
+               ->  Index Only Scan using l_nn on l l_1
+                     Index Cond: (nn = s.nn)
+         ->  Index Only Scan using l_nn on l
+               Index Cond: (nn = s.nn)
+(9 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Index Only Scan using l_nn on l
+                     Index Cond: (nn = s.nn)
+         ->  Seq Scan on s1
+(11 rows)
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Anti Join
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on s1
+               ->  Materialize
+                     ->  Nested Loop
+                           ->  Seq Scan on s
+                           ->  Index Only Scan using l_nn on l
+                                 Index Cond: (nn = s.nn)
+         ->  Index Only Scan using l_nn on l l_1
+               Index Cond: (nn = s.nn)
+(15 rows)
+
+select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Left Join
+         Join Filter: (s.u = s1.u)
+         ->  Hash Anti Join
+               Hash Cond: (s.nn = l_1.nn)
+               ->  Hash Left Join
+                     Hash Cond: (l.nn = s.nn)
+                     Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+                     ->  Seq Scan on l
+                     ->  Hash
+                           ->  Seq Scan on s
+               ->  Hash
+                     ->  Seq Scan on l l_1
+         ->  Seq Scan on s1
+(16 rows)
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+--test reduce outer joins from subquery
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+(7 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+                QUERY PLAN                 
+-------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+                 Filter: (u > 9)
+(8 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+                 Filter: (u > 9)
+           ->  Index Only Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+(8 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           Join Filter: (l.n = s1.n)
+           ->  Seq Scan on s1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Scan using l_nn on l
+                             Index Cond: (nn = s_1.nn)
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.n = s1.n)
+           ->  Seq Scan on s1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Scan using l_nn on l
+                             Index Cond: (nn = s_1.nn)
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+                   QUERY PLAN                    
+-------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.n = s1.n)
+           ->  Nested Loop Left Join
+                 ->  Seq Scan on s s_1
+                 ->  Index Scan using l_nn on l
+                       Index Cond: (nn = s_1.nn)
+           ->  Materialize
+                 ->  Seq Scan on s1
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+--test reduce outer join on outer and sub-query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+                                    QUERY PLAN                                    
+----------------------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = s1.u)
+         ->  Seq Scan on s1
+         ->  Materialize
+               ->  Nested Loop
+                     ->  Seq Scan on s
+                           Filter: (NOT (hashed SubPlan 1))
+                           SubPlan 1
+                             ->  Nested Loop
+                                   Join Filter: (l_1.n = s1_1.n)
+                                   ->  Seq Scan on s1 s1_1
+                                   ->  Materialize
+                                         ->  Nested Loop
+                                               ->  Seq Scan on s s_1
+                                               ->  Index Scan using l_nn on l l_1
+                                                     Index Cond: (nn = s_1.nn)
+                     ->  Index Only Scan using l_nn on l
+                           Index Cond: (nn = s.nn)
+(19 rows)
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         Filter: (NOT (hashed SubPlan 1))
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Hash Right Join
+                     Hash Cond: (s1.u = s.u)
+                     ->  Seq Scan on s1
+                     ->  Hash
+                           ->  Seq Scan on s
+         SubPlan 1
+           ->  Nested Loop Left Join
+                 Join Filter: (l_1.n = s1_1.n)
+                 ->  Nested Loop Left Join
+                       ->  Seq Scan on s s_1
+                       ->  Index Scan using l_nn on l l_1
+                             Index Cond: (nn = s_1.nn)
+                 ->  Materialize
+                       ->  Seq Scan on s1 s1_1
+(20 rows)
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+ count 
+-------
+     0
+(1 row)
+
+-- test union all
+explain (costs false) select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s t
+         Filter: (n IS NOT NULL)
+   ->  Append
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: ((t.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = t.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+         ->  Seq Scan on s
+               Filter: ((t.n = u) OR (u IS NULL))
+         ->  Seq Scan on s s_1
+               Filter: ((t.n = nn) OR (nn IS NULL))
+(15 rows)
+
+select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Result
+           ->  Append
+                 ->  Seq Scan on l l_1
+                 ->  Seq Scan on s s_3
+                 ->  Seq Scan on s s_4
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Append
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+         ->  Seq Scan on s s_1
+               Filter: ((s.n = u) OR (u IS NULL))
+         ->  Seq Scan on s s_2
+               Filter: ((s.n = nn) OR (nn IS NULL))
+(21 rows)
+
+select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.n = l_1.nn)
+         ->  Append
+               ->  Seq Scan on s
+                     Filter: ((n IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+                     Filter: ((u IS NOT NULL) OR (NOT $0))
+         ->  Hash
+               ->  Seq Scan on l l_1
+(12 rows)
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+                   QUERY PLAN                   
+------------------------------------------------
+ Aggregate
+   ->  Append
+         ->  Seq Scan on s
+               Filter: (NOT (hashed SubPlan 1))
+               SubPlan 1
+                 ->  Seq Scan on empty
+         ->  Seq Scan on l
+               Filter: (NOT (hashed SubPlan 1))
+(8 rows)
+
+select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+ count 
+-------
+ 10004
+(1 row)
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
+ Finalize Aggregate
+   InitPlan 1 (returns $0)
+     ->  Result
+           ->  Append
+                 ->  Seq Scan on l l_2
+                 ->  Seq Scan on s s_3
+                 ->  Seq Scan on s s_4
+   ->  Gather
+         Workers Planned: 2
+         Params Evaluated: $0
+         ->  Partial Aggregate
+               ->  Nested Loop Anti Join
+                     ->  Parallel Append
+                           ->  Parallel Seq Scan on l
+                                 Filter: ((u IS NOT NULL) OR (NOT $0))
+                           ->  Parallel Seq Scan on s
+                                 Filter: ((n IS NOT NULL) OR (NOT $0))
+                     ->  Append
+                           ->  Bitmap Heap Scan on l l_1
+                                 Recheck Cond: ((l.u = n) OR (n IS NULL))
+                                 ->  BitmapOr
+                                       ->  Bitmap Index Scan on l_n
+                                             Index Cond: (n = l.u)
+                                       ->  Bitmap Index Scan on l_n
+                                             Index Cond: (n IS NULL)
+                           ->  Seq Scan on s s_1
+                                 Filter: ((l.u = u) OR (u IS NULL))
+                           ->  Seq Scan on s s_2
+                                 Filter: ((l.u = nn) OR (nn IS NULL))
+(29 rows)
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+ count 
+-------
+     0
+(1 row)
+
+-- test multi-levels of NOT IN
+explain (costs false) select * from s where n not in (select n from s where n not in (select n from l));
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 2))
+   SubPlan 2
+     ->  Nested Loop Anti Join
+           InitPlan 1 (returns $0)
+             ->  Seq Scan on l
+           ->  Seq Scan on s s_1
+                 Filter: ((n IS NOT NULL) OR (NOT $0))
+           ->  Bitmap Heap Scan on l l_1
+                 Recheck Cond: ((s_1.n = n) OR (n IS NULL))
+                 ->  BitmapOr
+                       ->  Bitmap Index Scan on l_n
+                             Index Cond: (n = s_1.n)
+                       ->  Bitmap Index Scan on l_n
+                             Index Cond: (n IS NULL)
+(15 rows)
+
+select * from s where n not in (select n from s where n not in (select n from l));
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where n not in (select n from s where n not in (select u from l));
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 2))
+   SubPlan 2
+     ->  Nested Loop Anti Join
+           InitPlan 1 (returns $0)
+             ->  Seq Scan on l
+           ->  Seq Scan on s s_1
+                 Filter: ((n IS NOT NULL) OR (NOT $0))
+           ->  Index Only Scan using l_u on l l_1
+                 Index Cond: (u = s_1.n)
+(10 rows)
+
+select * from s where n not in (select n from s where n not in (select u from l));
+ u | n | nn |  p  
+---+---+----+-----
+ 1 | 1 |  1 | foo
+ 2 | 2 |  2 | foo
+(2 rows)
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Aggregate
+   ->  Seq Scan on s
+         Filter: (NOT (SubPlan 2))
+         SubPlan 2
+           ->  Result
+                 One-Time Filter: (NOT $2)
+                 InitPlan 1 (returns $2)
+                   ->  Nested Loop Anti Join
+                         ->  Seq Scan on s1
+                               Filter: (n = s.n)
+                         ->  Bitmap Heap Scan on l
+                               Recheck Cond: ((s1.u = n) OR (n IS NULL))
+                               ->  BitmapOr
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n = s1.u)
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n IS NULL)
+                 ->  Seq Scan on s1 s1_1
+(18 rows)
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+                                           QUERY PLAN                                           
+------------------------------------------------------------------------------------------------
+ Seq Scan on s
+   Filter: ((NOT (hashed SubPlan 1)) AND (NOT (hashed SubPlan 2)) AND (NOT (hashed SubPlan 3)))
+   SubPlan 1
+     ->  Seq Scan on s1
+   SubPlan 2
+     ->  Seq Scan on s1 s1_1
+   SubPlan 3
+     ->  Seq Scan on s1 s1_2
+(8 rows)
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: ((NOT (hashed SubPlan 1)) AND (NOT (hashed SubPlan 2)))
+         SubPlan 1
+           ->  Seq Scan on s1
+         SubPlan 2
+           ->  Seq Scan on s1 s1_1
+   ->  Index Only Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+(9 rows)
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Aggregate
+   ->  Seq Scan on s
+         Filter: ((NOT (SubPlan 2)) AND (NOT (SubPlan 4)))
+         SubPlan 2
+           ->  Result
+                 One-Time Filter: (NOT $2)
+                 InitPlan 1 (returns $2)
+                   ->  Nested Loop Anti Join
+                         ->  Seq Scan on s1
+                               Filter: (n = s.n)
+                         ->  Bitmap Heap Scan on l
+                               Recheck Cond: ((s1.u = n) OR (n IS NULL))
+                               ->  BitmapOr
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n = s1.u)
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n IS NULL)
+                 ->  Seq Scan on s1 s1_1
+         SubPlan 4
+           ->  Result
+                 One-Time Filter: (NOT $6)
+                 InitPlan 3 (returns $6)
+                   ->  Nested Loop Anti Join
+                         ->  Seq Scan on s1 s1_2
+                               Filter: (n = s.n)
+                         ->  Bitmap Heap Scan on l l_1
+                               Recheck Cond: ((s1_2.u = n) OR (n IS NULL))
+                               ->  BitmapOr
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n = s1_2.u)
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n IS NULL)
+                 ->  Seq Scan on s1 s1_3
+(33 rows)
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+ count 
+-------
+     0
+(1 row)
+
+--test COALESCE
+explain (costs false) select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, '-1'::integer) = COALESCE(l.n, '-1'::integer))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, '-1'::integer) = COALESCE(l.n, '-1'::integer))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+                                 QUERY PLAN                                  
+-----------------------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((COALESCE(s.n) = COALESCE(l.n)) OR (COALESCE(l.n) IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((COALESCE(n) IS NOT NULL) OR (NOT $0))
+   ->  Materialize
+         ->  Seq Scan on l
+(8 rows)
+
+select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, s.nn) = COALESCE(l.n, l.nn))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+                   QUERY PLAN                   
+------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.nn) = COALESCE(l.nn))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+                                                       QUERY PLAN                                                        
+-------------------------------------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((COALESCE(s.n, '-1'::integer) = COALESCE(n, '-1'::integer)) AND (COALESCE(s.n, s.u) = COALESCE(n, u)))
+(5 rows)
+
+select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+-- test miscellaneous outer nullable cases
+explain (costs false) select * from s where (n,n) not in (select n,n from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s.n = l.n) AND (s.n = l.n)) IS NOT FALSE)
+   ->  Seq Scan on s
+   ->  Materialize
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where (n,n) not in (select n,n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+                                     QUERY PLAN                                      
+-------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s.n = l_1.n) AND (s.u = l_1.u) AND (s.nn = l_1.nn)) IS NOT FALSE)
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s
+   ->  Materialize
+         ->  Seq Scan on l l_1
+(9 rows)
+
+select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+ u | n | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+                 QUERY PLAN                  
+---------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         Filter: (NOT (hashed SubPlan 1))
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s
+         SubPlan 1
+           ->  Index Scan using l_u on l l_1
+                 Index Cond: (u < 0)
+(10 rows)
+
+select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+ count 
+-------
+ 10000
+(1 row)
+
+explain (costs false) select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Sort
+   Sort Key: s.n
+   ->  Seq Scan on s
+         Filter: (NOT (hashed SubPlan 1))
+         SubPlan 1
+           ->  Limit
+                 ->  Unique
+                       ->  Index Scan using l_n on l
+                             Filter: (u > 0)
+(9 rows)
+
+select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+--test outer has strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l) and n > 0;
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n > 0)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where n not in (select n from l) and n > 0;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l) and u > 0;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: (((n IS NOT NULL) OR (NOT $0)) AND (u > 0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where n not in (select n from l) and u > 0;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l) and n is not null;
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n IS NOT NULL)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where n not in (select n from l) and n is not null;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s join l on s.n = l.n where s.n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop
+   ->  Nested Loop Anti Join
+         ->  Seq Scan on s
+         ->  Bitmap Heap Scan on l l_1
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+   ->  Index Scan using l_n on l
+         Index Cond: (n = s.n)
+(12 rows)
+
+select * from s join l on s.n = l.n where s.n not in (select n from l);
+ u | n | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Anti Join
+         ->  Hash Left Join
+               Hash Cond: (l.n = s.n)
+               Filter: ((s.n IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s
+         ->  Bitmap Heap Scan on l l_1
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+(17 rows)
+
+select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Bitmap Heap Scan on l l_1
+                           Recheck Cond: ((s.n = n) OR (n IS NULL))
+                           ->  BitmapOr
+                                 ->  Bitmap Index Scan on l_n
+                                       Index Cond: (n = s.n)
+                                 ->  Bitmap Index Scan on l_n
+                                       Index Cond: (n IS NULL)
+               ->  Index Only Scan using l_n on l
+                     Index Cond: (n = s.n)
+         ->  Seq Scan on s1
+(16 rows)
+
+select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Anti Join
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               Filter: ((s.n IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on s1
+               ->  Materialize
+                     ->  Nested Loop
+                           ->  Seq Scan on s
+                           ->  Index Only Scan using l_n on l
+                                 Index Cond: (n = s.n)
+         ->  Bitmap Heap Scan on l l_1
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+(20 rows)
+
+select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+--test inner has strict predicate or inner join
+explain (costs false) select * from s where u not in (select n from l where n > 0);
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.u) AND (n > 0))
+(4 rows)
+
+select * from s where u not in (select n from l where n > 0);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where u not in (select n from l where u > 0);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.u = n) OR (n IS NULL))
+         Filter: (u > 0)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.u)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where u not in (select n from l where u > 0);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select n from l where n is not null);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.u) AND (n IS NOT NULL))
+(4 rows)
+
+select * from s where u not in (select n from l where n is not null);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(7 rows)
+
+select * from s where u not in (select l.n from l join s on l.n=s.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.u=s.u);
+               QUERY PLAN                
+-----------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Scan using l_u on l
+                 Index Cond: (u = s_1.u)
+(7 rows)
+
+select * from s where u not in (select l.n from l join s on l.u=s.u);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n = s.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(7 rows)
+
+select * from s where u not in (select l.n from l join s on l.n = s.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n = s.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(7 rows)
+
+select * from s where u not in (select l.n from l right join s on l.n = s.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Nested Loop
+                 Join Filter: (s_1.n = s1.n)
+                 ->  Seq Scan on s1
+                 ->  Materialize
+                       ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(11 rows)
+
+select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.n = s1.n)
+           ->  Seq Scan on s1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Only Scan using l_n on l
+                             Index Cond: (n = s_1.n)
+(11 rows)
+
+select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+--test both sides have strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l where n > 0) and n > 0;
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n > 0)
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.n) AND (n > 0))
+(5 rows)
+
+select * from s where n not in (select n from l where n > 0) and n > 0;
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where n not in (select n from l where u > 0) and n > 0;
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n > 0)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 0)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(11 rows)
+
+select * from s where n not in (select n from l where u > 0) and n > 0;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l where n > 0) and u > 0;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (n > 0)
+   ->  Seq Scan on s
+         Filter: (((n IS NOT NULL) OR (NOT $0)) AND (u > 0))
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.n) AND (n > 0))
+(8 rows)
+
+select * from s where n not in (select n from l where n > 0) and u > 0;
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Nested Loop
+   Join Filter: (s.u = s1.u)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Nested Loop
+               ->  Seq Scan on s
+                     Filter: (NOT (hashed SubPlan 1))
+                     SubPlan 1
+                       ->  Nested Loop
+                             ->  Nested Loop
+                                   Join Filter: (s_1.n = s1_1.n)
+                                   ->  Seq Scan on s s_1
+                                   ->  Materialize
+                                         ->  Seq Scan on s s1_1
+                             ->  Index Only Scan using l_n on l l_1
+                                   Index Cond: (n = s_1.n)
+               ->  Index Scan using l_n on l
+                     Index Cond: (n = s.n)
+(18 rows)
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+ u | n | nn | p | u | n | nn | p | u | n | n1 | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+----+---
+(0 rows)
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+                                   QUERY PLAN                                   
+--------------------------------------------------------------------------------
+ Nested Loop
+   Join Filter: (s.u = s1.u)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Nested Loop
+               ->  Seq Scan on s
+                     Filter: (NOT (hashed SubPlan 1))
+                     SubPlan 1
+                       ->  Nested Loop Left Join
+                             Join Filter: (l_1.n = s1_1.n)
+                             ->  Seq Scan on s s1_1
+                             ->  Materialize
+                                   ->  Nested Loop
+                                         ->  Seq Scan on s s_1
+                                         ->  Index Only Scan using l_n on l l_1
+                                               Index Cond: (n = s_1.n)
+               ->  Index Scan using l_n on l
+                     Index Cond: (n = s.n)
+(18 rows)
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+ u | n | nn | p | u | n | nn | p | u | n | n1 | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+----+---
+(0 rows)
+
+explain (costs false) select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Nested Loop Left Join
+   Join Filter: (s.u = s1.u)
+   Filter: (NOT (hashed SubPlan 1))
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Nested Loop
+               ->  Seq Scan on s
+               ->  Index Scan using l_n on l
+                     Index Cond: (n = s.n)
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l_1.n = s1_1.n)
+           ->  Seq Scan on s s1_1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Only Scan using l_n on l l_1
+                             Index Cond: (n = s_1.n)
+(18 rows)
+
+select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+ u | n | nn | p | u | n | nn | p | u | n | n1 | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+----+---
+(0 rows)
+
+-- clean up
+set work_mem = 4000;
+drop table s;
+drop table s1;
+drop table l;
+drop table empty;
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index 99ca69791e..f6ce9b5719 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -744,3 +744,507 @@ select * from (with x as (select 2 as y) select * from x) ss;
 explain (verbose, costs off)
 with x as (select * from subselect_tbl)
 select * from x for update;
+
+-- test NON IN to ANTI JOIN conversion
+CREATE TABLE s (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s values(1000002, 1000002, 1000002, 'foofoo');
+UPDATE s set n = NULL WHERE n = 3;
+analyze s;
+
+CREATE TABLE l (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into l (u, n, nn, p)
+	select
+    generate_series(1,10000 ) as u,
+	generate_series(1,10000 ) as n,
+	generate_series(1,10000 ) as nn,
+	'bar' as p;
+UPDATE l set n = NULL WHERE n = 7;
+
+CREATE UNIQUE INDEX l_u ON l (u);
+CREATE INDEX l_n ON l (n);
+CREATE INDEX l_nn ON l (nn);
+analyze l;
+
+CREATE TABLE s1 (u INTEGER NOT NULL, n INTEGER NULL, n1 INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s1 (u, n, n1, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as n1,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s1 values(1000003, 1000003, 1000003, 1000003, 'foofoo');
+insert into s1 values(1003, 1003, 1003, 1003, 'foofoo');
+UPDATE s1 set n = NULL WHERE n = 3;
+UPDATE s1 set n1 = NULL WHERE n = 2;
+UPDATE s1 set n1 = NULL WHERE n1 = 3;
+analyze s1;
+
+CREATE TABLE empty (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+analyze empty;
+
+-- set work_mem to 64KB so that NOT IN to ANTI JOIN optimization will kick in
+set work_mem = 64;
+
+-- correctness test 1: inner empty, return every thing from outer including NULL
+explain (costs false) select * from s where n not in (select n from empty);
+
+select * from s where n not in (select n from empty);
+
+-- correctness test 2: inner has NULL, return empty result
+explain (costs false) select * from s where n not in (select n from l);
+
+select * from s where n not in (select n from l);
+
+-- correctness test 3: inner non-null, result has no NULL
+explain (costs false) select * from s where n not in (select u from l);
+
+select * from s where n not in (select u from l);
+
+-- correctness test 4: inner has predicate
+explain (costs false) select * from s where n not in (select n from l where u > 7);
+
+select * from s where n not in (select n from l where u > 7);
+
+-- correctness test 5: multi-expression, (2, 2, null, 2, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+
+select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should not be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+
+select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+
+select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+
+-- test using hashed subplan when inner fits in work_mem
+explain (costs false) select * from l where n not in (select n from s);
+
+select * from l where n not in (select n from s);
+
+-- test single expression
+explain (costs false) select * from s where n not in (select n from l);
+
+select * from s where n not in (select n from l);
+
+explain (costs false) select * from s where u not in (select u from l);
+
+select * from s where u not in (select u from l);
+
+explain (costs false) select * from s where 3*n not in (select n from l);
+
+select * from s where 3*n not in (select n from l);
+
+explain (costs false) select * from s where n not in (select 3*n from l);
+
+select * from s where n not in (select 3*n from l);
+
+-- test single expression with predicates
+explain (costs false) select * from s where n not in (select n from l where u > 0);
+
+select * from s where n not in (select n from l where u > 0);
+
+explain (costs false) select * from s where n not in (select n from l where u > 100);
+
+select * from s where n not in (select n from l where u > 100);
+
+-- test multi expression
+explain (costs false) select * from s where (n,u) not in (select n,u from l);
+
+select * from s where (n,u) not in (select n,u from l);
+
+explain (costs false) select * from s where (u, nn) not in (select u, nn from l);
+
+select * from s where (u, nn) not in (select u, nn from l);
+
+explain (costs false) select * from s where (n,u) not in (select u,n from l);
+
+select * from s where (n,u) not in (select u,n from l);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+
+-- test multi-table
+explain (costs false) select count(*) from s, l where s.n not in (select n from l);
+
+select count(*) from s, l where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s, l where s.nn not in (select nn from l);
+
+select count(*) from s, l where s.nn not in (select nn from l);
+
+-- test null padded results from outer join
+explain (costs false) select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+
+select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+
+select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+
+select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+
+select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+
+select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+select * from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l right join s on l.nn = s.nn);
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+--test reduce outer joins from outer query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+--test reduce outer joins from subquery
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+--test reduce outer join on outer and sub-query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+-- test union all
+explain (costs false) select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+
+select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+
+explain (costs false) select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+
+explain (costs false) select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+
+select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+-- test multi-levels of NOT IN
+explain (costs false) select * from s where n not in (select n from s where n not in (select n from l));
+
+select * from s where n not in (select n from s where n not in (select n from l));
+
+explain (costs false) select * from s where n not in (select n from s where n not in (select u from l));
+
+select * from s where n not in (select n from s where n not in (select u from l));
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+--test COALESCE
+explain (costs false) select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+
+select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+
+explain (costs false) select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+
+select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+
+explain (costs false) select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+
+select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+
+explain (costs false) select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+
+select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+
+explain (costs false) select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+
+select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+
+explain (costs false) select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+
+select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+
+-- test miscellaneous outer nullable cases
+
+explain (costs false) select * from s where (n,n) not in (select n,n from l);
+
+select * from s where (n,n) not in (select n,n from l);
+
+explain (costs false) select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+
+select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+
+select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+
+explain (costs false) select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+
+select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+
+--test outer has strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l) and n > 0;
+
+select * from s where n not in (select n from l) and n > 0;
+
+explain (costs false) select * from s where n not in (select n from l) and u > 0;
+
+select * from s where n not in (select n from l) and u > 0;
+
+explain (costs false) select * from s where n not in (select n from l) and n is not null;
+
+select * from s where n not in (select n from l) and n is not null;
+
+explain (costs false) select * from s join l on s.n = l.n where s.n not in (select n from l);
+
+select * from s join l on s.n = l.n where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+
+select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+
+select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+
+select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+
+--test inner has strict predicate or inner join
+explain (costs false) select * from s where u not in (select n from l where n > 0);
+
+select * from s where u not in (select n from l where n > 0);
+
+explain (costs false) select * from s where u not in (select n from l where u > 0);
+
+select * from s where u not in (select n from l where u > 0);
+
+explain (costs false) select * from s where u not in (select n from l where n is not null);
+
+select * from s where u not in (select n from l where n is not null);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n);
+
+select * from s where u not in (select l.n from l join s on l.n=s.n);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.u=s.u);
+
+select * from s where u not in (select l.n from l join s on l.u=s.u);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n = s.n);
+
+select * from s where u not in (select l.n from l join s on l.n = s.n);
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n = s.n);
+
+select * from s where u not in (select l.n from l right join s on l.n = s.n);
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+
+select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+
+select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+
+--test both sides have strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l where n > 0) and n > 0;
+
+select * from s where n not in (select n from l where n > 0) and n > 0;
+
+explain (costs false) select * from s where n not in (select n from l where u > 0) and n > 0;
+
+select * from s where n not in (select n from l where u > 0) and n > 0;
+
+explain (costs false) select * from s where n not in (select n from l where n > 0) and u > 0;
+
+select * from s where n not in (select n from l where n > 0) and u > 0;
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+explain (costs false) select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+-- clean up
+set work_mem = 4000;
+drop table s;
+drop table s1;
+drop table l;
+drop table empty;
#54Li, Zheng
zhelli@amazon.com
In reply to: Li, Zheng (#53)
1 attachment(s)
Re: NOT IN subquery optimization

Hi,

I'm submitting patch v2.2.

This version fixed an issue that involves CTE. Because we call subquery_planner before deciding whether to proceed with the transformation, we need to setup access to upper level CTEs at this point if the subquery contains any CTE RangeTblEntry.

Also added more test cases of NOT IN accessing CTEs, including recursive CTE. It's nice that CTE can use index now!

Let me know if you have any comments.

Regards,
-----------
Zheng Li
AWS, Amazon Aurora PostgreSQL

Attachments:

not_in_to_anti_join_transformation_v2_2.patchapplication/octet-stream; name=not_in_to_anti_join_transformation_v2_2.patchDownload
commit 106418a7bd1902aa6c2ec400f7c6ca6589fcb95e
Author: Zheng Li <zhelli@amazon.com>
Date:   Fri Jun 14 17:38:52 2019 +0000

        NOT IN to ANTI JOIN transformation
    
        If the NOT IN subquery is not eligible for hashed subplan as decided by
        subplan_is_hashable(), do the following NOT IN to ANTI JOIN
        transformation:
    
        Single expression:
            When x is nullable:
            t1.x not in (t2.y where p) =>
            ANTI JOIN
            t1 (Filter: t1.x IS NOT NULL or NOT EXISTS (select 1 from t2 where p)),
            t2 on join condition (t1.x=t2.y or t2.y IS NULL)
            and p.
            The predicate "t2.y IS NULL" can be removed if y is non-nullable.
    
            When x is non-nullable:
            t1.x not in (t2.y where p) =>
            ANTI JOIN t1, t2 on join condition (t1.x=t2.y or t2.y IS NULL)
            and p.
            The predicate "t2.y IS NULL" can be removed if y is non-nullable.
    
        Multi expression:
            If all xi's are nullable:
            (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
            ANTI JOIN t1, t2 on join condition:
            ((t1.x1 = t2.y1) and ... (t1.xi = t2.yi) ... and
             (t1.xn = t2.yn)) IS NOT FALSE.
    
            If at least one xi is non-nuallable:
            (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
            ANTI JOIN t1, t2 on join condition:
            (t1.x1 = t2.y1 or t2.y1 IS NULL or t1.x1 IS NULL) and ...
            (t1.xi = t2.yi or t2.yi IS NULL) ... and
            (t1.xn = t2.yn or t2.yn IS NULL or t1.xn IS NULL).
    
        Add nullability testing routine is_node_nonnullable(), currently it
        handles Var, TargetEntry, CoalesceExpr and Const. It uses strict
        predicates, inner join conditions and NOT NULL constraint to check
        the nullability of a Var.
    
        Adjust and apply reduce_outer_joins() before the transformation so
        that the outer joins have an opportunity to be converted to inner joins
        prior to the transformation.
    
        We measured performance improvements of two to five orders of magnitude
        on most queries in a development environment. In our performance experiments,
        table s (small) has 11 rows, table l (large) has 1 million rows. s.n and l.n
        have NULL value. s.nn and l.nn are NOT NULL. Index is created on each column.
    
        Cases using hash anti join:
        l.nn not in (l.nn) 21900s -> 235ms
        l.nn not in (l.nn where u>0) 22000s -> 240ms
        l.n not in (l.nn) 21900s -> 238ms
        l.n not in (l.nn where u>0) 22000s -> 248ms
    
        Cases using index nested loop anti join
        s.n not in (l.nn) 360ms -> 0.5ms
        s.n not in (l.nn where u>0) 390ms -> 0.6ms
        s.nn not in (l.nn) 365ms -> 0.5ms
        s.nn not in (l.nn where u>0) 390ms -> 0.5ms
    
        Cases using bitmap heap scan on the inner and nested loop anti join:
        s.n not in (l.n) 360ms -> 0.7ms
        l.n not in (l.n) 21900s ->  1.6s
        l.n not in (l.n where u>0) 22000s -> 1680ms
        s.nn not in (l.n) 360ms -> 0.5ms
        l.nn not in (l.n) 21900s -> 1650ms
        l.nn not in (l.n where u>0) 22000s -> 1660ms
    
        Cases using the original hashed subplan:
        l.n not in (s.n) 63ms -> 63ms
        l.nn not in (s.nn) 63ms -> 63ms
        l.n not in (s.n where u>0) 63ms -> 63ms
    
        Authors: Zheng Li, Jim Finnerty
        Discussion:
            https://www.postgresql.org/message-id/flat/812F84BC-702F-4063-95CC-A7DBD7C9D66B%40amazon.com#bfbaff1ad4a366ed92d7c9657afdb533
    
    Make NOT IN to ANTI JOIN transformation work with CTE

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index cb897cc7f4..6edb5f65f5 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -991,7 +991,7 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
 	 * preprocessing.
 	 */
 	if (hasOuterJoins)
-		reduce_outer_joins(root);
+		reduce_outer_joins(parse);
 
 	/*
 	 * If we have any RTE_RESULT relations, see if they can be deleted from
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index efd0fbc21c..25f9494611 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -158,8 +158,7 @@ get_first_col_type(Plan *plan, Oid *coltype, int32 *coltypmod,
  * subquery itself is in a resjunk tlist entry whose value is uninteresting).
  */
 static Node *
-make_subplan(PlannerInfo *root, Query *orig_subquery,
-			 SubLinkType subLinkType, int subLinkId,
+make_subplan(PlannerInfo *root, SubLink *sublink,
 			 Node *testexpr, bool isTopQual)
 {
 	Query	   *subquery;
@@ -171,6 +170,9 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 	Plan	   *plan;
 	List	   *plan_params;
 	Node	   *result;
+	Query      *orig_subquery = (Query *) sublink->subselect;
+	SubLinkType subLinkType = sublink->subLinkType;
+	int         subLinkId = sublink->subLinkId;
 
 	/*
 	 * Copy the source Query node.  This is a quick and dirty kluge to resolve
@@ -216,24 +218,33 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 	/* plan_params should not be in use in current query level */
 	Assert(root->plan_params == NIL);
 
-	/* Generate Paths for the subquery */
-	subroot = subquery_planner(root->glob, subquery,
-							   root,
-							   false, tuple_fraction);
+	if(sublink->subroot != NULL &&
+		sublink->subplan != NULL)
+	{
+		subroot = (PlannerInfo *) sublink->subroot;
+		plan = (Plan *) sublink->subplan;
+	}
+	else
+	{
+		/* Generate Paths for the subquery */
+		subroot = subquery_planner(root->glob, subquery,
+								   root,
+								   false, tuple_fraction);
+
+		/*
+		 * Select best Path and turn it into a Plan.  At least for now, there
+		 * seems no reason to postpone doing that.
+		 */
+		final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
+		best_path = get_cheapest_fractional_path(final_rel, tuple_fraction);
+
+		plan = create_plan(subroot, best_path);
+	}
 
 	/* Isolate the params needed by this specific subplan */
 	plan_params = root->plan_params;
 	root->plan_params = NIL;
 
-	/*
-	 * Select best Path and turn it into a Plan.  At least for now, there
-	 * seems no reason to postpone doing that.
-	 */
-	final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
-	best_path = get_cheapest_fractional_path(final_rel, tuple_fraction);
-
-	plan = create_plan(subroot, best_path);
-
 	/* And convert to SubPlan or InitPlan format. */
 	result = build_subplan(root, plan, subroot, plan_params,
 						   subLinkType, subLinkId,
@@ -1173,6 +1184,371 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
 	return expression_tree_walker(node, inline_cte_walker, context);
 }
 
+/* Returns a List of Nodes from the testexpr of an Any SubLink */
+static List *
+getTestExpr(SubLink *sublink)
+{
+	Node * testexpr = sublink->testexpr;
+	Assert(testexpr);
+
+	/* single expression */
+	if(IsA(testexpr, OpExpr))
+	{
+		OpExpr	*opexpr = (OpExpr *) testexpr;
+		Node	*testnode = linitial(opexpr->args);
+
+		return list_make1(testnode);
+	}
+	/* multi-expression */
+	else if(IsA(testexpr, BoolExpr))
+	{
+		BoolExpr	*bexpr = (BoolExpr *) testexpr;
+		ListCell		*lc;
+		Node		*node;
+		List			*result = NULL;
+
+		foreach(lc, bexpr->args)
+		{
+			node = lfirst(lc);
+			if(IsA(node, OpExpr))
+			{
+				OpExpr *expr = (OpExpr *) node;
+				result = lappend(result, linitial(expr->args));
+			}
+			else
+			{
+				elog(ERROR, "unrecognized node type for testexpr: %d",
+						(int) nodeTag(node));
+			}
+		}
+		return result;
+	}
+	else
+	{
+		elog(ERROR, "unrecognized node type for testexpr: %d",
+				(int) nodeTag(testexpr));
+	}
+}
+
+/* Try to reduce outer joins if there is one in the Query */
+static void
+reduce_outer_joins_NOT_IN(Query *parse)
+{
+	ListCell				*lc;
+	RangeTblEntry	*rte;
+
+	foreach(lc, parse->rtable)
+	{
+		rte = (RangeTblEntry *) lfirst(lc);
+		/* try to reduce outer joins if there is one */
+		if (rte->rtekind == RTE_JOIN &&
+				IS_OUTER_JOIN(rte->jointype))
+		{
+			reduce_outer_joins(parse);
+			return;
+		}
+	}
+}
+
+/*
+ * Make clause NOT EXISTS
+ * (select 1 from t2 where p) for the NOT IN to ANTI JOIN
+ * transformation.
+ */
+static
+Node *
+makeExistsTest_NOT_IN(Query *subselect)
+{
+	BoolExpr *notExpr = makeNode(BoolExpr);
+	SubLink *exists = makeNode(SubLink);
+	Query *selectOne =  copyObject(subselect);
+	Const *oneconst;
+	TargetEntry *dummyte;
+
+	/* modify subselect target list to contain a dummy const 1 */
+	oneconst = makeConst(INT4OID,
+						 -1,
+						 InvalidOid,
+						 sizeof(int32),
+						 Int32GetDatum(1),
+						 false, /* isnull */
+						 true); /* pass by value */
+	dummyte = makeTargetEntry((Expr *) oneconst,
+							  1,
+							  "one",
+							  false /* resjunk */ );
+	selectOne->targetList = list_make1(dummyte);
+
+	/* make EXISTS(select 1 from t2 where p) */
+	exists->subLinkType = EXISTS_SUBLINK;
+	exists->subLinkId = 0;
+	exists->subselect = (Node *) selectOne;
+	exists->location = -1;
+	exists->subroot = NULL;
+	exists->subplan = NULL;
+
+	/* make NOT EXISTS(select 1 from t2 where p) */
+	notExpr->boolop = NOT_EXPR;
+	notExpr->args = list_make1(exists);
+	notExpr->location = -1;
+
+	return (Node *) notExpr;
+}
+
+/*
+ * If the NOT IN subquery is not eligible for hashed subplan as
+ * decided by subplan_is_hashable(),
+ * do the following NOT IN to ANTI JOIN conversions:
+ *
+ * When x is non-nullable:
+ * t1.x not in (t2.y where p) => ANTI JOIN
+ * t1, t2 on join condition (t1.x=t2.y or t2.y IS NULL) and p.
+ * the above predicate "t2.y IS NULL" can be removed if y
+ * is also non-nullable.
+ *
+ * When x is nullable:
+ * t1.x not in (t2.y where p) => ANTI JOIN
+ * t1 (Filter: t1.x is not null or not exists (select 1 from t2 where p)),
+ * t2 on join condition (t1.x=t2.y or t2.y is null) and p.
+ * the above predicate "t2.y IS NULL" can be removed if y
+ * is also non-nullable.
+ *
+ * The multi-expression case is just ANDs of the single-
+ * expression case.
+ */
+static bool
+convert_NOT_IN_to_join(PlannerInfo *root, Node **quals,
+									SubLink *sublink, List *subquery_vars,
+									Node **pullout)
+{
+	Query			*parse = root->parse;
+	Query			*subselect = (Query *) sublink->subselect;
+	List			*testnodes = getTestExpr(sublink);
+	bool			 outerNonNull;
+	bool			 innerNonNull;
+	PlannerInfo 	*subroot;
+	double			 tuple_fraction;
+	RelOptInfo		*final_rel;
+	Path			*best_path;
+	Plan			*plan;
+	NullTest 		*nt;
+
+	/*
+	 * Can't flatten if it contains WITH.  (We could arrange to pull up the
+	 * WITH into the parent query's cteList, but that risks changing the
+	 * semantics, since a WITH ought to be executed once per associated query
+	 * call.)  Note that convert_ANY_sublink_to_join doesn't have to reject
+	 * this case, since it just produces a subquery RTE that doesn't have to
+	 * get flattened into the parent query.
+	 */
+	if(subselect->cteList)
+		return false;
+
+	/* For ALL and ANY subplans, we will be
+	 * able to stop evaluating if the test condition fails or matches, so very
+	 * often not all the tuples will be retrieved; for lack of a better idea,
+	 * specify 50% retrieval.
+	 */
+	tuple_fraction = 0.5;
+	/*
+	 * Generate Paths for the subquery, use a copied version of the subquery
+	 * so that the existing one doesn't get modified.
+	 */
+	subroot = subquery_planner(root->glob, copyObject(subselect),
+													root, false, tuple_fraction);
+
+	/* Select best Path and turn it into a Plan. */
+	final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
+	best_path = get_cheapest_fractional_path(final_rel, tuple_fraction);
+
+	plan = create_plan(subroot, best_path);
+
+	sublink->subroot = (Node *) subroot;
+	sublink->subplan = (Node *) plan;
+	/*
+	 * Punt if subplan is hashable since using hashed subplan is almost like
+	 * doing a Hash Anti Join, we probably can't do better than that.
+	 */
+	if(subplan_is_hashable(plan))
+	{
+		return false;
+	}
+
+	/*
+	 * Try reduce_outer_joins since outer join affects the nullability test that's coming up next.
+	 * We have to call reduce_outer_joins for outer and inner query separately because we
+	 * don't have a global range table yet.
+	 */
+	reduce_outer_joins_NOT_IN(parse);
+
+	reduce_outer_joins_NOT_IN(subselect);
+
+	Assert(testnodes);
+	outerNonNull =
+					list_hasnonnullable(testnodes, parse);
+	innerNonNull =
+					list_hasnonnullable(subselect->targetList, subselect);
+
+	/* Single-expression case, do the following:
+	 * When x is non-nullable:
+	 * t1.x not in (t2.y where p) => ANTI JOIN
+	 * t1, t2 on join condition (t1.x=t2.y or t2.y IS NULL) and p.
+	 * the above predicate "t2.y IS NULL" can be removed if y
+	 * is also non-nullable.
+	 *
+	 * When x is nullable:
+	 * t1.x not in (t2.y where p) => ANTI JOIN
+	 * t1 (Filter: t1.x is not null or not exists (select 1 from t2 where p)),
+	 * t2 on join condition (t1.x=t2.y or t2.y is null) and p.
+	 * the above predicate "t2.y IS NULL" can be removed if y
+	 * is also non-nullable.
+	 */
+	if(IsA(*quals, OpExpr))
+	{
+		/*  Add "OR y IS NULL" if y is nullable */
+		if(!innerNonNull)
+		{
+			/* make expr y IS NULL */
+			nt = makeNode(NullTest);
+			nt->arg = (Expr *)linitial(subquery_vars);
+			nt->nulltesttype = IS_NULL;
+			nt->argisrow = false;
+			nt->location = -1;
+
+			/* make orclause (x = y OR y IS NULL) */
+			*quals = (Node *)make_orclause(list_make2(*quals,
+								(Node *)nt));
+		}
+
+		/*
+		 * if x is nullable, make the following filter for t1 :
+		 * x IS NOT NULL or NOT EXISTS (select 1 from t2 where p)
+		 */
+		if(!outerNonNull)
+		{
+			Node * existsTest = NULL;
+
+			/* make expr x IS NOT NULL */
+			nt = makeNode(NullTest);
+			nt->arg = (Expr *) linitial(testnodes);
+			nt->nulltesttype = IS_NOT_NULL;
+			nt->argisrow = false;
+			nt->location = -1;
+
+			existsTest = makeExistsTest_NOT_IN(subselect);
+			/* make x IS NOT NULL OR NOT EXISTS (select 1 from t2 where p) */
+			*pullout = (Node *)make_orclause(list_make2(
+									(Node *) nt, existsTest));
+		}
+	}
+	/*
+	 * Multi-expression case:
+	 * If all xi's are nullable:
+	 * (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
+	 * ANTI JOIN t1,
+	 * t2 on join condition:
+	 * ((t1.x1 = t2.y1) and ... (t1.xi = t2.yi) ... and
+	 * (t1.xn = t2.yn)) is NOT FALSE.
+	 *
+	 * If at least one xi is non-nuallable:
+	 * (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
+	 * ANTI JOIN t1,
+	 * t2 on join condition:
+	 * (t1.x1 = t2.y1 or t2.y1 is NULL) and ...
+	 * (t1.xi = t2.yi or t2.yi is NULL or t1.xi is NULL) ... and
+	 * (t1.xn = t2.yn or t2.yn is NULL).
+	 */
+	else if(IsA(*quals, BoolExpr))
+	{
+		/*
+		 * Add IS NOT FALSE on top of the join condition if ALL x_i's are nullable
+		 */
+		if(!outerNonNull)
+		{
+			BooleanTest *btest;
+
+			btest = makeNode(BooleanTest);
+			btest->arg = (Expr *) *quals;
+			btest->booltesttype = IS_NOT_FALSE;
+			*quals = (Node *) btest;
+		}
+		else
+		{
+			ListCell			*qualc;
+			TargetEntry	*te;
+			ListCell			*xc = list_head(testnodes);
+			ListCell			*yc = list_head(subquery_vars);
+			ListCell			*ytlc = list_head(subselect->targetList);
+			List				*quallist = ((BoolExpr *)*quals)->args;
+			Node				*joinCondition = NULL;
+			bool				 xnonNull;
+			bool				 ynonNull;
+
+			/* Reconstruct quals in the loop */
+			*quals = NULL;
+			foreach(qualc, quallist)
+			{
+				te = (TargetEntry *)lfirst(ytlc);
+				/* x_i = y_i */
+				joinCondition = lfirst(qualc);
+				ynonNull = is_node_nonnullable((Node*)te, subselect);
+
+				/* append y_i IS NULL to x_i = y_i if y_i is non-nullable */
+				if(!ynonNull)
+				{
+					/* make expr y_i IS NULL */
+					nt = makeNode(NullTest);
+					nt->arg = (Expr *)lfirst(yc);
+					nt->nulltesttype = IS_NULL;
+					nt->argisrow = false;
+					nt->location = -1;
+
+					/* make orclause (x_i = y_i OR y_i IS NULL) */
+					joinCondition = (Node *)make_orclause(list_make2(joinCondition,
+											(Node *)nt));
+				}
+
+				/*
+				 * Append "OR x_i is null" to the join condition if x_i is nullable.
+				 * Notice at least one x_i should be non-nullable because the all
+				 * x_i's nullable case is handled earlier by adding "IS NOT FALSE"
+				 * on top of the join condition.
+				 */
+				xnonNull = is_node_nonnullable(lfirst(xc), parse);
+				if(!xnonNull)
+				{
+					/* make expr x_i IS NULL */
+					nt = makeNode(NullTest);
+					nt->arg = (Expr *)lfirst(xc);
+					nt->nulltesttype = IS_NULL;
+					nt->argisrow = false;
+					nt->location = -1;
+
+					/* make orclause (x_i = y_i OR y_i IS NULL OR x_i IS NULL) */
+					joinCondition = (Node *)make_orclause(list_make2(joinCondition,
+											(Node *)nt));
+				}
+
+				/*
+				 * Now append joinCondition to quals as one andclause.
+				 * (x_i = y_i OR y_i IS NULL OR x_i IS NULL) AND
+				 * (x_j = y_j OR y_j IS NULL OR x_j IS NULL)...
+				 */
+				*quals = (Node *)make_andclause(list_make2(*quals, joinCondition));
+				xc = lnext(xc);
+				yc = lnext(yc);
+				ytlc = lnext(ytlc);
+			}
+		}
+	}
+	/* quals should be either OpExpr or BoolExpr, otherwise don't convert */
+	else
+	{
+		return false;
+	}
+
+	return true;
+}
 
 /*
  * convert_ANY_sublink_to_join: try to convert an ANY SubLink to a join
@@ -1210,7 +1586,8 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
  */
 JoinExpr *
 convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
-							Relids available_rels)
+											bool under_not, Node **pullout,
+											Relids available_rels)
 {
 	JoinExpr   *result;
 	Query	   *parse = root->parse;
@@ -1290,11 +1667,22 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 	 */
 	quals = convert_testexpr(root, sublink->testexpr, subquery_vars);
 
+	/*
+	 * Try converting x NOT IN (y) to ANTI JOIN.
+	 */
+	if(under_not &&
+			!convert_NOT_IN_to_join(root, &quals,
+								sublink, subquery_vars, pullout))
+	{
+		return NULL;
+	}
+
 	/*
 	 * And finally, build the JoinExpr node.
 	 */
 	result = makeNode(JoinExpr);
-	result->jointype = JOIN_SEMI;
+	/* NOT IN will be converted to ANTI JOIN */
+	result->jointype = under_not ? JOIN_ANTI : JOIN_SEMI;
 	result->isNatural = false;
 	result->larg = NULL;		/* caller must fill this in */
 	result->rarg = (Node *) rtr;
@@ -1883,9 +2271,7 @@ process_sublinks_mutator(Node *node, process_sublinks_context *context)
 		 * Now build the SubPlan node and make the expr to return.
 		 */
 		return make_subplan(context->root,
-							(Query *) sublink->subselect,
-							sublink->subLinkType,
-							sublink->subLinkId,
+							sublink,
 							testexpr,
 							context->isTopQual);
 	}
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index 67eeba938d..1469c7cb65 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -104,7 +104,7 @@ static Query *pullup_replace_vars_subquery(Query *query,
 static reduce_outer_joins_state *reduce_outer_joins_pass1(Node *jtnode);
 static void reduce_outer_joins_pass2(Node *jtnode,
 									 reduce_outer_joins_state *state,
-									 PlannerInfo *root,
+									 Query *parse,
 									 Relids nonnullable_rels,
 									 List *nonnullable_vars,
 									 List *forced_null_vars);
@@ -256,6 +256,13 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 		}
 		/* Build the replacement FromExpr; no quals yet */
 		newf = makeFromExpr(newfromlist, NULL);
+		/*
+		 * Replace parse->jointree with newf now because we might modify join types
+		 * during reduce_outer_joins() in convert_NOT_IN_to_join() which is called
+		 * in pull_up_sublinks_qual_recurse() that's coming up next.
+		 */
+		newf->quals = f->quals;
+		root->parse->jointree = newf;
 		/* Set up a link representing the rebuilt jointree */
 		jtlink = (Node *) newf;
 		/* Now process qual --- all children are available for use */
@@ -388,7 +395,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 		/* Is it a convertible ANY or EXISTS clause? */
 		if (sublink->subLinkType == ANY_SUBLINK)
 		{
-			if ((j = convert_ANY_sublink_to_join(root, sublink,
+			if ((j = convert_ANY_sublink_to_join(root, sublink, false, NULL,
 												 available_rels1)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
@@ -414,7 +421,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 				return NULL;
 			}
 			if (available_rels2 != NULL &&
-				(j = convert_ANY_sublink_to_join(root, sublink,
+				(j = convert_ANY_sublink_to_join(root, sublink, false, NULL,
 												 available_rels2)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
@@ -560,6 +567,68 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 					return NULL;
 				}
 			}
+			else if (sublink->subLinkType == ANY_SUBLINK)
+			{
+				Node *pullout = NULL;
+
+				if ((j = convert_ANY_sublink_to_join(root, sublink, true, &pullout,
+													 available_rels1)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink1;
+					*jtlink1 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Any inserted
+					 * joins can get stacked onto either j->larg or j->rarg,
+					 * depending on which rels they reference.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->larg,
+															 available_rels1,
+															 &j->rarg,
+															 child_rels);
+					/*
+					 * Return pullout predicate (x is NOT NULL) if it's not null,
+					 * otherwise return NULL representing constant TRUE.
+					 */
+					return pullout? pullout : NULL;
+				}
+				if (available_rels2 != NULL &&
+					(j = convert_ANY_sublink_to_join(root, sublink, true, &pullout,
+													 available_rels2)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink2;
+					*jtlink2 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Any inserted
+					 * joins can get stacked onto either j->larg or j->rarg,
+					 * depending on which rels they reference.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->larg,
+															 available_rels2,
+															 &j->rarg,
+															 child_rels);
+					/*
+					 * Return pullout predicate (x is NOT NULL) if it's not null,
+					 * otherwise return NULL representing constant TRUE.
+					 */
+					return pullout? pullout : NULL;
+				}
+			}
 		}
 		/* Else return it unmodified */
 		return node;
@@ -878,7 +947,13 @@ pull_up_simple_subquery(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte,
 	subroot->parse = subquery;
 	subroot->glob = root->glob;
 	subroot->query_level = root->query_level;
-	subroot->parent_root = root->parent_root;
+	/*
+	 * Keep a path to the top level root so that we can recursively access top level
+	 * CTEs in root->parse->cteList, and CTE plans in root->init_plans. This hack
+	 * won't change the original PlannerInfo tree structure because subroot is just
+	 * a auxiliary PlannerInfo to help pulling up subquery.
+	 */
+	subroot->parent_root = root;
 	subroot->plan_params = NIL;
 	subroot->outer_params = NULL;
 	subroot->planner_cxt = CurrentMemoryContext;
@@ -2377,7 +2452,7 @@ flatten_simple_union_all(PlannerInfo *root)
  * alias-var expansion).
  */
 void
-reduce_outer_joins(PlannerInfo *root)
+reduce_outer_joins(Query *parse)
 {
 	reduce_outer_joins_state *state;
 
@@ -2390,14 +2465,14 @@ reduce_outer_joins(PlannerInfo *root)
 	 * join(s) below each side of each join clause. The second pass examines
 	 * qual clauses and changes join types as it descends the tree.
 	 */
-	state = reduce_outer_joins_pass1((Node *) root->parse->jointree);
+	state = reduce_outer_joins_pass1((Node *) parse->jointree);
 
 	/* planner.c shouldn't have called me if no outer joins */
 	if (state == NULL || !state->contains_outer)
 		elog(ERROR, "so where are the outer joins?");
 
-	reduce_outer_joins_pass2((Node *) root->parse->jointree,
-							 state, root, NULL, NIL, NIL);
+	reduce_outer_joins_pass2((Node *) parse->jointree,
+							 state, parse, NULL, NIL, NIL);
 }
 
 /*
@@ -2480,7 +2555,7 @@ reduce_outer_joins_pass1(Node *jtnode)
 static void
 reduce_outer_joins_pass2(Node *jtnode,
 						 reduce_outer_joins_state *state,
-						 PlannerInfo *root,
+						 Query *parse,
 						 Relids nonnullable_rels,
 						 List *nonnullable_vars,
 						 List *forced_null_vars)
@@ -2520,7 +2595,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 			reduce_outer_joins_state *sub_state = lfirst(s);
 
 			if (sub_state->contains_outer)
-				reduce_outer_joins_pass2(lfirst(l), sub_state, root,
+				reduce_outer_joins_pass2(lfirst(l), sub_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
@@ -2632,7 +2707,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 		/* Apply the jointype change, if any, to both jointree node and RTE */
 		if (rtindex && jointype != j->jointype)
 		{
-			RangeTblEntry *rte = rt_fetch(rtindex, root->parse->rtable);
+			RangeTblEntry *rte = rt_fetch(rtindex, parse->rtable);
 
 			Assert(rte->rtekind == RTE_JOIN);
 			Assert(rte->jointype == j->jointype);
@@ -2717,7 +2792,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 					pass_nonnullable_vars = NIL;
 					pass_forced_null_vars = NIL;
 				}
-				reduce_outer_joins_pass2(j->larg, left_state, root,
+				reduce_outer_joins_pass2(j->larg, left_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
@@ -2739,7 +2814,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 					pass_nonnullable_vars = NIL;
 					pass_forced_null_vars = NIL;
 				}
-				reduce_outer_joins_pass2(j->rarg, right_state, root,
+				reduce_outer_joins_pass2(j->rarg, right_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 2e84d6b3b4..7835ecf6ca 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -38,10 +38,12 @@
 #include "optimizer/optimizer.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planmain.h"
+#include "optimizer/pathnode.h"
 #include "parser/analyze.h"
 #include "parser/parse_agg.h"
 #include "parser/parse_coerce.h"
 #include "parser/parse_func.h"
+#include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "tcop/tcopprot.h"
 #include "utils/acl.h"
@@ -2011,6 +2013,271 @@ find_forced_null_var(Node *node)
 	return NULL;
 }
 
+/*
+ * find_innerjoined_rels
+ *		Traverse jointree to locate non-outerjoined-rels and quals above them
+ *
+ * We fill innerjoined_rels with the relids of all rels that are not below
+ * the nullable side of any outer join (which would cause their Vars to be
+ * possibly NULL regardless of what's in the catalogs).  In the same scan,
+ * we locate all WHERE and JOIN/ON quals that constrain these rels add them to
+ * the usable_quals list (forming a list with implicit-AND semantics).
+ *
+ * Top-level caller must initialize innerjoined_rels/usable_quals to NULL/NIL.
+ */
+static void
+find_innerjoined_rels(Node *jtnode,
+					  Relids *innerjoined_rels, List **usable_quals)
+{
+	if (jtnode == NULL)
+		return;
+	if (IsA(jtnode, RangeTblRef))
+	{
+		int			varno = ((RangeTblRef *) jtnode)->rtindex;
+
+		*innerjoined_rels = bms_add_member(*innerjoined_rels, varno);
+	}
+	else if (IsA(jtnode, FromExpr))
+	{
+		FromExpr   *f = (FromExpr *) jtnode;
+		ListCell   *lc;
+
+		/* All elements of the FROM list are allowable */
+		foreach(lc, f->fromlist)
+			find_innerjoined_rels((Node *) lfirst(lc),
+								  innerjoined_rels, usable_quals);
+		/* ... and its WHERE quals are too */
+		if (f->quals)
+			*usable_quals = lappend(*usable_quals, f->quals);
+	}
+	else if (IsA(jtnode, JoinExpr))
+	{
+		JoinExpr   *j = (JoinExpr *) jtnode;
+
+		switch (j->jointype)
+		{
+			case JOIN_INNER:
+				/* visit both children */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				/* and grab the ON quals too */
+				if (j->quals)
+					*usable_quals = lappend(*usable_quals, j->quals);
+				break;
+
+			case JOIN_LEFT:
+			case JOIN_SEMI:
+			case JOIN_ANTI:
+
+				/*
+				 * Only the left input is possibly non-nullable; furthermore,
+				 * the quals of this join don't constrain the left input.
+				 * Note: we probably can't see SEMI or ANTI joins at this
+				 * point, but if we do, we can treat them like LEFT joins.
+				 */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_RIGHT:
+				/* Reverse of the above case */
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_FULL:
+				/* Neither side is non-nullable, so stop descending */
+				break;
+
+			case JOIN_UNIQUE_OUTER:
+			case JOIN_UNIQUE_INNER:
+				/* Don't think we will see JOIN_UNIQUE_OUTER or
+				 * JOIN_UNIQUE_INNER since they are only used internally in
+				 * the planner in a much later phase (in standard_join_search).
+				*/
+				break;
+
+			default:
+				elog(ERROR, "unrecognized join type: %d",
+					 (int) j->jointype);
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+			 (int) nodeTag(jtnode));
+}
+
+/*
+ * Returns true if the Node passed in is nonnullable. Currently handles Var,
+ * TargetEntry, CoaleseExpr and Const.
+ * TODO: Add more supporting cases.
+ * A Var is nonnullable if:
+ *	It does not appear on the null-padded side of an outer join and it has NOT NULL constraint,
+ * 	Or if it's forced non-null by inner join or other strict predicates.
+ * A CoalesceExpr is nonnullable if it has a non-null argument.
+ */
+bool
+is_node_nonnullable(Node * node, Query *parse)
+{
+	AttrNumber	 attno = InvalidAttrNumber;
+	Oid				 reloid;
+	Var 				*var = NULL;
+
+	/*
+	 * If the query contains set operations, punt.  The set ops themselves
+	 * couldn't introduce nulls that weren't in their inputs, but the tlist
+	 * present in the top-level query is just dummy and won't give us useful
+	 * info.  We could get an answer by recursing to examine each leaf query,
+	 * but for the moment it doesn't seem worth the extra complication.
+	 *
+	 * Note that we needn't consider other top-level operators such as
+	 * DISTINCT, GROUP BY, etc, as those will not introduce nulls either.
+	 */
+	if (parse->setOperations)
+		return false;
+
+	switch (nodeTag(node))
+	{
+		case T_Var:
+		{
+			RangeTblEntry	*rte;
+
+			var = (Var *) node;
+			attno = var->varattno;
+			rte = rt_fetch(var->varno, parse->rtable);
+			reloid = rte->relid;
+			break;
+		}
+		case T_TargetEntry:
+		{
+			TargetEntry	*te = (TargetEntry *)node;
+			switch(nodeTag(te->expr))
+			{
+				case T_Var:
+				{
+					var = (Var *) te->expr;
+					attno = te->resorigcol;
+					reloid = te->resorigtbl;
+					break;
+				}
+				/* recurse into is_node_nonnullable for other types of Node */
+				default:
+					return is_node_nonnullable((Node *)te->expr, parse);
+			}
+			break;
+		}
+		case T_CoalesceExpr:
+		{
+			ListCell   			*arg;
+			CoalesceExpr	*cexpr = (CoalesceExpr *)node;
+
+			/* handle COALESCE Function by looking for non-null argument */
+			foreach(arg, cexpr->args)
+			{
+				Node	*e = lfirst(arg);
+
+				/* recurse into is_node_nonnullable */
+				if (is_node_nonnullable(e, parse))
+				{
+					return true;
+				}
+			}
+			break;
+		}
+		case T_Const:
+		{
+			if(!((Const *) node)->constisnull)
+			{
+				return true;
+			}
+			break;
+		}
+		/*
+		 * TODO: handle more cases to make the nullability test more accurate
+		 * Assume unhandled cases are nullable.
+		 */
+		default:
+			return false;
+	}
+
+	/*
+	 * If we have found a Var, it is non-null if:
+	 * not on NULL-padded side of an outer join and has NOT NULL constraint
+	 * or forced NOT NULL by inner join conditions or by other strict predicates.
+	 */
+	if(var &&
+			reloid != InvalidOid)
+	{
+		Relids		 innerjoined_rels = NULL;
+		List	   		*innerjoined_useful_quals = NIL;
+		List			*nonnullable_innerjoined_vars = NIL;
+
+		find_innerjoined_rels((Node *) parse->jointree,
+							  &innerjoined_rels,
+							  &innerjoined_useful_quals);
+
+		/*
+		 * Check if the Var is from an INNER JOINed rel, it's also guaranteed
+		 * to not be on the null-padded side of an outer join.
+		 */
+		if (bms_is_member(var->varno, innerjoined_rels))
+		{
+			/*
+			 * If Var is from a plain relation and its column is marked
+			 * NOT NULL according to the catalogs, it can't produce NULL.
+			 */
+			if (get_attnotnull(reloid, attno))
+			{
+				return true;
+			}
+
+			/*
+			 * Otherwise check for the existance of strict predicates which filter
+			 * out NULL values for this Var.
+			 */
+			nonnullable_innerjoined_vars =
+				find_nonnullable_vars((Node *) innerjoined_useful_quals);
+
+			if (list_member(nonnullable_innerjoined_vars, var))
+			{
+				return true;
+			}
+		}
+
+		/*
+		 * If we get here it means -
+		 * The var is on null-padded side of an outer-join, since we've already
+		 * tried reduce_outer_joins(), there isn't any strict predicates to turn this
+		 * outer join to inner join, then there will be no strict predicates to force
+		 * this var non-null as well.
+		 */
+	}
+
+	return false;
+}
+
+/*
+ * Returns true if at least one Node in the input list is non-nullable
+ */
+bool
+list_hasnonnullable(List * list, Query *parse)
+{
+	ListCell	*lc;
+	Node	*node;
+
+	foreach(lc, list)
+	{
+		node = lfirst(lc);
+		if(is_node_nonnullable(node, parse))
+		{
+			return true;
+		}
+	}
+	return false;
+}
+
 /*
  * Can we treat a ScalarArrayOpExpr as strict?
  *
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index c13c08a97b..59aa736f1e 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -908,6 +908,35 @@ get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 	ReleaseSysCache(tp);
 }
 
+/*
+ * get_attnotnull
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	  tp;
+	Form_pg_attribute att_tup;
+
+	tp = SearchSysCache2(ATTNUM,
+			ObjectIdGetDatum(relid),
+			Int16GetDatum(attnum));
+
+	if (HeapTupleIsValid(tp))
+	{
+		bool result;
+
+		att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+		result = att_tup->attnotnull;
+		ReleaseSysCache(tp);
+
+		return result;
+	}
+	/* Assume att is nullable if no valid heap tuple is found */
+	return false;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 7c278c0e56..b8cdd5ad1e 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -643,6 +643,8 @@ typedef struct SubLink
 	List	   *operName;		/* originally specified operator name */
 	Node	   *subselect;		/* subselect as Query* or raw parsetree */
 	int			location;		/* token location, or -1 if unknown */
+	Node	   *subroot;		/* PlannerInfo for the subquery */
+	Node	   *subplan;		/* best Plan for the subquery */
 } SubLink;
 
 /*
diff --git a/src/include/optimizer/clauses.h b/src/include/optimizer/clauses.h
index 2f9aeec4a7..b3229e96a7 100644
--- a/src/include/optimizer/clauses.h
+++ b/src/include/optimizer/clauses.h
@@ -45,6 +45,9 @@ extern List *find_nonnullable_vars(Node *clause);
 extern List *find_forced_null_vars(Node *clause);
 extern Var *find_forced_null_var(Node *clause);
 
+extern bool is_node_nonnullable(Node * node, Query *parse);
+extern bool list_hasnonnullable(List * list, Query *parse);
+
 extern bool is_pseudo_constant_clause(Node *clause);
 extern bool is_pseudo_constant_clause_relids(Node *clause, Relids relids);
 
diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h
index a9b2c9026c..ec8af5abf4 100644
--- a/src/include/optimizer/prep.h
+++ b/src/include/optimizer/prep.h
@@ -26,7 +26,7 @@ extern void pull_up_sublinks(PlannerInfo *root);
 extern void inline_set_returning_functions(PlannerInfo *root);
 extern void pull_up_subqueries(PlannerInfo *root);
 extern void flatten_simple_union_all(PlannerInfo *root);
-extern void reduce_outer_joins(PlannerInfo *root);
+extern void reduce_outer_joins(Query *parse);
 extern void remove_useless_result_rtes(PlannerInfo *root);
 extern Relids get_relids_in_jointree(Node *jtnode, bool include_joins);
 extern Relids get_relids_for_join(Query *query, int joinrelid);
diff --git a/src/include/optimizer/subselect.h b/src/include/optimizer/subselect.h
index 71a22eeb64..77c3730e47 100644
--- a/src/include/optimizer/subselect.h
+++ b/src/include/optimizer/subselect.h
@@ -19,6 +19,8 @@
 extern void SS_process_ctes(PlannerInfo *root);
 extern JoinExpr *convert_ANY_sublink_to_join(PlannerInfo *root,
 											 SubLink *sublink,
+											 bool under_not,
+											 Node **pullout,
 											 Relids available_rels);
 extern JoinExpr *convert_EXISTS_sublink_to_join(PlannerInfo *root,
 												SubLink *sublink,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index c8df5bff9f..d2bf81985d 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -90,6 +90,7 @@ extern char get_attgenerated(Oid relid, AttrNumber attnum);
 extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 								  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern char *get_collation_name(Oid colloid);
 extern bool get_collation_isdeterministic(Oid colloid);
 extern char *get_constraint_name(Oid conoid);
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index 2d1963d12a..46fb9e8d9a 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -1442,3 +1442,2680 @@ select * from x for update;
    Output: subselect_tbl.f1, subselect_tbl.f2, subselect_tbl.f3
 (2 rows)
 
+-- test NON IN to ANTI JOIN conversion
+CREATE TABLE s (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s values(1000002, 1000002, 1000002, 'foofoo');
+UPDATE s set n = NULL WHERE n = 3;
+analyze s;
+CREATE TABLE l (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into l (u, n, nn, p)
+	select
+    generate_series(1,10000 ) as u,
+	generate_series(1,10000 ) as n,
+	generate_series(1,10000 ) as nn,
+	'bar' as p;
+UPDATE l set n = NULL WHERE n = 7;
+CREATE UNIQUE INDEX l_u ON l (u);
+CREATE INDEX l_n ON l (n);
+CREATE INDEX l_nn ON l (nn);
+analyze l;
+CREATE TABLE s1 (u INTEGER NOT NULL, n INTEGER NULL, n1 INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s1 (u, n, n1, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as n1,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s1 values(1000003, 1000003, 1000003, 1000003, 'foofoo');
+insert into s1 values(1003, 1003, 1003, 1003, 'foofoo');
+UPDATE s1 set n = NULL WHERE n = 3;
+UPDATE s1 set n1 = NULL WHERE n = 2;
+UPDATE s1 set n1 = NULL WHERE n1 = 3;
+analyze s1;
+CREATE TABLE empty (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+analyze empty;
+-- set work_mem to 64KB so that NOT IN to ANTI JOIN optimization will kick in
+set work_mem = 64;
+-- correctness test 1: inner empty, return every thing from outer including NULL
+explain (costs false) select * from s where n not in (select n from empty);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on empty
+(4 rows)
+
+select * from s where n not in (select n from empty);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+-- correctness test 2: inner has NULL, return empty result
+explain (costs false) select * from s where n not in (select n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- correctness test 3: inner non-null, result has no NULL
+explain (costs false) select * from s where n not in (select u from l);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Index Only Scan using l_u on l
+         Index Cond: (u = s.n)
+(7 rows)
+
+select * from s where n not in (select u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+-- correctness test 4: inner has predicate
+explain (costs false) select * from s where n not in (select n from l where u > 7);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (u > 7)
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 7)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(14 rows)
+
+select * from s where n not in (select n from l where u > 7);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(3 rows)
+
+-- correctness test 5: multi-expression, (2, 2, null, 2, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s1.n = l.u) AND (s1.n1 = l.nn)) IS NOT FALSE)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Seq Scan on l
+               Filter: (u >= 3)
+(6 rows)
+
+select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+    u    |    n    |   n1    |   nn    |   p    
+---------+---------+---------+---------+--------
+       1 |       1 |       1 |       1 | foo
+ 1000003 | 1000003 | 1000003 | 1000003 | foofoo
+       2 |       2 |         |       2 | foo
+(3 rows)
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should not be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s1.n = l.u) AND (s1.n1 = l.nn)) IS NOT FALSE)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Seq Scan on l
+               Filter: (u > 0)
+(6 rows)
+
+select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+    u    |    n    |   n1    |   nn    |   p    
+---------+---------+---------+---------+--------
+ 1000003 | 1000003 | 1000003 | 1000003 | foofoo
+(1 row)
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on s1
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Index Scan using l_u on l
+           Index Cond: (u < 0)
+(5 rows)
+
+select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+    u    |    n    |   n1    |   nn    |   p    
+---------+---------+---------+---------+--------
+       1 |       1 |       1 |       1 | foo
+ 1000003 | 1000003 | 1000003 | 1000003 | foofoo
+    1003 |    1003 |    1003 |    1003 | foofoo
+       2 |       2 |         |       2 | foo
+       3 |         |         |       3 | foo
+(5 rows)
+
+-- test using hashed subplan when inner fits in work_mem
+explain (costs false) select * from l where n not in (select n from s);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on l
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on s
+(4 rows)
+
+select * from l where n not in (select n from s);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- test single expression
+explain (costs false) select * from s where n not in (select n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select u from l);
+              QUERY PLAN              
+--------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_u on l
+         Index Cond: (u = s.u)
+(4 rows)
+
+select * from s where u not in (select u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where 3*n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: (((3 * n) IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: (((3 * s.n) = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = (3 * s.n))
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where 3*n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select 3*n from l);
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = (3 * l.n)) OR ((3 * l.n) IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Materialize
+         ->  Seq Scan on l
+(8 rows)
+
+select * from s where n not in (select 3*n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- test single expression with predicates
+explain (costs false) select * from s where n not in (select n from l where u > 0);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (u > 0)
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 0)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(14 rows)
+
+select * from s where n not in (select n from l where u > 0);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l where u > 100);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (u > 100)
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 100)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(14 rows)
+
+select * from s where n not in (select n from l where u > 100);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(3 rows)
+
+-- test multi expression
+explain (costs false) select * from s where (n,u) not in (select n,u from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_u on l
+         Index Cond: (u = s.u)
+         Filter: ((s.n = n) OR (n IS NULL) OR (s.n IS NULL))
+(5 rows)
+
+select * from s where (n,u) not in (select n,u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (u, nn) not in (select u, nn from l);
+            QUERY PLAN            
+----------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: (s.u = u)
+(5 rows)
+
+select * from s where (u, nn) not in (select u, nn from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u) not in (select u,n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.u = n) OR (n IS NULL))
+         Filter: ((s.n = u) OR (s.n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.u)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where (n,u) not in (select u,n from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l);
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: (((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+                                          QUERY PLAN                                          
+----------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((u > 1000) AND ((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((u > 0) AND ((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((u > 1) AND ((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(2 rows)
+
+-- test multi-table
+explain (costs false) select count(*) from s, l where s.n not in (select n from l);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+                     Filter: ((n IS NOT NULL) OR (NOT $0))
+               ->  Bitmap Heap Scan on l l_1
+                     Recheck Cond: ((s.n = n) OR (n IS NULL))
+                     ->  BitmapOr
+                           ->  Bitmap Index Scan on l_n
+                                 Index Cond: (n = s.n)
+                           ->  Bitmap Index Scan on l_n
+                                 Index Cond: (n IS NULL)
+         ->  Seq Scan on l
+(15 rows)
+
+select count(*) from s, l where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s, l where s.nn not in (select nn from l);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+               ->  Index Only Scan using l_nn on l l_1
+                     Index Cond: (nn = s.nn)
+         ->  Seq Scan on l
+(7 rows)
+
+select count(*) from s, l where s.nn not in (select nn from l);
+ count 
+-------
+ 10000
+(1 row)
+
+-- test null padded results from outer join
+explain (costs false) select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = s_1.nn) OR (s_1.nn IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Nested Loop Left Join
+           Join Filter: (l_1.nn = s_2.nn)
+           ->  Seq Scan on l l_1
+           ->  Materialize
+                 ->  Seq Scan on s s_2
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s_1.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s s_1
+(15 rows)
+
+select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = s_1.nn) OR (s_1.nn IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Nested Loop Left Join
+           Join Filter: (s_2.nn = l_1.nn)
+           ->  Seq Scan on l l_1
+           ->  Materialize
+                 ->  Seq Scan on s s_2
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s_1.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s s_1
+(15 rows)
+
+select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+                   QUERY PLAN                   
+------------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         ->  Seq Scan on l
+               Filter: (NOT (hashed SubPlan 1))
+               SubPlan 1
+                 ->  Seq Scan on s s_1
+         ->  Hash
+               ->  Seq Scan on s
+(9 rows)
+
+select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+ count 
+-------
+  9997
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+                QUERY PLAN                
+------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         Filter: (NOT (hashed SubPlan 1))
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s
+         SubPlan 1
+           ->  Seq Scan on s s_1
+(9 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   ->  Nested Loop Left Join
+         Join Filter: (s.nn = l.nn)
+         ->  Hash Anti Join
+               Hash Cond: (l.nn = l_1.nn)
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Left Join
+                           Hash Cond: (l_1.nn = s_1.nn)
+                           ->  Seq Scan on l l_1
+                           ->  Hash
+                                 ->  Seq Scan on s s_1
+         ->  Seq Scan on s
+(13 rows)
+
+select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Nested Loop Left Join
+           Join Filter: (l_2.nn = s_2.nn)
+           ->  Seq Scan on l l_2
+           ->  Materialize
+                 ->  Seq Scan on s s_2
+   ->  Nested Loop Anti Join
+         Join Filter: ((s.nn = s_1.nn) OR (s_1.nn IS NULL))
+         ->  Hash Left Join
+               Hash Cond: (l.nn = s.nn)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s
+         ->  Materialize
+               ->  Hash Left Join
+                     Hash Cond: (l_1.nn = s_1.nn)
+                     ->  Seq Scan on l l_1
+                     ->  Hash
+                           ->  Seq Scan on s s_1
+(21 rows)
+
+select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Seq Scan on s1
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(11 rows)
+
+select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Left Join
+               Hash Cond: (l.u = s.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Right Join
+                           Hash Cond: (s1.u = s.u)
+                           ->  Seq Scan on s1
+                           ->  Hash
+                                 ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(17 rows)
+
+select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+                    QUERY PLAN                     
+---------------------------------------------------
+ Aggregate
+   ->  Nested Loop Left Join
+         Join Filter: (s.u = s1.u)
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+               ->  Index Only Scan using l_nn on l
+                     Index Cond: (nn = s.nn)
+         ->  Seq Scan on s1
+(8 rows)
+
+select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop
+               Join Filter: (s.u = s1.u)
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Seq Scan on s1
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(11 rows)
+
+select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Left Join
+               Hash Cond: (l.u = s.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Join
+                           Hash Cond: (s1.u = s.u)
+                           ->  Seq Scan on s1
+                           ->  Hash
+                                 ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(17 rows)
+
+select * from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+ u | n | nn | p | u | n | n1 | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Seq Scan on s1
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(11 rows)
+
+select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Full Join
+               Hash Cond: (l.u = s.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Join
+                           Hash Cond: (s1.u = s.u)
+                           ->  Seq Scan on s1
+                           ->  Hash
+                                 ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(17 rows)
+
+select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+                    QUERY PLAN                     
+---------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Nested Loop Left Join
+         Join Filter: (l.nn = s1.nn)
+         ->  Nested Loop Left Join
+               Join Filter: (l.nn = s_1.nn)
+               ->  Index Only Scan using l_nn on l
+                     Index Cond: (nn = s.nn)
+               ->  Seq Scan on s s_1
+         ->  Seq Scan on s1
+(10 rows)
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.nn = s_1.nn)
+           ->  Nested Loop Left Join
+                 ->  Seq Scan on s1
+                 ->  Index Only Scan using l_nn on l
+                       Index Cond: (nn = s1.nn)
+           ->  Materialize
+                 ->  Seq Scan on s s_1
+(11 rows)
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+                                    QUERY PLAN                                     
+-----------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Nested Loop Left Join
+         Join Filter: (l.nn = s_1.nn)
+         ->  Index Scan using l_nn on l
+               Index Cond: (nn = s.nn)
+               Filter: (((s.n = n) OR (n IS NULL) OR (s.n IS NULL)) AND (s.u = u))
+         ->  Seq Scan on s s_1
+(8 rows)
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l right join s on l.nn = s.nn);
+                QUERY PLAN                 
+-------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+           ->  Index Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+(7 rows)
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+--test reduce outer joins from outer query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Left Join
+               Hash Cond: (l.nn = s.nn)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(13 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+                     Filter: (u > 0)
+               ->  Index Only Scan using l_nn on l l_1
+                     Index Cond: (nn = s.nn)
+         ->  Index Only Scan using l_nn on l
+               Index Cond: (nn = s.nn)
+(9 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Index Only Scan using l_nn on l
+                     Index Cond: (nn = s.nn)
+         ->  Seq Scan on s1
+(11 rows)
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Anti Join
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on s1
+               ->  Materialize
+                     ->  Nested Loop
+                           ->  Seq Scan on s
+                           ->  Index Only Scan using l_nn on l
+                                 Index Cond: (nn = s.nn)
+         ->  Index Only Scan using l_nn on l l_1
+               Index Cond: (nn = s.nn)
+(15 rows)
+
+select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Left Join
+         Join Filter: (s.u = s1.u)
+         ->  Hash Anti Join
+               Hash Cond: (s.nn = l_1.nn)
+               ->  Hash Left Join
+                     Hash Cond: (l.nn = s.nn)
+                     Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+                     ->  Seq Scan on l
+                     ->  Hash
+                           ->  Seq Scan on s
+               ->  Hash
+                     ->  Seq Scan on l l_1
+         ->  Seq Scan on s1
+(16 rows)
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+--test reduce outer joins from subquery
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+(7 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+                QUERY PLAN                 
+-------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+                 Filter: (u > 9)
+(8 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+                 Filter: (u > 9)
+           ->  Index Only Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+(8 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           Join Filter: (l.n = s1.n)
+           ->  Seq Scan on s1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Scan using l_nn on l
+                             Index Cond: (nn = s_1.nn)
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.n = s1.n)
+           ->  Seq Scan on s1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Scan using l_nn on l
+                             Index Cond: (nn = s_1.nn)
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+                   QUERY PLAN                    
+-------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.n = s1.n)
+           ->  Nested Loop Left Join
+                 ->  Seq Scan on s s_1
+                 ->  Index Scan using l_nn on l
+                       Index Cond: (nn = s_1.nn)
+           ->  Materialize
+                 ->  Seq Scan on s1
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+--test reduce outer join on outer and sub-query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+                                    QUERY PLAN                                    
+----------------------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = s1.u)
+         ->  Seq Scan on s1
+         ->  Materialize
+               ->  Nested Loop
+                     ->  Seq Scan on s
+                           Filter: (NOT (hashed SubPlan 1))
+                           SubPlan 1
+                             ->  Nested Loop
+                                   Join Filter: (l_1.n = s1_1.n)
+                                   ->  Seq Scan on s1 s1_1
+                                   ->  Materialize
+                                         ->  Nested Loop
+                                               ->  Seq Scan on s s_1
+                                               ->  Index Scan using l_nn on l l_1
+                                                     Index Cond: (nn = s_1.nn)
+                     ->  Index Only Scan using l_nn on l
+                           Index Cond: (nn = s.nn)
+(19 rows)
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         Filter: (NOT (hashed SubPlan 1))
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Hash Right Join
+                     Hash Cond: (s1.u = s.u)
+                     ->  Seq Scan on s1
+                     ->  Hash
+                           ->  Seq Scan on s
+         SubPlan 1
+           ->  Nested Loop Left Join
+                 Join Filter: (l_1.n = s1_1.n)
+                 ->  Nested Loop Left Join
+                       ->  Seq Scan on s s_1
+                       ->  Index Scan using l_nn on l l_1
+                             Index Cond: (nn = s_1.nn)
+                 ->  Materialize
+                       ->  Seq Scan on s1 s1_1
+(20 rows)
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+ count 
+-------
+     0
+(1 row)
+
+-- test union all
+explain (costs false) select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s t
+         Filter: (n IS NOT NULL)
+   ->  Append
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: ((t.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = t.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+         ->  Seq Scan on s
+               Filter: ((t.n = u) OR (u IS NULL))
+         ->  Seq Scan on s s_1
+               Filter: ((t.n = nn) OR (nn IS NULL))
+(15 rows)
+
+select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Result
+           ->  Append
+                 ->  Seq Scan on l l_1
+                 ->  Seq Scan on s s_3
+                 ->  Seq Scan on s s_4
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Append
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+         ->  Seq Scan on s s_1
+               Filter: ((s.n = u) OR (u IS NULL))
+         ->  Seq Scan on s s_2
+               Filter: ((s.n = nn) OR (nn IS NULL))
+(21 rows)
+
+select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.n = l_1.nn)
+         ->  Append
+               ->  Seq Scan on s
+                     Filter: ((n IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+                     Filter: ((u IS NOT NULL) OR (NOT $0))
+         ->  Hash
+               ->  Seq Scan on l l_1
+(12 rows)
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+                   QUERY PLAN                   
+------------------------------------------------
+ Aggregate
+   ->  Append
+         ->  Seq Scan on s
+               Filter: (NOT (hashed SubPlan 1))
+               SubPlan 1
+                 ->  Seq Scan on empty
+         ->  Seq Scan on l
+               Filter: (NOT (hashed SubPlan 1))
+(8 rows)
+
+select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+ count 
+-------
+ 10004
+(1 row)
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
+ Finalize Aggregate
+   InitPlan 1 (returns $0)
+     ->  Result
+           ->  Append
+                 ->  Seq Scan on l l_2
+                 ->  Seq Scan on s s_3
+                 ->  Seq Scan on s s_4
+   ->  Gather
+         Workers Planned: 2
+         Params Evaluated: $0
+         ->  Partial Aggregate
+               ->  Nested Loop Anti Join
+                     ->  Parallel Append
+                           ->  Parallel Seq Scan on l
+                                 Filter: ((u IS NOT NULL) OR (NOT $0))
+                           ->  Parallel Seq Scan on s
+                                 Filter: ((n IS NOT NULL) OR (NOT $0))
+                     ->  Append
+                           ->  Bitmap Heap Scan on l l_1
+                                 Recheck Cond: ((l.u = n) OR (n IS NULL))
+                                 ->  BitmapOr
+                                       ->  Bitmap Index Scan on l_n
+                                             Index Cond: (n = l.u)
+                                       ->  Bitmap Index Scan on l_n
+                                             Index Cond: (n IS NULL)
+                           ->  Seq Scan on s s_1
+                                 Filter: ((l.u = u) OR (u IS NULL))
+                           ->  Seq Scan on s s_2
+                                 Filter: ((l.u = nn) OR (nn IS NULL))
+(29 rows)
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+ count 
+-------
+     0
+(1 row)
+
+-- test multi-levels of NOT IN
+explain (costs false) select * from s where n not in (select n from s where n not in (select n from l));
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 2))
+   SubPlan 2
+     ->  Nested Loop Anti Join
+           InitPlan 1 (returns $0)
+             ->  Seq Scan on l
+           ->  Seq Scan on s s_1
+                 Filter: ((n IS NOT NULL) OR (NOT $0))
+           ->  Bitmap Heap Scan on l l_1
+                 Recheck Cond: ((s_1.n = n) OR (n IS NULL))
+                 ->  BitmapOr
+                       ->  Bitmap Index Scan on l_n
+                             Index Cond: (n = s_1.n)
+                       ->  Bitmap Index Scan on l_n
+                             Index Cond: (n IS NULL)
+(15 rows)
+
+select * from s where n not in (select n from s where n not in (select n from l));
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where n not in (select n from s where n not in (select u from l));
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 2))
+   SubPlan 2
+     ->  Nested Loop Anti Join
+           InitPlan 1 (returns $0)
+             ->  Seq Scan on l
+           ->  Seq Scan on s s_1
+                 Filter: ((n IS NOT NULL) OR (NOT $0))
+           ->  Index Only Scan using l_u on l l_1
+                 Index Cond: (u = s_1.n)
+(10 rows)
+
+select * from s where n not in (select n from s where n not in (select u from l));
+ u | n | nn |  p  
+---+---+----+-----
+ 1 | 1 |  1 | foo
+ 2 | 2 |  2 | foo
+(2 rows)
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Aggregate
+   ->  Seq Scan on s
+         Filter: (NOT (SubPlan 2))
+         SubPlan 2
+           ->  Result
+                 One-Time Filter: (NOT $2)
+                 InitPlan 1 (returns $2)
+                   ->  Nested Loop Anti Join
+                         ->  Seq Scan on s1
+                               Filter: (n = s.n)
+                         ->  Bitmap Heap Scan on l
+                               Recheck Cond: ((s1.u = n) OR (n IS NULL))
+                               ->  BitmapOr
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n = s1.u)
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n IS NULL)
+                 ->  Seq Scan on s1 s1_1
+(18 rows)
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+                                           QUERY PLAN                                           
+------------------------------------------------------------------------------------------------
+ Seq Scan on s
+   Filter: ((NOT (hashed SubPlan 1)) AND (NOT (hashed SubPlan 2)) AND (NOT (hashed SubPlan 3)))
+   SubPlan 1
+     ->  Seq Scan on s1
+   SubPlan 2
+     ->  Seq Scan on s1 s1_1
+   SubPlan 3
+     ->  Seq Scan on s1 s1_2
+(8 rows)
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: ((NOT (hashed SubPlan 1)) AND (NOT (hashed SubPlan 2)))
+         SubPlan 1
+           ->  Seq Scan on s1
+         SubPlan 2
+           ->  Seq Scan on s1 s1_1
+   ->  Index Only Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+(9 rows)
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Aggregate
+   ->  Seq Scan on s
+         Filter: ((NOT (SubPlan 2)) AND (NOT (SubPlan 4)))
+         SubPlan 2
+           ->  Result
+                 One-Time Filter: (NOT $2)
+                 InitPlan 1 (returns $2)
+                   ->  Nested Loop Anti Join
+                         ->  Seq Scan on s1
+                               Filter: (n = s.n)
+                         ->  Bitmap Heap Scan on l
+                               Recheck Cond: ((s1.u = n) OR (n IS NULL))
+                               ->  BitmapOr
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n = s1.u)
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n IS NULL)
+                 ->  Seq Scan on s1 s1_1
+         SubPlan 4
+           ->  Result
+                 One-Time Filter: (NOT $6)
+                 InitPlan 3 (returns $6)
+                   ->  Nested Loop Anti Join
+                         ->  Seq Scan on s1 s1_2
+                               Filter: (n = s.n)
+                         ->  Bitmap Heap Scan on l l_1
+                               Recheck Cond: ((s1_2.u = n) OR (n IS NULL))
+                               ->  BitmapOr
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n = s1_2.u)
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n IS NULL)
+                 ->  Seq Scan on s1 s1_3
+(33 rows)
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+ count 
+-------
+     0
+(1 row)
+
+--test COALESCE
+explain (costs false) select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, '-1'::integer) = COALESCE(l.n, '-1'::integer))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, '-1'::integer) = COALESCE(l.n, '-1'::integer))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+                                 QUERY PLAN                                  
+-----------------------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((COALESCE(s.n) = COALESCE(l.n)) OR (COALESCE(l.n) IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((COALESCE(n) IS NOT NULL) OR (NOT $0))
+   ->  Materialize
+         ->  Seq Scan on l
+(8 rows)
+
+select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, s.nn) = COALESCE(l.n, l.nn))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+                   QUERY PLAN                   
+------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.nn) = COALESCE(l.nn))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+                                                       QUERY PLAN                                                        
+-------------------------------------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((COALESCE(s.n, '-1'::integer) = COALESCE(n, '-1'::integer)) AND (COALESCE(s.n, s.u) = COALESCE(n, u)))
+(5 rows)
+
+select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+-- test miscellaneous outer nullable cases
+explain (costs false) select * from s where (n,n) not in (select n,n from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s.n = l.n) AND (s.n = l.n)) IS NOT FALSE)
+   ->  Seq Scan on s
+   ->  Materialize
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where (n,n) not in (select n,n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+                                     QUERY PLAN                                      
+-------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s.n = l_1.n) AND (s.u = l_1.u) AND (s.nn = l_1.nn)) IS NOT FALSE)
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s
+   ->  Materialize
+         ->  Seq Scan on l l_1
+(9 rows)
+
+select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+ u | n | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+                 QUERY PLAN                  
+---------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         Filter: (NOT (hashed SubPlan 1))
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s
+         SubPlan 1
+           ->  Index Scan using l_u on l l_1
+                 Index Cond: (u < 0)
+(10 rows)
+
+select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+ count 
+-------
+ 10000
+(1 row)
+
+explain (costs false) select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Sort
+   Sort Key: s.n
+   ->  Seq Scan on s
+         Filter: (NOT (hashed SubPlan 1))
+         SubPlan 1
+           ->  Limit
+                 ->  Unique
+                       ->  Index Scan using l_n on l
+                             Filter: (u > 0)
+(9 rows)
+
+select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+--test outer has strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l) and n > 0;
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n > 0)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where n not in (select n from l) and n > 0;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l) and u > 0;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: (((n IS NOT NULL) OR (NOT $0)) AND (u > 0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where n not in (select n from l) and u > 0;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l) and n is not null;
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n IS NOT NULL)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where n not in (select n from l) and n is not null;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s join l on s.n = l.n where s.n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop
+   ->  Nested Loop Anti Join
+         ->  Seq Scan on s
+         ->  Bitmap Heap Scan on l l_1
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+   ->  Index Scan using l_n on l
+         Index Cond: (n = s.n)
+(12 rows)
+
+select * from s join l on s.n = l.n where s.n not in (select n from l);
+ u | n | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Anti Join
+         ->  Hash Left Join
+               Hash Cond: (l.n = s.n)
+               Filter: ((s.n IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s
+         ->  Bitmap Heap Scan on l l_1
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+(17 rows)
+
+select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Bitmap Heap Scan on l l_1
+                           Recheck Cond: ((s.n = n) OR (n IS NULL))
+                           ->  BitmapOr
+                                 ->  Bitmap Index Scan on l_n
+                                       Index Cond: (n = s.n)
+                                 ->  Bitmap Index Scan on l_n
+                                       Index Cond: (n IS NULL)
+               ->  Index Only Scan using l_n on l
+                     Index Cond: (n = s.n)
+         ->  Seq Scan on s1
+(16 rows)
+
+select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Anti Join
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               Filter: ((s.n IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on s1
+               ->  Materialize
+                     ->  Nested Loop
+                           ->  Seq Scan on s
+                           ->  Index Only Scan using l_n on l
+                                 Index Cond: (n = s.n)
+         ->  Bitmap Heap Scan on l l_1
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+(20 rows)
+
+select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+--test inner has strict predicate or inner join
+explain (costs false) select * from s where u not in (select n from l where n > 0);
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.u) AND (n > 0))
+(4 rows)
+
+select * from s where u not in (select n from l where n > 0);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where u not in (select n from l where u > 0);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.u = n) OR (n IS NULL))
+         Filter: (u > 0)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.u)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where u not in (select n from l where u > 0);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select n from l where n is not null);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.u) AND (n IS NOT NULL))
+(4 rows)
+
+select * from s where u not in (select n from l where n is not null);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(7 rows)
+
+select * from s where u not in (select l.n from l join s on l.n=s.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.u=s.u);
+               QUERY PLAN                
+-----------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Scan using l_u on l
+                 Index Cond: (u = s_1.u)
+(7 rows)
+
+select * from s where u not in (select l.n from l join s on l.u=s.u);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n = s.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(7 rows)
+
+select * from s where u not in (select l.n from l join s on l.n = s.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n = s.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(7 rows)
+
+select * from s where u not in (select l.n from l right join s on l.n = s.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Nested Loop
+                 Join Filter: (s_1.n = s1.n)
+                 ->  Seq Scan on s1
+                 ->  Materialize
+                       ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(11 rows)
+
+select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.n = s1.n)
+           ->  Seq Scan on s1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Only Scan using l_n on l
+                             Index Cond: (n = s_1.n)
+(11 rows)
+
+select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+--test both sides have strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l where n > 0) and n > 0;
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n > 0)
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.n) AND (n > 0))
+(5 rows)
+
+select * from s where n not in (select n from l where n > 0) and n > 0;
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where n not in (select n from l where u > 0) and n > 0;
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n > 0)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 0)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(11 rows)
+
+select * from s where n not in (select n from l where u > 0) and n > 0;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l where n > 0) and u > 0;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (n > 0)
+   ->  Seq Scan on s
+         Filter: (((n IS NOT NULL) OR (NOT $0)) AND (u > 0))
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.n) AND (n > 0))
+(8 rows)
+
+select * from s where n not in (select n from l where n > 0) and u > 0;
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Nested Loop
+   Join Filter: (s.u = s1.u)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Nested Loop
+               ->  Seq Scan on s
+                     Filter: (NOT (hashed SubPlan 1))
+                     SubPlan 1
+                       ->  Nested Loop
+                             ->  Nested Loop
+                                   Join Filter: (s_1.n = s1_1.n)
+                                   ->  Seq Scan on s s_1
+                                   ->  Materialize
+                                         ->  Seq Scan on s s1_1
+                             ->  Index Only Scan using l_n on l l_1
+                                   Index Cond: (n = s_1.n)
+               ->  Index Scan using l_n on l
+                     Index Cond: (n = s.n)
+(18 rows)
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+ u | n | nn | p | u | n | nn | p | u | n | n1 | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+----+---
+(0 rows)
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+                                   QUERY PLAN                                   
+--------------------------------------------------------------------------------
+ Nested Loop
+   Join Filter: (s.u = s1.u)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Nested Loop
+               ->  Seq Scan on s
+                     Filter: (NOT (hashed SubPlan 1))
+                     SubPlan 1
+                       ->  Nested Loop Left Join
+                             Join Filter: (l_1.n = s1_1.n)
+                             ->  Seq Scan on s s1_1
+                             ->  Materialize
+                                   ->  Nested Loop
+                                         ->  Seq Scan on s s_1
+                                         ->  Index Only Scan using l_n on l l_1
+                                               Index Cond: (n = s_1.n)
+               ->  Index Scan using l_n on l
+                     Index Cond: (n = s.n)
+(18 rows)
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+ u | n | nn | p | u | n | nn | p | u | n | n1 | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+----+---
+(0 rows)
+
+explain (costs false) select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Nested Loop Left Join
+   Join Filter: (s.u = s1.u)
+   Filter: (NOT (hashed SubPlan 1))
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Nested Loop
+               ->  Seq Scan on s
+               ->  Index Scan using l_n on l
+                     Index Cond: (n = s.n)
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l_1.n = s1_1.n)
+           ->  Seq Scan on s s1_1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Only Scan using l_n on l l_1
+                             Index Cond: (n = s_1.n)
+(18 rows)
+
+select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+ u | n | nn | p | u | n | nn | p | u | n | n1 | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+----+---
+(0 rows)
+
+--JIRA-7279 CTE with NOT IN
+create table public.testing
+(
+a integer,
+b integer,
+c integer
+);
+explain (costs false) with
+selected(a,b,c) as (values(1,2,3)),
+updated(d,e,f) as (values(4,5,6))
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+                    QUERY PLAN                     
+---------------------------------------------------
+ Insert on testing
+   ->  Result
+         One-Time Filter: (NOT (hashed SubPlan 1))
+         SubPlan 1
+           ->  Result
+(5 rows)
+
+with
+selected(a,b,c) as (values(1,2,3)),
+updated(d,e,f) as (values(4,5,6))
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+select * from public.testing;
+ a | b | c 
+---+---+---
+ 1 | 2 | 3
+(1 row)
+
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+                                     QUERY PLAN                                      
+-------------------------------------------------------------------------------------
+ Insert on testing
+   ->  Nested Loop Anti Join
+         Join Filter: (((s.u = l.u) AND (s.n = l.n) AND (s.nn = l.nn)) IS NOT FALSE)
+         ->  Seq Scan on s
+         ->  Materialize
+               ->  Seq Scan on l
+(6 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+(2 rows)
+
+-- expect to get Hash Anti Join
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where a not in (select d from updated);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Insert on testing
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Nested Loop Anti Join
+         ->  Seq Scan on s
+               Filter: ((u IS NOT NULL) OR (NOT $0))
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(8 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where a not in (select d from updated);
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+(3 rows)
+
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select e from updated);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Insert on testing
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Nested Loop Anti Join
+         ->  Seq Scan on s
+               Filter: ((n IS NOT NULL) OR (NOT $0))
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+(13 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select e from updated);
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+(3 rows)
+
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select d from updated);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Insert on testing
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Nested Loop Anti Join
+         ->  Seq Scan on s
+               Filter: ((n IS NOT NULL) OR (NOT $0))
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.n)
+(8 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select d from updated);
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+(4 rows)
+
+-- two levels of NOT IN with CTE, 2nd NOT IN
+-- subquery access CTE two levels above
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated
+      where d not in (select a from selected));
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
+ Insert on testing
+   CTE selected
+     ->  Seq Scan on s
+   ->  Nested Loop Anti Join
+         Join Filter: (((selected.a = l.u) AND (selected.b = l.n) AND (selected.c = l.nn)) IS NOT FALSE)
+         ->  CTE Scan on selected
+         ->  Materialize
+               ->  Seq Scan on l
+                     Filter: (NOT (hashed SubPlan 3))
+                     SubPlan 3
+                       ->  CTE Scan on selected selected_1
+(11 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated
+      where d not in (select a from selected));
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+       1 |       1 |       1
+       2 |       2 |       2
+ 1000002 | 1000002 | 1000002
+       3 |         |       3
+(8 rows)
+
+-- With clause inside a query block
+explain select count(distinct t.a) from
+(with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+select * from selected where (a,b,c) not in
+(select d,e,f from updated
+ where d not in (select a from selected))) as t;
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
+ Aggregate  (cost=693.75..693.76 rows=1 width=8)
+   ->  Nested Loop Anti Join  (cost=1.13..693.71 rows=3 width=12)
+         Join Filter: (((selected.a = l.u) AND (selected.b = l.n) AND (selected.c = l.nn)) IS NOT FALSE)
+         CTE selected
+           ->  Seq Scan on s  (cost=0.00..1.04 rows=4 width=12)
+         ->  CTE Scan on selected  (cost=0.00..0.08 rows=4 width=12)
+         ->  Materialize  (cost=0.09..230.09 rows=5000 width=12)
+               ->  Seq Scan on l  (cost=0.09..180.09 rows=5000 width=12)
+                     Filter: (NOT (hashed SubPlan 3))
+                     SubPlan 3
+                       ->  CTE Scan on selected selected_1  (cost=0.00..0.08 rows=4 width=4)
+(11 rows)
+
+select count(distinct t.a) from
+(with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+select * from selected where (a,b,c) not in
+(select d,e,f from updated
+ where d not in (select a from selected))) as t;
+ count 
+-------
+     4
+(1 row)
+
+-- With clause in subquery, can't flatten subquery to anti join
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(with updated (d,e,f) as (select u, n, nn from l)
+select d,e,f from updated);
+            QUERY PLAN             
+-----------------------------------
+ Insert on testing
+   ->  Seq Scan on s
+         Filter: (NOT (SubPlan 1))
+         SubPlan 1
+           ->  Materialize
+                 ->  Seq Scan on l
+(6 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(with updated (d,e,f) as (select u, n, nn from l)
+select d,e,f from updated);
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+       1 |       1 |       1
+       2 |       2 |       2
+ 1000002 | 1000002 | 1000002
+       3 |         |       3
+ 1000002 | 1000002 | 1000002
+(9 rows)
+
+-- With clause in subquery, subsubquery access CTE in subquery
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(
+with updated(d,e,f) as (select u, n, nn from l)
+select d,e,f from updated where d not in (select d from updated)
+);
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Insert on testing
+   ->  Seq Scan on s
+         Filter: (NOT (SubPlan 3))
+         SubPlan 3
+           ->  Materialize
+                 CTE updated
+                   ->  Seq Scan on l
+                 InitPlan 2 (returns $1)
+                   ->  CTE Scan on updated
+                 ->  Hash Anti Join
+                       Hash Cond: (updated_1.d = updated_2.d)
+                       ->  CTE Scan on updated updated_1
+                             Filter: ((d IS NOT NULL) OR (NOT $1))
+                       ->  Hash
+                             ->  CTE Scan on updated updated_2
+(15 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(
+with updated(d,e,f) as (select u, n, nn from l)
+select d,e,f from updated where d not in (select d from updated)
+);
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+       1 |       1 |       1
+       2 |       2 |       2
+ 1000002 | 1000002 | 1000002
+       3 |         |       3
+ 1000002 | 1000002 | 1000002
+       1 |       1 |       1
+       2 |       2 |       2
+ 1000002 | 1000002 | 1000002
+       3 |         |       3
+(13 rows)
+
+-- Recursive CTE
+CREATE TABLE employees (
+  id serial,
+  name varchar(255),
+  manager_id int
+);
+INSERT INTO employees VALUES (1, 'Mark', null);
+INSERT INTO employees VALUES (2, 'John', 1);
+INSERT INTO employees VALUES (3, 'Dan', 2);
+INSERT INTO employees VALUES (4, 'Clark', 1);
+INSERT INTO employees VALUES (5, 'Linda', 2);
+INSERT INTO employees VALUES (6, 'Willy', 2);
+INSERT INTO employees VALUES (7, 'Barack', 2);
+INSERT INTO employees VALUES (8, 'Elen', 2);
+INSERT INTO employees VALUES (9, 'Kate', 3);
+INSERT INTO employees VALUES (10, 'Terry', 4);
+WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON mtree.id = e.manager_id
+)
+SELECT *
+FROM managertree;
+ id |  name  | manager_id 
+----+--------+------------
+  2 | John   |          1
+  3 | Dan    |          2
+  5 | Linda  |          2
+  6 | Willy  |          2
+  7 | Barack |          2
+  8 | Elen   |          2
+  9 | Kate   |          3
+(7 rows)
+
+-- NOT IN subquery access Recursive CTE
+EXPLAIN (COSTS FALSE) WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON mtree.id = e.manager_id
+)
+SELECT *
+FROM managertree mt WHERE mt.manager_id NOT IN (SELECT id FROM managertree);
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ CTE Scan on managertree mt
+   Filter: (NOT (hashed SubPlan 2))
+   CTE managertree
+     ->  Recursive Union
+           ->  Seq Scan on employees
+                 Filter: (id = 2)
+           ->  Hash Join
+                 Hash Cond: (e.manager_id = mtree.id)
+                 ->  Seq Scan on employees e
+                 ->  Hash
+                       ->  WorkTable Scan on managertree mtree
+   SubPlan 2
+     ->  CTE Scan on managertree
+(13 rows)
+
+WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON mtree.id = e.manager_id
+)
+SELECT *
+FROM managertree mt WHERE mt.manager_id NOT IN (SELECT id FROM managertree);
+ id | name | manager_id 
+----+------+------------
+  2 | John |          1
+(1 row)
+
+-- NOT IN under UNION ALL inside Recursive CTE
+EXPLAIN (COSTS FALSE) WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON
+  (mtree.id = e.manager_id AND
+  mtree.manager_id NOT IN (SELECT manager_id FROM employees)
+  )
+)
+SELECT *
+FROM managertree;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ CTE Scan on managertree
+   CTE managertree
+     ->  Recursive Union
+           ->  Seq Scan on employees employees_1
+                 Filter: (id = 2)
+           ->  Hash Join
+                 Hash Cond: (e.manager_id = mtree.id)
+                 ->  Seq Scan on employees e
+                 ->  Hash
+                       ->  WorkTable Scan on managertree mtree
+                             Filter: (NOT (hashed SubPlan 1))
+                             SubPlan 1
+                               ->  Seq Scan on employees
+(13 rows)
+
+WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON
+  (mtree.id = e.manager_id AND
+  mtree.manager_id NOT IN (SELECT manager_id FROM employees)
+  )
+)
+SELECT *
+FROM managertree;
+ id | name | manager_id 
+----+------+------------
+  2 | John |          1
+(1 row)
+
+-- clean up
+reset work_mem;
+drop table s;
+drop table s1;
+drop table l;
+drop table empty;
+drop table public.testing;
+drop table employees;
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index 99ca69791e..8941c20b80 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -744,3 +744,759 @@ select * from (with x as (select 2 as y) select * from x) ss;
 explain (verbose, costs off)
 with x as (select * from subselect_tbl)
 select * from x for update;
+
+-- test NON IN to ANTI JOIN conversion
+CREATE TABLE s (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s values(1000002, 1000002, 1000002, 'foofoo');
+UPDATE s set n = NULL WHERE n = 3;
+analyze s;
+
+CREATE TABLE l (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into l (u, n, nn, p)
+	select
+    generate_series(1,10000 ) as u,
+	generate_series(1,10000 ) as n,
+	generate_series(1,10000 ) as nn,
+	'bar' as p;
+UPDATE l set n = NULL WHERE n = 7;
+
+CREATE UNIQUE INDEX l_u ON l (u);
+CREATE INDEX l_n ON l (n);
+CREATE INDEX l_nn ON l (nn);
+analyze l;
+
+CREATE TABLE s1 (u INTEGER NOT NULL, n INTEGER NULL, n1 INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s1 (u, n, n1, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as n1,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s1 values(1000003, 1000003, 1000003, 1000003, 'foofoo');
+insert into s1 values(1003, 1003, 1003, 1003, 'foofoo');
+UPDATE s1 set n = NULL WHERE n = 3;
+UPDATE s1 set n1 = NULL WHERE n = 2;
+UPDATE s1 set n1 = NULL WHERE n1 = 3;
+analyze s1;
+
+CREATE TABLE empty (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+analyze empty;
+
+-- set work_mem to 64KB so that NOT IN to ANTI JOIN optimization will kick in
+set work_mem = 64;
+
+-- correctness test 1: inner empty, return every thing from outer including NULL
+explain (costs false) select * from s where n not in (select n from empty);
+
+select * from s where n not in (select n from empty);
+
+-- correctness test 2: inner has NULL, return empty result
+explain (costs false) select * from s where n not in (select n from l);
+
+select * from s where n not in (select n from l);
+
+-- correctness test 3: inner non-null, result has no NULL
+explain (costs false) select * from s where n not in (select u from l);
+
+select * from s where n not in (select u from l);
+
+-- correctness test 4: inner has predicate
+explain (costs false) select * from s where n not in (select n from l where u > 7);
+
+select * from s where n not in (select n from l where u > 7);
+
+-- correctness test 5: multi-expression, (2, 2, null, 2, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+
+select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should not be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+
+select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+
+select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+
+-- test using hashed subplan when inner fits in work_mem
+explain (costs false) select * from l where n not in (select n from s);
+
+select * from l where n not in (select n from s);
+
+-- test single expression
+explain (costs false) select * from s where n not in (select n from l);
+
+select * from s where n not in (select n from l);
+
+explain (costs false) select * from s where u not in (select u from l);
+
+select * from s where u not in (select u from l);
+
+explain (costs false) select * from s where 3*n not in (select n from l);
+
+select * from s where 3*n not in (select n from l);
+
+explain (costs false) select * from s where n not in (select 3*n from l);
+
+select * from s where n not in (select 3*n from l);
+
+-- test single expression with predicates
+explain (costs false) select * from s where n not in (select n from l where u > 0);
+
+select * from s where n not in (select n from l where u > 0);
+
+explain (costs false) select * from s where n not in (select n from l where u > 100);
+
+select * from s where n not in (select n from l where u > 100);
+
+-- test multi expression
+explain (costs false) select * from s where (n,u) not in (select n,u from l);
+
+select * from s where (n,u) not in (select n,u from l);
+
+explain (costs false) select * from s where (u, nn) not in (select u, nn from l);
+
+select * from s where (u, nn) not in (select u, nn from l);
+
+explain (costs false) select * from s where (n,u) not in (select u,n from l);
+
+select * from s where (n,u) not in (select u,n from l);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+
+-- test multi-table
+explain (costs false) select count(*) from s, l where s.n not in (select n from l);
+
+select count(*) from s, l where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s, l where s.nn not in (select nn from l);
+
+select count(*) from s, l where s.nn not in (select nn from l);
+
+-- test null padded results from outer join
+explain (costs false) select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+
+select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+
+select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+
+select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+
+select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+
+select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+select * from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l right join s on l.nn = s.nn);
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+--test reduce outer joins from outer query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+--test reduce outer joins from subquery
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+--test reduce outer join on outer and sub-query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+-- test union all
+explain (costs false) select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+
+select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+
+explain (costs false) select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+
+explain (costs false) select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+
+select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+-- test multi-levels of NOT IN
+explain (costs false) select * from s where n not in (select n from s where n not in (select n from l));
+
+select * from s where n not in (select n from s where n not in (select n from l));
+
+explain (costs false) select * from s where n not in (select n from s where n not in (select u from l));
+
+select * from s where n not in (select n from s where n not in (select u from l));
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+--test COALESCE
+explain (costs false) select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+
+select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+
+explain (costs false) select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+
+select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+
+explain (costs false) select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+
+select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+
+explain (costs false) select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+
+select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+
+explain (costs false) select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+
+select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+
+explain (costs false) select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+
+select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+
+-- test miscellaneous outer nullable cases
+
+explain (costs false) select * from s where (n,n) not in (select n,n from l);
+
+select * from s where (n,n) not in (select n,n from l);
+
+explain (costs false) select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+
+select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+
+select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+
+explain (costs false) select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+
+select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+
+--test outer has strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l) and n > 0;
+
+select * from s where n not in (select n from l) and n > 0;
+
+explain (costs false) select * from s where n not in (select n from l) and u > 0;
+
+select * from s where n not in (select n from l) and u > 0;
+
+explain (costs false) select * from s where n not in (select n from l) and n is not null;
+
+select * from s where n not in (select n from l) and n is not null;
+
+explain (costs false) select * from s join l on s.n = l.n where s.n not in (select n from l);
+
+select * from s join l on s.n = l.n where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+
+select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+
+select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+
+select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+
+--test inner has strict predicate or inner join
+explain (costs false) select * from s where u not in (select n from l where n > 0);
+
+select * from s where u not in (select n from l where n > 0);
+
+explain (costs false) select * from s where u not in (select n from l where u > 0);
+
+select * from s where u not in (select n from l where u > 0);
+
+explain (costs false) select * from s where u not in (select n from l where n is not null);
+
+select * from s where u not in (select n from l where n is not null);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n);
+
+select * from s where u not in (select l.n from l join s on l.n=s.n);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.u=s.u);
+
+select * from s where u not in (select l.n from l join s on l.u=s.u);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n = s.n);
+
+select * from s where u not in (select l.n from l join s on l.n = s.n);
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n = s.n);
+
+select * from s where u not in (select l.n from l right join s on l.n = s.n);
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+
+select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+
+select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+
+--test both sides have strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l where n > 0) and n > 0;
+
+select * from s where n not in (select n from l where n > 0) and n > 0;
+
+explain (costs false) select * from s where n not in (select n from l where u > 0) and n > 0;
+
+select * from s where n not in (select n from l where u > 0) and n > 0;
+
+explain (costs false) select * from s where n not in (select n from l where n > 0) and u > 0;
+
+select * from s where n not in (select n from l where n > 0) and u > 0;
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+explain (costs false) select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+--JIRA-7279 CTE with NOT IN
+create table public.testing
+(
+a integer,
+b integer,
+c integer
+);
+
+explain (costs false) with
+selected(a,b,c) as (values(1,2,3)),
+updated(d,e,f) as (values(4,5,6))
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+
+with
+selected(a,b,c) as (values(1,2,3)),
+updated(d,e,f) as (values(4,5,6))
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+
+select * from public.testing;
+
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+
+select * from public.testing;
+
+-- expect to get Hash Anti Join
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where a not in (select d from updated);
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where a not in (select d from updated);
+
+select * from public.testing;
+
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select e from updated);
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select e from updated);
+
+select * from public.testing;
+
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select d from updated);
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select d from updated);
+
+select * from public.testing;
+
+-- two levels of NOT IN with CTE, 2nd NOT IN
+-- subquery access CTE two levels above
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated
+      where d not in (select a from selected));
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated
+      where d not in (select a from selected));
+
+select * from public.testing;
+
+-- With clause inside a query block
+explain select count(distinct t.a) from
+(with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+select * from selected where (a,b,c) not in
+(select d,e,f from updated
+ where d not in (select a from selected))) as t;
+
+select count(distinct t.a) from
+(with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+select * from selected where (a,b,c) not in
+(select d,e,f from updated
+ where d not in (select a from selected))) as t;
+
+-- With clause in subquery, can't flatten subquery to anti join
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(with updated (d,e,f) as (select u, n, nn from l)
+select d,e,f from updated);
+
+with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(with updated (d,e,f) as (select u, n, nn from l)
+select d,e,f from updated);
+
+select * from public.testing;
+
+-- With clause in subquery, subsubquery access CTE in subquery
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(
+with updated(d,e,f) as (select u, n, nn from l)
+select d,e,f from updated where d not in (select d from updated)
+);
+
+with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(
+with updated(d,e,f) as (select u, n, nn from l)
+select d,e,f from updated where d not in (select d from updated)
+);
+
+select * from public.testing;
+
+-- Recursive CTE
+CREATE TABLE employees (
+  id serial,
+  name varchar(255),
+  manager_id int
+);
+
+INSERT INTO employees VALUES (1, 'Mark', null);
+INSERT INTO employees VALUES (2, 'John', 1);
+INSERT INTO employees VALUES (3, 'Dan', 2);
+INSERT INTO employees VALUES (4, 'Clark', 1);
+INSERT INTO employees VALUES (5, 'Linda', 2);
+INSERT INTO employees VALUES (6, 'Willy', 2);
+INSERT INTO employees VALUES (7, 'Barack', 2);
+INSERT INTO employees VALUES (8, 'Elen', 2);
+INSERT INTO employees VALUES (9, 'Kate', 3);
+INSERT INTO employees VALUES (10, 'Terry', 4);
+
+WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON mtree.id = e.manager_id
+)
+SELECT *
+FROM managertree;
+
+-- NOT IN subquery access Recursive CTE
+EXPLAIN (COSTS FALSE) WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON mtree.id = e.manager_id
+)
+SELECT *
+FROM managertree mt WHERE mt.manager_id NOT IN (SELECT id FROM managertree);
+
+WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON mtree.id = e.manager_id
+)
+SELECT *
+FROM managertree mt WHERE mt.manager_id NOT IN (SELECT id FROM managertree);
+
+-- NOT IN under UNION ALL inside Recursive CTE
+EXPLAIN (COSTS FALSE) WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON
+  (mtree.id = e.manager_id AND
+  mtree.manager_id NOT IN (SELECT manager_id FROM employees)
+  )
+)
+SELECT *
+FROM managertree;
+
+WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON
+  (mtree.id = e.manager_id AND
+  mtree.manager_id NOT IN (SELECT manager_id FROM employees)
+  )
+)
+SELECT *
+FROM managertree;
+
+-- clean up
+reset work_mem;
+drop table s;
+drop table s1;
+drop table l;
+drop table empty;
+drop table public.testing;
+drop table employees;
#55Michael Paquier
michael@paquier.xyz
In reply to: Li, Zheng (#54)
Re: NOT IN subquery optimization

On Wed, Jun 26, 2019 at 09:26:16PM +0000, Li, Zheng wrote:

Let me know if you have any comments.

I have one: the latest patch visibly applies, but fails to build
because of the recent API changes around lists in the backend code.
So a rebase is in order. The discussion has not moved a iota in the
last couple of months, still as the latest patch has not received
reviews, I have moved it to next CF waiting on author.
--
Michael

#56Li, Zheng
zhelli@amazon.com
In reply to: Michael Paquier (#55)
1 attachment(s)
Re: NOT IN subquery optimization

Hi Michael,

Here is the latest rebased patch.

Regards,
-----------
Zheng Li
AWS, Amazon Aurora PostgreSQL

On 11/30/19, 10:43 PM, "Michael Paquier" <michael@paquier.xyz> wrote:

On Wed, Jun 26, 2019 at 09:26:16PM +0000, Li, Zheng wrote:

Let me know if you have any comments.

I have one: the latest patch visibly applies, but fails to build
because of the recent API changes around lists in the backend code.
So a rebase is in order. The discussion has not moved a iota in the
last couple of months, still as the latest patch has not received
reviews, I have moved it to next CF waiting on author.
--
Michael

Attachments:

not_in_to_anti_join_transformation_v2_3.patchapplication/octet-stream; name=not_in_to_anti_join_transformation_v2_3.patchDownload
commit ad0d9b11dec579cdc1cea1587e4a05afe060c49a
Author: Zheng Li <zhelli@amazon.com>
Date:   Fri Jun 14 17:38:52 2019 +0000

        NOT IN to ANTI JOIN transformation
    
        If the NOT IN subquery is not eligible for hashed subplan as decided by
        subplan_is_hashable(), do the following NOT IN to ANTI JOIN
        transformation:
    
        Single expression:
            When x is nullable:
            t1.x not in (t2.y where p) =>
            ANTI JOIN
            t1 (Filter: t1.x IS NOT NULL or NOT EXISTS (select 1 from t2 where p)),
            t2 on join condition (t1.x=t2.y or t2.y IS NULL)
            and p.
            The predicate "t2.y IS NULL" can be removed if y is non-nullable.
    
            When x is non-nullable:
            t1.x not in (t2.y where p) =>
            ANTI JOIN t1, t2 on join condition (t1.x=t2.y or t2.y IS NULL)
            and p.
            The predicate "t2.y IS NULL" can be removed if y is non-nullable.
    
        Multi expression:
            If all xi's are nullable:
            (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
            ANTI JOIN t1, t2 on join condition:
            ((t1.x1 = t2.y1) and ... (t1.xi = t2.yi) ... and
             (t1.xn = t2.yn)) IS NOT FALSE.
    
            If at least one xi is non-nuallable:
            (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
            ANTI JOIN t1, t2 on join condition:
            (t1.x1 = t2.y1 or t2.y1 IS NULL or t1.x1 IS NULL) and ...
            (t1.xi = t2.yi or t2.yi IS NULL) ... and
            (t1.xn = t2.yn or t2.yn IS NULL or t1.xn IS NULL).
    
        Add nullability testing routine is_node_nonnullable(), currently it
        handles Var, TargetEntry, CoalesceExpr and Const. It uses strict
        predicates, inner join conditions and NOT NULL constraint to check
        the nullability of a Var.
    
        Adjust and apply reduce_outer_joins() before the transformation so
        that the outer joins have an opportunity to be converted to inner joins
        prior to the transformation.
    
        We measured performance improvements of two to five orders of magnitude
        on most queries in a development environment. In our performance experiments,
        table s (small) has 11 rows, table l (large) has 1 million rows. s.n and l.n
        have NULL value. s.nn and l.nn are NOT NULL. Index is created on each column.
    
        Cases using hash anti join:
        l.nn not in (l.nn) 21900s -> 235ms
        l.nn not in (l.nn where u>0) 22000s -> 240ms
        l.n not in (l.nn) 21900s -> 238ms
        l.n not in (l.nn where u>0) 22000s -> 248ms
    
        Cases using index nested loop anti join
        s.n not in (l.nn) 360ms -> 0.5ms
        s.n not in (l.nn where u>0) 390ms -> 0.6ms
        s.nn not in (l.nn) 365ms -> 0.5ms
        s.nn not in (l.nn where u>0) 390ms -> 0.5ms
    
        Cases using bitmap heap scan on the inner and nested loop anti join:
        s.n not in (l.n) 360ms -> 0.7ms
        l.n not in (l.n) 21900s ->  1.6s
        l.n not in (l.n where u>0) 22000s -> 1680ms
        s.nn not in (l.n) 360ms -> 0.5ms
        l.nn not in (l.n) 21900s -> 1650ms
        l.nn not in (l.n where u>0) 22000s -> 1660ms
    
        Cases using the original hashed subplan:
        l.n not in (s.n) 63ms -> 63ms
        l.nn not in (s.nn) 63ms -> 63ms
        l.n not in (s.n where u>0) 63ms -> 63ms
    
        Authors: Zheng Li, Jim Finnerty
        Discussion:
            https://www.postgresql.org/message-id/flat/812F84BC-702F-4063-95CC-A7DBD7C9D66B%40amazon.com#bfbaff1ad4a366ed92d7c9657afdb533
    
    Make NOT IN to ANTI JOIN transformation work with CTE

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 7fe11b59a0..41db9b4dba 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -992,7 +992,7 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
 	 * preprocessing.
 	 */
 	if (hasOuterJoins)
-		reduce_outer_joins(root);
+		reduce_outer_joins(parse);
 
 	/*
 	 * If we have any RTE_RESULT relations, see if they can be deleted from
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 48b62a55de..b628634fc2 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -39,6 +39,8 @@
 #include "utils/syscache.h"
 
 
+bool enable_not_in_transform;
+
 typedef struct convert_testexpr_context
 {
 	PlannerInfo *root;
@@ -158,8 +160,7 @@ get_first_col_type(Plan *plan, Oid *coltype, int32 *coltypmod,
  * subquery itself is in a resjunk tlist entry whose value is uninteresting).
  */
 static Node *
-make_subplan(PlannerInfo *root, Query *orig_subquery,
-			 SubLinkType subLinkType, int subLinkId,
+make_subplan(PlannerInfo *root, SubLink *sublink,
 			 Node *testexpr, bool isTopQual)
 {
 	Query	   *subquery;
@@ -171,6 +172,9 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 	Plan	   *plan;
 	List	   *plan_params;
 	Node	   *result;
+	Query      *orig_subquery = (Query *) sublink->subselect;
+	SubLinkType subLinkType = sublink->subLinkType;
+	int         subLinkId = sublink->subLinkId;
 
 	/*
 	 * Copy the source Query node.  This is a quick and dirty kluge to resolve
@@ -216,24 +220,33 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 	/* plan_params should not be in use in current query level */
 	Assert(root->plan_params == NIL);
 
-	/* Generate Paths for the subquery */
-	subroot = subquery_planner(root->glob, subquery,
-							   root,
-							   false, tuple_fraction);
+	if(sublink->subroot != NULL &&
+		sublink->subplan != NULL)
+	{
+		subroot = (PlannerInfo *) sublink->subroot;
+		plan = (Plan *) sublink->subplan;
+	}
+	else
+	{
+		/* Generate Paths for the subquery */
+		subroot = subquery_planner(root->glob, subquery,
+								   root,
+								   false, tuple_fraction);
+
+		/*
+		 * Select best Path and turn it into a Plan.  At least for now, there
+		 * seems no reason to postpone doing that.
+		 */
+		final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
+		best_path = get_cheapest_fractional_path(final_rel, tuple_fraction);
+
+		plan = create_plan(subroot, best_path);
+	}
 
 	/* Isolate the params needed by this specific subplan */
 	plan_params = root->plan_params;
 	root->plan_params = NIL;
 
-	/*
-	 * Select best Path and turn it into a Plan.  At least for now, there
-	 * seems no reason to postpone doing that.
-	 */
-	final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
-	best_path = get_cheapest_fractional_path(final_rel, tuple_fraction);
-
-	plan = create_plan(subroot, best_path);
-
 	/* And convert to SubPlan or InitPlan format. */
 	result = build_subplan(root, plan, subroot, plan_params,
 						   subLinkType, subLinkId,
@@ -1173,6 +1186,389 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
 	return expression_tree_walker(node, inline_cte_walker, context);
 }
 
+/* Returns a List of Nodes from the testexpr of an Any SubLink */
+static List *
+getTestExpr(SubLink *sublink)
+{
+	Node * testexpr = sublink->testexpr;
+	Assert(testexpr);
+
+	/* single expression */
+	if(IsA(testexpr, OpExpr))
+	{
+		OpExpr	*opexpr = (OpExpr *) testexpr;
+		Node	*testnode = linitial(opexpr->args);
+
+		return list_make1(testnode);
+	}
+	/* multi-expression */
+	else if(IsA(testexpr, BoolExpr))
+	{
+		BoolExpr	*bexpr = (BoolExpr *) testexpr;
+		ListCell		*lc;
+		Node		*node;
+		List			*result = NULL;
+
+		foreach(lc, bexpr->args)
+		{
+			node = lfirst(lc);
+			if(IsA(node, OpExpr))
+			{
+				OpExpr *expr = (OpExpr *) node;
+				result = lappend(result, linitial(expr->args));
+			}
+			else
+			{
+				elog(ERROR, "unrecognized node type for testexpr: %d",
+						(int) nodeTag(node));
+			}
+		}
+		return result;
+	}
+	else
+	{
+		elog(ERROR, "unrecognized node type for testexpr: %d",
+				(int) nodeTag(testexpr));
+	}
+}
+
+/* Try to reduce outer joins if there is one in the Query */
+static void
+reduce_outer_joins_NOT_IN(Query *parse)
+{
+	ListCell				*lc;
+	RangeTblEntry	*rte;
+
+	foreach(lc, parse->rtable)
+	{
+		rte = (RangeTblEntry *) lfirst(lc);
+		/* try to reduce outer joins if there is one */
+		if (rte->rtekind == RTE_JOIN &&
+				IS_OUTER_JOIN(rte->jointype))
+		{
+			reduce_outer_joins(parse);
+			return;
+		}
+	}
+}
+
+/*
+ * Make clause NOT EXISTS
+ * (select 1 from t2 where p) for the NOT IN to ANTI JOIN
+ * transformation.
+ */
+static
+Node *
+makeExistsTest_NOT_IN(Query *subselect)
+{
+	BoolExpr *notExpr = makeNode(BoolExpr);
+	SubLink *exists = makeNode(SubLink);
+	Query *selectOne =  copyObject(subselect);
+	Const *oneconst;
+	TargetEntry *dummyte;
+
+	/* modify subselect target list to contain a dummy const 1 */
+	oneconst = makeConst(INT4OID,
+						 -1,
+						 InvalidOid,
+						 sizeof(int32),
+						 Int32GetDatum(1),
+						 false, /* isnull */
+						 true); /* pass by value */
+	dummyte = makeTargetEntry((Expr *) oneconst,
+							  1,
+							  "one",
+							  false /* resjunk */ );
+	selectOne->targetList = list_make1(dummyte);
+
+	/* make EXISTS(select 1 from t2 where p) */
+	exists->subLinkType = EXISTS_SUBLINK;
+	exists->subLinkId = 0;
+	exists->subselect = (Node *) selectOne;
+	exists->location = -1;
+	exists->subroot = NULL;
+	exists->subplan = NULL;
+
+	/* make NOT EXISTS(select 1 from t2 where p) */
+	notExpr->boolop = NOT_EXPR;
+	notExpr->args = list_make1(exists);
+	notExpr->location = -1;
+
+	return (Node *) notExpr;
+}
+
+/*
+ *Allow transformation from NOT IN query to ANTI JOIN if ALL of the
+ * following conditions are true:
+ * 1. The GUC enable_not_in_transform is set to true.
+ * 2. the NOT IN subquery is not hashable, in which case an expensive
+ *		subplan will be generated if we don't transform.
+ * 3.. the subquery does not define any CTE.
+ */
+static bool
+allow_NOT_IN_transformation(PlannerInfo *root,
+													  SubLink *sublink)
+{
+	Query			*subselect = (Query *) sublink->subselect;
+	PlannerInfo 	*subroot;
+	double			 tuple_fraction;
+	RelOptInfo		*final_rel;
+	Path			*best_path;
+	Plan			*plan;
+
+	if(! enable_not_in_transform)
+		return false;
+
+	/*
+	 * Can't flatten if it contains WITH.  (We could arrange to pull up the
+	 * WITH into the parent query's cteList, but that risks changing the
+	 * semantics, since a WITH ought to be executed once per associated query
+	 * call.)  Note that convert_ANY_sublink_to_join doesn't have to reject
+	 * this case, since it just produces a subquery RTE that doesn't have to
+	 * get flattened into the parent query.
+	 */
+	if(subselect->cteList)
+		return false;
+
+	/* For ALL and ANY subplans, we will be
+	 * able to stop evaluating if the test condition fails or matches, so very
+	 * often not all the tuples will be retrieved; for lack of a better idea,
+	 * specify 50% retrieval.
+	 */
+	tuple_fraction = 0.5;
+	/*
+	 * Generate Paths for the subquery, use a copied version of the subquery
+	 * so that the existing one doesn't get modified.
+	 */
+	subroot = subquery_planner(root->glob, copyObject(subselect),
+													root, false, tuple_fraction);
+
+	/* Select best Path and turn it into a Plan. */
+	final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
+	best_path = get_cheapest_fractional_path(final_rel, tuple_fraction);
+
+	plan = create_plan(subroot, best_path);
+
+	sublink->subroot = (Node *) subroot;
+	sublink->subplan = (Node *) plan;
+	/*
+	 * Punt if subplan is hashable since using hashed subplan is almost like
+	 * doing a Hash Anti Join, we probably can't do better than that.
+	 */
+	if(subplan_is_hashable(plan))
+	{
+		return false;
+	}
+
+	return true;
+}
+
+/*
+ * do the following NOT IN to ANTI JOIN conversions:
+ *
+ * When x is non-nullable:
+ * t1.x not in (t2.y where p) => ANTI JOIN
+ * t1, t2 on join condition (t1.x=t2.y or t2.y IS NULL) and p.
+ * the above predicate "t2.y IS NULL" can be removed if y
+ * is also non-nullable.
+ *
+ * When x is nullable:
+ * t1.x not in (t2.y where p) => ANTI JOIN
+ * t1 (Filter: t1.x is not null or not exists (select 1 from t2 where p)),
+ * t2 on join condition (t1.x=t2.y or t2.y is null) and p.
+ * the above predicate "t2.y IS NULL" can be removed if y
+ * is also non-nullable.
+ *
+ * The multi-expression case is just ANDs of the single-
+ * expression case.
+ */
+static bool
+convert_NOT_IN_to_join(PlannerInfo *root, Node **quals,
+									SubLink *sublink, List *subquery_vars,
+									Node **pullout)
+{
+	Query			*parse = root->parse;
+	Query			*subselect = (Query *) sublink->subselect;
+	List			*testnodes = getTestExpr(sublink);
+	bool			 outerNonNull;
+	bool			 innerNonNull;
+	NullTest 		*nt;
+
+	/*
+	 * Try reduce_outer_joins since outer join affects the nullability test that's coming up next.
+	 * We have to call reduce_outer_joins for outer and inner query separately because we
+	 * don't have a global range table yet.
+	 */
+	reduce_outer_joins_NOT_IN(parse);
+
+	reduce_outer_joins_NOT_IN(subselect);
+
+	Assert(testnodes);
+	outerNonNull =
+					list_hasnonnullable(testnodes, parse);
+	innerNonNull =
+					list_hasnonnullable(subselect->targetList, subselect);
+
+	/* Single-expression case, do the following:
+	 * When x is non-nullable:
+	 * t1.x not in (t2.y where p) => ANTI JOIN
+	 * t1, t2 on join condition (t1.x=t2.y or t2.y IS NULL) and p.
+	 * the above predicate "t2.y IS NULL" can be removed if y
+	 * is also non-nullable.
+	 *
+	 * When x is nullable:
+	 * t1.x not in (t2.y where p) => ANTI JOIN
+	 * t1 (Filter: t1.x is not null or not exists (select 1 from t2 where p)),
+	 * t2 on join condition (t1.x=t2.y or t2.y is null) and p.
+	 * the above predicate "t2.y IS NULL" can be removed if y
+	 * is also non-nullable.
+	 */
+	if(IsA(*quals, OpExpr))
+	{
+		/*  Add "OR y IS NULL" if y is nullable */
+		if(!innerNonNull)
+		{
+			/* make expr y IS NULL */
+			nt = makeNode(NullTest);
+			nt->arg = (Expr *)linitial(subquery_vars);
+			nt->nulltesttype = IS_NULL;
+			nt->argisrow = false;
+			nt->location = -1;
+
+			/* make orclause (x = y OR y IS NULL) */
+			*quals = (Node *)make_orclause(list_make2(*quals,
+								(Node *)nt));
+		}
+
+		/*
+		 * if x is nullable, make the following filter for t1 :
+		 * x IS NOT NULL or NOT EXISTS (select 1 from t2 where p)
+		 */
+		if(!outerNonNull)
+		{
+			Node * existsTest = NULL;
+
+			/* make expr x IS NOT NULL */
+			nt = makeNode(NullTest);
+			nt->arg = (Expr *) linitial(testnodes);
+			nt->nulltesttype = IS_NOT_NULL;
+			nt->argisrow = false;
+			nt->location = -1;
+
+			existsTest = makeExistsTest_NOT_IN(subselect);
+			/* make x IS NOT NULL OR NOT EXISTS (select 1 from t2 where p) */
+			*pullout = (Node *)make_orclause(list_make2(
+									(Node *) nt, existsTest));
+		}
+	}
+	/*
+	 * Multi-expression case:
+	 * If all xi's are nullable:
+	 * (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
+	 * ANTI JOIN t1,
+	 * t2 on join condition:
+	 * ((t1.x1 = t2.y1) and ... (t1.xi = t2.yi) ... and
+	 * (t1.xn = t2.yn)) is NOT FALSE.
+	 *
+	 * If at least one xi is non-nuallable:
+	 * (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
+	 * ANTI JOIN t1,
+	 * t2 on join condition:
+	 * (t1.x1 = t2.y1 or t2.y1 is NULL) and ...
+	 * (t1.xi = t2.yi or t2.yi is NULL or t1.xi is NULL) ... and
+	 * (t1.xn = t2.yn or t2.yn is NULL).
+	 */
+	else if(IsA(*quals, BoolExpr))
+	{
+		/*
+		 * Add IS NOT FALSE on top of the join condition if ALL x_i's are nullable
+		 */
+		if(!outerNonNull)
+		{
+			BooleanTest *btest;
+
+			btest = makeNode(BooleanTest);
+			btest->arg = (Expr *) *quals;
+			btest->booltesttype = IS_NOT_FALSE;
+			*quals = (Node *) btest;
+		}
+		else
+		{
+			ListCell			*qualc;
+			TargetEntry	*te;
+			ListCell			*xc = list_head(testnodes);
+			ListCell			*yc = list_head(subquery_vars);
+			ListCell			*ytlc = list_head(subselect->targetList);
+			List				*quallist = ((BoolExpr *)*quals)->args;
+			Node				*joinCondition = NULL;
+			bool				 xnonNull;
+			bool				 ynonNull;
+
+			/* Reconstruct quals in the loop */
+			*quals = NULL;
+			foreach(qualc, quallist)
+			{
+				te = (TargetEntry *)lfirst(ytlc);
+				/* x_i = y_i */
+				joinCondition = lfirst(qualc);
+				ynonNull = is_node_nonnullable((Node*)te, subselect);
+
+				/* append y_i IS NULL to x_i = y_i if y_i is non-nullable */
+				if(!ynonNull)
+				{
+					/* make expr y_i IS NULL */
+					nt = makeNode(NullTest);
+					nt->arg = (Expr *)lfirst(yc);
+					nt->nulltesttype = IS_NULL;
+					nt->argisrow = false;
+					nt->location = -1;
+
+					/* make orclause (x_i = y_i OR y_i IS NULL) */
+					joinCondition = (Node *)make_orclause(list_make2(joinCondition,
+											(Node *)nt));
+				}
+
+				/*
+				 * Append "OR x_i is null" to the join condition if x_i is nullable.
+				 * Notice at least one x_i should be non-nullable because the all
+				 * x_i's nullable case is handled earlier by adding "IS NOT FALSE"
+				 * on top of the join condition.
+				 */
+				xnonNull = is_node_nonnullable(lfirst(xc), parse);
+				if(!xnonNull)
+				{
+					/* make expr x_i IS NULL */
+					nt = makeNode(NullTest);
+					nt->arg = (Expr *)lfirst(xc);
+					nt->nulltesttype = IS_NULL;
+					nt->argisrow = false;
+					nt->location = -1;
+
+					/* make orclause (x_i = y_i OR y_i IS NULL OR x_i IS NULL) */
+					joinCondition = (Node *)make_orclause(list_make2(joinCondition,
+											(Node *)nt));
+				}
+
+				/*
+				 * Now append joinCondition to quals as one andclause.
+				 * (x_i = y_i OR y_i IS NULL OR x_i IS NULL) AND
+				 * (x_j = y_j OR y_j IS NULL OR x_j IS NULL)...
+				 */
+				*quals = (Node *)make_andclause(list_make2(*quals, joinCondition));
+				xc = lnext(testnodes, xc);
+				yc = lnext(subquery_vars, yc);
+				ytlc = lnext(subselect->targetList, ytlc);
+			}
+		}
+	}
+	/* quals should be either OpExpr or BoolExpr, otherwise don't convert */
+	else
+	{
+		return false;
+	}
+
+	return true;
+}
 
 /*
  * convert_ANY_sublink_to_join: try to convert an ANY SubLink to a join
@@ -1210,7 +1606,8 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
  */
 JoinExpr *
 convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
-							Relids available_rels)
+											bool under_not, Node **pullout,
+											Relids available_rels)
 {
 	JoinExpr   *result;
 	Query	   *parse = root->parse;
@@ -1253,6 +1650,12 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 	if (contain_volatile_functions(sublink->testexpr))
 		return NULL;
 
+	if (under_not &&
+		! allow_NOT_IN_transformation(root, sublink))
+	{
+		return NULL;
+	}
+
 	/* Create a dummy ParseState for addRangeTableEntryForSubquery */
 	pstate = make_parsestate(NULL);
 
@@ -1290,11 +1693,29 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 	 */
 	quals = convert_testexpr(root, sublink->testexpr, subquery_vars);
 
+	/*
+	 * Try converting x NOT IN (y) to ANTI JOIN.
+	 */
+	if(under_not &&
+			!convert_NOT_IN_to_join(root, &quals,
+								sublink, subquery_vars, pullout))
+	{
+		/*
+		 * In theory, we shouldn't get here since allow_NOT_IN_transformation()
+		 * has already ruled out cases that shouldn't be transformed. In other words,
+		 * I expect convert_NOT_IN_to_join to always return true, but just in case
+		 * it fails, reset parse->rtable which has been changed a few lines above.
+		 */
+		parse->rtable = list_delete(parse->rtable, rte);
+		return NULL;
+	}
+
 	/*
 	 * And finally, build the JoinExpr node.
 	 */
 	result = makeNode(JoinExpr);
-	result->jointype = JOIN_SEMI;
+	/* NOT IN will be converted to ANTI JOIN */
+	result->jointype = under_not ? JOIN_ANTI : JOIN_SEMI;
 	result->isNatural = false;
 	result->larg = NULL;		/* caller must fill this in */
 	result->rarg = (Node *) rtr;
@@ -1883,9 +2304,7 @@ process_sublinks_mutator(Node *node, process_sublinks_context *context)
 		 * Now build the SubPlan node and make the expr to return.
 		 */
 		return make_subplan(context->root,
-							(Query *) sublink->subselect,
-							sublink->subLinkType,
-							sublink->subLinkId,
+							sublink,
 							testexpr,
 							context->isTopQual);
 	}
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index f489f140e3..774ee5e11e 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -113,7 +113,7 @@ static Query *pullup_replace_vars_subquery(Query *query,
 static reduce_outer_joins_state *reduce_outer_joins_pass1(Node *jtnode);
 static void reduce_outer_joins_pass2(Node *jtnode,
 									 reduce_outer_joins_state *state,
-									 PlannerInfo *root,
+									 Query *parse,
 									 Relids nonnullable_rels,
 									 List *nonnullable_vars,
 									 List *forced_null_vars);
@@ -265,6 +265,13 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 		}
 		/* Build the replacement FromExpr; no quals yet */
 		newf = makeFromExpr(newfromlist, NULL);
+		/*
+		 * Replace parse->jointree with newf now because we might modify join types
+		 * during reduce_outer_joins() in convert_NOT_IN_to_join() which is called
+		 * in pull_up_sublinks_qual_recurse() that's coming up next.
+		 */
+		newf->quals = f->quals;
+		root->parse->jointree = newf;
 		/* Set up a link representing the rebuilt jointree */
 		jtlink = (Node *) newf;
 		/* Now process qual --- all children are available for use */
@@ -397,7 +404,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 		/* Is it a convertible ANY or EXISTS clause? */
 		if (sublink->subLinkType == ANY_SUBLINK)
 		{
-			if ((j = convert_ANY_sublink_to_join(root, sublink,
+			if ((j = convert_ANY_sublink_to_join(root, sublink, false, NULL,
 												 available_rels1)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
@@ -423,7 +430,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 				return NULL;
 			}
 			if (available_rels2 != NULL &&
-				(j = convert_ANY_sublink_to_join(root, sublink,
+				(j = convert_ANY_sublink_to_join(root, sublink, false, NULL,
 												 available_rels2)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
@@ -569,6 +576,68 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 					return NULL;
 				}
 			}
+			else if (sublink->subLinkType == ANY_SUBLINK)
+			{
+				Node *pullout = NULL;
+
+				if ((j = convert_ANY_sublink_to_join(root, sublink, true, &pullout,
+													 available_rels1)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink1;
+					*jtlink1 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Any inserted
+					 * joins can get stacked onto either j->larg or j->rarg,
+					 * depending on which rels they reference.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->larg,
+															 available_rels1,
+															 &j->rarg,
+															 child_rels);
+					/*
+					 * Return pullout predicate (x is NOT NULL) if it's not null,
+					 * otherwise return NULL representing constant TRUE.
+					 */
+					return pullout? pullout : NULL;
+				}
+				if (available_rels2 != NULL &&
+					(j = convert_ANY_sublink_to_join(root, sublink, true, &pullout,
+													 available_rels2)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink2;
+					*jtlink2 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Any inserted
+					 * joins can get stacked onto either j->larg or j->rarg,
+					 * depending on which rels they reference.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->larg,
+															 available_rels2,
+															 &j->rarg,
+															 child_rels);
+					/*
+					 * Return pullout predicate (x is NOT NULL) if it's not null,
+					 * otherwise return NULL representing constant TRUE.
+					 */
+					return pullout? pullout : NULL;
+				}
+			}
 		}
 		/* Else return it unmodified */
 		return node;
@@ -907,7 +976,13 @@ pull_up_simple_subquery(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte,
 	subroot->parse = subquery;
 	subroot->glob = root->glob;
 	subroot->query_level = root->query_level;
-	subroot->parent_root = root->parent_root;
+	/*
+	 * Keep a path to the top level root so that we can recursively access top level
+	 * CTEs in root->parse->cteList, and CTE plans in root->init_plans. This hack
+	 * won't change the original PlannerInfo tree structure because subroot is just
+	 * a auxiliary PlannerInfo to help pulling up subquery.
+	 */
+	subroot->parent_root = root;
 	subroot->plan_params = NIL;
 	subroot->outer_params = NULL;
 	subroot->planner_cxt = CurrentMemoryContext;
@@ -2548,7 +2623,7 @@ flatten_simple_union_all(PlannerInfo *root)
  * alias-var expansion).
  */
 void
-reduce_outer_joins(PlannerInfo *root)
+reduce_outer_joins(Query *parse)
 {
 	reduce_outer_joins_state *state;
 
@@ -2561,14 +2636,14 @@ reduce_outer_joins(PlannerInfo *root)
 	 * join(s) below each side of each join clause. The second pass examines
 	 * qual clauses and changes join types as it descends the tree.
 	 */
-	state = reduce_outer_joins_pass1((Node *) root->parse->jointree);
+	state = reduce_outer_joins_pass1((Node *) parse->jointree);
 
 	/* planner.c shouldn't have called me if no outer joins */
 	if (state == NULL || !state->contains_outer)
 		elog(ERROR, "so where are the outer joins?");
 
-	reduce_outer_joins_pass2((Node *) root->parse->jointree,
-							 state, root, NULL, NIL, NIL);
+	reduce_outer_joins_pass2((Node *) parse->jointree,
+							 state, parse, NULL, NIL, NIL);
 }
 
 /*
@@ -2651,7 +2726,7 @@ reduce_outer_joins_pass1(Node *jtnode)
 static void
 reduce_outer_joins_pass2(Node *jtnode,
 						 reduce_outer_joins_state *state,
-						 PlannerInfo *root,
+						 Query *parse,
 						 Relids nonnullable_rels,
 						 List *nonnullable_vars,
 						 List *forced_null_vars)
@@ -2690,7 +2765,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 			reduce_outer_joins_state *sub_state = lfirst(s);
 
 			if (sub_state->contains_outer)
-				reduce_outer_joins_pass2(lfirst(l), sub_state, root,
+				reduce_outer_joins_pass2(lfirst(l), sub_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
@@ -2802,7 +2877,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 		/* Apply the jointype change, if any, to both jointree node and RTE */
 		if (rtindex && jointype != j->jointype)
 		{
-			RangeTblEntry *rte = rt_fetch(rtindex, root->parse->rtable);
+			RangeTblEntry *rte = rt_fetch(rtindex, parse->rtable);
 
 			Assert(rte->rtekind == RTE_JOIN);
 			Assert(rte->jointype == j->jointype);
@@ -2887,7 +2962,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 					pass_nonnullable_vars = NIL;
 					pass_forced_null_vars = NIL;
 				}
-				reduce_outer_joins_pass2(j->larg, left_state, root,
+				reduce_outer_joins_pass2(j->larg, left_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
@@ -2909,7 +2984,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 					pass_nonnullable_vars = NIL;
 					pass_forced_null_vars = NIL;
 				}
-				reduce_outer_joins_pass2(j->rarg, right_state, root,
+				reduce_outer_joins_pass2(j->rarg, right_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index a04b62274d..b31cd6b1ed 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -38,10 +38,12 @@
 #include "optimizer/optimizer.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planmain.h"
+#include "optimizer/pathnode.h"
 #include "parser/analyze.h"
 #include "parser/parse_agg.h"
 #include "parser/parse_coerce.h"
 #include "parser/parse_func.h"
+#include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "tcop/tcopprot.h"
 #include "utils/acl.h"
@@ -2010,6 +2012,271 @@ find_forced_null_var(Node *node)
 	return NULL;
 }
 
+/*
+ * find_innerjoined_rels
+ *		Traverse jointree to locate non-outerjoined-rels and quals above them
+ *
+ * We fill innerjoined_rels with the relids of all rels that are not below
+ * the nullable side of any outer join (which would cause their Vars to be
+ * possibly NULL regardless of what's in the catalogs).  In the same scan,
+ * we locate all WHERE and JOIN/ON quals that constrain these rels add them to
+ * the usable_quals list (forming a list with implicit-AND semantics).
+ *
+ * Top-level caller must initialize innerjoined_rels/usable_quals to NULL/NIL.
+ */
+static void
+find_innerjoined_rels(Node *jtnode,
+					  Relids *innerjoined_rels, List **usable_quals)
+{
+	if (jtnode == NULL)
+		return;
+	if (IsA(jtnode, RangeTblRef))
+	{
+		int			varno = ((RangeTblRef *) jtnode)->rtindex;
+
+		*innerjoined_rels = bms_add_member(*innerjoined_rels, varno);
+	}
+	else if (IsA(jtnode, FromExpr))
+	{
+		FromExpr   *f = (FromExpr *) jtnode;
+		ListCell   *lc;
+
+		/* All elements of the FROM list are allowable */
+		foreach(lc, f->fromlist)
+			find_innerjoined_rels((Node *) lfirst(lc),
+								  innerjoined_rels, usable_quals);
+		/* ... and its WHERE quals are too */
+		if (f->quals)
+			*usable_quals = lappend(*usable_quals, f->quals);
+	}
+	else if (IsA(jtnode, JoinExpr))
+	{
+		JoinExpr   *j = (JoinExpr *) jtnode;
+
+		switch (j->jointype)
+		{
+			case JOIN_INNER:
+				/* visit both children */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				/* and grab the ON quals too */
+				if (j->quals)
+					*usable_quals = lappend(*usable_quals, j->quals);
+				break;
+
+			case JOIN_LEFT:
+			case JOIN_SEMI:
+			case JOIN_ANTI:
+
+				/*
+				 * Only the left input is possibly non-nullable; furthermore,
+				 * the quals of this join don't constrain the left input.
+				 * Note: we probably can't see SEMI or ANTI joins at this
+				 * point, but if we do, we can treat them like LEFT joins.
+				 */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_RIGHT:
+				/* Reverse of the above case */
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_FULL:
+				/* Neither side is non-nullable, so stop descending */
+				break;
+
+			case JOIN_UNIQUE_OUTER:
+			case JOIN_UNIQUE_INNER:
+				/* Don't think we will see JOIN_UNIQUE_OUTER or
+				 * JOIN_UNIQUE_INNER since they are only used internally in
+				 * the planner in a much later phase (in standard_join_search).
+				*/
+				break;
+
+			default:
+				elog(ERROR, "unrecognized join type: %d",
+					 (int) j->jointype);
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+			 (int) nodeTag(jtnode));
+}
+
+/*
+ * Returns true if the Node passed in is nonnullable. Currently handles Var,
+ * TargetEntry, CoaleseExpr and Const.
+ * TODO: Add more supporting cases.
+ * A Var is nonnullable if:
+ *	It does not appear on the null-padded side of an outer join and it has NOT NULL constraint,
+ * 	Or if it's forced non-null by inner join or other strict predicates.
+ * A CoalesceExpr is nonnullable if it has a non-null argument.
+ */
+bool
+is_node_nonnullable(Node * node, Query *parse)
+{
+	AttrNumber	 attno = InvalidAttrNumber;
+	Oid				 reloid;
+	Var 				*var = NULL;
+
+	/*
+	 * If the query contains set operations, punt.  The set ops themselves
+	 * couldn't introduce nulls that weren't in their inputs, but the tlist
+	 * present in the top-level query is just dummy and won't give us useful
+	 * info.  We could get an answer by recursing to examine each leaf query,
+	 * but for the moment it doesn't seem worth the extra complication.
+	 *
+	 * Note that we needn't consider other top-level operators such as
+	 * DISTINCT, GROUP BY, etc, as those will not introduce nulls either.
+	 */
+	if (parse->setOperations)
+		return false;
+
+	switch (nodeTag(node))
+	{
+		case T_Var:
+		{
+			RangeTblEntry	*rte;
+
+			var = (Var *) node;
+			attno = var->varattno;
+			rte = rt_fetch(var->varno, parse->rtable);
+			reloid = rte->relid;
+			break;
+		}
+		case T_TargetEntry:
+		{
+			TargetEntry	*te = (TargetEntry *)node;
+			switch(nodeTag(te->expr))
+			{
+				case T_Var:
+				{
+					var = (Var *) te->expr;
+					attno = te->resorigcol;
+					reloid = te->resorigtbl;
+					break;
+				}
+				/* recurse into is_node_nonnullable for other types of Node */
+				default:
+					return is_node_nonnullable((Node *)te->expr, parse);
+			}
+			break;
+		}
+		case T_CoalesceExpr:
+		{
+			ListCell   			*arg;
+			CoalesceExpr	*cexpr = (CoalesceExpr *)node;
+
+			/* handle COALESCE Function by looking for non-null argument */
+			foreach(arg, cexpr->args)
+			{
+				Node	*e = lfirst(arg);
+
+				/* recurse into is_node_nonnullable */
+				if (is_node_nonnullable(e, parse))
+				{
+					return true;
+				}
+			}
+			break;
+		}
+		case T_Const:
+		{
+			if(!((Const *) node)->constisnull)
+			{
+				return true;
+			}
+			break;
+		}
+		/*
+		 * TODO: handle more cases to make the nullability test more accurate
+		 * Assume unhandled cases are nullable.
+		 */
+		default:
+			return false;
+	}
+
+	/*
+	 * If we have found a Var, it is non-null if:
+	 * not on NULL-padded side of an outer join and has NOT NULL constraint
+	 * or forced NOT NULL by inner join conditions or by other strict predicates.
+	 */
+	if(var &&
+			reloid != InvalidOid)
+	{
+		Relids		 innerjoined_rels = NULL;
+		List	   		*innerjoined_useful_quals = NIL;
+		List			*nonnullable_innerjoined_vars = NIL;
+
+		find_innerjoined_rels((Node *) parse->jointree,
+							  &innerjoined_rels,
+							  &innerjoined_useful_quals);
+
+		/*
+		 * Check if the Var is from an INNER JOINed rel, it's also guaranteed
+		 * to not be on the null-padded side of an outer join.
+		 */
+		if (bms_is_member(var->varno, innerjoined_rels))
+		{
+			/*
+			 * If Var is from a plain relation and its column is marked
+			 * NOT NULL according to the catalogs, it can't produce NULL.
+			 */
+			if (get_attnotnull(reloid, attno))
+			{
+				return true;
+			}
+
+			/*
+			 * Otherwise check for the existance of strict predicates which filter
+			 * out NULL values for this Var.
+			 */
+			nonnullable_innerjoined_vars =
+				find_nonnullable_vars((Node *) innerjoined_useful_quals);
+
+			if (list_member(nonnullable_innerjoined_vars, var))
+			{
+				return true;
+			}
+		}
+
+		/*
+		 * If we get here it means -
+		 * The var is on null-padded side of an outer-join, since we've already
+		 * tried reduce_outer_joins(), there isn't any strict predicates to turn this
+		 * outer join to inner join, then there will be no strict predicates to force
+		 * this var non-null as well.
+		 */
+	}
+
+	return false;
+}
+
+/*
+ * Returns true if at least one Node in the input list is non-nullable
+ */
+bool
+list_hasnonnullable(List * list, Query *parse)
+{
+	ListCell	*lc;
+	Node	*node;
+
+	foreach(lc, list)
+	{
+		node = lfirst(lc);
+		if(is_node_nonnullable(node, parse))
+		{
+			return true;
+		}
+	}
+	return false;
+}
+
 /*
  * Can we treat a ScalarArrayOpExpr as strict?
  *
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 27602fa49c..8cd75101e5 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -908,6 +908,35 @@ get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 	ReleaseSysCache(tp);
 }
 
+/*
+ * get_attnotnull
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	  tp;
+	Form_pg_attribute att_tup;
+
+	tp = SearchSysCache2(ATTNUM,
+			ObjectIdGetDatum(relid),
+			Int16GetDatum(attnum));
+
+	if (HeapTupleIsValid(tp))
+	{
+		bool result;
+
+		att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+		result = att_tup->attnotnull;
+		ReleaseSysCache(tp);
+
+		return result;
+	}
+	/* Assume att is nullable if no valid heap tuple is found */
+	return false;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 5fccc9683e..0c7a436641 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1059,6 +1059,15 @@ static struct config_bool ConfigureNamesBool[] =
 		true,
 		NULL, NULL, NULL
 	},
+	{
+		{"enable_not_in_transform", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables the planner to transform NOT IN subquery to ANTI JOIN when possible."),
+			NULL
+		},
+		&enable_not_in_transform,
+		true,
+		NULL, NULL, NULL
+	},
 	{
 		{"geqo", PGC_USERSET, QUERY_TUNING_GEQO,
 			gettext_noop("Enables genetic query optimization."),
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 860a84de7c..02bf2318a9 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -643,6 +643,8 @@ typedef struct SubLink
 	List	   *operName;		/* originally specified operator name */
 	Node	   *subselect;		/* subselect as Query* or raw parsetree */
 	int			location;		/* token location, or -1 if unknown */
+	Node	   *subroot;		/* PlannerInfo for the subquery */
+	Node	   *subplan;		/* best Plan for the subquery */
 } SubLink;
 
 /*
diff --git a/src/include/optimizer/clauses.h b/src/include/optimizer/clauses.h
index 2f9aeec4a7..b3229e96a7 100644
--- a/src/include/optimizer/clauses.h
+++ b/src/include/optimizer/clauses.h
@@ -45,6 +45,9 @@ extern List *find_nonnullable_vars(Node *clause);
 extern List *find_forced_null_vars(Node *clause);
 extern Var *find_forced_null_var(Node *clause);
 
+extern bool is_node_nonnullable(Node * node, Query *parse);
+extern bool list_hasnonnullable(List * list, Query *parse);
+
 extern bool is_pseudo_constant_clause(Node *clause);
 extern bool is_pseudo_constant_clause_relids(Node *clause, Relids relids);
 
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index b3d0b4f6fb..9a5191cd9a 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -58,6 +58,7 @@ extern PGDLLIMPORT bool enable_nestloop;
 extern PGDLLIMPORT bool enable_material;
 extern PGDLLIMPORT bool enable_mergejoin;
 extern PGDLLIMPORT bool enable_hashjoin;
+extern PGDLLIMPORT bool enable_not_in_transform;
 extern PGDLLIMPORT bool enable_gathermerge;
 extern PGDLLIMPORT bool enable_partitionwise_join;
 extern PGDLLIMPORT bool enable_partitionwise_aggregate;
diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h
index 9018160d80..6518417d8f 100644
--- a/src/include/optimizer/prep.h
+++ b/src/include/optimizer/prep.h
@@ -26,7 +26,7 @@ extern void pull_up_sublinks(PlannerInfo *root);
 extern void preprocess_function_rtes(PlannerInfo *root);
 extern void pull_up_subqueries(PlannerInfo *root);
 extern void flatten_simple_union_all(PlannerInfo *root);
-extern void reduce_outer_joins(PlannerInfo *root);
+extern void reduce_outer_joins(Query *parse);
 extern void remove_useless_result_rtes(PlannerInfo *root);
 extern Relids get_relids_in_jointree(Node *jtnode, bool include_joins);
 extern Relids get_relids_for_join(Query *query, int joinrelid);
diff --git a/src/include/optimizer/subselect.h b/src/include/optimizer/subselect.h
index 71a22eeb64..77c3730e47 100644
--- a/src/include/optimizer/subselect.h
+++ b/src/include/optimizer/subselect.h
@@ -19,6 +19,8 @@
 extern void SS_process_ctes(PlannerInfo *root);
 extern JoinExpr *convert_ANY_sublink_to_join(PlannerInfo *root,
 											 SubLink *sublink,
+											 bool under_not,
+											 Node **pullout,
 											 Relids available_rels);
 extern JoinExpr *convert_EXISTS_sublink_to_join(PlannerInfo *root,
 												SubLink *sublink,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index c8df5bff9f..d2bf81985d 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -90,6 +90,7 @@ extern char get_attgenerated(Oid relid, AttrNumber attnum);
 extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 								  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern char *get_collation_name(Oid colloid);
 extern bool get_collation_isdeterministic(Oid colloid);
 extern char *get_constraint_name(Oid conoid);
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index ee9c5db0d5..07b53473b9 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -1466,3 +1466,2789 @@ select * from x for update;
    Output: subselect_tbl.f1, subselect_tbl.f2, subselect_tbl.f3
 (2 rows)
 
+-- test NON IN to ANTI JOIN conversion
+CREATE TABLE s (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s values(1000002, 1000002, 1000002, 'foofoo');
+UPDATE s set n = NULL WHERE n = 3;
+analyze s;
+CREATE TABLE l (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into l (u, n, nn, p)
+	select
+    generate_series(1,10000 ) as u,
+	generate_series(1,10000 ) as n,
+	generate_series(1,10000 ) as nn,
+	'bar' as p;
+UPDATE l set n = NULL WHERE n = 7;
+CREATE UNIQUE INDEX l_u ON l (u);
+CREATE INDEX l_n ON l (n);
+CREATE INDEX l_nn ON l (nn);
+analyze l;
+CREATE TABLE s1 (u INTEGER NOT NULL, n INTEGER NULL, n1 INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s1 (u, n, n1, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as n1,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s1 values(1000003, 1000003, 1000003, 1000003, 'foofoo');
+insert into s1 values(1003, 1003, 1003, 1003, 'foofoo');
+UPDATE s1 set n = NULL WHERE n = 3;
+UPDATE s1 set n1 = NULL WHERE n = 2;
+UPDATE s1 set n1 = NULL WHERE n1 = 3;
+analyze s1;
+CREATE TABLE empty (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+analyze empty;
+-- set work_mem to 64KB so that NOT IN to ANTI JOIN optimization will kick in
+set work_mem = 64;
+-- correctness test 1: inner empty, return every thing from outer including NULL
+explain (costs false) select * from s where n not in (select n from empty);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on empty
+(4 rows)
+
+select * from s where n not in (select n from empty);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+-- correctness test 2: inner has NULL, return empty result
+explain (costs false) select * from s where n not in (select n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- correctness test 3: inner non-null, result has no NULL
+explain (costs false) select * from s where n not in (select u from l);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Index Only Scan using l_u on l
+         Index Cond: (u = s.n)
+(7 rows)
+
+select * from s where n not in (select u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+-- correctness test 4: inner has predicate
+explain (costs false) select * from s where n not in (select n from l where u > 7);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (u > 7)
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 7)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(14 rows)
+
+select * from s where n not in (select n from l where u > 7);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(3 rows)
+
+-- correctness test 5: multi-expression, (2, 2, null, 2, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s1.n = l.u) AND (s1.n1 = l.nn)) IS NOT FALSE)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Seq Scan on l
+               Filter: (u >= 3)
+(6 rows)
+
+select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+    u    |    n    |   n1    |   nn    |   p    
+---------+---------+---------+---------+--------
+       1 |       1 |       1 |       1 | foo
+ 1000003 | 1000003 | 1000003 | 1000003 | foofoo
+       2 |       2 |         |       2 | foo
+(3 rows)
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should not be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s1.n = l.u) AND (s1.n1 = l.nn)) IS NOT FALSE)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Seq Scan on l
+               Filter: (u > 0)
+(6 rows)
+
+select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+    u    |    n    |   n1    |   nn    |   p    
+---------+---------+---------+---------+--------
+ 1000003 | 1000003 | 1000003 | 1000003 | foofoo
+(1 row)
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on s1
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Index Scan using l_u on l
+           Index Cond: (u < 0)
+(5 rows)
+
+select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+    u    |    n    |   n1    |   nn    |   p    
+---------+---------+---------+---------+--------
+       1 |       1 |       1 |       1 | foo
+ 1000003 | 1000003 | 1000003 | 1000003 | foofoo
+    1003 |    1003 |    1003 |    1003 | foofoo
+       2 |       2 |         |       2 | foo
+       3 |         |         |       3 | foo
+(5 rows)
+
+-- test using hashed subplan when inner fits in work_mem
+explain (costs false) select * from l where n not in (select n from s);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on l
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on s
+(4 rows)
+
+select * from l where n not in (select n from s);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- test single expression
+explain (costs false) select * from s where n not in (select n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select u from l);
+              QUERY PLAN              
+--------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_u on l
+         Index Cond: (u = s.u)
+(4 rows)
+
+select * from s where u not in (select u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where 3*n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: (((3 * n) IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: (((3 * s.n) = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = (3 * s.n))
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where 3*n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select 3*n from l);
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = (3 * l.n)) OR ((3 * l.n) IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Materialize
+         ->  Seq Scan on l
+(8 rows)
+
+select * from s where n not in (select 3*n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- test single expression with predicates
+explain (costs false) select * from s where n not in (select n from l where u > 0);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (u > 0)
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 0)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(14 rows)
+
+select * from s where n not in (select n from l where u > 0);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l where u > 100);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (u > 100)
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 100)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(14 rows)
+
+select * from s where n not in (select n from l where u > 100);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(3 rows)
+
+-- test multi expression
+explain (costs false) select * from s where (n,u) not in (select n,u from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_u on l
+         Index Cond: (u = s.u)
+         Filter: ((s.n = n) OR (n IS NULL) OR (s.n IS NULL))
+(5 rows)
+
+select * from s where (n,u) not in (select n,u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (u, nn) not in (select u, nn from l);
+            QUERY PLAN            
+----------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: (s.u = u)
+(5 rows)
+
+select * from s where (u, nn) not in (select u, nn from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u) not in (select u,n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.u = n) OR (n IS NULL))
+         Filter: ((s.n = u) OR (s.n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.u)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where (n,u) not in (select u,n from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l);
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: (((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+                                          QUERY PLAN                                          
+----------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((u > 1000) AND ((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((u > 0) AND ((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((u > 1) AND ((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(2 rows)
+
+-- test multi-table
+explain (costs false) select count(*) from s, l where s.n not in (select n from l);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+                     Filter: ((n IS NOT NULL) OR (NOT $0))
+               ->  Bitmap Heap Scan on l l_1
+                     Recheck Cond: ((s.n = n) OR (n IS NULL))
+                     ->  BitmapOr
+                           ->  Bitmap Index Scan on l_n
+                                 Index Cond: (n = s.n)
+                           ->  Bitmap Index Scan on l_n
+                                 Index Cond: (n IS NULL)
+         ->  Seq Scan on l
+(15 rows)
+
+select count(*) from s, l where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s, l where s.nn not in (select nn from l);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+               ->  Index Only Scan using l_nn on l l_1
+                     Index Cond: (nn = s.nn)
+         ->  Seq Scan on l
+(7 rows)
+
+select count(*) from s, l where s.nn not in (select nn from l);
+ count 
+-------
+ 10000
+(1 row)
+
+-- test null padded results from outer join
+explain (costs false) select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = s_1.nn) OR (s_1.nn IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Nested Loop Left Join
+           Join Filter: (l_1.nn = s_2.nn)
+           ->  Seq Scan on l l_1
+           ->  Materialize
+                 ->  Seq Scan on s s_2
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s_1.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s s_1
+(15 rows)
+
+select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = s_1.nn) OR (s_1.nn IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Nested Loop Left Join
+           Join Filter: (s_2.nn = l_1.nn)
+           ->  Seq Scan on l l_1
+           ->  Materialize
+                 ->  Seq Scan on s s_2
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s_1.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s s_1
+(15 rows)
+
+select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+                   QUERY PLAN                   
+------------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         ->  Seq Scan on l
+               Filter: (NOT (hashed SubPlan 1))
+               SubPlan 1
+                 ->  Seq Scan on s s_1
+         ->  Hash
+               ->  Seq Scan on s
+(9 rows)
+
+select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+ count 
+-------
+  9997
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+                QUERY PLAN                
+------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         Filter: (NOT (hashed SubPlan 1))
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s
+         SubPlan 1
+           ->  Seq Scan on s s_1
+(9 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   ->  Nested Loop Left Join
+         Join Filter: (s.nn = l.nn)
+         ->  Hash Anti Join
+               Hash Cond: (l.nn = l_1.nn)
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Left Join
+                           Hash Cond: (l_1.nn = s_1.nn)
+                           ->  Seq Scan on l l_1
+                           ->  Hash
+                                 ->  Seq Scan on s s_1
+         ->  Seq Scan on s
+(13 rows)
+
+select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Nested Loop Left Join
+           Join Filter: (l_2.nn = s_2.nn)
+           ->  Seq Scan on l l_2
+           ->  Materialize
+                 ->  Seq Scan on s s_2
+   ->  Nested Loop Anti Join
+         Join Filter: ((s.nn = s_1.nn) OR (s_1.nn IS NULL))
+         ->  Hash Left Join
+               Hash Cond: (l.nn = s.nn)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s
+         ->  Materialize
+               ->  Hash Left Join
+                     Hash Cond: (l_1.nn = s_1.nn)
+                     ->  Seq Scan on l l_1
+                     ->  Hash
+                           ->  Seq Scan on s s_1
+(21 rows)
+
+select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Seq Scan on s1
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(11 rows)
+
+select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Left Join
+               Hash Cond: (l.u = s.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Right Join
+                           Hash Cond: (s1.u = s.u)
+                           ->  Seq Scan on s1
+                           ->  Hash
+                                 ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(17 rows)
+
+select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+                    QUERY PLAN                     
+---------------------------------------------------
+ Aggregate
+   ->  Nested Loop Left Join
+         Join Filter: (s.u = s1.u)
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+               ->  Index Only Scan using l_nn on l
+                     Index Cond: (nn = s.nn)
+         ->  Seq Scan on s1
+(8 rows)
+
+select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop
+               Join Filter: (s.u = s1.u)
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Seq Scan on s1
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(11 rows)
+
+select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Left Join
+               Hash Cond: (l.u = s.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Join
+                           Hash Cond: (s1.u = s.u)
+                           ->  Seq Scan on s1
+                           ->  Hash
+                                 ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(17 rows)
+
+select * from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+ u | n | nn | p | u | n | n1 | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Seq Scan on s1
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(11 rows)
+
+select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Full Join
+               Hash Cond: (l.u = s.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Join
+                           Hash Cond: (s1.u = s.u)
+                           ->  Seq Scan on s1
+                           ->  Hash
+                                 ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(17 rows)
+
+select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+                    QUERY PLAN                     
+---------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Nested Loop Left Join
+         Join Filter: (l.nn = s1.nn)
+         ->  Nested Loop Left Join
+               Join Filter: (l.nn = s_1.nn)
+               ->  Index Only Scan using l_nn on l
+                     Index Cond: (nn = s.nn)
+               ->  Seq Scan on s s_1
+         ->  Seq Scan on s1
+(10 rows)
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.nn = s_1.nn)
+           ->  Nested Loop Left Join
+                 ->  Seq Scan on s1
+                 ->  Index Only Scan using l_nn on l
+                       Index Cond: (nn = s1.nn)
+           ->  Materialize
+                 ->  Seq Scan on s s_1
+(11 rows)
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+                                    QUERY PLAN                                     
+-----------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Nested Loop Left Join
+         Join Filter: (l.nn = s_1.nn)
+         ->  Index Scan using l_nn on l
+               Index Cond: (nn = s.nn)
+               Filter: (((s.n = n) OR (n IS NULL) OR (s.n IS NULL)) AND (s.u = u))
+         ->  Seq Scan on s s_1
+(8 rows)
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l right join s on l.nn = s.nn);
+                QUERY PLAN                 
+-------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+           ->  Index Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+(7 rows)
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+--test reduce outer joins from outer query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Left Join
+               Hash Cond: (l.nn = s.nn)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(13 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+                     Filter: (u > 0)
+               ->  Index Only Scan using l_nn on l l_1
+                     Index Cond: (nn = s.nn)
+         ->  Index Only Scan using l_nn on l
+               Index Cond: (nn = s.nn)
+(9 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Index Only Scan using l_nn on l
+                     Index Cond: (nn = s.nn)
+         ->  Seq Scan on s1
+(11 rows)
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Anti Join
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on s1
+               ->  Materialize
+                     ->  Nested Loop
+                           ->  Seq Scan on s
+                           ->  Index Only Scan using l_nn on l
+                                 Index Cond: (nn = s.nn)
+         ->  Index Only Scan using l_nn on l l_1
+               Index Cond: (nn = s.nn)
+(15 rows)
+
+select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Left Join
+         Join Filter: (s.u = s1.u)
+         ->  Hash Anti Join
+               Hash Cond: (s.nn = l_1.nn)
+               ->  Hash Left Join
+                     Hash Cond: (l.nn = s.nn)
+                     Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+                     ->  Seq Scan on l
+                     ->  Hash
+                           ->  Seq Scan on s
+               ->  Hash
+                     ->  Seq Scan on l l_1
+         ->  Seq Scan on s1
+(16 rows)
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+--test reduce outer joins from subquery
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+(7 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+                QUERY PLAN                 
+-------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+                 Filter: (u > 9)
+(8 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+                 Filter: (u > 9)
+           ->  Index Only Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+(8 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           Join Filter: (l.n = s1.n)
+           ->  Seq Scan on s1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Scan using l_nn on l
+                             Index Cond: (nn = s_1.nn)
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.n = s1.n)
+           ->  Seq Scan on s1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Scan using l_nn on l
+                             Index Cond: (nn = s_1.nn)
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+                   QUERY PLAN                    
+-------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.n = s1.n)
+           ->  Nested Loop Left Join
+                 ->  Seq Scan on s s_1
+                 ->  Index Scan using l_nn on l
+                       Index Cond: (nn = s_1.nn)
+           ->  Materialize
+                 ->  Seq Scan on s1
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+--test reduce outer join on outer and sub-query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+                                    QUERY PLAN                                    
+----------------------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = s1.u)
+         ->  Seq Scan on s1
+         ->  Materialize
+               ->  Nested Loop
+                     ->  Seq Scan on s
+                           Filter: (NOT (hashed SubPlan 1))
+                           SubPlan 1
+                             ->  Nested Loop
+                                   Join Filter: (l_1.n = s1_1.n)
+                                   ->  Seq Scan on s1 s1_1
+                                   ->  Materialize
+                                         ->  Nested Loop
+                                               ->  Seq Scan on s s_1
+                                               ->  Index Scan using l_nn on l l_1
+                                                     Index Cond: (nn = s_1.nn)
+                     ->  Index Only Scan using l_nn on l
+                           Index Cond: (nn = s.nn)
+(19 rows)
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         Filter: (NOT (hashed SubPlan 1))
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Hash Right Join
+                     Hash Cond: (s1.u = s.u)
+                     ->  Seq Scan on s1
+                     ->  Hash
+                           ->  Seq Scan on s
+         SubPlan 1
+           ->  Nested Loop Left Join
+                 Join Filter: (l_1.n = s1_1.n)
+                 ->  Nested Loop Left Join
+                       ->  Seq Scan on s s_1
+                       ->  Index Scan using l_nn on l l_1
+                             Index Cond: (nn = s_1.nn)
+                 ->  Materialize
+                       ->  Seq Scan on s1 s1_1
+(20 rows)
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+ count 
+-------
+     0
+(1 row)
+
+-- test union all
+explain (costs false) select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s t
+         Filter: (n IS NOT NULL)
+   ->  Append
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: ((t.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = t.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+         ->  Seq Scan on s
+               Filter: ((t.n = u) OR (u IS NULL))
+         ->  Seq Scan on s s_1
+               Filter: ((t.n = nn) OR (nn IS NULL))
+(15 rows)
+
+select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Result
+           ->  Append
+                 ->  Seq Scan on l l_1
+                 ->  Seq Scan on s s_3
+                 ->  Seq Scan on s s_4
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Append
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+         ->  Seq Scan on s s_1
+               Filter: ((s.n = u) OR (u IS NULL))
+         ->  Seq Scan on s s_2
+               Filter: ((s.n = nn) OR (nn IS NULL))
+(21 rows)
+
+select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.n = l_1.nn)
+         ->  Append
+               ->  Seq Scan on s
+                     Filter: ((n IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+                     Filter: ((u IS NOT NULL) OR (NOT $0))
+         ->  Hash
+               ->  Seq Scan on l l_1
+(12 rows)
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+                   QUERY PLAN                   
+------------------------------------------------
+ Aggregate
+   ->  Append
+         ->  Seq Scan on s
+               Filter: (NOT (hashed SubPlan 1))
+               SubPlan 1
+                 ->  Seq Scan on empty
+         ->  Seq Scan on l
+               Filter: (NOT (hashed SubPlan 1))
+(8 rows)
+
+select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+ count 
+-------
+ 10004
+(1 row)
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
+ Finalize Aggregate
+   InitPlan 1 (returns $0)
+     ->  Result
+           ->  Append
+                 ->  Seq Scan on l l_2
+                 ->  Seq Scan on s s_3
+                 ->  Seq Scan on s s_4
+   ->  Gather
+         Workers Planned: 2
+         Params Evaluated: $0
+         ->  Partial Aggregate
+               ->  Nested Loop Anti Join
+                     ->  Parallel Append
+                           ->  Parallel Seq Scan on l
+                                 Filter: ((u IS NOT NULL) OR (NOT $0))
+                           ->  Parallel Seq Scan on s
+                                 Filter: ((n IS NOT NULL) OR (NOT $0))
+                     ->  Append
+                           ->  Bitmap Heap Scan on l l_1
+                                 Recheck Cond: ((l.u = n) OR (n IS NULL))
+                                 ->  BitmapOr
+                                       ->  Bitmap Index Scan on l_n
+                                             Index Cond: (n = l.u)
+                                       ->  Bitmap Index Scan on l_n
+                                             Index Cond: (n IS NULL)
+                           ->  Seq Scan on s s_1
+                                 Filter: ((l.u = u) OR (u IS NULL))
+                           ->  Seq Scan on s s_2
+                                 Filter: ((l.u = nn) OR (nn IS NULL))
+(29 rows)
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+ count 
+-------
+     0
+(1 row)
+
+-- test multi-levels of NOT IN
+explain (costs false) select * from s where n not in (select n from s where n not in (select n from l));
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 2))
+   SubPlan 2
+     ->  Nested Loop Anti Join
+           InitPlan 1 (returns $0)
+             ->  Seq Scan on l
+           ->  Seq Scan on s s_1
+                 Filter: ((n IS NOT NULL) OR (NOT $0))
+           ->  Bitmap Heap Scan on l l_1
+                 Recheck Cond: ((s_1.n = n) OR (n IS NULL))
+                 ->  BitmapOr
+                       ->  Bitmap Index Scan on l_n
+                             Index Cond: (n = s_1.n)
+                       ->  Bitmap Index Scan on l_n
+                             Index Cond: (n IS NULL)
+(15 rows)
+
+select * from s where n not in (select n from s where n not in (select n from l));
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where n not in (select n from s where n not in (select u from l));
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 2))
+   SubPlan 2
+     ->  Nested Loop Anti Join
+           InitPlan 1 (returns $0)
+             ->  Seq Scan on l
+           ->  Seq Scan on s s_1
+                 Filter: ((n IS NOT NULL) OR (NOT $0))
+           ->  Index Only Scan using l_u on l l_1
+                 Index Cond: (u = s_1.n)
+(10 rows)
+
+select * from s where n not in (select n from s where n not in (select u from l));
+ u | n | nn |  p  
+---+---+----+-----
+ 1 | 1 |  1 | foo
+ 2 | 2 |  2 | foo
+(2 rows)
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Aggregate
+   ->  Seq Scan on s
+         Filter: (NOT (SubPlan 2))
+         SubPlan 2
+           ->  Result
+                 One-Time Filter: (NOT $2)
+                 InitPlan 1 (returns $2)
+                   ->  Nested Loop Anti Join
+                         ->  Seq Scan on s1
+                               Filter: (n = s.n)
+                         ->  Bitmap Heap Scan on l
+                               Recheck Cond: ((s1.u = n) OR (n IS NULL))
+                               ->  BitmapOr
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n = s1.u)
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n IS NULL)
+                 ->  Seq Scan on s1 s1_1
+(18 rows)
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+                                           QUERY PLAN                                           
+------------------------------------------------------------------------------------------------
+ Seq Scan on s
+   Filter: ((NOT (hashed SubPlan 1)) AND (NOT (hashed SubPlan 2)) AND (NOT (hashed SubPlan 3)))
+   SubPlan 1
+     ->  Seq Scan on s1
+   SubPlan 2
+     ->  Seq Scan on s1 s1_1
+   SubPlan 3
+     ->  Seq Scan on s1 s1_2
+(8 rows)
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: ((NOT (hashed SubPlan 1)) AND (NOT (hashed SubPlan 2)))
+         SubPlan 1
+           ->  Seq Scan on s1
+         SubPlan 2
+           ->  Seq Scan on s1 s1_1
+   ->  Index Only Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+(9 rows)
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Aggregate
+   ->  Seq Scan on s
+         Filter: ((NOT (SubPlan 2)) AND (NOT (SubPlan 4)))
+         SubPlan 2
+           ->  Result
+                 One-Time Filter: (NOT $2)
+                 InitPlan 1 (returns $2)
+                   ->  Nested Loop Anti Join
+                         ->  Seq Scan on s1
+                               Filter: (n = s.n)
+                         ->  Bitmap Heap Scan on l
+                               Recheck Cond: ((s1.u = n) OR (n IS NULL))
+                               ->  BitmapOr
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n = s1.u)
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n IS NULL)
+                 ->  Seq Scan on s1 s1_1
+         SubPlan 4
+           ->  Result
+                 One-Time Filter: (NOT $6)
+                 InitPlan 3 (returns $6)
+                   ->  Nested Loop Anti Join
+                         ->  Seq Scan on s1 s1_2
+                               Filter: (n = s.n)
+                         ->  Bitmap Heap Scan on l l_1
+                               Recheck Cond: ((s1_2.u = n) OR (n IS NULL))
+                               ->  BitmapOr
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n = s1_2.u)
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n IS NULL)
+                 ->  Seq Scan on s1 s1_3
+(33 rows)
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+ count 
+-------
+     0
+(1 row)
+
+--test COALESCE
+explain (costs false) select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, '-1'::integer) = COALESCE(l.n, '-1'::integer))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, '-1'::integer) = COALESCE(l.n, '-1'::integer))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+                                 QUERY PLAN                                  
+-----------------------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((COALESCE(s.n) = COALESCE(l.n)) OR (COALESCE(l.n) IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((COALESCE(n) IS NOT NULL) OR (NOT $0))
+   ->  Materialize
+         ->  Seq Scan on l
+(8 rows)
+
+select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, s.nn) = COALESCE(l.n, l.nn))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+                   QUERY PLAN                   
+------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.nn) = COALESCE(l.nn))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+                                                       QUERY PLAN                                                        
+-------------------------------------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((COALESCE(s.n, '-1'::integer) = COALESCE(n, '-1'::integer)) AND (COALESCE(s.n, s.u) = COALESCE(n, u)))
+(5 rows)
+
+select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+-- test miscellaneous outer nullable cases
+explain (costs false) select * from s where (n,n) not in (select n,n from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s.n = l.n) AND (s.n = l.n)) IS NOT FALSE)
+   ->  Seq Scan on s
+   ->  Materialize
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where (n,n) not in (select n,n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+                                     QUERY PLAN                                      
+-------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s.n = l_1.n) AND (s.u = l_1.u) AND (s.nn = l_1.nn)) IS NOT FALSE)
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s
+   ->  Materialize
+         ->  Seq Scan on l l_1
+(9 rows)
+
+select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+ u | n | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+                 QUERY PLAN                  
+---------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         Filter: (NOT (hashed SubPlan 1))
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s
+         SubPlan 1
+           ->  Index Scan using l_u on l l_1
+                 Index Cond: (u < 0)
+(10 rows)
+
+select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+ count 
+-------
+ 10000
+(1 row)
+
+explain (costs false) select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Sort
+   Sort Key: s.n
+   ->  Seq Scan on s
+         Filter: (NOT (hashed SubPlan 1))
+         SubPlan 1
+           ->  Limit
+                 ->  Unique
+                       ->  Index Scan using l_n on l
+                             Filter: (u > 0)
+(9 rows)
+
+select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+--test outer has strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l) and n > 0;
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n > 0)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where n not in (select n from l) and n > 0;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l) and u > 0;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: (((n IS NOT NULL) OR (NOT $0)) AND (u > 0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where n not in (select n from l) and u > 0;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l) and n is not null;
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n IS NOT NULL)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where n not in (select n from l) and n is not null;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s join l on s.n = l.n where s.n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop
+   ->  Nested Loop Anti Join
+         ->  Seq Scan on s
+         ->  Bitmap Heap Scan on l l_1
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+   ->  Index Scan using l_n on l
+         Index Cond: (n = s.n)
+(12 rows)
+
+select * from s join l on s.n = l.n where s.n not in (select n from l);
+ u | n | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Anti Join
+         ->  Hash Left Join
+               Hash Cond: (l.n = s.n)
+               Filter: ((s.n IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s
+         ->  Bitmap Heap Scan on l l_1
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+(17 rows)
+
+select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Bitmap Heap Scan on l l_1
+                           Recheck Cond: ((s.n = n) OR (n IS NULL))
+                           ->  BitmapOr
+                                 ->  Bitmap Index Scan on l_n
+                                       Index Cond: (n = s.n)
+                                 ->  Bitmap Index Scan on l_n
+                                       Index Cond: (n IS NULL)
+               ->  Index Only Scan using l_n on l
+                     Index Cond: (n = s.n)
+         ->  Seq Scan on s1
+(16 rows)
+
+select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Anti Join
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               Filter: ((s.n IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on s1
+               ->  Materialize
+                     ->  Nested Loop
+                           ->  Seq Scan on s
+                           ->  Index Only Scan using l_n on l
+                                 Index Cond: (n = s.n)
+         ->  Bitmap Heap Scan on l l_1
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+(20 rows)
+
+select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+--test inner has strict predicate or inner join
+explain (costs false) select * from s where u not in (select n from l where n > 0);
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.u) AND (n > 0))
+(4 rows)
+
+select * from s where u not in (select n from l where n > 0);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where u not in (select n from l where u > 0);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.u = n) OR (n IS NULL))
+         Filter: (u > 0)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.u)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where u not in (select n from l where u > 0);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select n from l where n is not null);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.u) AND (n IS NOT NULL))
+(4 rows)
+
+select * from s where u not in (select n from l where n is not null);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(7 rows)
+
+select * from s where u not in (select l.n from l join s on l.n=s.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.u=s.u);
+               QUERY PLAN                
+-----------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Scan using l_u on l
+                 Index Cond: (u = s_1.u)
+(7 rows)
+
+select * from s where u not in (select l.n from l join s on l.u=s.u);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n = s.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(7 rows)
+
+select * from s where u not in (select l.n from l join s on l.n = s.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n = s.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(7 rows)
+
+select * from s where u not in (select l.n from l right join s on l.n = s.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Nested Loop
+                 Join Filter: (s_1.n = s1.n)
+                 ->  Seq Scan on s1
+                 ->  Materialize
+                       ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(11 rows)
+
+select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.n = s1.n)
+           ->  Seq Scan on s1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Only Scan using l_n on l
+                             Index Cond: (n = s_1.n)
+(11 rows)
+
+select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+--test both sides have strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l where n > 0) and n > 0;
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n > 0)
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.n) AND (n > 0))
+(5 rows)
+
+select * from s where n not in (select n from l where n > 0) and n > 0;
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where n not in (select n from l where u > 0) and n > 0;
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n > 0)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 0)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(11 rows)
+
+select * from s where n not in (select n from l where u > 0) and n > 0;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l where n > 0) and u > 0;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (n > 0)
+   ->  Seq Scan on s
+         Filter: (((n IS NOT NULL) OR (NOT $0)) AND (u > 0))
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.n) AND (n > 0))
+(8 rows)
+
+select * from s where n not in (select n from l where n > 0) and u > 0;
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Nested Loop
+   Join Filter: (s.u = s1.u)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Nested Loop
+               ->  Seq Scan on s
+                     Filter: (NOT (hashed SubPlan 1))
+                     SubPlan 1
+                       ->  Nested Loop
+                             ->  Nested Loop
+                                   Join Filter: (s_1.n = s1_1.n)
+                                   ->  Seq Scan on s s_1
+                                   ->  Materialize
+                                         ->  Seq Scan on s s1_1
+                             ->  Index Only Scan using l_n on l l_1
+                                   Index Cond: (n = s_1.n)
+               ->  Index Scan using l_n on l
+                     Index Cond: (n = s.n)
+(18 rows)
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+ u | n | nn | p | u | n | nn | p | u | n | n1 | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+----+---
+(0 rows)
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+                                   QUERY PLAN                                   
+--------------------------------------------------------------------------------
+ Nested Loop
+   Join Filter: (s.u = s1.u)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Nested Loop
+               ->  Seq Scan on s
+                     Filter: (NOT (hashed SubPlan 1))
+                     SubPlan 1
+                       ->  Nested Loop Left Join
+                             Join Filter: (l_1.n = s1_1.n)
+                             ->  Seq Scan on s s1_1
+                             ->  Materialize
+                                   ->  Nested Loop
+                                         ->  Seq Scan on s s_1
+                                         ->  Index Only Scan using l_n on l l_1
+                                               Index Cond: (n = s_1.n)
+               ->  Index Scan using l_n on l
+                     Index Cond: (n = s.n)
+(18 rows)
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+ u | n | nn | p | u | n | nn | p | u | n | n1 | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+----+---
+(0 rows)
+
+explain (costs false) select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Nested Loop Left Join
+   Join Filter: (s.u = s1.u)
+   Filter: (NOT (hashed SubPlan 1))
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Nested Loop
+               ->  Seq Scan on s
+               ->  Index Scan using l_n on l
+                     Index Cond: (n = s.n)
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l_1.n = s1_1.n)
+           ->  Seq Scan on s s1_1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Only Scan using l_n on l l_1
+                             Index Cond: (n = s_1.n)
+(18 rows)
+
+select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+ u | n | nn | p | u | n | nn | p | u | n | n1 | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+----+---
+(0 rows)
+
+--JIRA-7279 CTE with NOT IN
+create table public.testing
+(
+a integer,
+b integer,
+c integer
+);
+explain (costs false) with
+selected(a,b,c) as (values(1,2,3)),
+updated(d,e,f) as (values(4,5,6))
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+                    QUERY PLAN                     
+---------------------------------------------------
+ Insert on testing
+   ->  Result
+         One-Time Filter: (NOT (hashed SubPlan 1))
+         SubPlan 1
+           ->  Result
+(5 rows)
+
+with
+selected(a,b,c) as (values(1,2,3)),
+updated(d,e,f) as (values(4,5,6))
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+select * from public.testing;
+ a | b | c 
+---+---+---
+ 1 | 2 | 3
+(1 row)
+
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+                                     QUERY PLAN                                      
+-------------------------------------------------------------------------------------
+ Insert on testing
+   ->  Nested Loop Anti Join
+         Join Filter: (((s.u = l.u) AND (s.n = l.n) AND (s.nn = l.nn)) IS NOT FALSE)
+         ->  Seq Scan on s
+         ->  Materialize
+               ->  Seq Scan on l
+(6 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+(2 rows)
+
+-- expect to get Hash Anti Join
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where a not in (select d from updated);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Insert on testing
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Nested Loop Anti Join
+         ->  Seq Scan on s
+               Filter: ((u IS NOT NULL) OR (NOT $0))
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(8 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where a not in (select d from updated);
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+(3 rows)
+
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select e from updated);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Insert on testing
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Nested Loop Anti Join
+         ->  Seq Scan on s
+               Filter: ((n IS NOT NULL) OR (NOT $0))
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+(13 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select e from updated);
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+(3 rows)
+
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select d from updated);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Insert on testing
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Nested Loop Anti Join
+         ->  Seq Scan on s
+               Filter: ((n IS NOT NULL) OR (NOT $0))
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.n)
+(8 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select d from updated);
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+(4 rows)
+
+-- two levels of NOT IN with CTE, 2nd NOT IN
+-- subquery access CTE two levels above
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated
+      where d not in (select a from selected));
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
+ Insert on testing
+   CTE selected
+     ->  Seq Scan on s
+   ->  Nested Loop Anti Join
+         Join Filter: (((selected.a = l.u) AND (selected.b = l.n) AND (selected.c = l.nn)) IS NOT FALSE)
+         ->  CTE Scan on selected
+         ->  Materialize
+               ->  Seq Scan on l
+                     Filter: (NOT (hashed SubPlan 3))
+                     SubPlan 3
+                       ->  CTE Scan on selected selected_1
+(11 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated
+      where d not in (select a from selected));
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+       1 |       1 |       1
+       2 |       2 |       2
+ 1000002 | 1000002 | 1000002
+       3 |         |       3
+(8 rows)
+
+-- With clause inside a query block
+explain select count(distinct t.a) from
+(with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+select * from selected where (a,b,c) not in
+(select d,e,f from updated
+ where d not in (select a from selected))) as t;
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
+ Aggregate  (cost=693.75..693.76 rows=1 width=8)
+   ->  Nested Loop Anti Join  (cost=1.13..693.71 rows=3 width=12)
+         Join Filter: (((selected.a = l.u) AND (selected.b = l.n) AND (selected.c = l.nn)) IS NOT FALSE)
+         CTE selected
+           ->  Seq Scan on s  (cost=0.00..1.04 rows=4 width=12)
+         ->  CTE Scan on selected  (cost=0.00..0.08 rows=4 width=12)
+         ->  Materialize  (cost=0.09..230.09 rows=5000 width=12)
+               ->  Seq Scan on l  (cost=0.09..180.09 rows=5000 width=12)
+                     Filter: (NOT (hashed SubPlan 3))
+                     SubPlan 3
+                       ->  CTE Scan on selected selected_1  (cost=0.00..0.08 rows=4 width=4)
+(11 rows)
+
+select count(distinct t.a) from
+(with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+select * from selected where (a,b,c) not in
+(select d,e,f from updated
+ where d not in (select a from selected))) as t;
+ count 
+-------
+     4
+(1 row)
+
+-- With clause in subquery, can't flatten subquery to anti join
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(with updated (d,e,f) as (select u, n, nn from l)
+select d,e,f from updated);
+            QUERY PLAN             
+-----------------------------------
+ Insert on testing
+   ->  Seq Scan on s
+         Filter: (NOT (SubPlan 1))
+         SubPlan 1
+           ->  Materialize
+                 ->  Seq Scan on l
+(6 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(with updated (d,e,f) as (select u, n, nn from l)
+select d,e,f from updated);
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+       1 |       1 |       1
+       2 |       2 |       2
+ 1000002 | 1000002 | 1000002
+       3 |         |       3
+ 1000002 | 1000002 | 1000002
+(9 rows)
+
+-- With clause in subquery, subsubquery access CTE in subquery
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(
+with updated(d,e,f) as (select u, n, nn from l)
+select d,e,f from updated where d not in (select d from updated)
+);
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Insert on testing
+   ->  Seq Scan on s
+         Filter: (NOT (SubPlan 3))
+         SubPlan 3
+           ->  Materialize
+                 CTE updated
+                   ->  Seq Scan on l
+                 InitPlan 2 (returns $1)
+                   ->  CTE Scan on updated
+                 ->  Hash Anti Join
+                       Hash Cond: (updated_1.d = updated_2.d)
+                       ->  CTE Scan on updated updated_1
+                             Filter: ((d IS NOT NULL) OR (NOT $1))
+                       ->  Hash
+                             ->  CTE Scan on updated updated_2
+(15 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(
+with updated(d,e,f) as (select u, n, nn from l)
+select d,e,f from updated where d not in (select d from updated)
+);
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+       1 |       1 |       1
+       2 |       2 |       2
+ 1000002 | 1000002 | 1000002
+       3 |         |       3
+ 1000002 | 1000002 | 1000002
+       1 |       1 |       1
+       2 |       2 |       2
+ 1000002 | 1000002 | 1000002
+       3 |         |       3
+(13 rows)
+
+-- Recursive CTE
+CREATE TABLE employees (
+  id serial,
+  name varchar(255),
+  manager_id int
+);
+INSERT INTO employees VALUES (1, 'Mark', null);
+INSERT INTO employees VALUES (2, 'John', 1);
+INSERT INTO employees VALUES (3, 'Dan', 2);
+INSERT INTO employees VALUES (4, 'Clark', 1);
+INSERT INTO employees VALUES (5, 'Linda', 2);
+INSERT INTO employees VALUES (6, 'Willy', 2);
+INSERT INTO employees VALUES (7, 'Barack', 2);
+INSERT INTO employees VALUES (8, 'Elen', 2);
+INSERT INTO employees VALUES (9, 'Kate', 3);
+INSERT INTO employees VALUES (10, 'Terry', 4);
+WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON mtree.id = e.manager_id
+)
+SELECT *
+FROM managertree;
+ id |  name  | manager_id 
+----+--------+------------
+  2 | John   |          1
+  3 | Dan    |          2
+  5 | Linda  |          2
+  6 | Willy  |          2
+  7 | Barack |          2
+  8 | Elen   |          2
+  9 | Kate   |          3
+(7 rows)
+
+-- NOT IN subquery access Recursive CTE
+EXPLAIN (COSTS FALSE) WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON mtree.id = e.manager_id
+)
+SELECT *
+FROM managertree mt WHERE mt.manager_id NOT IN (SELECT id FROM managertree);
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ CTE Scan on managertree mt
+   Filter: (NOT (hashed SubPlan 2))
+   CTE managertree
+     ->  Recursive Union
+           ->  Seq Scan on employees
+                 Filter: (id = 2)
+           ->  Hash Join
+                 Hash Cond: (e.manager_id = mtree.id)
+                 ->  Seq Scan on employees e
+                 ->  Hash
+                       ->  WorkTable Scan on managertree mtree
+   SubPlan 2
+     ->  CTE Scan on managertree
+(13 rows)
+
+WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON mtree.id = e.manager_id
+)
+SELECT *
+FROM managertree mt WHERE mt.manager_id NOT IN (SELECT id FROM managertree);
+ id | name | manager_id 
+----+------+------------
+  2 | John |          1
+(1 row)
+
+-- NOT IN under UNION ALL inside Recursive CTE
+EXPLAIN (COSTS FALSE) WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON
+  (mtree.id = e.manager_id AND
+  mtree.manager_id NOT IN (SELECT manager_id FROM employees)
+  )
+)
+SELECT *
+FROM managertree;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ CTE Scan on managertree
+   CTE managertree
+     ->  Recursive Union
+           ->  Seq Scan on employees employees_1
+                 Filter: (id = 2)
+           ->  Hash Join
+                 Hash Cond: (e.manager_id = mtree.id)
+                 ->  Seq Scan on employees e
+                 ->  Hash
+                       ->  WorkTable Scan on managertree mtree
+                             Filter: (NOT (hashed SubPlan 1))
+                             SubPlan 1
+                               ->  Seq Scan on employees
+(13 rows)
+
+WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON
+  (mtree.id = e.manager_id AND
+  mtree.manager_id NOT IN (SELECT manager_id FROM employees)
+  )
+)
+SELECT *
+FROM managertree;
+ id | name | manager_id 
+----+------+------------
+  2 | John |          1
+(1 row)
+
+--Manfred-7613 CTE NOT IN with Union All
+create table cocotero as (
+	select * from(
+	values(1,2,3)) as data(a,b,c)
+);
+explain (costs off) with selected as (
+	select *
+	from cocotero
+),
+updated as (
+	update cocotero
+	set a = 3
+	from selected
+	where cocotero.a = selected.a
+	returning selected.a,selected.b,selected.c
+),
+inserted as (
+	insert into cocotero
+	select *
+	from selected
+	where a not in (select a from updated)
+	returning *
+)
+select 'updated' as action, count(*) as lines from updated
+union all
+select 'inserted' as action, count(*) as lines from inserted;
+                                      QUERY PLAN                                      
+--------------------------------------------------------------------------------------
+ Append
+   CTE selected
+     ->  Seq Scan on cocotero
+   CTE updated
+     ->  Update on cocotero cocotero_1
+           ->  Merge Join
+                 Merge Cond: (cocotero_1.a = selected.a)
+                 ->  Sort
+                       Sort Key: cocotero_1.a
+                       ->  Seq Scan on cocotero cocotero_1
+                 ->  Materialize
+                       ->  Sort
+                             Sort Key: selected.a
+                             ->  CTE Scan on selected
+   CTE inserted
+     ->  Insert on cocotero cocotero_2
+           InitPlan 3 (returns $3)
+             ->  CTE Scan on updated updated_1
+           ->  Nested Loop Anti Join
+                 Join Filter: ((selected_1.a = updated_2.a) OR (updated_2.a IS NULL))
+                 ->  CTE Scan on selected selected_1
+                       Filter: ((a IS NOT NULL) OR (NOT $3))
+                 ->  CTE Scan on updated updated_2
+   ->  Aggregate
+         ->  CTE Scan on updated
+   ->  Aggregate
+         ->  CTE Scan on inserted
+(27 rows)
+
+with selected as (
+	select *
+	from cocotero
+),
+updated as (
+	update cocotero
+	set a = 3
+	from selected
+	where cocotero.a = selected.a
+	returning selected.a,selected.b,selected.c
+),
+inserted as (
+	insert into cocotero
+	select *
+	from selected
+	where a not in (select a from updated)
+	returning *
+)
+select 'updated' as action, count(*) as lines from updated
+union all
+select 'inserted' as action, count(*) as lines from inserted;
+  action  | lines 
+----------+-------
+ updated  |     1
+ inserted |     0
+(2 rows)
+
+--test enable_not_in_transform
+explain (costs off) select count(*) from s where s.u not in (select l.u from l);
+                 QUERY PLAN                 
+--------------------------------------------
+ Aggregate
+   ->  Nested Loop Anti Join
+         ->  Seq Scan on s
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(5 rows)
+
+set enable_not_in_transform = off;
+explain (costs off) select count(*) from s where s.u not in (select l.u from l);
+            QUERY PLAN             
+-----------------------------------
+ Aggregate
+   ->  Seq Scan on s
+         Filter: (NOT (SubPlan 1))
+         SubPlan 1
+           ->  Materialize
+                 ->  Seq Scan on l
+(6 rows)
+
+-- clean up
+reset work_mem;
+reset enable_not_in_transform;
+drop table s;
+drop table s1;
+drop table l;
+drop table empty;
+drop table public.testing;
+drop table employees;
+drop table cocotero;
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index a1c90eb905..b6cf19f020 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -81,6 +81,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_material                | on
  enable_mergejoin               | on
  enable_nestloop                | on
+ enable_not_in_transform        | on
  enable_parallel_append         | on
  enable_parallel_hash           | on
  enable_partition_pruning       | on
@@ -89,7 +90,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(17 rows)
+(18 rows)
 
 -- Test that the pg_timezone_names and pg_timezone_abbrevs views are
 -- more-or-less working.  We can't test their contents in any great detail
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index 20c4f99c9c..147ca94089 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -754,3 +754,818 @@ select * from (with x as (select 2 as y) select * from x) ss;
 explain (verbose, costs off)
 with x as (select * from subselect_tbl)
 select * from x for update;
+
+-- test NON IN to ANTI JOIN conversion
+CREATE TABLE s (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s values(1000002, 1000002, 1000002, 'foofoo');
+UPDATE s set n = NULL WHERE n = 3;
+analyze s;
+
+CREATE TABLE l (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into l (u, n, nn, p)
+	select
+    generate_series(1,10000 ) as u,
+	generate_series(1,10000 ) as n,
+	generate_series(1,10000 ) as nn,
+	'bar' as p;
+UPDATE l set n = NULL WHERE n = 7;
+
+CREATE UNIQUE INDEX l_u ON l (u);
+CREATE INDEX l_n ON l (n);
+CREATE INDEX l_nn ON l (nn);
+analyze l;
+
+CREATE TABLE s1 (u INTEGER NOT NULL, n INTEGER NULL, n1 INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s1 (u, n, n1, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as n1,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s1 values(1000003, 1000003, 1000003, 1000003, 'foofoo');
+insert into s1 values(1003, 1003, 1003, 1003, 'foofoo');
+UPDATE s1 set n = NULL WHERE n = 3;
+UPDATE s1 set n1 = NULL WHERE n = 2;
+UPDATE s1 set n1 = NULL WHERE n1 = 3;
+analyze s1;
+
+CREATE TABLE empty (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+analyze empty;
+
+-- set work_mem to 64KB so that NOT IN to ANTI JOIN optimization will kick in
+set work_mem = 64;
+
+-- correctness test 1: inner empty, return every thing from outer including NULL
+explain (costs false) select * from s where n not in (select n from empty);
+
+select * from s where n not in (select n from empty);
+
+-- correctness test 2: inner has NULL, return empty result
+explain (costs false) select * from s where n not in (select n from l);
+
+select * from s where n not in (select n from l);
+
+-- correctness test 3: inner non-null, result has no NULL
+explain (costs false) select * from s where n not in (select u from l);
+
+select * from s where n not in (select u from l);
+
+-- correctness test 4: inner has predicate
+explain (costs false) select * from s where n not in (select n from l where u > 7);
+
+select * from s where n not in (select n from l where u > 7);
+
+-- correctness test 5: multi-expression, (2, 2, null, 2, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+
+select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should not be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+
+select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+
+select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+
+-- test using hashed subplan when inner fits in work_mem
+explain (costs false) select * from l where n not in (select n from s);
+
+select * from l where n not in (select n from s);
+
+-- test single expression
+explain (costs false) select * from s where n not in (select n from l);
+
+select * from s where n not in (select n from l);
+
+explain (costs false) select * from s where u not in (select u from l);
+
+select * from s where u not in (select u from l);
+
+explain (costs false) select * from s where 3*n not in (select n from l);
+
+select * from s where 3*n not in (select n from l);
+
+explain (costs false) select * from s where n not in (select 3*n from l);
+
+select * from s where n not in (select 3*n from l);
+
+-- test single expression with predicates
+explain (costs false) select * from s where n not in (select n from l where u > 0);
+
+select * from s where n not in (select n from l where u > 0);
+
+explain (costs false) select * from s where n not in (select n from l where u > 100);
+
+select * from s where n not in (select n from l where u > 100);
+
+-- test multi expression
+explain (costs false) select * from s where (n,u) not in (select n,u from l);
+
+select * from s where (n,u) not in (select n,u from l);
+
+explain (costs false) select * from s where (u, nn) not in (select u, nn from l);
+
+select * from s where (u, nn) not in (select u, nn from l);
+
+explain (costs false) select * from s where (n,u) not in (select u,n from l);
+
+select * from s where (n,u) not in (select u,n from l);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+
+-- test multi-table
+explain (costs false) select count(*) from s, l where s.n not in (select n from l);
+
+select count(*) from s, l where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s, l where s.nn not in (select nn from l);
+
+select count(*) from s, l where s.nn not in (select nn from l);
+
+-- test null padded results from outer join
+explain (costs false) select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+
+select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+
+select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+
+select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+
+select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+
+select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+select * from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l right join s on l.nn = s.nn);
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+--test reduce outer joins from outer query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+--test reduce outer joins from subquery
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+--test reduce outer join on outer and sub-query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+-- test union all
+explain (costs false) select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+
+select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+
+explain (costs false) select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+
+explain (costs false) select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+
+select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+-- test multi-levels of NOT IN
+explain (costs false) select * from s where n not in (select n from s where n not in (select n from l));
+
+select * from s where n not in (select n from s where n not in (select n from l));
+
+explain (costs false) select * from s where n not in (select n from s where n not in (select u from l));
+
+select * from s where n not in (select n from s where n not in (select u from l));
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+--test COALESCE
+explain (costs false) select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+
+select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+
+explain (costs false) select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+
+select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+
+explain (costs false) select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+
+select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+
+explain (costs false) select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+
+select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+
+explain (costs false) select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+
+select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+
+explain (costs false) select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+
+select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+
+-- test miscellaneous outer nullable cases
+
+explain (costs false) select * from s where (n,n) not in (select n,n from l);
+
+select * from s where (n,n) not in (select n,n from l);
+
+explain (costs false) select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+
+select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+
+select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+
+explain (costs false) select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+
+select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+
+--test outer has strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l) and n > 0;
+
+select * from s where n not in (select n from l) and n > 0;
+
+explain (costs false) select * from s where n not in (select n from l) and u > 0;
+
+select * from s where n not in (select n from l) and u > 0;
+
+explain (costs false) select * from s where n not in (select n from l) and n is not null;
+
+select * from s where n not in (select n from l) and n is not null;
+
+explain (costs false) select * from s join l on s.n = l.n where s.n not in (select n from l);
+
+select * from s join l on s.n = l.n where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+
+select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+
+select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+
+select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+
+--test inner has strict predicate or inner join
+explain (costs false) select * from s where u not in (select n from l where n > 0);
+
+select * from s where u not in (select n from l where n > 0);
+
+explain (costs false) select * from s where u not in (select n from l where u > 0);
+
+select * from s where u not in (select n from l where u > 0);
+
+explain (costs false) select * from s where u not in (select n from l where n is not null);
+
+select * from s where u not in (select n from l where n is not null);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n);
+
+select * from s where u not in (select l.n from l join s on l.n=s.n);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.u=s.u);
+
+select * from s where u not in (select l.n from l join s on l.u=s.u);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n = s.n);
+
+select * from s where u not in (select l.n from l join s on l.n = s.n);
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n = s.n);
+
+select * from s where u not in (select l.n from l right join s on l.n = s.n);
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+
+select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+
+select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+
+--test both sides have strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l where n > 0) and n > 0;
+
+select * from s where n not in (select n from l where n > 0) and n > 0;
+
+explain (costs false) select * from s where n not in (select n from l where u > 0) and n > 0;
+
+select * from s where n not in (select n from l where u > 0) and n > 0;
+
+explain (costs false) select * from s where n not in (select n from l where n > 0) and u > 0;
+
+select * from s where n not in (select n from l where n > 0) and u > 0;
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+explain (costs false) select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+--JIRA-7279 CTE with NOT IN
+create table public.testing
+(
+a integer,
+b integer,
+c integer
+);
+
+explain (costs false) with
+selected(a,b,c) as (values(1,2,3)),
+updated(d,e,f) as (values(4,5,6))
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+
+with
+selected(a,b,c) as (values(1,2,3)),
+updated(d,e,f) as (values(4,5,6))
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+
+select * from public.testing;
+
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+
+select * from public.testing;
+
+-- expect to get Hash Anti Join
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where a not in (select d from updated);
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where a not in (select d from updated);
+
+select * from public.testing;
+
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select e from updated);
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select e from updated);
+
+select * from public.testing;
+
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select d from updated);
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select d from updated);
+
+select * from public.testing;
+
+-- two levels of NOT IN with CTE, 2nd NOT IN
+-- subquery access CTE two levels above
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated
+      where d not in (select a from selected));
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated
+      where d not in (select a from selected));
+
+select * from public.testing;
+
+-- With clause inside a query block
+explain select count(distinct t.a) from
+(with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+select * from selected where (a,b,c) not in
+(select d,e,f from updated
+ where d not in (select a from selected))) as t;
+
+select count(distinct t.a) from
+(with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+select * from selected where (a,b,c) not in
+(select d,e,f from updated
+ where d not in (select a from selected))) as t;
+
+-- With clause in subquery, can't flatten subquery to anti join
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(with updated (d,e,f) as (select u, n, nn from l)
+select d,e,f from updated);
+
+with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(with updated (d,e,f) as (select u, n, nn from l)
+select d,e,f from updated);
+
+select * from public.testing;
+
+-- With clause in subquery, subsubquery access CTE in subquery
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(
+with updated(d,e,f) as (select u, n, nn from l)
+select d,e,f from updated where d not in (select d from updated)
+);
+
+with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(
+with updated(d,e,f) as (select u, n, nn from l)
+select d,e,f from updated where d not in (select d from updated)
+);
+
+select * from public.testing;
+
+-- Recursive CTE
+CREATE TABLE employees (
+  id serial,
+  name varchar(255),
+  manager_id int
+);
+
+INSERT INTO employees VALUES (1, 'Mark', null);
+INSERT INTO employees VALUES (2, 'John', 1);
+INSERT INTO employees VALUES (3, 'Dan', 2);
+INSERT INTO employees VALUES (4, 'Clark', 1);
+INSERT INTO employees VALUES (5, 'Linda', 2);
+INSERT INTO employees VALUES (6, 'Willy', 2);
+INSERT INTO employees VALUES (7, 'Barack', 2);
+INSERT INTO employees VALUES (8, 'Elen', 2);
+INSERT INTO employees VALUES (9, 'Kate', 3);
+INSERT INTO employees VALUES (10, 'Terry', 4);
+
+WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON mtree.id = e.manager_id
+)
+SELECT *
+FROM managertree;
+
+-- NOT IN subquery access Recursive CTE
+EXPLAIN (COSTS FALSE) WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON mtree.id = e.manager_id
+)
+SELECT *
+FROM managertree mt WHERE mt.manager_id NOT IN (SELECT id FROM managertree);
+
+WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON mtree.id = e.manager_id
+)
+SELECT *
+FROM managertree mt WHERE mt.manager_id NOT IN (SELECT id FROM managertree);
+
+-- NOT IN under UNION ALL inside Recursive CTE
+EXPLAIN (COSTS FALSE) WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON
+  (mtree.id = e.manager_id AND
+  mtree.manager_id NOT IN (SELECT manager_id FROM employees)
+  )
+)
+SELECT *
+FROM managertree;
+
+WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON
+  (mtree.id = e.manager_id AND
+  mtree.manager_id NOT IN (SELECT manager_id FROM employees)
+  )
+)
+SELECT *
+FROM managertree;
+
+--Manfred-7613 CTE NOT IN with Union All
+create table cocotero as (
+	select * from(
+	values(1,2,3)) as data(a,b,c)
+);
+
+explain (costs off) with selected as (
+	select *
+	from cocotero
+),
+updated as (
+	update cocotero
+	set a = 3
+	from selected
+	where cocotero.a = selected.a
+	returning selected.a,selected.b,selected.c
+),
+inserted as (
+	insert into cocotero
+	select *
+	from selected
+	where a not in (select a from updated)
+	returning *
+)
+select 'updated' as action, count(*) as lines from updated
+union all
+select 'inserted' as action, count(*) as lines from inserted;
+
+with selected as (
+	select *
+	from cocotero
+),
+updated as (
+	update cocotero
+	set a = 3
+	from selected
+	where cocotero.a = selected.a
+	returning selected.a,selected.b,selected.c
+),
+inserted as (
+	insert into cocotero
+	select *
+	from selected
+	where a not in (select a from updated)
+	returning *
+)
+select 'updated' as action, count(*) as lines from updated
+union all
+select 'inserted' as action, count(*) as lines from inserted;
+
+--test enable_not_in_transform
+explain (costs off) select count(*) from s where s.u not in (select l.u from l);
+
+set enable_not_in_transform = off;
+
+explain (costs off) select count(*) from s where s.u not in (select l.u from l);
+
+-- clean up
+reset work_mem;
+reset enable_not_in_transform;
+drop table s;
+drop table s1;
+drop table l;
+drop table empty;
+drop table public.testing;
+drop table employees;
+drop table cocotero;
#57Andrey Lepikhov
a.lepikhov@postgrespro.ru
In reply to: Li, Zheng (#56)
Re: NOT IN subquery optimization

At the top of the thread your co-author argued the beginning of this
work with "findings about the performance of PostgreSQL, MySQL, and
Oracle on various subqueries:"

https://antognini.ch/2017/12/how-well-a-query-optimizer-handles-subqueries/

I launched this test set with your "not_in ..." patch. Your optimization
improves only results of D10-D13 queries. Nothing has changed for bad
plans of the E20-E27 and F20-F27 queries.

For example, we can replace E20 query:
SELECT * FROM large WHERE n IN (SELECT n FROM small WHERE small.u =
large.u); - execution time: 1370 ms, by
SELECT * FROM large WHERE EXISTS (SELECT n,u FROM small WHERE (small.u =
large.u) AND (large.n = small.n
)) AND n IS NOT NULL; - execution time: 0.112 ms

E21 query:
SELECT * FROM large WHERE n IN (SELECT nn FROM small WHERE small.u =
large.u); - 1553 ms, by
SELECT * FROM large WHERE EXISTS (SELECT nn FROM small WHERE (small.u =
large.u) AND (small.nn = large.n)); - 0.194 ms

F27 query:
SELECT * FROM large WHERE nn NOT IN (SELECT nn FROM small WHERE small.nu
= large.u); - 1653.048 ms, by
SELECT * FROM large WHERE NOT EXISTS (SELECT nn,nu FROM small WHERE
(small.nu = large.u) AND (small.nn = large.nn)); - 274.019 ms

Are you planning to make another patch for these cases?

Also i tried to increase work_mem up to 2GB: may be hashed subqueries
can improve situation? But hashing is not improved execution time of the
queries significantly.

On your test cases (from the comments of the patch) the subquery hashing
has the same execution time with queries No.13-17. At the queries
No.1-12 it is not so slow as without hashing, but works more slowly (up
to 3 orders) than NOT IN optimization.

On 12/2/19 9:25 PM, Li, Zheng wrote:

Here is the latest rebased patch.

--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company

#58Li, Zheng
zhelli@amazon.com
In reply to: Andrey Lepikhov (#57)
Re: NOT IN subquery optimization

Hi Andrey,

Thanks for the comment!

The unimproved cases you mentioned all fall into the category “correlated subquery”. This category is explicitly disallowed by existing code to convert to join in convert_ANY_sublink_to_join:
/*
* The sub-select must not refer to any Vars of the parent query. (Vars of
* higher levels should be okay, though.)
*/
if (contain_vars_of_level((Node *) subselect, 1))
return NULL;

I think this is also the reason why hashed subplan is not used for such subqueries.

It's probably not always safe to convert a correlated subquery to join. We need to find out/prove when it’s safe/unsafe to convert such ANY subquery if we were to do so.

Regards,
-----------
Zheng Li
AWS, Amazon Aurora PostgreSQL

On 1/5/20, 1:12 AM, "Andrey Lepikhov" <a.lepikhov@postgrespro.ru> wrote:

At the top of the thread your co-author argued the beginning of this
work with "findings about the performance of PostgreSQL, MySQL, and
Oracle on various subqueries:"

https://antognini.ch/2017/12/how-well-a-query-optimizer-handles-subqueries/

I launched this test set with your "not_in ..." patch. Your optimization
improves only results of D10-D13 queries. Nothing has changed for bad
plans of the E20-E27 and F20-F27 queries.

For example, we can replace E20 query:
SELECT * FROM large WHERE n IN (SELECT n FROM small WHERE small.u =
large.u); - execution time: 1370 ms, by
SELECT * FROM large WHERE EXISTS (SELECT n,u FROM small WHERE (small.u =
large.u) AND (large.n = small.n
)) AND n IS NOT NULL; - execution time: 0.112 ms

E21 query:
SELECT * FROM large WHERE n IN (SELECT nn FROM small WHERE small.u =
large.u); - 1553 ms, by
SELECT * FROM large WHERE EXISTS (SELECT nn FROM small WHERE (small.u =
large.u) AND (small.nn = large.n)); - 0.194 ms

F27 query:
SELECT * FROM large WHERE nn NOT IN (SELECT nn FROM small WHERE small.nu
= large.u); - 1653.048 ms, by
SELECT * FROM large WHERE NOT EXISTS (SELECT nn,nu FROM small WHERE
(small.nu = large.u) AND (small.nn = large.nn)); - 274.019 ms

Are you planning to make another patch for these cases?

Also i tried to increase work_mem up to 2GB: may be hashed subqueries
can improve situation? But hashing is not improved execution time of the
queries significantly.

On your test cases (from the comments of the patch) the subquery hashing
has the same execution time with queries No.13-17. At the queries
No.1-12 it is not so slow as without hashing, but works more slowly (up
to 3 orders) than NOT IN optimization.

On 12/2/19 9:25 PM, Li, Zheng wrote:

Here is the latest rebased patch.

--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company

#59Andrey Lepikhov
a.lepikhov@postgrespro.ru
In reply to: Li, Zheng (#58)
Re: NOT IN subquery optimization

On 1/7/20 12:34 AM, Li, Zheng wrote:

Hi Andrey,

Thanks for the comment!

The unimproved cases you mentioned all fall into the category “correlated subquery”. This category is explicitly disallowed by existing code to convert to join in convert_ANY_sublink_to_join:
/*
* The sub-select must not refer to any Vars of the parent query. (Vars of
* higher levels should be okay, though.)
*/
if (contain_vars_of_level((Node *) subselect, 1))
return NULL;

I think this is also the reason why hashed subplan is not used for such subqueries.

It's probably not always safe to convert a correlated subquery to join. We need to find out/prove when it’s safe/unsafe to convert such ANY subquery if we were to do so.

Maybe this part of code contains logical error?
You optimize only the special case of the "NOT IN" expression, equal to
NOT EXISTS. The convert_EXISTS_sublink_to_join() routine can contain
vars of the parent query.
May be you give an trivial example for this problem?

--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company

#60Tom Lane
tgl@sss.pgh.pa.us
In reply to: Li, Zheng (#56)
1 attachment(s)
Re: NOT IN subquery optimization

"Li, Zheng" <zhelli@amazon.com> writes:

Here is the latest rebased patch.

I noticed that the cfbot is failing to test this because of some trivial
merge conflicts, so here's a re-rebased version.

I haven't reviewed this in any detail, but here's a couple of notes
from having quickly looked through the patch:

* I find it entirely unacceptable to stick some planner temporary
fields into struct SubLink. If you need that storage you'll have
to find some other place to put it. But in point of fact I don't
think you need it; it doesn't look to me to be critical to generate
the subquery's plan any earlier than make_subplan would have done it.
Moreover, you should really strive to *not* do that, because it's
likely to get in the way of other future optimizations. As the
existing comment in make_subplan already suggests, we might want to
delay subplan planning even further than that in future.

* I'm also not too happy with the (undocumented) rearrangement of
reduce_outer_joins. There's a specific sequence of processing that
that's involved in, as documented at the top of prepjointree.c, and
I doubt that you can just randomly call it from other places and expect
good results. In particular, since JOIN alias var flattening won't have
happened yet when this code is being run from pull_up_sublinks, it's
unlikely that reduce_outer_joins will reliably get the same answers it
would get normally. I also wonder whether it's safe to make the
parsetree changes it makes earlier than normal, and whether it will be
problematic to run it twice on the same tree, and whether its rather
indirect connection to distribute_qual_to_rels is going to misbehave.

* The proposed test additions seem to about triple the runtime of
subselect.sql. This seems excessive. I also wonder why it's necessary
for this test to build its own large tables; couldn't it re-use ones
that already exist in the regression database?

* Not really sure that we need a new planner GUC for this, but if we
do, it needs to be documented.

regards, tom lane

Attachments:

not_in_to_anti_join_transformation_v2_4.patchtext/x-diff; charset=us-ascii; name=not_in_to_anti_join_transformation_v2_4.patchDownload
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 5da0528..16ce707 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -992,7 +992,7 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
 	 * preprocessing.
 	 */
 	if (hasOuterJoins)
-		reduce_outer_joins(root);
+		reduce_outer_joins(parse);
 
 	/*
 	 * If we have any RTE_RESULT relations, see if they can be deleted from
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 3650e83..fa64281 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -39,6 +39,8 @@
 #include "utils/syscache.h"
 
 
+bool enable_not_in_transform;
+
 typedef struct convert_testexpr_context
 {
 	PlannerInfo *root;
@@ -158,8 +160,7 @@ get_first_col_type(Plan *plan, Oid *coltype, int32 *coltypmod,
  * subquery itself is in a resjunk tlist entry whose value is uninteresting).
  */
 static Node *
-make_subplan(PlannerInfo *root, Query *orig_subquery,
-			 SubLinkType subLinkType, int subLinkId,
+make_subplan(PlannerInfo *root, SubLink *sublink,
 			 Node *testexpr, bool isTopQual)
 {
 	Query	   *subquery;
@@ -171,6 +172,9 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 	Plan	   *plan;
 	List	   *plan_params;
 	Node	   *result;
+	Query      *orig_subquery = (Query *) sublink->subselect;
+	SubLinkType subLinkType = sublink->subLinkType;
+	int         subLinkId = sublink->subLinkId;
 
 	/*
 	 * Copy the source Query node.  This is a quick and dirty kluge to resolve
@@ -216,24 +220,33 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 	/* plan_params should not be in use in current query level */
 	Assert(root->plan_params == NIL);
 
-	/* Generate Paths for the subquery */
-	subroot = subquery_planner(root->glob, subquery,
-							   root,
-							   false, tuple_fraction);
+	if(sublink->subroot != NULL &&
+		sublink->subplan != NULL)
+	{
+		subroot = (PlannerInfo *) sublink->subroot;
+		plan = (Plan *) sublink->subplan;
+	}
+	else
+	{
+		/* Generate Paths for the subquery */
+		subroot = subquery_planner(root->glob, subquery,
+								   root,
+								   false, tuple_fraction);
+
+		/*
+		 * Select best Path and turn it into a Plan.  At least for now, there
+		 * seems no reason to postpone doing that.
+		 */
+		final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
+		best_path = get_cheapest_fractional_path(final_rel, tuple_fraction);
+
+		plan = create_plan(subroot, best_path);
+	}
 
 	/* Isolate the params needed by this specific subplan */
 	plan_params = root->plan_params;
 	root->plan_params = NIL;
 
-	/*
-	 * Select best Path and turn it into a Plan.  At least for now, there
-	 * seems no reason to postpone doing that.
-	 */
-	final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
-	best_path = get_cheapest_fractional_path(final_rel, tuple_fraction);
-
-	plan = create_plan(subroot, best_path);
-
 	/* And convert to SubPlan or InitPlan format. */
 	result = build_subplan(root, plan, subroot, plan_params,
 						   subLinkType, subLinkId,
@@ -1173,6 +1186,389 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
 	return expression_tree_walker(node, inline_cte_walker, context);
 }
 
+/* Returns a List of Nodes from the testexpr of an Any SubLink */
+static List *
+getTestExpr(SubLink *sublink)
+{
+	Node * testexpr = sublink->testexpr;
+	Assert(testexpr);
+
+	/* single expression */
+	if(IsA(testexpr, OpExpr))
+	{
+		OpExpr	*opexpr = (OpExpr *) testexpr;
+		Node	*testnode = linitial(opexpr->args);
+
+		return list_make1(testnode);
+	}
+	/* multi-expression */
+	else if(IsA(testexpr, BoolExpr))
+	{
+		BoolExpr	*bexpr = (BoolExpr *) testexpr;
+		ListCell		*lc;
+		Node		*node;
+		List			*result = NULL;
+
+		foreach(lc, bexpr->args)
+		{
+			node = lfirst(lc);
+			if(IsA(node, OpExpr))
+			{
+				OpExpr *expr = (OpExpr *) node;
+				result = lappend(result, linitial(expr->args));
+			}
+			else
+			{
+				elog(ERROR, "unrecognized node type for testexpr: %d",
+						(int) nodeTag(node));
+			}
+		}
+		return result;
+	}
+	else
+	{
+		elog(ERROR, "unrecognized node type for testexpr: %d",
+				(int) nodeTag(testexpr));
+	}
+}
+
+/* Try to reduce outer joins if there is one in the Query */
+static void
+reduce_outer_joins_NOT_IN(Query *parse)
+{
+	ListCell				*lc;
+	RangeTblEntry	*rte;
+
+	foreach(lc, parse->rtable)
+	{
+		rte = (RangeTblEntry *) lfirst(lc);
+		/* try to reduce outer joins if there is one */
+		if (rte->rtekind == RTE_JOIN &&
+				IS_OUTER_JOIN(rte->jointype))
+		{
+			reduce_outer_joins(parse);
+			return;
+		}
+	}
+}
+
+/*
+ * Make clause NOT EXISTS
+ * (select 1 from t2 where p) for the NOT IN to ANTI JOIN
+ * transformation.
+ */
+static
+Node *
+makeExistsTest_NOT_IN(Query *subselect)
+{
+	BoolExpr *notExpr = makeNode(BoolExpr);
+	SubLink *exists = makeNode(SubLink);
+	Query *selectOne =  copyObject(subselect);
+	Const *oneconst;
+	TargetEntry *dummyte;
+
+	/* modify subselect target list to contain a dummy const 1 */
+	oneconst = makeConst(INT4OID,
+						 -1,
+						 InvalidOid,
+						 sizeof(int32),
+						 Int32GetDatum(1),
+						 false, /* isnull */
+						 true); /* pass by value */
+	dummyte = makeTargetEntry((Expr *) oneconst,
+							  1,
+							  "one",
+							  false /* resjunk */ );
+	selectOne->targetList = list_make1(dummyte);
+
+	/* make EXISTS(select 1 from t2 where p) */
+	exists->subLinkType = EXISTS_SUBLINK;
+	exists->subLinkId = 0;
+	exists->subselect = (Node *) selectOne;
+	exists->location = -1;
+	exists->subroot = NULL;
+	exists->subplan = NULL;
+
+	/* make NOT EXISTS(select 1 from t2 where p) */
+	notExpr->boolop = NOT_EXPR;
+	notExpr->args = list_make1(exists);
+	notExpr->location = -1;
+
+	return (Node *) notExpr;
+}
+
+/*
+ *Allow transformation from NOT IN query to ANTI JOIN if ALL of the
+ * following conditions are true:
+ * 1. The GUC enable_not_in_transform is set to true.
+ * 2. the NOT IN subquery is not hashable, in which case an expensive
+ *		subplan will be generated if we don't transform.
+ * 3.. the subquery does not define any CTE.
+ */
+static bool
+allow_NOT_IN_transformation(PlannerInfo *root,
+													  SubLink *sublink)
+{
+	Query			*subselect = (Query *) sublink->subselect;
+	PlannerInfo 	*subroot;
+	double			 tuple_fraction;
+	RelOptInfo		*final_rel;
+	Path			*best_path;
+	Plan			*plan;
+
+	if(! enable_not_in_transform)
+		return false;
+
+	/*
+	 * Can't flatten if it contains WITH.  (We could arrange to pull up the
+	 * WITH into the parent query's cteList, but that risks changing the
+	 * semantics, since a WITH ought to be executed once per associated query
+	 * call.)  Note that convert_ANY_sublink_to_join doesn't have to reject
+	 * this case, since it just produces a subquery RTE that doesn't have to
+	 * get flattened into the parent query.
+	 */
+	if(subselect->cteList)
+		return false;
+
+	/* For ALL and ANY subplans, we will be
+	 * able to stop evaluating if the test condition fails or matches, so very
+	 * often not all the tuples will be retrieved; for lack of a better idea,
+	 * specify 50% retrieval.
+	 */
+	tuple_fraction = 0.5;
+	/*
+	 * Generate Paths for the subquery, use a copied version of the subquery
+	 * so that the existing one doesn't get modified.
+	 */
+	subroot = subquery_planner(root->glob, copyObject(subselect),
+													root, false, tuple_fraction);
+
+	/* Select best Path and turn it into a Plan. */
+	final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
+	best_path = get_cheapest_fractional_path(final_rel, tuple_fraction);
+
+	plan = create_plan(subroot, best_path);
+
+	sublink->subroot = (Node *) subroot;
+	sublink->subplan = (Node *) plan;
+	/*
+	 * Punt if subplan is hashable since using hashed subplan is almost like
+	 * doing a Hash Anti Join, we probably can't do better than that.
+	 */
+	if(subplan_is_hashable(plan))
+	{
+		return false;
+	}
+
+	return true;
+}
+
+/*
+ * do the following NOT IN to ANTI JOIN conversions:
+ *
+ * When x is non-nullable:
+ * t1.x not in (t2.y where p) => ANTI JOIN
+ * t1, t2 on join condition (t1.x=t2.y or t2.y IS NULL) and p.
+ * the above predicate "t2.y IS NULL" can be removed if y
+ * is also non-nullable.
+ *
+ * When x is nullable:
+ * t1.x not in (t2.y where p) => ANTI JOIN
+ * t1 (Filter: t1.x is not null or not exists (select 1 from t2 where p)),
+ * t2 on join condition (t1.x=t2.y or t2.y is null) and p.
+ * the above predicate "t2.y IS NULL" can be removed if y
+ * is also non-nullable.
+ *
+ * The multi-expression case is just ANDs of the single-
+ * expression case.
+ */
+static bool
+convert_NOT_IN_to_join(PlannerInfo *root, Node **quals,
+									SubLink *sublink, List *subquery_vars,
+									Node **pullout)
+{
+	Query			*parse = root->parse;
+	Query			*subselect = (Query *) sublink->subselect;
+	List			*testnodes = getTestExpr(sublink);
+	bool			 outerNonNull;
+	bool			 innerNonNull;
+	NullTest 		*nt;
+
+	/*
+	 * Try reduce_outer_joins since outer join affects the nullability test that's coming up next.
+	 * We have to call reduce_outer_joins for outer and inner query separately because we
+	 * don't have a global range table yet.
+	 */
+	reduce_outer_joins_NOT_IN(parse);
+
+	reduce_outer_joins_NOT_IN(subselect);
+
+	Assert(testnodes);
+	outerNonNull =
+					list_hasnonnullable(testnodes, parse);
+	innerNonNull =
+					list_hasnonnullable(subselect->targetList, subselect);
+
+	/* Single-expression case, do the following:
+	 * When x is non-nullable:
+	 * t1.x not in (t2.y where p) => ANTI JOIN
+	 * t1, t2 on join condition (t1.x=t2.y or t2.y IS NULL) and p.
+	 * the above predicate "t2.y IS NULL" can be removed if y
+	 * is also non-nullable.
+	 *
+	 * When x is nullable:
+	 * t1.x not in (t2.y where p) => ANTI JOIN
+	 * t1 (Filter: t1.x is not null or not exists (select 1 from t2 where p)),
+	 * t2 on join condition (t1.x=t2.y or t2.y is null) and p.
+	 * the above predicate "t2.y IS NULL" can be removed if y
+	 * is also non-nullable.
+	 */
+	if(IsA(*quals, OpExpr))
+	{
+		/*  Add "OR y IS NULL" if y is nullable */
+		if(!innerNonNull)
+		{
+			/* make expr y IS NULL */
+			nt = makeNode(NullTest);
+			nt->arg = (Expr *)linitial(subquery_vars);
+			nt->nulltesttype = IS_NULL;
+			nt->argisrow = false;
+			nt->location = -1;
+
+			/* make orclause (x = y OR y IS NULL) */
+			*quals = (Node *)make_orclause(list_make2(*quals,
+								(Node *)nt));
+		}
+
+		/*
+		 * if x is nullable, make the following filter for t1 :
+		 * x IS NOT NULL or NOT EXISTS (select 1 from t2 where p)
+		 */
+		if(!outerNonNull)
+		{
+			Node * existsTest = NULL;
+
+			/* make expr x IS NOT NULL */
+			nt = makeNode(NullTest);
+			nt->arg = (Expr *) linitial(testnodes);
+			nt->nulltesttype = IS_NOT_NULL;
+			nt->argisrow = false;
+			nt->location = -1;
+
+			existsTest = makeExistsTest_NOT_IN(subselect);
+			/* make x IS NOT NULL OR NOT EXISTS (select 1 from t2 where p) */
+			*pullout = (Node *)make_orclause(list_make2(
+									(Node *) nt, existsTest));
+		}
+	}
+	/*
+	 * Multi-expression case:
+	 * If all xi's are nullable:
+	 * (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
+	 * ANTI JOIN t1,
+	 * t2 on join condition:
+	 * ((t1.x1 = t2.y1) and ... (t1.xi = t2.yi) ... and
+	 * (t1.xn = t2.yn)) is NOT FALSE.
+	 *
+	 * If at least one xi is non-nuallable:
+	 * (x1, x2, ... xn) not in (y1, y2, ... yn ) =>
+	 * ANTI JOIN t1,
+	 * t2 on join condition:
+	 * (t1.x1 = t2.y1 or t2.y1 is NULL) and ...
+	 * (t1.xi = t2.yi or t2.yi is NULL or t1.xi is NULL) ... and
+	 * (t1.xn = t2.yn or t2.yn is NULL).
+	 */
+	else if(IsA(*quals, BoolExpr))
+	{
+		/*
+		 * Add IS NOT FALSE on top of the join condition if ALL x_i's are nullable
+		 */
+		if(!outerNonNull)
+		{
+			BooleanTest *btest;
+
+			btest = makeNode(BooleanTest);
+			btest->arg = (Expr *) *quals;
+			btest->booltesttype = IS_NOT_FALSE;
+			*quals = (Node *) btest;
+		}
+		else
+		{
+			ListCell			*qualc;
+			TargetEntry	*te;
+			ListCell			*xc = list_head(testnodes);
+			ListCell			*yc = list_head(subquery_vars);
+			ListCell			*ytlc = list_head(subselect->targetList);
+			List				*quallist = ((BoolExpr *)*quals)->args;
+			Node				*joinCondition = NULL;
+			bool				 xnonNull;
+			bool				 ynonNull;
+
+			/* Reconstruct quals in the loop */
+			*quals = NULL;
+			foreach(qualc, quallist)
+			{
+				te = (TargetEntry *)lfirst(ytlc);
+				/* x_i = y_i */
+				joinCondition = lfirst(qualc);
+				ynonNull = is_node_nonnullable((Node*)te, subselect);
+
+				/* append y_i IS NULL to x_i = y_i if y_i is non-nullable */
+				if(!ynonNull)
+				{
+					/* make expr y_i IS NULL */
+					nt = makeNode(NullTest);
+					nt->arg = (Expr *)lfirst(yc);
+					nt->nulltesttype = IS_NULL;
+					nt->argisrow = false;
+					nt->location = -1;
+
+					/* make orclause (x_i = y_i OR y_i IS NULL) */
+					joinCondition = (Node *)make_orclause(list_make2(joinCondition,
+											(Node *)nt));
+				}
+
+				/*
+				 * Append "OR x_i is null" to the join condition if x_i is nullable.
+				 * Notice at least one x_i should be non-nullable because the all
+				 * x_i's nullable case is handled earlier by adding "IS NOT FALSE"
+				 * on top of the join condition.
+				 */
+				xnonNull = is_node_nonnullable(lfirst(xc), parse);
+				if(!xnonNull)
+				{
+					/* make expr x_i IS NULL */
+					nt = makeNode(NullTest);
+					nt->arg = (Expr *)lfirst(xc);
+					nt->nulltesttype = IS_NULL;
+					nt->argisrow = false;
+					nt->location = -1;
+
+					/* make orclause (x_i = y_i OR y_i IS NULL OR x_i IS NULL) */
+					joinCondition = (Node *)make_orclause(list_make2(joinCondition,
+											(Node *)nt));
+				}
+
+				/*
+				 * Now append joinCondition to quals as one andclause.
+				 * (x_i = y_i OR y_i IS NULL OR x_i IS NULL) AND
+				 * (x_j = y_j OR y_j IS NULL OR x_j IS NULL)...
+				 */
+				*quals = (Node *)make_andclause(list_make2(*quals, joinCondition));
+				xc = lnext(testnodes, xc);
+				yc = lnext(subquery_vars, yc);
+				ytlc = lnext(subselect->targetList, ytlc);
+			}
+		}
+	}
+	/* quals should be either OpExpr or BoolExpr, otherwise don't convert */
+	else
+	{
+		return false;
+	}
+
+	return true;
+}
 
 /*
  * convert_ANY_sublink_to_join: try to convert an ANY SubLink to a join
@@ -1210,7 +1606,8 @@ inline_cte_walker(Node *node, inline_cte_walker_context *context)
  */
 JoinExpr *
 convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
-							Relids available_rels)
+											bool under_not, Node **pullout,
+											Relids available_rels)
 {
 	JoinExpr   *result;
 	Query	   *parse = root->parse;
@@ -1254,6 +1651,12 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 	if (contain_volatile_functions(sublink->testexpr))
 		return NULL;
 
+	if (under_not &&
+		! allow_NOT_IN_transformation(root, sublink))
+	{
+		return NULL;
+	}
+
 	/* Create a dummy ParseState for addRangeTableEntryForSubquery */
 	pstate = make_parsestate(NULL);
 
@@ -1293,10 +1696,28 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 	quals = convert_testexpr(root, sublink->testexpr, subquery_vars);
 
 	/*
+	 * Try converting x NOT IN (y) to ANTI JOIN.
+	 */
+	if(under_not &&
+			!convert_NOT_IN_to_join(root, &quals,
+								sublink, subquery_vars, pullout))
+	{
+		/*
+		 * In theory, we shouldn't get here since allow_NOT_IN_transformation()
+		 * has already ruled out cases that shouldn't be transformed. In other words,
+		 * I expect convert_NOT_IN_to_join to always return true, but just in case
+		 * it fails, reset parse->rtable which has been changed a few lines above.
+		 */
+		parse->rtable = list_delete(parse->rtable, rte);
+		return NULL;
+	}
+
+	/*
 	 * And finally, build the JoinExpr node.
 	 */
 	result = makeNode(JoinExpr);
-	result->jointype = JOIN_SEMI;
+	/* NOT IN will be converted to ANTI JOIN */
+	result->jointype = under_not ? JOIN_ANTI : JOIN_SEMI;
 	result->isNatural = false;
 	result->larg = NULL;		/* caller must fill this in */
 	result->rarg = (Node *) rtr;
@@ -1885,9 +2306,7 @@ process_sublinks_mutator(Node *node, process_sublinks_context *context)
 		 * Now build the SubPlan node and make the expr to return.
 		 */
 		return make_subplan(context->root,
-							(Query *) sublink->subselect,
-							sublink->subLinkType,
-							sublink->subLinkId,
+							sublink,
 							testexpr,
 							context->isTopQual);
 	}
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index 1452172..cd1557c 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -113,7 +113,7 @@ static Query *pullup_replace_vars_subquery(Query *query,
 static reduce_outer_joins_state *reduce_outer_joins_pass1(Node *jtnode);
 static void reduce_outer_joins_pass2(Node *jtnode,
 									 reduce_outer_joins_state *state,
-									 PlannerInfo *root,
+									 Query *parse,
 									 Relids nonnullable_rels,
 									 List *nonnullable_vars,
 									 List *forced_null_vars);
@@ -267,6 +267,13 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 		}
 		/* Build the replacement FromExpr; no quals yet */
 		newf = makeFromExpr(newfromlist, NULL);
+		/*
+		 * Replace parse->jointree with newf now because we might modify join types
+		 * during reduce_outer_joins() in convert_NOT_IN_to_join() which is called
+		 * in pull_up_sublinks_qual_recurse() that's coming up next.
+		 */
+		newf->quals = f->quals;
+		root->parse->jointree = newf;
 		/* Set up a link representing the rebuilt jointree */
 		jtlink = (Node *) newf;
 		/* Now process qual --- all children are available for use */
@@ -399,7 +406,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 		/* Is it a convertible ANY or EXISTS clause? */
 		if (sublink->subLinkType == ANY_SUBLINK)
 		{
-			if ((j = convert_ANY_sublink_to_join(root, sublink,
+			if ((j = convert_ANY_sublink_to_join(root, sublink, false, NULL,
 												 available_rels1)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
@@ -425,7 +432,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 				return NULL;
 			}
 			if (available_rels2 != NULL &&
-				(j = convert_ANY_sublink_to_join(root, sublink,
+				(j = convert_ANY_sublink_to_join(root, sublink, false, NULL,
 												 available_rels2)) != NULL)
 			{
 				/* Yes; insert the new join node into the join tree */
@@ -571,6 +578,68 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 					return NULL;
 				}
 			}
+			else if (sublink->subLinkType == ANY_SUBLINK)
+			{
+				Node *pullout = NULL;
+
+				if ((j = convert_ANY_sublink_to_join(root, sublink, true, &pullout,
+													 available_rels1)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink1;
+					*jtlink1 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Any inserted
+					 * joins can get stacked onto either j->larg or j->rarg,
+					 * depending on which rels they reference.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->larg,
+															 available_rels1,
+															 &j->rarg,
+															 child_rels);
+					/*
+					 * Return pullout predicate (x is NOT NULL) if it's not null,
+					 * otherwise return NULL representing constant TRUE.
+					 */
+					return pullout? pullout : NULL;
+				}
+				if (available_rels2 != NULL &&
+					(j = convert_ANY_sublink_to_join(root, sublink, true, &pullout,
+													 available_rels2)) != NULL)
+				{
+					/* Yes; insert the new join node into the join tree */
+					j->larg = *jtlink2;
+					*jtlink2 = (Node *) j;
+					/* Recursively process pulled-up jointree nodes */
+					j->rarg = pull_up_sublinks_jointree_recurse(root,
+																j->rarg,
+																&child_rels);
+
+					/*
+					 * Now recursively process the pulled-up quals.  Any inserted
+					 * joins can get stacked onto either j->larg or j->rarg,
+					 * depending on which rels they reference.
+					 */
+					j->quals = pull_up_sublinks_qual_recurse(root,
+															 j->quals,
+															 &j->larg,
+															 available_rels2,
+															 &j->rarg,
+															 child_rels);
+					/*
+					 * Return pullout predicate (x is NOT NULL) if it's not null,
+					 * otherwise return NULL representing constant TRUE.
+					 */
+					return pullout? pullout : NULL;
+				}
+			}
 		}
 		/* Else return it unmodified */
 		return node;
@@ -909,7 +978,13 @@ pull_up_simple_subquery(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte,
 	subroot->parse = subquery;
 	subroot->glob = root->glob;
 	subroot->query_level = root->query_level;
-	subroot->parent_root = root->parent_root;
+	/*
+	 * Keep a path to the top level root so that we can recursively access top level
+	 * CTEs in root->parse->cteList, and CTE plans in root->init_plans. This hack
+	 * won't change the original PlannerInfo tree structure because subroot is just
+	 * a auxiliary PlannerInfo to help pulling up subquery.
+	 */
+	subroot->parent_root = root;
 	subroot->plan_params = NIL;
 	subroot->outer_params = NULL;
 	subroot->planner_cxt = CurrentMemoryContext;
@@ -2558,7 +2633,7 @@ flatten_simple_union_all(PlannerInfo *root)
  * alias-var expansion).
  */
 void
-reduce_outer_joins(PlannerInfo *root)
+reduce_outer_joins(Query *parse)
 {
 	reduce_outer_joins_state *state;
 
@@ -2571,14 +2646,14 @@ reduce_outer_joins(PlannerInfo *root)
 	 * join(s) below each side of each join clause. The second pass examines
 	 * qual clauses and changes join types as it descends the tree.
 	 */
-	state = reduce_outer_joins_pass1((Node *) root->parse->jointree);
+	state = reduce_outer_joins_pass1((Node *) parse->jointree);
 
 	/* planner.c shouldn't have called me if no outer joins */
 	if (state == NULL || !state->contains_outer)
 		elog(ERROR, "so where are the outer joins?");
 
-	reduce_outer_joins_pass2((Node *) root->parse->jointree,
-							 state, root, NULL, NIL, NIL);
+	reduce_outer_joins_pass2((Node *) parse->jointree,
+							 state, parse, NULL, NIL, NIL);
 }
 
 /*
@@ -2661,7 +2736,7 @@ reduce_outer_joins_pass1(Node *jtnode)
 static void
 reduce_outer_joins_pass2(Node *jtnode,
 						 reduce_outer_joins_state *state,
-						 PlannerInfo *root,
+						 Query *parse,
 						 Relids nonnullable_rels,
 						 List *nonnullable_vars,
 						 List *forced_null_vars)
@@ -2700,7 +2775,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 			reduce_outer_joins_state *sub_state = lfirst(s);
 
 			if (sub_state->contains_outer)
-				reduce_outer_joins_pass2(lfirst(l), sub_state, root,
+				reduce_outer_joins_pass2(lfirst(l), sub_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
@@ -2812,7 +2887,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 		/* Apply the jointype change, if any, to both jointree node and RTE */
 		if (rtindex && jointype != j->jointype)
 		{
-			RangeTblEntry *rte = rt_fetch(rtindex, root->parse->rtable);
+			RangeTblEntry *rte = rt_fetch(rtindex, parse->rtable);
 
 			Assert(rte->rtekind == RTE_JOIN);
 			Assert(rte->jointype == j->jointype);
@@ -2897,7 +2972,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 					pass_nonnullable_vars = NIL;
 					pass_forced_null_vars = NIL;
 				}
-				reduce_outer_joins_pass2(j->larg, left_state, root,
+				reduce_outer_joins_pass2(j->larg, left_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
@@ -2919,7 +2994,7 @@ reduce_outer_joins_pass2(Node *jtnode,
 					pass_nonnullable_vars = NIL;
 					pass_forced_null_vars = NIL;
 				}
-				reduce_outer_joins_pass2(j->rarg, right_state, root,
+				reduce_outer_joins_pass2(j->rarg, right_state, parse,
 										 pass_nonnullable_rels,
 										 pass_nonnullable_vars,
 										 pass_forced_null_vars);
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 0c6fe01..46a9877 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -38,10 +38,12 @@
 #include "optimizer/optimizer.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planmain.h"
+#include "optimizer/pathnode.h"
 #include "parser/analyze.h"
 #include "parser/parse_agg.h"
 #include "parser/parse_coerce.h"
 #include "parser/parse_func.h"
+#include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "tcop/tcopprot.h"
 #include "utils/acl.h"
@@ -2013,6 +2015,271 @@ find_forced_null_var(Node *node)
 }
 
 /*
+ * find_innerjoined_rels
+ *		Traverse jointree to locate non-outerjoined-rels and quals above them
+ *
+ * We fill innerjoined_rels with the relids of all rels that are not below
+ * the nullable side of any outer join (which would cause their Vars to be
+ * possibly NULL regardless of what's in the catalogs).  In the same scan,
+ * we locate all WHERE and JOIN/ON quals that constrain these rels add them to
+ * the usable_quals list (forming a list with implicit-AND semantics).
+ *
+ * Top-level caller must initialize innerjoined_rels/usable_quals to NULL/NIL.
+ */
+static void
+find_innerjoined_rels(Node *jtnode,
+					  Relids *innerjoined_rels, List **usable_quals)
+{
+	if (jtnode == NULL)
+		return;
+	if (IsA(jtnode, RangeTblRef))
+	{
+		int			varno = ((RangeTblRef *) jtnode)->rtindex;
+
+		*innerjoined_rels = bms_add_member(*innerjoined_rels, varno);
+	}
+	else if (IsA(jtnode, FromExpr))
+	{
+		FromExpr   *f = (FromExpr *) jtnode;
+		ListCell   *lc;
+
+		/* All elements of the FROM list are allowable */
+		foreach(lc, f->fromlist)
+			find_innerjoined_rels((Node *) lfirst(lc),
+								  innerjoined_rels, usable_quals);
+		/* ... and its WHERE quals are too */
+		if (f->quals)
+			*usable_quals = lappend(*usable_quals, f->quals);
+	}
+	else if (IsA(jtnode, JoinExpr))
+	{
+		JoinExpr   *j = (JoinExpr *) jtnode;
+
+		switch (j->jointype)
+		{
+			case JOIN_INNER:
+				/* visit both children */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				/* and grab the ON quals too */
+				if (j->quals)
+					*usable_quals = lappend(*usable_quals, j->quals);
+				break;
+
+			case JOIN_LEFT:
+			case JOIN_SEMI:
+			case JOIN_ANTI:
+
+				/*
+				 * Only the left input is possibly non-nullable; furthermore,
+				 * the quals of this join don't constrain the left input.
+				 * Note: we probably can't see SEMI or ANTI joins at this
+				 * point, but if we do, we can treat them like LEFT joins.
+				 */
+				find_innerjoined_rels(j->larg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_RIGHT:
+				/* Reverse of the above case */
+				find_innerjoined_rels(j->rarg,
+									  innerjoined_rels, usable_quals);
+				break;
+
+			case JOIN_FULL:
+				/* Neither side is non-nullable, so stop descending */
+				break;
+
+			case JOIN_UNIQUE_OUTER:
+			case JOIN_UNIQUE_INNER:
+				/* Don't think we will see JOIN_UNIQUE_OUTER or
+				 * JOIN_UNIQUE_INNER since they are only used internally in
+				 * the planner in a much later phase (in standard_join_search).
+				*/
+				break;
+
+			default:
+				elog(ERROR, "unrecognized join type: %d",
+					 (int) j->jointype);
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+			 (int) nodeTag(jtnode));
+}
+
+/*
+ * Returns true if the Node passed in is nonnullable. Currently handles Var,
+ * TargetEntry, CoaleseExpr and Const.
+ * TODO: Add more supporting cases.
+ * A Var is nonnullable if:
+ *	It does not appear on the null-padded side of an outer join and it has NOT NULL constraint,
+ * 	Or if it's forced non-null by inner join or other strict predicates.
+ * A CoalesceExpr is nonnullable if it has a non-null argument.
+ */
+bool
+is_node_nonnullable(Node * node, Query *parse)
+{
+	AttrNumber	 attno = InvalidAttrNumber;
+	Oid				 reloid;
+	Var 				*var = NULL;
+
+	/*
+	 * If the query contains set operations, punt.  The set ops themselves
+	 * couldn't introduce nulls that weren't in their inputs, but the tlist
+	 * present in the top-level query is just dummy and won't give us useful
+	 * info.  We could get an answer by recursing to examine each leaf query,
+	 * but for the moment it doesn't seem worth the extra complication.
+	 *
+	 * Note that we needn't consider other top-level operators such as
+	 * DISTINCT, GROUP BY, etc, as those will not introduce nulls either.
+	 */
+	if (parse->setOperations)
+		return false;
+
+	switch (nodeTag(node))
+	{
+		case T_Var:
+		{
+			RangeTblEntry	*rte;
+
+			var = (Var *) node;
+			attno = var->varattno;
+			rte = rt_fetch(var->varno, parse->rtable);
+			reloid = rte->relid;
+			break;
+		}
+		case T_TargetEntry:
+		{
+			TargetEntry	*te = (TargetEntry *)node;
+			switch(nodeTag(te->expr))
+			{
+				case T_Var:
+				{
+					var = (Var *) te->expr;
+					attno = te->resorigcol;
+					reloid = te->resorigtbl;
+					break;
+				}
+				/* recurse into is_node_nonnullable for other types of Node */
+				default:
+					return is_node_nonnullable((Node *)te->expr, parse);
+			}
+			break;
+		}
+		case T_CoalesceExpr:
+		{
+			ListCell   			*arg;
+			CoalesceExpr	*cexpr = (CoalesceExpr *)node;
+
+			/* handle COALESCE Function by looking for non-null argument */
+			foreach(arg, cexpr->args)
+			{
+				Node	*e = lfirst(arg);
+
+				/* recurse into is_node_nonnullable */
+				if (is_node_nonnullable(e, parse))
+				{
+					return true;
+				}
+			}
+			break;
+		}
+		case T_Const:
+		{
+			if(!((Const *) node)->constisnull)
+			{
+				return true;
+			}
+			break;
+		}
+		/*
+		 * TODO: handle more cases to make the nullability test more accurate
+		 * Assume unhandled cases are nullable.
+		 */
+		default:
+			return false;
+	}
+
+	/*
+	 * If we have found a Var, it is non-null if:
+	 * not on NULL-padded side of an outer join and has NOT NULL constraint
+	 * or forced NOT NULL by inner join conditions or by other strict predicates.
+	 */
+	if(var &&
+			reloid != InvalidOid)
+	{
+		Relids		 innerjoined_rels = NULL;
+		List	   		*innerjoined_useful_quals = NIL;
+		List			*nonnullable_innerjoined_vars = NIL;
+
+		find_innerjoined_rels((Node *) parse->jointree,
+							  &innerjoined_rels,
+							  &innerjoined_useful_quals);
+
+		/*
+		 * Check if the Var is from an INNER JOINed rel, it's also guaranteed
+		 * to not be on the null-padded side of an outer join.
+		 */
+		if (bms_is_member(var->varno, innerjoined_rels))
+		{
+			/*
+			 * If Var is from a plain relation and its column is marked
+			 * NOT NULL according to the catalogs, it can't produce NULL.
+			 */
+			if (get_attnotnull(reloid, attno))
+			{
+				return true;
+			}
+
+			/*
+			 * Otherwise check for the existance of strict predicates which filter
+			 * out NULL values for this Var.
+			 */
+			nonnullable_innerjoined_vars =
+				find_nonnullable_vars((Node *) innerjoined_useful_quals);
+
+			if (list_member(nonnullable_innerjoined_vars, var))
+			{
+				return true;
+			}
+		}
+
+		/*
+		 * If we get here it means -
+		 * The var is on null-padded side of an outer-join, since we've already
+		 * tried reduce_outer_joins(), there isn't any strict predicates to turn this
+		 * outer join to inner join, then there will be no strict predicates to force
+		 * this var non-null as well.
+		 */
+	}
+
+	return false;
+}
+
+/*
+ * Returns true if at least one Node in the input list is non-nullable
+ */
+bool
+list_hasnonnullable(List * list, Query *parse)
+{
+	ListCell	*lc;
+	Node	*node;
+
+	foreach(lc, list)
+	{
+		node = lfirst(lc);
+		if(is_node_nonnullable(node, parse))
+		{
+			return true;
+		}
+	}
+	return false;
+}
+
+/*
  * Can we treat a ScalarArrayOpExpr as strict?
  *
  * If "falseOK" is true, then a "false" result can be considered strict,
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 27bbb58..61ecac7 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -934,6 +934,35 @@ get_cast_oid(Oid sourcetypeid, Oid targettypeid, bool missing_ok)
 	return oid;
 }
 
+/*
+ * get_attnotnull
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	  tp;
+	Form_pg_attribute att_tup;
+
+	tp = SearchSysCache2(ATTNUM,
+			ObjectIdGetDatum(relid),
+			Int16GetDatum(attnum));
+
+	if (HeapTupleIsValid(tp))
+	{
+		bool result;
+
+		att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+		result = att_tup->attnotnull;
+		ReleaseSysCache(tp);
+
+		return result;
+	}
+	/* Assume att is nullable if no valid heap tuple is found */
+	return false;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index af876d1..29e3fac 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1122,6 +1122,15 @@ static struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 	{
+		{"enable_not_in_transform", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables the planner to transform NOT IN subquery to ANTI JOIN when possible."),
+			NULL
+		},
+		&enable_not_in_transform,
+		true,
+		NULL, NULL, NULL
+	},
+	{
 		{"geqo", PGC_USERSET, QUERY_TUNING_GEQO,
 			gettext_noop("Enables genetic query optimization."),
 			gettext_noop("This algorithm attempts to do planning without "
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index d73be2a..12ee3a7 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -657,6 +657,8 @@ typedef struct SubLink
 	List	   *operName;		/* originally specified operator name */
 	Node	   *subselect;		/* subselect as Query* or raw parsetree */
 	int			location;		/* token location, or -1 if unknown */
+	Node	   *subroot;		/* PlannerInfo for the subquery */
+	Node	   *subplan;		/* best Plan for the subquery */
 } SubLink;
 
 /*
diff --git a/src/include/optimizer/clauses.h b/src/include/optimizer/clauses.h
index b7456e3..5078cdd 100644
--- a/src/include/optimizer/clauses.h
+++ b/src/include/optimizer/clauses.h
@@ -45,6 +45,9 @@ extern List *find_nonnullable_vars(Node *clause);
 extern List *find_forced_null_vars(Node *clause);
 extern Var *find_forced_null_var(Node *clause);
 
+extern bool is_node_nonnullable(Node * node, Query *parse);
+extern bool list_hasnonnullable(List * list, Query *parse);
+
 extern bool is_pseudo_constant_clause(Node *clause);
 extern bool is_pseudo_constant_clause_relids(Node *clause, Relids relids);
 
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 735ba09..d103415 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -60,6 +60,7 @@ extern PGDLLIMPORT bool enable_nestloop;
 extern PGDLLIMPORT bool enable_material;
 extern PGDLLIMPORT bool enable_mergejoin;
 extern PGDLLIMPORT bool enable_hashjoin;
+extern PGDLLIMPORT bool enable_not_in_transform;
 extern PGDLLIMPORT bool enable_gathermerge;
 extern PGDLLIMPORT bool enable_partitionwise_join;
 extern PGDLLIMPORT bool enable_partitionwise_aggregate;
diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h
index 19c9230..7f807c4 100644
--- a/src/include/optimizer/prep.h
+++ b/src/include/optimizer/prep.h
@@ -26,7 +26,7 @@ extern void pull_up_sublinks(PlannerInfo *root);
 extern void preprocess_function_rtes(PlannerInfo *root);
 extern void pull_up_subqueries(PlannerInfo *root);
 extern void flatten_simple_union_all(PlannerInfo *root);
-extern void reduce_outer_joins(PlannerInfo *root);
+extern void reduce_outer_joins(Query *parse);
 extern void remove_useless_result_rtes(PlannerInfo *root);
 extern Relids get_relids_in_jointree(Node *jtnode, bool include_joins);
 extern Relids get_relids_for_join(Query *query, int joinrelid);
diff --git a/src/include/optimizer/subselect.h b/src/include/optimizer/subselect.h
index d6a872b..a4d309e 100644
--- a/src/include/optimizer/subselect.h
+++ b/src/include/optimizer/subselect.h
@@ -19,6 +19,8 @@
 extern void SS_process_ctes(PlannerInfo *root);
 extern JoinExpr *convert_ANY_sublink_to_join(PlannerInfo *root,
 											 SubLink *sublink,
+											 bool under_not,
+											 Node **pullout,
 											 Relids available_rels);
 extern JoinExpr *convert_EXISTS_sublink_to_join(PlannerInfo *root,
 												SubLink *sublink,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 4e646c5..1e585aa 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -90,6 +90,7 @@ extern char get_attgenerated(Oid relid, AttrNumber attnum);
 extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 								  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern Oid	get_cast_oid(Oid sourcetypeid, Oid targettypeid, bool missing_ok);
 extern char *get_collation_name(Oid colloid);
 extern bool get_collation_isdeterministic(Oid colloid);
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index 4c6cd5f..9a54f35 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -1614,3 +1614,2789 @@ select * from x for update;
    Output: subselect_tbl.f1, subselect_tbl.f2, subselect_tbl.f3
 (2 rows)
 
+-- test NON IN to ANTI JOIN conversion
+CREATE TABLE s (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s values(1000002, 1000002, 1000002, 'foofoo');
+UPDATE s set n = NULL WHERE n = 3;
+analyze s;
+CREATE TABLE l (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into l (u, n, nn, p)
+	select
+    generate_series(1,10000 ) as u,
+	generate_series(1,10000 ) as n,
+	generate_series(1,10000 ) as nn,
+	'bar' as p;
+UPDATE l set n = NULL WHERE n = 7;
+CREATE UNIQUE INDEX l_u ON l (u);
+CREATE INDEX l_n ON l (n);
+CREATE INDEX l_nn ON l (nn);
+analyze l;
+CREATE TABLE s1 (u INTEGER NOT NULL, n INTEGER NULL, n1 INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s1 (u, n, n1, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as n1,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s1 values(1000003, 1000003, 1000003, 1000003, 'foofoo');
+insert into s1 values(1003, 1003, 1003, 1003, 'foofoo');
+UPDATE s1 set n = NULL WHERE n = 3;
+UPDATE s1 set n1 = NULL WHERE n = 2;
+UPDATE s1 set n1 = NULL WHERE n1 = 3;
+analyze s1;
+CREATE TABLE empty (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+analyze empty;
+-- set work_mem to 64KB so that NOT IN to ANTI JOIN optimization will kick in
+set work_mem = 64;
+-- correctness test 1: inner empty, return every thing from outer including NULL
+explain (costs false) select * from s where n not in (select n from empty);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on empty
+(4 rows)
+
+select * from s where n not in (select n from empty);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+-- correctness test 2: inner has NULL, return empty result
+explain (costs false) select * from s where n not in (select n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- correctness test 3: inner non-null, result has no NULL
+explain (costs false) select * from s where n not in (select u from l);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Index Only Scan using l_u on l
+         Index Cond: (u = s.n)
+(7 rows)
+
+select * from s where n not in (select u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+-- correctness test 4: inner has predicate
+explain (costs false) select * from s where n not in (select n from l where u > 7);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (u > 7)
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 7)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(14 rows)
+
+select * from s where n not in (select n from l where u > 7);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(3 rows)
+
+-- correctness test 5: multi-expression, (2, 2, null, 2, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s1.n = l.u) AND (s1.n1 = l.nn)) IS NOT FALSE)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Seq Scan on l
+               Filter: (u >= 3)
+(6 rows)
+
+select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+    u    |    n    |   n1    |   nn    |   p    
+---------+---------+---------+---------+--------
+       1 |       1 |       1 |       1 | foo
+ 1000003 | 1000003 | 1000003 | 1000003 | foofoo
+       2 |       2 |         |       2 | foo
+(3 rows)
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should not be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s1.n = l.u) AND (s1.n1 = l.nn)) IS NOT FALSE)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Seq Scan on l
+               Filter: (u > 0)
+(6 rows)
+
+select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+    u    |    n    |   n1    |   nn    |   p    
+---------+---------+---------+---------+--------
+ 1000003 | 1000003 | 1000003 | 1000003 | foofoo
+(1 row)
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on s1
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Index Scan using l_u on l
+           Index Cond: (u < 0)
+(5 rows)
+
+select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+    u    |    n    |   n1    |   nn    |   p    
+---------+---------+---------+---------+--------
+       1 |       1 |       1 |       1 | foo
+ 1000003 | 1000003 | 1000003 | 1000003 | foofoo
+    1003 |    1003 |    1003 |    1003 | foofoo
+       2 |       2 |         |       2 | foo
+       3 |         |         |       3 | foo
+(5 rows)
+
+-- test using hashed subplan when inner fits in work_mem
+explain (costs false) select * from l where n not in (select n from s);
+             QUERY PLAN             
+------------------------------------
+ Seq Scan on l
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on s
+(4 rows)
+
+select * from l where n not in (select n from s);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- test single expression
+explain (costs false) select * from s where n not in (select n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select u from l);
+              QUERY PLAN              
+--------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_u on l
+         Index Cond: (u = s.u)
+(4 rows)
+
+select * from s where u not in (select u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where 3*n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: (((3 * n) IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: (((3 * s.n) = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = (3 * s.n))
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where 3*n not in (select n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select 3*n from l);
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = (3 * l.n)) OR ((3 * l.n) IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Materialize
+         ->  Seq Scan on l
+(8 rows)
+
+select * from s where n not in (select 3*n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+-- test single expression with predicates
+explain (costs false) select * from s where n not in (select n from l where u > 0);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (u > 0)
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 0)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(14 rows)
+
+select * from s where n not in (select n from l where u > 0);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l where u > 100);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (u > 100)
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 100)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(14 rows)
+
+select * from s where n not in (select n from l where u > 100);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(3 rows)
+
+-- test multi expression
+explain (costs false) select * from s where (n,u) not in (select n,u from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_u on l
+         Index Cond: (u = s.u)
+         Filter: ((s.n = n) OR (n IS NULL) OR (s.n IS NULL))
+(5 rows)
+
+select * from s where (n,u) not in (select n,u from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (u, nn) not in (select u, nn from l);
+            QUERY PLAN            
+----------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: (s.u = u)
+(5 rows)
+
+select * from s where (u, nn) not in (select u, nn from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u) not in (select u,n from l);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.u = n) OR (n IS NULL))
+         Filter: ((s.n = u) OR (s.n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.u)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where (n,u) not in (select u,n from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l);
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: (((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+                                          QUERY PLAN                                          
+----------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((u > 1000) AND ((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((u > 0) AND ((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((u > 1) AND ((s.n = u) OR (s.n IS NULL)) AND ((s.u = n) OR (n IS NULL)))
+(5 rows)
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+(2 rows)
+
+-- test multi-table
+explain (costs false) select count(*) from s, l where s.n not in (select n from l);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+                     Filter: ((n IS NOT NULL) OR (NOT $0))
+               ->  Bitmap Heap Scan on l l_1
+                     Recheck Cond: ((s.n = n) OR (n IS NULL))
+                     ->  BitmapOr
+                           ->  Bitmap Index Scan on l_n
+                                 Index Cond: (n = s.n)
+                           ->  Bitmap Index Scan on l_n
+                                 Index Cond: (n IS NULL)
+         ->  Seq Scan on l
+(15 rows)
+
+select count(*) from s, l where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s, l where s.nn not in (select nn from l);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+               ->  Index Only Scan using l_nn on l l_1
+                     Index Cond: (nn = s.nn)
+         ->  Seq Scan on l
+(7 rows)
+
+select count(*) from s, l where s.nn not in (select nn from l);
+ count 
+-------
+ 10000
+(1 row)
+
+-- test null padded results from outer join
+explain (costs false) select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = s_1.nn) OR (s_1.nn IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Nested Loop Left Join
+           Join Filter: (l_1.nn = s_2.nn)
+           ->  Seq Scan on l l_1
+           ->  Materialize
+                 ->  Seq Scan on s s_2
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s_1.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s s_1
+(15 rows)
+
+select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((s.n = s_1.nn) OR (s_1.nn IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Nested Loop Left Join
+           Join Filter: (s_2.nn = l_1.nn)
+           ->  Seq Scan on l l_1
+           ->  Materialize
+                 ->  Seq Scan on s s_2
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s_1.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s s_1
+(15 rows)
+
+select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+                   QUERY PLAN                   
+------------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         ->  Seq Scan on l
+               Filter: (NOT (hashed SubPlan 1))
+               SubPlan 1
+                 ->  Seq Scan on s s_1
+         ->  Hash
+               ->  Seq Scan on s
+(9 rows)
+
+select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+ count 
+-------
+  9997
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+                QUERY PLAN                
+------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         Filter: (NOT (hashed SubPlan 1))
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s
+         SubPlan 1
+           ->  Seq Scan on s s_1
+(9 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   ->  Nested Loop Left Join
+         Join Filter: (s.nn = l.nn)
+         ->  Hash Anti Join
+               Hash Cond: (l.nn = l_1.nn)
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Left Join
+                           Hash Cond: (l_1.nn = s_1.nn)
+                           ->  Seq Scan on l l_1
+                           ->  Hash
+                                 ->  Seq Scan on s s_1
+         ->  Seq Scan on s
+(13 rows)
+
+select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Nested Loop Left Join
+           Join Filter: (l_2.nn = s_2.nn)
+           ->  Seq Scan on l l_2
+           ->  Materialize
+                 ->  Seq Scan on s s_2
+   ->  Nested Loop Anti Join
+         Join Filter: ((s.nn = s_1.nn) OR (s_1.nn IS NULL))
+         ->  Hash Left Join
+               Hash Cond: (l.nn = s.nn)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s
+         ->  Materialize
+               ->  Hash Left Join
+                     Hash Cond: (l_1.nn = s_1.nn)
+                     ->  Seq Scan on l l_1
+                     ->  Hash
+                           ->  Seq Scan on s s_1
+(21 rows)
+
+select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Seq Scan on s1
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(11 rows)
+
+select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Left Join
+               Hash Cond: (l.u = s.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Right Join
+                           Hash Cond: (s1.u = s.u)
+                           ->  Seq Scan on s1
+                           ->  Hash
+                                 ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(17 rows)
+
+select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+                    QUERY PLAN                     
+---------------------------------------------------
+ Aggregate
+   ->  Nested Loop Left Join
+         Join Filter: (s.u = s1.u)
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+               ->  Index Only Scan using l_nn on l
+                     Index Cond: (nn = s.nn)
+         ->  Seq Scan on s1
+(8 rows)
+
+select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop
+               Join Filter: (s.u = s1.u)
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Seq Scan on s1
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(11 rows)
+
+select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Left Join
+               Hash Cond: (l.u = s.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Join
+                           Hash Cond: (s1.u = s.u)
+                           ->  Seq Scan on s1
+                           ->  Hash
+                                 ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(17 rows)
+
+select * from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+ u | n | nn | p | u | n | n1 | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Seq Scan on s1
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(11 rows)
+
+select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Full Join
+               Hash Cond: (l.u = s.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Hash Join
+                           Hash Cond: (s1.u = s.u)
+                           ->  Seq Scan on s1
+                           ->  Hash
+                                 ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(17 rows)
+
+select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+                    QUERY PLAN                     
+---------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Nested Loop Left Join
+         Join Filter: (l.nn = s1.nn)
+         ->  Nested Loop Left Join
+               Join Filter: (l.nn = s_1.nn)
+               ->  Index Only Scan using l_nn on l
+                     Index Cond: (nn = s.nn)
+               ->  Seq Scan on s s_1
+         ->  Seq Scan on s1
+(10 rows)
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.nn = s_1.nn)
+           ->  Nested Loop Left Join
+                 ->  Seq Scan on s1
+                 ->  Index Only Scan using l_nn on l
+                       Index Cond: (nn = s1.nn)
+           ->  Materialize
+                 ->  Seq Scan on s s_1
+(11 rows)
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+                                    QUERY PLAN                                     
+-----------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Nested Loop Left Join
+         Join Filter: (l.nn = s_1.nn)
+         ->  Index Scan using l_nn on l
+               Index Cond: (nn = s.nn)
+               Filter: (((s.n = n) OR (n IS NULL) OR (s.n IS NULL)) AND (s.u = u))
+         ->  Seq Scan on s s_1
+(8 rows)
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l right join s on l.nn = s.nn);
+                QUERY PLAN                 
+-------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+           ->  Index Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+(7 rows)
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+--test reduce outer joins from outer query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.nn = l_1.nn)
+         ->  Hash Left Join
+               Hash Cond: (l.nn = s.nn)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s
+         ->  Hash
+               ->  Seq Scan on l l_1
+(13 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         ->  Nested Loop Anti Join
+               ->  Seq Scan on s
+                     Filter: (u > 0)
+               ->  Index Only Scan using l_nn on l l_1
+                     Index Cond: (nn = s.nn)
+         ->  Index Only Scan using l_nn on l
+               Index Cond: (nn = s.nn)
+(9 rows)
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Index Only Scan using l_nn on l l_1
+                           Index Cond: (nn = s.nn)
+               ->  Index Only Scan using l_nn on l
+                     Index Cond: (nn = s.nn)
+         ->  Seq Scan on s1
+(11 rows)
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Anti Join
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on s1
+               ->  Materialize
+                     ->  Nested Loop
+                           ->  Seq Scan on s
+                           ->  Index Only Scan using l_nn on l
+                                 Index Cond: (nn = s.nn)
+         ->  Index Only Scan using l_nn on l l_1
+               Index Cond: (nn = s.nn)
+(15 rows)
+
+select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Left Join
+         Join Filter: (s.u = s1.u)
+         ->  Hash Anti Join
+               Hash Cond: (s.nn = l_1.nn)
+               ->  Hash Left Join
+                     Hash Cond: (l.nn = s.nn)
+                     Filter: ((s.nn IS NOT NULL) OR (NOT $0))
+                     ->  Seq Scan on l
+                     ->  Hash
+                           ->  Seq Scan on s
+               ->  Hash
+                     ->  Seq Scan on l l_1
+         ->  Seq Scan on s1
+(16 rows)
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+ count 
+-------
+     0
+(1 row)
+
+--test reduce outer joins from subquery
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+(7 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+                QUERY PLAN                 
+-------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+                 Filter: (u > 9)
+(8 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+                 Filter: (u > 9)
+           ->  Index Only Scan using l_nn on l
+                 Index Cond: (nn = s_1.nn)
+(8 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           Join Filter: (l.n = s1.n)
+           ->  Seq Scan on s1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Scan using l_nn on l
+                             Index Cond: (nn = s_1.nn)
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.n = s1.n)
+           ->  Seq Scan on s1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Scan using l_nn on l
+                             Index Cond: (nn = s_1.nn)
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+                   QUERY PLAN                    
+-------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.n = s1.n)
+           ->  Nested Loop Left Join
+                 ->  Seq Scan on s s_1
+                 ->  Index Scan using l_nn on l
+                       Index Cond: (nn = s_1.nn)
+           ->  Materialize
+                 ->  Seq Scan on s1
+(11 rows)
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+--test reduce outer join on outer and sub-query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+                                    QUERY PLAN                                    
+----------------------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = s1.u)
+         ->  Seq Scan on s1
+         ->  Materialize
+               ->  Nested Loop
+                     ->  Seq Scan on s
+                           Filter: (NOT (hashed SubPlan 1))
+                           SubPlan 1
+                             ->  Nested Loop
+                                   Join Filter: (l_1.n = s1_1.n)
+                                   ->  Seq Scan on s1 s1_1
+                                   ->  Materialize
+                                         ->  Nested Loop
+                                               ->  Seq Scan on s s_1
+                                               ->  Index Scan using l_nn on l l_1
+                                                     Index Cond: (nn = s_1.nn)
+                     ->  Index Only Scan using l_nn on l
+                           Index Cond: (nn = s.nn)
+(19 rows)
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         Filter: (NOT (hashed SubPlan 1))
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Hash Right Join
+                     Hash Cond: (s1.u = s.u)
+                     ->  Seq Scan on s1
+                     ->  Hash
+                           ->  Seq Scan on s
+         SubPlan 1
+           ->  Nested Loop Left Join
+                 Join Filter: (l_1.n = s1_1.n)
+                 ->  Nested Loop Left Join
+                       ->  Seq Scan on s s_1
+                       ->  Index Scan using l_nn on l l_1
+                             Index Cond: (nn = s_1.nn)
+                 ->  Materialize
+                       ->  Seq Scan on s1 s1_1
+(20 rows)
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+ count 
+-------
+     0
+(1 row)
+
+-- test union all
+explain (costs false) select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s t
+         Filter: (n IS NOT NULL)
+   ->  Append
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: ((t.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = t.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+         ->  Seq Scan on s
+               Filter: ((t.n = u) OR (u IS NULL))
+         ->  Seq Scan on s s_1
+               Filter: ((t.n = nn) OR (nn IS NULL))
+(15 rows)
+
+select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Result
+           ->  Append
+                 ->  Seq Scan on l l_1
+                 ->  Seq Scan on s s_3
+                 ->  Seq Scan on s s_4
+   ->  Seq Scan on s
+         Filter: ((n IS NOT NULL) OR (NOT $0))
+   ->  Append
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+         ->  Seq Scan on s s_1
+               Filter: ((s.n = u) OR (u IS NULL))
+         ->  Seq Scan on s s_2
+               Filter: ((s.n = nn) OR (nn IS NULL))
+(21 rows)
+
+select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Hash Anti Join
+         Hash Cond: (s.n = l_1.nn)
+         ->  Append
+               ->  Seq Scan on s
+                     Filter: ((n IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+                     Filter: ((u IS NOT NULL) OR (NOT $0))
+         ->  Hash
+               ->  Seq Scan on l l_1
+(12 rows)
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+ count 
+-------
+     1
+(1 row)
+
+explain (costs false) select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+                   QUERY PLAN                   
+------------------------------------------------
+ Aggregate
+   ->  Append
+         ->  Seq Scan on s
+               Filter: (NOT (hashed SubPlan 1))
+               SubPlan 1
+                 ->  Seq Scan on empty
+         ->  Seq Scan on l
+               Filter: (NOT (hashed SubPlan 1))
+(8 rows)
+
+select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+ count 
+-------
+ 10004
+(1 row)
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
+ Finalize Aggregate
+   InitPlan 1 (returns $0)
+     ->  Result
+           ->  Append
+                 ->  Seq Scan on l l_2
+                 ->  Seq Scan on s s_3
+                 ->  Seq Scan on s s_4
+   ->  Gather
+         Workers Planned: 2
+         Params Evaluated: $0
+         ->  Partial Aggregate
+               ->  Nested Loop Anti Join
+                     ->  Parallel Append
+                           ->  Parallel Seq Scan on l
+                                 Filter: ((u IS NOT NULL) OR (NOT $0))
+                           ->  Parallel Seq Scan on s
+                                 Filter: ((n IS NOT NULL) OR (NOT $0))
+                     ->  Append
+                           ->  Bitmap Heap Scan on l l_1
+                                 Recheck Cond: ((l.u = n) OR (n IS NULL))
+                                 ->  BitmapOr
+                                       ->  Bitmap Index Scan on l_n
+                                             Index Cond: (n = l.u)
+                                       ->  Bitmap Index Scan on l_n
+                                             Index Cond: (n IS NULL)
+                           ->  Seq Scan on s s_1
+                                 Filter: ((l.u = u) OR (u IS NULL))
+                           ->  Seq Scan on s s_2
+                                 Filter: ((l.u = nn) OR (nn IS NULL))
+(29 rows)
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+ count 
+-------
+     0
+(1 row)
+
+-- test multi-levels of NOT IN
+explain (costs false) select * from s where n not in (select n from s where n not in (select n from l));
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 2))
+   SubPlan 2
+     ->  Nested Loop Anti Join
+           InitPlan 1 (returns $0)
+             ->  Seq Scan on l
+           ->  Seq Scan on s s_1
+                 Filter: ((n IS NOT NULL) OR (NOT $0))
+           ->  Bitmap Heap Scan on l l_1
+                 Recheck Cond: ((s_1.n = n) OR (n IS NULL))
+                 ->  BitmapOr
+                       ->  Bitmap Index Scan on l_n
+                             Index Cond: (n = s_1.n)
+                       ->  Bitmap Index Scan on l_n
+                             Index Cond: (n IS NULL)
+(15 rows)
+
+select * from s where n not in (select n from s where n not in (select n from l));
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+       1 |       1 |       1 | foo
+       2 |       2 |       2 | foo
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(4 rows)
+
+explain (costs false) select * from s where n not in (select n from s where n not in (select u from l));
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 2))
+   SubPlan 2
+     ->  Nested Loop Anti Join
+           InitPlan 1 (returns $0)
+             ->  Seq Scan on l
+           ->  Seq Scan on s s_1
+                 Filter: ((n IS NOT NULL) OR (NOT $0))
+           ->  Index Only Scan using l_u on l l_1
+                 Index Cond: (u = s_1.n)
+(10 rows)
+
+select * from s where n not in (select n from s where n not in (select u from l));
+ u | n | nn |  p  
+---+---+----+-----
+ 1 | 1 |  1 | foo
+ 2 | 2 |  2 | foo
+(2 rows)
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Aggregate
+   ->  Seq Scan on s
+         Filter: (NOT (SubPlan 2))
+         SubPlan 2
+           ->  Result
+                 One-Time Filter: (NOT $2)
+                 InitPlan 1 (returns $2)
+                   ->  Nested Loop Anti Join
+                         ->  Seq Scan on s1
+                               Filter: (n = s.n)
+                         ->  Bitmap Heap Scan on l
+                               Recheck Cond: ((s1.u = n) OR (n IS NULL))
+                               ->  BitmapOr
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n = s1.u)
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n IS NULL)
+                 ->  Seq Scan on s1 s1_1
+(18 rows)
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+                                           QUERY PLAN                                           
+------------------------------------------------------------------------------------------------
+ Seq Scan on s
+   Filter: ((NOT (hashed SubPlan 1)) AND (NOT (hashed SubPlan 2)) AND (NOT (hashed SubPlan 3)))
+   SubPlan 1
+     ->  Seq Scan on s1
+   SubPlan 2
+     ->  Seq Scan on s1 s1_1
+   SubPlan 3
+     ->  Seq Scan on s1 s1_2
+(8 rows)
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: ((NOT (hashed SubPlan 1)) AND (NOT (hashed SubPlan 2)))
+         SubPlan 1
+           ->  Seq Scan on s1
+         SubPlan 2
+           ->  Seq Scan on s1 s1_1
+   ->  Index Only Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+(9 rows)
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Aggregate
+   ->  Seq Scan on s
+         Filter: ((NOT (SubPlan 2)) AND (NOT (SubPlan 4)))
+         SubPlan 2
+           ->  Result
+                 One-Time Filter: (NOT $2)
+                 InitPlan 1 (returns $2)
+                   ->  Nested Loop Anti Join
+                         ->  Seq Scan on s1
+                               Filter: (n = s.n)
+                         ->  Bitmap Heap Scan on l
+                               Recheck Cond: ((s1.u = n) OR (n IS NULL))
+                               ->  BitmapOr
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n = s1.u)
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n IS NULL)
+                 ->  Seq Scan on s1 s1_1
+         SubPlan 4
+           ->  Result
+                 One-Time Filter: (NOT $6)
+                 InitPlan 3 (returns $6)
+                   ->  Nested Loop Anti Join
+                         ->  Seq Scan on s1 s1_2
+                               Filter: (n = s.n)
+                         ->  Bitmap Heap Scan on l l_1
+                               Recheck Cond: ((s1_2.u = n) OR (n IS NULL))
+                               ->  BitmapOr
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n = s1_2.u)
+                                     ->  Bitmap Index Scan on l_n
+                                           Index Cond: (n IS NULL)
+                 ->  Seq Scan on s1 s1_3
+(33 rows)
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+ count 
+-------
+     0
+(1 row)
+
+--test COALESCE
+explain (costs false) select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, '-1'::integer) = COALESCE(l.n, '-1'::integer))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, '-1'::integer) = COALESCE(l.n, '-1'::integer))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+                                 QUERY PLAN                                  
+-----------------------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: ((COALESCE(s.n) = COALESCE(l.n)) OR (COALESCE(l.n) IS NULL))
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: ((COALESCE(n) IS NOT NULL) OR (NOT $0))
+   ->  Materialize
+         ->  Seq Scan on l
+(8 rows)
+
+select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.n, s.nn) = COALESCE(l.n, l.nn))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+                   QUERY PLAN                   
+------------------------------------------------
+ Hash Anti Join
+   Hash Cond: (COALESCE(s.nn) = COALESCE(l.nn))
+   ->  Seq Scan on s
+   ->  Hash
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+                                                       QUERY PLAN                                                        
+-------------------------------------------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Scan using l_nn on l
+         Index Cond: (nn = s.nn)
+         Filter: ((COALESCE(s.n, '-1'::integer) = COALESCE(n, '-1'::integer)) AND (COALESCE(s.n, s.u) = COALESCE(n, u)))
+(5 rows)
+
+select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+-- test miscellaneous outer nullable cases
+explain (costs false) select * from s where (n,n) not in (select n,n from l);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s.n = l.n) AND (s.n = l.n)) IS NOT FALSE)
+   ->  Seq Scan on s
+   ->  Materialize
+         ->  Seq Scan on l
+(5 rows)
+
+select * from s where (n,n) not in (select n,n from l);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+                                     QUERY PLAN                                      
+-------------------------------------------------------------------------------------
+ Nested Loop Anti Join
+   Join Filter: (((s.n = l_1.n) AND (s.u = l_1.u) AND (s.nn = l_1.nn)) IS NOT FALSE)
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s
+   ->  Materialize
+         ->  Seq Scan on l l_1
+(9 rows)
+
+select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+ u | n | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+                 QUERY PLAN                  
+---------------------------------------------
+ Aggregate
+   ->  Hash Left Join
+         Hash Cond: (l.nn = s.nn)
+         Filter: (NOT (hashed SubPlan 1))
+         ->  Seq Scan on l
+         ->  Hash
+               ->  Seq Scan on s
+         SubPlan 1
+           ->  Index Scan using l_u on l l_1
+                 Index Cond: (u < 0)
+(10 rows)
+
+select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+ count 
+-------
+ 10000
+(1 row)
+
+explain (costs false) select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Sort
+   Sort Key: s.n
+   ->  Seq Scan on s
+         Filter: (NOT (hashed SubPlan 1))
+         SubPlan 1
+           ->  Limit
+                 ->  Unique
+                       ->  Index Scan using l_n on l
+                             Filter: (u > 0)
+(9 rows)
+
+select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+--test outer has strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l) and n > 0;
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n > 0)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where n not in (select n from l) and n > 0;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l) and u > 0;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Seq Scan on s
+         Filter: (((n IS NOT NULL) OR (NOT $0)) AND (u > 0))
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(12 rows)
+
+select * from s where n not in (select n from l) and u > 0;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l) and n is not null;
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n IS NOT NULL)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where n not in (select n from l) and n is not null;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s join l on s.n = l.n where s.n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Nested Loop
+   ->  Nested Loop Anti Join
+         ->  Seq Scan on s
+         ->  Bitmap Heap Scan on l l_1
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+   ->  Index Scan using l_n on l
+         Index Cond: (n = s.n)
+(12 rows)
+
+select * from s join l on s.n = l.n where s.n not in (select n from l);
+ u | n | nn | p | u | n | nn | p 
+---+---+----+---+---+---+----+---
+(0 rows)
+
+explain (costs false) select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Anti Join
+         ->  Hash Left Join
+               Hash Cond: (l.n = s.n)
+               Filter: ((s.n IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on l
+               ->  Hash
+                     ->  Seq Scan on s
+         ->  Bitmap Heap Scan on l l_1
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+(17 rows)
+
+select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop
+         Join Filter: (s.u = s1.u)
+         ->  Nested Loop
+               ->  Nested Loop Anti Join
+                     ->  Seq Scan on s
+                     ->  Bitmap Heap Scan on l l_1
+                           Recheck Cond: ((s.n = n) OR (n IS NULL))
+                           ->  BitmapOr
+                                 ->  Bitmap Index Scan on l_n
+                                       Index Cond: (n = s.n)
+                                 ->  Bitmap Index Scan on l_n
+                                       Index Cond: (n IS NULL)
+               ->  Index Only Scan using l_n on l
+                     Index Cond: (n = s.n)
+         ->  Seq Scan on s1
+(16 rows)
+
+select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+explain (costs false) select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_2
+   ->  Nested Loop Anti Join
+         ->  Nested Loop Left Join
+               Join Filter: (s.u = s1.u)
+               Filter: ((s.n IS NOT NULL) OR (NOT $0))
+               ->  Seq Scan on s1
+               ->  Materialize
+                     ->  Nested Loop
+                           ->  Seq Scan on s
+                           ->  Index Only Scan using l_n on l
+                                 Index Cond: (n = s.n)
+         ->  Bitmap Heap Scan on l l_1
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+(20 rows)
+
+select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+ count 
+-------
+     0
+(1 row)
+
+--test inner has strict predicate or inner join
+explain (costs false) select * from s where u not in (select n from l where n > 0);
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.u) AND (n > 0))
+(4 rows)
+
+select * from s where u not in (select n from l where n > 0);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where u not in (select n from l where u > 0);
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.u = n) OR (n IS NULL))
+         Filter: (u > 0)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.u)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(10 rows)
+
+select * from s where u not in (select n from l where u > 0);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select n from l where n is not null);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.u) AND (n IS NOT NULL))
+(4 rows)
+
+select * from s where u not in (select n from l where n is not null);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(7 rows)
+
+select * from s where u not in (select l.n from l join s on l.n=s.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.u=s.u);
+               QUERY PLAN                
+-----------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Scan using l_u on l
+                 Index Cond: (u = s_1.u)
+(7 rows)
+
+select * from s where u not in (select l.n from l join s on l.u=s.u);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n = s.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(7 rows)
+
+select * from s where u not in (select l.n from l join s on l.n = s.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n = s.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(7 rows)
+
+select * from s where u not in (select l.n from l right join s on l.n = s.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+                  QUERY PLAN                  
+----------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop
+           ->  Nested Loop
+                 Join Filter: (s_1.n = s1.n)
+                 ->  Seq Scan on s1
+                 ->  Materialize
+                       ->  Seq Scan on s s_1
+           ->  Index Only Scan using l_n on l
+                 Index Cond: (n = s_1.n)
+(11 rows)
+
+select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+       3 |         |       3 | foo
+(2 rows)
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Seq Scan on s
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l.n = s1.n)
+           ->  Seq Scan on s1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Only Scan using l_n on l
+                             Index Cond: (n = s_1.n)
+(11 rows)
+
+select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+--test both sides have strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l where n > 0) and n > 0;
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n > 0)
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.n) AND (n > 0))
+(5 rows)
+
+select * from s where n not in (select n from l where n > 0) and n > 0;
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s where n not in (select n from l where u > 0) and n > 0;
+                    QUERY PLAN                    
+--------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on s
+         Filter: (n > 0)
+   ->  Bitmap Heap Scan on l
+         Recheck Cond: ((s.n = n) OR (n IS NULL))
+         Filter: (u > 0)
+         ->  BitmapOr
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n = s.n)
+               ->  Bitmap Index Scan on l_n
+                     Index Cond: (n IS NULL)
+(11 rows)
+
+select * from s where n not in (select n from l where u > 0) and n > 0;
+ u | n | nn | p 
+---+---+----+---
+(0 rows)
+
+explain (costs false) select * from s where n not in (select n from l where n > 0) and u > 0;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Nested Loop Anti Join
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+           Filter: (n > 0)
+   ->  Seq Scan on s
+         Filter: (((n IS NOT NULL) OR (NOT $0)) AND (u > 0))
+   ->  Index Only Scan using l_n on l
+         Index Cond: ((n = s.n) AND (n > 0))
+(8 rows)
+
+select * from s where n not in (select n from l where n > 0) and u > 0;
+    u    |    n    |   nn    |   p    
+---------+---------+---------+--------
+ 1000002 | 1000002 | 1000002 | foofoo
+(1 row)
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Nested Loop
+   Join Filter: (s.u = s1.u)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Nested Loop
+               ->  Seq Scan on s
+                     Filter: (NOT (hashed SubPlan 1))
+                     SubPlan 1
+                       ->  Nested Loop
+                             ->  Nested Loop
+                                   Join Filter: (s_1.n = s1_1.n)
+                                   ->  Seq Scan on s s_1
+                                   ->  Materialize
+                                         ->  Seq Scan on s s1_1
+                             ->  Index Only Scan using l_n on l l_1
+                                   Index Cond: (n = s_1.n)
+               ->  Index Scan using l_n on l
+                     Index Cond: (n = s.n)
+(18 rows)
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+ u | n | nn | p | u | n | nn | p | u | n | n1 | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+----+---
+(0 rows)
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+                                   QUERY PLAN                                   
+--------------------------------------------------------------------------------
+ Nested Loop
+   Join Filter: (s.u = s1.u)
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Nested Loop
+               ->  Seq Scan on s
+                     Filter: (NOT (hashed SubPlan 1))
+                     SubPlan 1
+                       ->  Nested Loop Left Join
+                             Join Filter: (l_1.n = s1_1.n)
+                             ->  Seq Scan on s s1_1
+                             ->  Materialize
+                                   ->  Nested Loop
+                                         ->  Seq Scan on s s_1
+                                         ->  Index Only Scan using l_n on l l_1
+                                               Index Cond: (n = s_1.n)
+               ->  Index Scan using l_n on l
+                     Index Cond: (n = s.n)
+(18 rows)
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+ u | n | nn | p | u | n | nn | p | u | n | n1 | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+----+---
+(0 rows)
+
+explain (costs false) select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Nested Loop Left Join
+   Join Filter: (s.u = s1.u)
+   Filter: (NOT (hashed SubPlan 1))
+   ->  Seq Scan on s1
+   ->  Materialize
+         ->  Nested Loop
+               ->  Seq Scan on s
+               ->  Index Scan using l_n on l
+                     Index Cond: (n = s.n)
+   SubPlan 1
+     ->  Nested Loop Left Join
+           Join Filter: (l_1.n = s1_1.n)
+           ->  Seq Scan on s s1_1
+           ->  Materialize
+                 ->  Nested Loop
+                       ->  Seq Scan on s s_1
+                       ->  Index Only Scan using l_n on l l_1
+                             Index Cond: (n = s_1.n)
+(18 rows)
+
+select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+ u | n | nn | p | u | n | nn | p | u | n | n1 | nn | p 
+---+---+----+---+---+---+----+---+---+---+----+----+---
+(0 rows)
+
+--JIRA-7279 CTE with NOT IN
+create table public.testing
+(
+a integer,
+b integer,
+c integer
+);
+explain (costs false) with
+selected(a,b,c) as (values(1,2,3)),
+updated(d,e,f) as (values(4,5,6))
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+                    QUERY PLAN                     
+---------------------------------------------------
+ Insert on testing
+   ->  Result
+         One-Time Filter: (NOT (hashed SubPlan 1))
+         SubPlan 1
+           ->  Result
+(5 rows)
+
+with
+selected(a,b,c) as (values(1,2,3)),
+updated(d,e,f) as (values(4,5,6))
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+select * from public.testing;
+ a | b | c 
+---+---+---
+ 1 | 2 | 3
+(1 row)
+
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+                                     QUERY PLAN                                      
+-------------------------------------------------------------------------------------
+ Insert on testing
+   ->  Nested Loop Anti Join
+         Join Filter: (((s.u = l.u) AND (s.n = l.n) AND (s.nn = l.nn)) IS NOT FALSE)
+         ->  Seq Scan on s
+         ->  Materialize
+               ->  Seq Scan on l
+(6 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+(2 rows)
+
+-- expect to get Hash Anti Join
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where a not in (select d from updated);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Insert on testing
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Nested Loop Anti Join
+         ->  Seq Scan on s
+               Filter: ((u IS NOT NULL) OR (NOT $0))
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(8 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where a not in (select d from updated);
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+(3 rows)
+
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select e from updated);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Insert on testing
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Nested Loop Anti Join
+         ->  Seq Scan on s
+               Filter: ((n IS NOT NULL) OR (NOT $0))
+         ->  Bitmap Heap Scan on l
+               Recheck Cond: ((s.n = n) OR (n IS NULL))
+               ->  BitmapOr
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n = s.n)
+                     ->  Bitmap Index Scan on l_n
+                           Index Cond: (n IS NULL)
+(13 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select e from updated);
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+(3 rows)
+
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select d from updated);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Insert on testing
+   InitPlan 1 (returns $0)
+     ->  Seq Scan on l l_1
+   ->  Nested Loop Anti Join
+         ->  Seq Scan on s
+               Filter: ((n IS NOT NULL) OR (NOT $0))
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.n)
+(8 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select d from updated);
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+(4 rows)
+
+-- two levels of NOT IN with CTE, 2nd NOT IN
+-- subquery access CTE two levels above
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated
+      where d not in (select a from selected));
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
+ Insert on testing
+   CTE selected
+     ->  Seq Scan on s
+   ->  Nested Loop Anti Join
+         Join Filter: (((selected.a = l.u) AND (selected.b = l.n) AND (selected.c = l.nn)) IS NOT FALSE)
+         ->  CTE Scan on selected
+         ->  Materialize
+               ->  Seq Scan on l
+                     Filter: (NOT (hashed SubPlan 3))
+                     SubPlan 3
+                       ->  CTE Scan on selected selected_1
+(11 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated
+      where d not in (select a from selected));
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+       1 |       1 |       1
+       2 |       2 |       2
+ 1000002 | 1000002 | 1000002
+       3 |         |       3
+(8 rows)
+
+-- With clause inside a query block
+explain select count(distinct t.a) from
+(with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+select * from selected where (a,b,c) not in
+(select d,e,f from updated
+ where d not in (select a from selected))) as t;
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
+ Aggregate  (cost=693.75..693.76 rows=1 width=8)
+   ->  Nested Loop Anti Join  (cost=1.13..693.71 rows=3 width=12)
+         Join Filter: (((selected.a = l.u) AND (selected.b = l.n) AND (selected.c = l.nn)) IS NOT FALSE)
+         CTE selected
+           ->  Seq Scan on s  (cost=0.00..1.04 rows=4 width=12)
+         ->  CTE Scan on selected  (cost=0.00..0.08 rows=4 width=12)
+         ->  Materialize  (cost=0.09..230.09 rows=5000 width=12)
+               ->  Seq Scan on l  (cost=0.09..180.09 rows=5000 width=12)
+                     Filter: (NOT (hashed SubPlan 3))
+                     SubPlan 3
+                       ->  CTE Scan on selected selected_1  (cost=0.00..0.08 rows=4 width=4)
+(11 rows)
+
+select count(distinct t.a) from
+(with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+select * from selected where (a,b,c) not in
+(select d,e,f from updated
+ where d not in (select a from selected))) as t;
+ count 
+-------
+     4
+(1 row)
+
+-- With clause in subquery, can't flatten subquery to anti join
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(with updated (d,e,f) as (select u, n, nn from l)
+select d,e,f from updated);
+            QUERY PLAN             
+-----------------------------------
+ Insert on testing
+   ->  Seq Scan on s
+         Filter: (NOT (SubPlan 1))
+         SubPlan 1
+           ->  Materialize
+                 ->  Seq Scan on l
+(6 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(with updated (d,e,f) as (select u, n, nn from l)
+select d,e,f from updated);
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+       1 |       1 |       1
+       2 |       2 |       2
+ 1000002 | 1000002 | 1000002
+       3 |         |       3
+ 1000002 | 1000002 | 1000002
+(9 rows)
+
+-- With clause in subquery, subsubquery access CTE in subquery
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(
+with updated(d,e,f) as (select u, n, nn from l)
+select d,e,f from updated where d not in (select d from updated)
+);
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Insert on testing
+   ->  Seq Scan on s
+         Filter: (NOT (SubPlan 3))
+         SubPlan 3
+           ->  Materialize
+                 CTE updated
+                   ->  Seq Scan on l
+                 InitPlan 2 (returns $1)
+                   ->  CTE Scan on updated
+                 ->  Hash Anti Join
+                       Hash Cond: (updated_1.d = updated_2.d)
+                       ->  CTE Scan on updated updated_1
+                             Filter: ((d IS NOT NULL) OR (NOT $1))
+                       ->  Hash
+                             ->  CTE Scan on updated updated_2
+(15 rows)
+
+with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(
+with updated(d,e,f) as (select u, n, nn from l)
+select d,e,f from updated where d not in (select d from updated)
+);
+select * from public.testing;
+    a    |    b    |    c    
+---------+---------+---------
+       1 |       2 |       3
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+ 1000002 | 1000002 | 1000002
+       1 |       1 |       1
+       2 |       2 |       2
+ 1000002 | 1000002 | 1000002
+       3 |         |       3
+ 1000002 | 1000002 | 1000002
+       1 |       1 |       1
+       2 |       2 |       2
+ 1000002 | 1000002 | 1000002
+       3 |         |       3
+(13 rows)
+
+-- Recursive CTE
+CREATE TABLE employees (
+  id serial,
+  name varchar(255),
+  manager_id int
+);
+INSERT INTO employees VALUES (1, 'Mark', null);
+INSERT INTO employees VALUES (2, 'John', 1);
+INSERT INTO employees VALUES (3, 'Dan', 2);
+INSERT INTO employees VALUES (4, 'Clark', 1);
+INSERT INTO employees VALUES (5, 'Linda', 2);
+INSERT INTO employees VALUES (6, 'Willy', 2);
+INSERT INTO employees VALUES (7, 'Barack', 2);
+INSERT INTO employees VALUES (8, 'Elen', 2);
+INSERT INTO employees VALUES (9, 'Kate', 3);
+INSERT INTO employees VALUES (10, 'Terry', 4);
+WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON mtree.id = e.manager_id
+)
+SELECT *
+FROM managertree;
+ id |  name  | manager_id 
+----+--------+------------
+  2 | John   |          1
+  3 | Dan    |          2
+  5 | Linda  |          2
+  6 | Willy  |          2
+  7 | Barack |          2
+  8 | Elen   |          2
+  9 | Kate   |          3
+(7 rows)
+
+-- NOT IN subquery access Recursive CTE
+EXPLAIN (COSTS FALSE) WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON mtree.id = e.manager_id
+)
+SELECT *
+FROM managertree mt WHERE mt.manager_id NOT IN (SELECT id FROM managertree);
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ CTE Scan on managertree mt
+   Filter: (NOT (hashed SubPlan 2))
+   CTE managertree
+     ->  Recursive Union
+           ->  Seq Scan on employees
+                 Filter: (id = 2)
+           ->  Hash Join
+                 Hash Cond: (e.manager_id = mtree.id)
+                 ->  Seq Scan on employees e
+                 ->  Hash
+                       ->  WorkTable Scan on managertree mtree
+   SubPlan 2
+     ->  CTE Scan on managertree
+(13 rows)
+
+WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON mtree.id = e.manager_id
+)
+SELECT *
+FROM managertree mt WHERE mt.manager_id NOT IN (SELECT id FROM managertree);
+ id | name | manager_id 
+----+------+------------
+  2 | John |          1
+(1 row)
+
+-- NOT IN under UNION ALL inside Recursive CTE
+EXPLAIN (COSTS FALSE) WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON
+  (mtree.id = e.manager_id AND
+  mtree.manager_id NOT IN (SELECT manager_id FROM employees)
+  )
+)
+SELECT *
+FROM managertree;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ CTE Scan on managertree
+   CTE managertree
+     ->  Recursive Union
+           ->  Seq Scan on employees employees_1
+                 Filter: (id = 2)
+           ->  Hash Join
+                 Hash Cond: (e.manager_id = mtree.id)
+                 ->  Seq Scan on employees e
+                 ->  Hash
+                       ->  WorkTable Scan on managertree mtree
+                             Filter: (NOT (hashed SubPlan 1))
+                             SubPlan 1
+                               ->  Seq Scan on employees
+(13 rows)
+
+WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON
+  (mtree.id = e.manager_id AND
+  mtree.manager_id NOT IN (SELECT manager_id FROM employees)
+  )
+)
+SELECT *
+FROM managertree;
+ id | name | manager_id 
+----+------+------------
+  2 | John |          1
+(1 row)
+
+--Manfred-7613 CTE NOT IN with Union All
+create table cocotero as (
+	select * from(
+	values(1,2,3)) as data(a,b,c)
+);
+explain (costs off) with selected as (
+	select *
+	from cocotero
+),
+updated as (
+	update cocotero
+	set a = 3
+	from selected
+	where cocotero.a = selected.a
+	returning selected.a,selected.b,selected.c
+),
+inserted as (
+	insert into cocotero
+	select *
+	from selected
+	where a not in (select a from updated)
+	returning *
+)
+select 'updated' as action, count(*) as lines from updated
+union all
+select 'inserted' as action, count(*) as lines from inserted;
+                                      QUERY PLAN                                      
+--------------------------------------------------------------------------------------
+ Append
+   CTE selected
+     ->  Seq Scan on cocotero
+   CTE updated
+     ->  Update on cocotero cocotero_1
+           ->  Merge Join
+                 Merge Cond: (cocotero_1.a = selected.a)
+                 ->  Sort
+                       Sort Key: cocotero_1.a
+                       ->  Seq Scan on cocotero cocotero_1
+                 ->  Materialize
+                       ->  Sort
+                             Sort Key: selected.a
+                             ->  CTE Scan on selected
+   CTE inserted
+     ->  Insert on cocotero cocotero_2
+           InitPlan 3 (returns $3)
+             ->  CTE Scan on updated updated_1
+           ->  Nested Loop Anti Join
+                 Join Filter: ((selected_1.a = updated_2.a) OR (updated_2.a IS NULL))
+                 ->  CTE Scan on selected selected_1
+                       Filter: ((a IS NOT NULL) OR (NOT $3))
+                 ->  CTE Scan on updated updated_2
+   ->  Aggregate
+         ->  CTE Scan on updated
+   ->  Aggregate
+         ->  CTE Scan on inserted
+(27 rows)
+
+with selected as (
+	select *
+	from cocotero
+),
+updated as (
+	update cocotero
+	set a = 3
+	from selected
+	where cocotero.a = selected.a
+	returning selected.a,selected.b,selected.c
+),
+inserted as (
+	insert into cocotero
+	select *
+	from selected
+	where a not in (select a from updated)
+	returning *
+)
+select 'updated' as action, count(*) as lines from updated
+union all
+select 'inserted' as action, count(*) as lines from inserted;
+  action  | lines 
+----------+-------
+ updated  |     1
+ inserted |     0
+(2 rows)
+
+--test enable_not_in_transform
+explain (costs off) select count(*) from s where s.u not in (select l.u from l);
+                 QUERY PLAN                 
+--------------------------------------------
+ Aggregate
+   ->  Nested Loop Anti Join
+         ->  Seq Scan on s
+         ->  Index Only Scan using l_u on l
+               Index Cond: (u = s.u)
+(5 rows)
+
+set enable_not_in_transform = off;
+explain (costs off) select count(*) from s where s.u not in (select l.u from l);
+            QUERY PLAN             
+-----------------------------------
+ Aggregate
+   ->  Seq Scan on s
+         Filter: (NOT (SubPlan 1))
+         SubPlan 1
+           ->  Materialize
+                 ->  Seq Scan on l
+(6 rows)
+
+-- clean up
+reset work_mem;
+reset enable_not_in_transform;
+drop table s;
+drop table s1;
+drop table l;
+drop table empty;
+drop table public.testing;
+drop table employees;
+drop table cocotero;
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 715842b..6e3aee8 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -83,6 +83,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_material                | on
  enable_mergejoin               | on
  enable_nestloop                | on
+ enable_not_in_transform        | on
  enable_parallel_append         | on
  enable_parallel_hash           | on
  enable_partition_pruning       | on
@@ -91,7 +92,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(19 rows)
+(20 rows)
 
 -- Test that the pg_timezone_names and pg_timezone_abbrevs views are
 -- more-or-less working.  We can't test their contents in any great detail
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index 893d8d0..4553a9f 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -831,3 +831,818 @@ select * from (with x as (select 2 as y) select * from x) ss;
 explain (verbose, costs off)
 with x as (select * from subselect_tbl)
 select * from x for update;
+
+-- test NON IN to ANTI JOIN conversion
+CREATE TABLE s (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s (u, n, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s values(1000002, 1000002, 1000002, 'foofoo');
+UPDATE s set n = NULL WHERE n = 3;
+analyze s;
+
+CREATE TABLE l (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into l (u, n, nn, p)
+	select
+    generate_series(1,10000 ) as u,
+	generate_series(1,10000 ) as n,
+	generate_series(1,10000 ) as nn,
+	'bar' as p;
+UPDATE l set n = NULL WHERE n = 7;
+
+CREATE UNIQUE INDEX l_u ON l (u);
+CREATE INDEX l_n ON l (n);
+CREATE INDEX l_nn ON l (nn);
+analyze l;
+
+CREATE TABLE s1 (u INTEGER NOT NULL, n INTEGER NULL, n1 INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+insert into s1 (u, n, n1, nn, p)
+    select
+    generate_series(1,3) as u,
+	generate_series(1,3) as n,
+	generate_series(1,3) as n1,
+	generate_series(1,3) as nn,
+	'foo' as p;
+insert into s1 values(1000003, 1000003, 1000003, 1000003, 'foofoo');
+insert into s1 values(1003, 1003, 1003, 1003, 'foofoo');
+UPDATE s1 set n = NULL WHERE n = 3;
+UPDATE s1 set n1 = NULL WHERE n = 2;
+UPDATE s1 set n1 = NULL WHERE n1 = 3;
+analyze s1;
+
+CREATE TABLE empty (u INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
+analyze empty;
+
+-- set work_mem to 64KB so that NOT IN to ANTI JOIN optimization will kick in
+set work_mem = 64;
+
+-- correctness test 1: inner empty, return every thing from outer including NULL
+explain (costs false) select * from s where n not in (select n from empty);
+
+select * from s where n not in (select n from empty);
+
+-- correctness test 2: inner has NULL, return empty result
+explain (costs false) select * from s where n not in (select n from l);
+
+select * from s where n not in (select n from l);
+
+-- correctness test 3: inner non-null, result has no NULL
+explain (costs false) select * from s where n not in (select u from l);
+
+select * from s where n not in (select u from l);
+
+-- correctness test 4: inner has predicate
+explain (costs false) select * from s where n not in (select n from l where u > 7);
+
+select * from s where n not in (select n from l where u > 7);
+
+-- correctness test 5: multi-expression, (2, 2, null, 2, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+
+select * from s1 where (n,n1) not in (select u,nn from l where u >= 3);
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should not be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+
+select * from s1 where (n,n1) not in (select u,nn from l where u > 0);
+
+-- correctness test 6: multi-expression, (3, null, null, 3, foo) should be in the result
+explain (costs false) select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+
+select * from s1 where (n,n1) not in (select u,nn from l where u < 0);
+
+-- test using hashed subplan when inner fits in work_mem
+explain (costs false) select * from l where n not in (select n from s);
+
+select * from l where n not in (select n from s);
+
+-- test single expression
+explain (costs false) select * from s where n not in (select n from l);
+
+select * from s where n not in (select n from l);
+
+explain (costs false) select * from s where u not in (select u from l);
+
+select * from s where u not in (select u from l);
+
+explain (costs false) select * from s where 3*n not in (select n from l);
+
+select * from s where 3*n not in (select n from l);
+
+explain (costs false) select * from s where n not in (select 3*n from l);
+
+select * from s where n not in (select 3*n from l);
+
+-- test single expression with predicates
+explain (costs false) select * from s where n not in (select n from l where u > 0);
+
+select * from s where n not in (select n from l where u > 0);
+
+explain (costs false) select * from s where n not in (select n from l where u > 100);
+
+select * from s where n not in (select n from l where u > 100);
+
+-- test multi expression
+explain (costs false) select * from s where (n,u) not in (select n,u from l);
+
+select * from s where (n,u) not in (select n,u from l);
+
+explain (costs false) select * from s where (u, nn) not in (select u, nn from l);
+
+select * from s where (u, nn) not in (select u, nn from l);
+
+explain (costs false) select * from s where (n,u) not in (select u,n from l);
+
+select * from s where (n,u) not in (select u,n from l);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1000);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 0);
+
+explain (costs false) select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+
+select * from s where (n,u,nn) not in (select u,n,nn from l where u > 1);
+
+-- test multi-table
+explain (costs false) select count(*) from s, l where s.n not in (select n from l);
+
+select count(*) from s, l where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s, l where s.nn not in (select nn from l);
+
+select count(*) from s, l where s.nn not in (select nn from l);
+
+-- test null padded results from outer join
+explain (costs false) select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+
+select * from s where n not in (select s.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+
+select * from s where n not in (select s.nn from s right join l on s.nn = l.nn);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+
+select count(*) from s right join l on s.nn = l.nn where l.nn not in (select nn from s);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from s);
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+
+select count(*) from s right join l on s.nn=l.nn where l.nn not in (select l.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+
+select count(*) from s right join l on s.nn=l.nn where s.nn not in (select s.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s left join s1 on s.u=s1.u left join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s right join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+select * from s join s1 on s.u=s1.u right join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s full join s1 on s.u=s1.u join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+
+select count(*) from s join s1 on s.u=s1.u full join l on s.u=l.u where s.nn not in (select nn from l);
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn left join s1 on l.nn=s1.nn);
+
+explain (costs false) select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+
+select * from s where s.nn not in (select l.nn from l left join s on l.nn=s.nn right join s1 on l.nn=s1.nn);
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+explain (costs false) select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l right join s on l.nn = s.nn);
+
+select * from s where (n,u,nn) not in (select l.n,l.u,l.nn from l left join s on l.nn = s.nn);
+
+--test reduce outer joins from outer query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+
+select count(*) from s right join l on s.nn = l.nn where s.nn not in (select nn from l) and s.u>0;
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn right join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select nn from l);
+
+--test reduce outer joins from subquery
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where l.u > 9);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn where s.u > 9);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn right join s1 on l.n = s1.n);
+
+explain (costs false) select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+select * from s where nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+--test reduce outer join on outer and sub-query
+explain (costs false) select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+select count(*) from s right join l on s.nn = l.nn join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn join s1 on l.n = s1.n);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+select count(*) from s right join l on s.nn = l.nn left join s1 on s.u = s1.u where s.nn not in (select l.nn from l right join s on l.nn = s.nn left join s1 on l.n = s1.n);
+
+-- test union all
+explain (costs false) select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+
+select * from s as t where not exists
+(select 1 from (select n as y from l union all
+				select u as y from s union all
+				select nn as y from s) as v where t.n=v.y or v.y is null) and n is not null;
+
+explain (costs false) select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+select * from s where n not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select nn from l);
+
+explain (costs false) select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+
+select count(*) from
+(select n as x from s union all select n as x from l) t where t.x not in
+(select nn from empty);
+
+explain (costs false) select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+select count(*) from
+(select n as x from s union all select u as x from l) t where t.x not in
+(select n as y from l union all
+ select u as y from s union all
+ select nn as y from s);
+
+-- test multi-levels of NOT IN
+explain (costs false) select * from s where n not in (select n from s where n not in (select n from l));
+
+select * from s where n not in (select n from s where n not in (select n from l));
+
+explain (costs false) select * from s where n not in (select n from s where n not in (select u from l));
+
+select * from s where n not in (select n from s where n not in (select u from l));
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from s1);
+
+explain (costs false) select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+
+select * from s where n not in (select n from s1) and u not in (select u from s1) and nn not in (select nn from l);
+
+explain (costs false) select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+select count(*) from s where u not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n))
+and nn not in
+(select n from s1 where not exists
+ (select 1 from (select n from s1 where u not in (select n from l)) t where t.n = s.n));
+
+--test COALESCE
+explain (costs false) select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+
+select * from s where COALESCE(n, -1) not in (select COALESCE(n, -1) from l);
+
+explain (costs false) select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+
+select * from s where COALESCE(n, NULL, -1) not in (select COALESCE(n, NULL, -1) from l);
+
+explain (costs false) select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+
+select * from s where COALESCE(n, NULL, NULL) not in (select COALESCE(n, NULL, NULL) from l);
+
+explain (costs false) select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+
+select * from s where COALESCE(n, nn) not in (select COALESCE(n, nn) from l);
+
+explain (costs false) select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+
+select * from s where COALESCE(nn, NULL) not in (select COALESCE(nn, NULL) from l);
+
+explain (costs false) select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+
+select * from s where (COALESCE(n, -1), nn, COALESCE(n, u)) not in (select COALESCE(n, -1), nn, COALESCE(n, u) from l);
+
+-- test miscellaneous outer nullable cases
+
+explain (costs false) select * from s where (n,n) not in (select n,n from l);
+
+select * from s where (n,n) not in (select n,n from l);
+
+explain (costs false) select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+
+select * from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l);
+
+explain (costs false) select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+
+select count(*) from s right join l on s.nn = l.nn where (s.n,s.u,s.nn) not in (select n,u,nn from l where u < 0);
+
+explain (costs false) select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+
+select * from s where (n,n,n) not in (select distinct n,n,n from l where u > 0 limit 3) order by n;
+
+--test outer has strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l) and n > 0;
+
+select * from s where n not in (select n from l) and n > 0;
+
+explain (costs false) select * from s where n not in (select n from l) and u > 0;
+
+select * from s where n not in (select n from l) and u > 0;
+
+explain (costs false) select * from s where n not in (select n from l) and n is not null;
+
+select * from s where n not in (select n from l) and n is not null;
+
+explain (costs false) select * from s join l on s.n = l.n where s.n not in (select n from l);
+
+select * from s join l on s.n = l.n where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+
+select count(*) from s right join l on s.n = l.n where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+
+select count(*) from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select n from l);
+
+explain (costs false) select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+
+select count(*) from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select n from l);
+
+--test inner has strict predicate or inner join
+explain (costs false) select * from s where u not in (select n from l where n > 0);
+
+select * from s where u not in (select n from l where n > 0);
+
+explain (costs false) select * from s where u not in (select n from l where u > 0);
+
+select * from s where u not in (select n from l where u > 0);
+
+explain (costs false) select * from s where u not in (select n from l where n is not null);
+
+select * from s where u not in (select n from l where n is not null);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n);
+
+select * from s where u not in (select l.n from l join s on l.n=s.n);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.u=s.u);
+
+select * from s where u not in (select l.n from l join s on l.u=s.u);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n = s.n);
+
+select * from s where u not in (select l.n from l join s on l.n = s.n);
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n = s.n);
+
+select * from s where u not in (select l.n from l right join s on l.n = s.n);
+
+explain (costs false) select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+
+select * from s where u not in (select l.n from l right join s on l.n=s.n join s1 on l.n=s1.n);
+
+explain (costs false) select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+
+select * from s where u not in (select l.n from l join s on l.n=s.n right join s1 on l.n=s1.n);
+
+--test both sides have strict predicate or inner join
+explain (costs false) select * from s where n not in (select n from l where n > 0) and n > 0;
+
+select * from s where n not in (select n from l where n > 0) and n > 0;
+
+explain (costs false) select * from s where n not in (select n from l where u > 0) and n > 0;
+
+select * from s where n not in (select n from l where u > 0) and n > 0;
+
+explain (costs false) select * from s where n not in (select n from l where n > 0) and u > 0;
+
+select * from s where n not in (select n from l where n > 0) and u > 0;
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l right join s on l.n=s.n join s s1 on l.n=s1.n);
+
+explain (costs false) select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+select * from s right join l on s.n = l.n join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+explain (costs false) select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+select * from s join l on s.n = l.n right join s1 on s.u = s1.u where s.n not in (select l.n from l join s on l.n=s.n right join s s1 on l.n=s1.n);
+
+--JIRA-7279 CTE with NOT IN
+create table public.testing
+(
+a integer,
+b integer,
+c integer
+);
+
+explain (costs false) with
+selected(a,b,c) as (values(1,2,3)),
+updated(d,e,f) as (values(4,5,6))
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+
+with
+selected(a,b,c) as (values(1,2,3)),
+updated(d,e,f) as (values(4,5,6))
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+
+select * from public.testing;
+
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated);
+
+select * from public.testing;
+
+-- expect to get Hash Anti Join
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where a not in (select d from updated);
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where a not in (select d from updated);
+
+select * from public.testing;
+
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select e from updated);
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select e from updated);
+
+select * from public.testing;
+
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select d from updated);
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where b not in (select d from updated);
+
+select * from public.testing;
+
+-- two levels of NOT IN with CTE, 2nd NOT IN
+-- subquery access CTE two levels above
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated
+      where d not in (select a from selected));
+
+with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+insert into public.testing
+select * from selected
+where (a,b,c) not in (select d,e,f from updated
+      where d not in (select a from selected));
+
+select * from public.testing;
+
+-- With clause inside a query block
+explain select count(distinct t.a) from
+(with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+select * from selected where (a,b,c) not in
+(select d,e,f from updated
+ where d not in (select a from selected))) as t;
+
+select count(distinct t.a) from
+(with
+selected(a,b,c) as (select u, n, nn from s),
+updated(d,e,f) as (select u, n, nn from l)
+select * from selected where (a,b,c) not in
+(select d,e,f from updated
+ where d not in (select a from selected))) as t;
+
+-- With clause in subquery, can't flatten subquery to anti join
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(with updated (d,e,f) as (select u, n, nn from l)
+select d,e,f from updated);
+
+with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(with updated (d,e,f) as (select u, n, nn from l)
+select d,e,f from updated);
+
+select * from public.testing;
+
+-- With clause in subquery, subsubquery access CTE in subquery
+explain (costs false) with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(
+with updated(d,e,f) as (select u, n, nn from l)
+select d,e,f from updated where d not in (select d from updated)
+);
+
+with
+selected(a,b,c) as (select u, n, nn from s)
+insert into public.testing
+select * from selected where (a,b,c) not in
+(
+with updated(d,e,f) as (select u, n, nn from l)
+select d,e,f from updated where d not in (select d from updated)
+);
+
+select * from public.testing;
+
+-- Recursive CTE
+CREATE TABLE employees (
+  id serial,
+  name varchar(255),
+  manager_id int
+);
+
+INSERT INTO employees VALUES (1, 'Mark', null);
+INSERT INTO employees VALUES (2, 'John', 1);
+INSERT INTO employees VALUES (3, 'Dan', 2);
+INSERT INTO employees VALUES (4, 'Clark', 1);
+INSERT INTO employees VALUES (5, 'Linda', 2);
+INSERT INTO employees VALUES (6, 'Willy', 2);
+INSERT INTO employees VALUES (7, 'Barack', 2);
+INSERT INTO employees VALUES (8, 'Elen', 2);
+INSERT INTO employees VALUES (9, 'Kate', 3);
+INSERT INTO employees VALUES (10, 'Terry', 4);
+
+WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON mtree.id = e.manager_id
+)
+SELECT *
+FROM managertree;
+
+-- NOT IN subquery access Recursive CTE
+EXPLAIN (COSTS FALSE) WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON mtree.id = e.manager_id
+)
+SELECT *
+FROM managertree mt WHERE mt.manager_id NOT IN (SELECT id FROM managertree);
+
+WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON mtree.id = e.manager_id
+)
+SELECT *
+FROM managertree mt WHERE mt.manager_id NOT IN (SELECT id FROM managertree);
+
+-- NOT IN under UNION ALL inside Recursive CTE
+EXPLAIN (COSTS FALSE) WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON
+  (mtree.id = e.manager_id AND
+  mtree.manager_id NOT IN (SELECT manager_id FROM employees)
+  )
+)
+SELECT *
+FROM managertree;
+
+WITH RECURSIVE managertree AS (
+  SELECT id, name, manager_id
+  FROM employees
+  WHERE id = 2
+  UNION ALL
+  SELECT e.id, e.name, e.manager_id
+  FROM employees e
+  INNER JOIN managertree mtree ON
+  (mtree.id = e.manager_id AND
+  mtree.manager_id NOT IN (SELECT manager_id FROM employees)
+  )
+)
+SELECT *
+FROM managertree;
+
+--Manfred-7613 CTE NOT IN with Union All
+create table cocotero as (
+	select * from(
+	values(1,2,3)) as data(a,b,c)
+);
+
+explain (costs off) with selected as (
+	select *
+	from cocotero
+),
+updated as (
+	update cocotero
+	set a = 3
+	from selected
+	where cocotero.a = selected.a
+	returning selected.a,selected.b,selected.c
+),
+inserted as (
+	insert into cocotero
+	select *
+	from selected
+	where a not in (select a from updated)
+	returning *
+)
+select 'updated' as action, count(*) as lines from updated
+union all
+select 'inserted' as action, count(*) as lines from inserted;
+
+with selected as (
+	select *
+	from cocotero
+),
+updated as (
+	update cocotero
+	set a = 3
+	from selected
+	where cocotero.a = selected.a
+	returning selected.a,selected.b,selected.c
+),
+inserted as (
+	insert into cocotero
+	select *
+	from selected
+	where a not in (select a from updated)
+	returning *
+)
+select 'updated' as action, count(*) as lines from updated
+union all
+select 'inserted' as action, count(*) as lines from inserted;
+
+--test enable_not_in_transform
+explain (costs off) select count(*) from s where s.u not in (select l.u from l);
+
+set enable_not_in_transform = off;
+
+explain (costs off) select count(*) from s where s.u not in (select l.u from l);
+
+-- clean up
+reset work_mem;
+reset enable_not_in_transform;
+drop table s;
+drop table s1;
+drop table l;
+drop table empty;
+drop table public.testing;
+drop table employees;
+drop table cocotero;
#61Li, Zheng
zhelli@amazon.com
In reply to: Tom Lane (#60)
Re: NOT IN subquery optimization

Hi Tom,

Thanks for the feedback.

* I find it entirely unacceptable to stick some planner temporary
fields into struct SubLink. If you need that storage you'll have
to find some other place to put it. But in point of fact I don't
think you need it; it doesn't look to me to be critical to generate
the subquery's plan any earlier than make_subplan would have done it.
Moreover, you should really strive to *not* do that, because it's
likely to get in the way of other future optimizations. As the
existing comment in make_subplan already suggests, we might want to
delay subplan planning even further than that in future.

The reason for calling make_subplan this early is that we want to
Call subplan_is_hashable(plan), to decide whether to proceed with the proposed
transformation. We try to stick with the fast hashed subplan when possible to avoid
potential performance degradation from the transformation which may restrict the
planner to choose Nested Loop Anti Join in order to handle Null properly,
the following is an example from subselect.out:
explain (costs false) select * from s where n not in (select u from l);
QUERY PLAN
-----------------------------------------------
Nested Loop Anti Join
InitPlan 1 (returns $0)
-> Seq Scan on l l_1
-> Seq Scan on s
Filter: ((n IS NOT NULL) OR (NOT $0))
-> Index Only Scan using l_u on l
Index Cond: (u = s.n)

However, if the subplan is not hashable, the above Nested Loop Anti Join is
actually faster.

* I'm also not too happy with the (undocumented) rearrangement of
reduce_outer_joins. There's a specific sequence of processing that
that's involved in, as documented at the top of prepjointree.c, and
I doubt that you can just randomly call it from other places and expect
good results. In particular, since JOIN alias var flattening won't have
happened yet when this code is being run from pull_up_sublinks, it's
unlikely that reduce_outer_joins will reliably get the same answers it
would get normally. I also wonder whether it's safe to make the
parsetree changes it makes earlier than normal, and whether it will be
problematic to run it twice on the same tree, and whether its rather
indirect connection to distribute_qual_to_rels is going to misbehave.

The rearrangement of reduce_outer_joins was to make the null test function
is_node_nonnullable() more accurate. Later we added strict predicates logic in
is_node_nonnullable(), so I think we can get rid of the rearrangement of
reduce_outer_joins now without losing accuracy.

* The proposed test additions seem to about triple the runtime of
subselect.sql. This seems excessive. I also wonder why it's necessary
for this test to build its own large tables; couldn't it re-use ones
that already exist in the regression database?

I added a lot of test cases. But yes, I can reuse the existing large table if
there is one that doesn't fit in 64KB work_mem.

* Not really sure that we need a new planner GUC for this, but if we
do, it needs to be documented.

The new GUC is just in case if anything goes wrong, the user can
easily turn it off.

Regards,
Zheng

#62Tom Lane
tgl@sss.pgh.pa.us
In reply to: Li, Zheng (#61)
Re: NOT IN subquery optimization

"Li, Zheng" <zhelli@amazon.com> writes:

* I find it entirely unacceptable to stick some planner temporary
fields into struct SubLink. If you need that storage you'll have
to find some other place to put it. But in point of fact I don't
think you need it; it doesn't look to me to be critical to generate
the subquery's plan any earlier than make_subplan would have done it.
Moreover, you should really strive to *not* do that, because it's
likely to get in the way of other future optimizations. As the
existing comment in make_subplan already suggests, we might want to
delay subplan planning even further than that in future.

The reason for calling make_subplan this early is that we want to
Call subplan_is_hashable(plan), to decide whether to proceed with the proposed
transformation.

Well, you're going to have to find another way, because this one won't do.

If you really need to get whacked over the head with a counterexample for
this approach, consider what happens if some part of the planner decides
to pass the SubLink through copyObject, expression_tree_mutator, etc
in between where you've done the planning and where make_subplan looks
at it. Since you haven't taught copyfuncs.c about these fields, they'll
semi-accidentally wind up as NULL afterwards, meaning you lost the
information anyway. (In fact, I wouldn't be surprised if that's happening
already in some cases; you couldn't really tell, since make_subplan will
just repeat the work.) On the other hand, you can't have copyfuncs.c
copying such fields either --- we don't have copyfuncs support for
PlannerInfo, and if we did, the case would end up as infinite recursion.
Nor would it be particularly cool to try to fake things out by copying the
pointers as scalars; that will lead to dangling pointers later on.

BTW, so far as I can see, the only reason you're bothering with the whole
thing is to compare the size of the subquery output with work_mem, because
that's all that subplan_is_hashable does. I wonder whether that
consideration is even still necessary in the wake of 1f39bce02. If it is,
I wonder whether there isn't a cheaper way to figure it out. (Note
similar comment in make_subplan.)

Also ...

We try to stick with the fast hashed subplan when possible to avoid
potential performance degradation from the transformation which may
restrict the planner to choose Nested Loop Anti Join in order to handle
Null properly,

But can't you detect that case directly? It seems like you'd need to
figure out the NULL situation anyway to know whether the transformation
to antijoin is valid in the first place.

regards, tom lane

#63Li, Zheng
zhelli@amazon.com
In reply to: Tom Lane (#62)
Re: NOT IN subquery optimization

BTW, so far as I can see, the only reason you're bothering with the whole

thing is to compare the size of the subquery output with work_mem, because
that's all that subplan_is_hashable does. I wonder whether that
consideration is even still necessary in the wake of 1f39bce02. If it is,
I wonder whether there isn't a cheaper way to figure it out. (Note
similar comment in make_subplan.)

The comment in make_subplan says there is no cheaper way to figure out:
/* At present, however, we can only check hashability after
* we've made the subplan :-(. (Determining whether it'll fit in work_mem
* is the really hard part.)
*/

I don't see why commit 1f39bce02 is related to this problem. Can you expand on this?

But can't you detect that case directly? It seems like you'd need to

figure out the NULL situation anyway to know whether the transformation
to antijoin is valid in the first place.

Yes, we do need to figure out the NULL situation, and there is always valid transformation
to antijoin, it's just in the NULL case we need to stuff additional clause to the anti join
condition, and in these cases the transformation actually outperforms Subplan (non-hashed),
but underperforms the hashed Subplan. The unmodified anti hash join has similar performance
compared to hashed Subplan.

#64Andrey Lepikhov
a.lepikhov@postgrespro.ru
In reply to: Li, Zheng (#63)
Re: NOT IN subquery optimization

You should do small rebase (conflict with 911e7020770) and pgindent of
the patch to repair problems with long lines and backspaces.

I am reviewing your patch in small steps. Questions:
1. In the find_innerjoined_rels() routine you stop descending on
JOIN_FULL node type. I think it is wrong because if var has NOT NULL
constraint, full join can't change it to NULL.
2. The convert_NOT_IN_to_join() routine is ok, but its name is
misleading. May be you can use something like make_NOT_IN_to_join_quals()?
3. pull_up_sublinks_qual_recurse(). Comment:
"Return pullout predicate (x is NOT NULL)..."
may be change to
"Return pullout predicate (x is NOT NULL or NOT EXISTS...)"?
4. is_node_nonnullable():
I think one more case of non-nullable var may be foreign key constraint.

--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company

#65David Steele
david@pgmasters.net
In reply to: Li, Zheng (#63)
Re: NOT IN subquery optimization

On 3/26/20 4:58 PM, Li, Zheng wrote:

BTW, so far as I can see, the only reason you're bothering with the whole

thing is to compare the size of the subquery output with work_mem, because
that's all that subplan_is_hashable does. I wonder whether that
consideration is even still necessary in the wake of 1f39bce02. If it is,
I wonder whether there isn't a cheaper way to figure it out. (Note
similar comment in make_subplan.)

The comment in make_subplan says there is no cheaper way to figure out:
/* At present, however, we can only check hashability after
* we've made the subplan :-(. (Determining whether it'll fit in work_mem
* is the really hard part.)
*/

I don't see why commit 1f39bce02 is related to this problem. Can you expand on this?

But can't you detect that case directly? It seems like you'd need to

figure out the NULL situation anyway to know whether the transformation
to antijoin is valid in the first place.

Yes, we do need to figure out the NULL situation, and there is always valid transformation
to antijoin, it's just in the NULL case we need to stuff additional clause to the anti join
condition, and in these cases the transformation actually outperforms Subplan (non-hashed),
but underperforms the hashed Subplan. The unmodified anti hash join has similar performance
compared to hashed Subplan.

There seem to be enough questions about this implementation that I think
it makes sense to mark this patch Returned with Feedback.

Feel free to resubmit it to a future CF when there is more of a
consensus on the implementation.

Regards,
--
-David
david@pgmasters.net