Patch to support SEMI and ANTI join removal

Started by David Rowleyover 11 years ago47 messages

dgrowleyml@gmail.com

over 11 years ago

1 attachment(s)

I've been working away at allowing semi and anti joins to be added to the
list of join types that our join removal code supports.

The basic idea is that we can removal a semi or anti join if the left hand
relation references the relation that's being semi/anti joined if the join
condition matches a foreign key definition on the left hand relation.

To give an example:

Given the 2 tables:
create table t2 (t1_id int primary key);
create table t1 (value int references t2);

The join to t2 would not be required in:

select * from t1 where value in(select t1_id from t2);

Neither would it be here:

select * from t1 where not exists(select 1 from t2 where t1_id=value);

To give a bit of background, I initially proposed the idea here:

/messages/by-id/CAApHDvq0NAi8cEqTNNdqG6mhFH__7_A6Tn9XU4V0cut9wab4gA@mail.gmail.com

And some issues were raised around the fact that updates to the referenced
relation would only flush out changes to the referencing tables on
completion of the command, and if we happened to be planning a query that
was located inside a volatile function then we wouldn't know that the
parent query hadn't updated some of these referenced tables.

Noah raised this concern here:
/messages/by-id/20140603235053.GA351732@tornado.leadboat.com
But proposed a solution here:
/messages/by-id/20140605000407.GA390318@tornado.leadboat.com

In the attached I've used Noah's solution to the problem, and it seems to
work just fine. (See regression test in the attached patch)

Tom raised a point here:
/messages/by-id/19326.1401891282@sss.pgh.pa.us

Where he mentioned that it may be possible that the foreign key trigger
queue gets added to after planning has taken place.
I've spent some time looking into this and I've not yet managed to find a
case where this matters as it seems that updates made in 1 command are not
visible to that same command. I've tested various different test cases in
all transaction isolation levels and also tested update commands which call
volatile functions that perform updates in the same table that the outer
update will reach later in the command.

The patch (attached) is also now able to detect when a NOT EXISTS clause
cannot produce any records at all.

If I make a simple change to the tables I defined above:

ALTER TABLE t1 ALTER COLUMN value SET NOT NULL;

Then the following will be produced:

explain (costs off) select * from t1 where not exists(select 1 from t2
where t1_id=value);
QUERY PLAN
--------------------------
Result
One-Time Filter: false
-> Seq Scan on t1

A small note on my intentions with this patch:

I'm not seeing the use case for all of this to be massive, I'm more
interested in this patch to use it as a stepping stone towards implementing
INNER JOIN removals which would use foreign keys in a similar way to
attempt to prove that the join is not required. I decided to tackle semi
and anti joins first as these are a far more simple case, and it also adds
quite a bit of the infrastructure that would be required for inner join
removal, plus if nobody manages to poke holes in my ideas with this then I
should have good grounds to begin the work on the inner join removal code.
I also think if we're bothering to load foreign key constraints at planning
time, then only using them for inner join removals wouldn't be making full
use of them, so likely this patch would be a good idea anyway.

Currently most of my changes are in analyzejoin.c, but I did also have to
make changes to load the foreign key constraints so that they were
available to the planner. One thing that is currently lacking, which would
likely be needed, before the finished patch is ready, would be a
"relhasfkeys" column in pg_class. Such a column would mean that it would be
possible to skip scanning pg_constraint for foreign keys when there's none
to find. I'll delay implementing that until I get a bit more feedback to
weather this patch would be a welcome addition to the existing join removal
code or not.

I'm submitting this (a little early) for the August commitfest, but if
anyone has any time to glance at it before then then that would be a really
good help.

Regards

David Rowley

Attachments:

semianti_join_removal_2410c7c_2014-08-05.patchapplication/octet-stream; name=semianti_join_removal_2410c7c_2014-08-05.patchDownload

diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 9bf0098..88c8d98 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -3887,6 +3887,17 @@ afterTriggerInvokeEvents(AfterTriggerEventList *events,
 	return all_fired;
 }
 
+/* ----------
+ * AfterTriggerQueueIsEmpty()
+ *
+ *	True if there are no pending triggers in the queue.
+ * ----------
+ */
+bool
+AfterTriggerQueueIsEmpty(void)
+{
+	return (afterTriggers->query_depth == -1);
+}
 
 /* ----------
  * AfterTriggerBeginXact()
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index b7aff37..63dbc1b 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -32,6 +32,7 @@
 static EquivalenceMember *add_eq_member(EquivalenceClass *ec,
 			  Expr *expr, Relids relids, Relids nullable_relids,
 			  bool is_child, Oid datatype);
+static void update_rel_class_joins(PlannerInfo *root);
 static void generate_base_implied_equalities_const(PlannerInfo *root,
 									   EquivalenceClass *ec);
 static void generate_base_implied_equalities_no_const(PlannerInfo *root,
@@ -725,7 +726,6 @@ void
 generate_base_implied_equalities(PlannerInfo *root)
 {
 	ListCell   *lc;
-	Index		rti;
 
 	foreach(lc, root->eq_classes)
 	{
@@ -752,6 +752,19 @@ generate_base_implied_equalities(PlannerInfo *root)
 	 * This is also a handy place to mark base rels (which should all exist by
 	 * now) with flags showing whether they have pending eclass joins.
 	 */
+	update_rel_class_joins(root);
+}
+
+/*
+ * update_rel_class_joins
+ *		Process each relation in the PlannerInfo to update the
+ *		has_eclass_joins flag
+ */
+static void
+update_rel_class_joins(PlannerInfo *root)
+{
+	Index		rti;
+
 	for (rti = 1; rti < root->simple_rel_array_size; rti++)
 	{
 		RelOptInfo *brel = root->simple_rel_array[rti];
@@ -764,6 +777,63 @@ generate_base_implied_equalities(PlannerInfo *root)
 }
 
 /*
+ * remove_rel_from_eclass
+ *		Remove all eclass members that belong to relid and also any classes
+ *		which have been left empty as a result of removing a member.
+ */
+void
+remove_rel_from_eclass(PlannerInfo *root, int relid)
+{
+	ListCell	*l,
+				*nextl,
+				*eqm,
+				*eqmnext;
+
+	bool removedany = false;
+
+	/* Strip all traces of this relation out of the eclasses */
+	for (l = list_head(root->eq_classes); l != NULL; l = nextl)
+	{
+		EquivalenceClass *ec = (EquivalenceClass *) lfirst(l);
+
+		nextl = lnext(l);
+
+		for (eqm = list_head(ec->ec_members); eqm != NULL; eqm = eqmnext)
+		{
+			EquivalenceMember *em = (EquivalenceMember *) lfirst(eqm);
+
+			eqmnext = lnext(eqm);
+
+			if (IsA(em->em_expr, Var))
+			{
+				Var *var = (Var *) em->em_expr;
+
+				if (var->varno == relid)
+				{
+					list_delete_ptr(ec->ec_members, em);
+					removedany = true;
+				}
+			}
+		}
+
+		/*
+		 * If we've removed the last member from the EquivalenceClass then we'd
+		 * better delete the entire entry.
+		 */
+		if (list_length(ec->ec_members) == 0)
+			list_delete_ptr(root->eq_classes, ec);
+	}
+
+	/*
+	 * If we removed any eclass members then this may have changed if a
+	 * relation has an eclass join or not, we'd better force an update
+	 * of this
+	 */
+	if (removedany)
+		update_rel_class_joins(root);
+}
+
+/*
  * generate_base_implied_equalities when EC contains pseudoconstant(s)
  */
 static void
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index 773f8a4..569580f 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -22,17 +22,34 @@
  */
 #include "postgres.h"
 
+#include "commands/trigger.h"
+#include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "nodes/relation.h"
 #include "optimizer/clauses.h"
 #include "optimizer/joininfo.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/planmain.h"
+#include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
 #include "utils/lsyscache.h"
 
 /* local functions */
-static bool join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
+static bool leftjoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
+static bool semiorantijoin_is_removable(PlannerInfo *root,
+							SpecialJoinInfo *sjinfo, List **leftrelcolumns,
+							RelOptInfo **leftrel);
+void convert_semijoin_to_isnotnull_quals(PlannerInfo *root, RelOptInfo *rel,
+									List *columnlist);
+void convert_antijoin_to_isnull_quals(PlannerInfo *root, RelOptInfo *rel,
+									List *columnlist);
+static bool relation_has_foreign_key_for(PlannerInfo *root, JoinType jointype,
+								RelOptInfo *rel, RelOptInfo *referencedrel,
+								List *referencing_vars, List *index_vars,
+								List *operator_list);
+static bool expressions_match_foreign_key(ForeignKeyInfo *fk, List *fkvars,
+								List *indexvars, List *operators);
 static void remove_rel_from_query(PlannerInfo *root, int relid,
 					  Relids joinrelids);
 static List *remove_rel_from_joinlist(List *joinlist, int relid, int *nremoved);
@@ -53,8 +70,8 @@ remove_useless_joins(PlannerInfo *root, List *joinlist)
 	ListCell   *lc;
 
 	/*
-	 * We are only interested in relations that are left-joined to, so we can
-	 * scan the join_info_list to find them easily.
+	 * We are only interested in relations that are left, semi or anti-joined
+	 * to, so we can scan the join_info_list to find them easily.
 	 */
 restart:
 	foreach(lc, root->join_info_list)
@@ -63,9 +80,37 @@ restart:
 		int			innerrelid;
 		int			nremoved;
 
-		/* Skip if not removable */
-		if (!join_is_removable(root, sjinfo))
-			continue;
+		if (sjinfo->jointype == JOIN_LEFT)
+		{
+			/* Skip if not removable */
+			if (!leftjoin_is_removable(root, sjinfo))
+				continue;
+		}
+		else if (sjinfo->jointype == JOIN_SEMI)
+		{
+			List	   *columnlist;
+			RelOptInfo *rel;
+
+			/* Skip if not removable */
+			if (!semiorantijoin_is_removable(root, sjinfo, &columnlist, &rel))
+				continue;
+
+			Assert(columnlist != NIL);
+			convert_semijoin_to_isnotnull_quals(root, rel, columnlist);
+		}
+		else if (sjinfo->jointype == JOIN_ANTI)
+		{
+			List	   *columnlist;
+			RelOptInfo *rel;
+
+			if (!semiorantijoin_is_removable(root, sjinfo, &columnlist, &rel))
+				continue;
+
+			Assert(columnlist != NIL);
+			convert_antijoin_to_isnull_quals(root, rel, columnlist);
+		}
+		else
+			continue; /* we don't support this join type */
 
 		/*
 		 * Currently, join_is_removable can only succeed when the sjinfo's
@@ -136,7 +181,7 @@ clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
 }
 
 /*
- * join_is_removable
+ * leftjoin_is_removable
  *	  Check whether we need not perform this special join at all, because
  *	  it will just duplicate its left input.
  *
@@ -147,7 +192,7 @@ clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
  * above the join.
  */
 static bool
-join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
+leftjoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 {
 	int			innerrelid;
 	RelOptInfo *innerrel;
@@ -157,12 +202,13 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	ListCell   *l;
 	int			attroff;
 
+	Assert(sjinfo->jointype == JOIN_LEFT);
+
 	/*
-	 * Must be a non-delaying left join to a single baserel, else we aren't
+	 * Must be a non-delaying join to a single baserel, else we aren't
 	 * going to be able to do anything with it.
 	 */
-	if (sjinfo->jointype != JOIN_LEFT ||
-		sjinfo->delay_upper_joins ||
+	if (sjinfo->delay_upper_joins ||
 		bms_membership(sjinfo->min_righthand) != BMS_SINGLETON)
 		return false;
 
@@ -367,6 +413,503 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	return false;
 }
 
+/*
+ * semiorantijoin_is_removable
+ *	  True if we can remove this semi or anti join.
+ *
+ * Detecting if a SEMI or ANTI join may be removed is quite different to the
+ * detection code for left joins. For these we have no need to check if vars
+ * from the join are used in the query as the EXISTS and IN() syntax disallow
+ * this. In order to prove that a semi or anti join is redundant we must ensure
+ * that a foreign key exists on the left side of the join which references the
+ * table on the right side of the join. This means that we can only support a
+ * single table on either side of the join. We must also ensure that the join
+ * condition matches all the foreign key columns to each index column on the
+ * referenced table. If any columns are missing then we cannot be sure we'll
+ * get at most 1 record back, and if there are any extra conditions that are
+ * not in the foreign key then we cannot be sure that the join condition will
+ * produce at least 1 matching row.
+ *
+ * If we manage to find a foreign key which will allow the join to be removed
+ * then the calling may have to add NULL checking to the query in place of the
+ * join. For example if we determine that the join to the table b is not needed
+ * due to the existence of a foreign key on a.b_id referencing b.id in the
+ * following query:
+ *
+ * SELECT * FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id = b.id);
+ *
+ * Then the only possible records that could be returned from a are the ones
+ * WHERE b_id IS NULL.
+ *
+ * If this function returns True, then leftrelcolumns will be populated with
+ * the list of columns from the left relation which exist in the join
+ * condition, leftrel will be set to the RelOptInfo of the left hand relation.
+ */
+static bool
+semiorantijoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo,
+		List **leftrelcolumns, RelOptInfo **leftrel)
+{
+	int			innerrelid;
+	int			outerrelid;
+	RelOptInfo *innerrel;
+	RelOptInfo *outerrel;
+	ListCell   *lc;
+	List	   *referencing_vars;
+	List	   *index_vars;
+	List	   *operator_list;
+
+	Assert(sjinfo->jointype == JOIN_SEMI || sjinfo->jointype == JOIN_ANTI);
+
+	/*
+	 * We mustn't allow semi or anti joins to be removed if there are any
+	 * pending foreign key triggers in the queue. This could happen if we
+	 * are planning a query that has been executed from within a volatile
+	 * function and the query which called this volatile function has made some
+	 * changes to a table referenced by a foreign key. The reason for this is
+	 * that any updates to a table which is referenced by a foreign key
+	 * constraint will only have the referencing tables updated after the
+	 * command is complete, so there is a window of time where records may
+	 * violate the foreign key constraint. The following code intends to
+	 * maintain correct results by disabling the join removal if there's a
+	 * possibility that there are records which violate the foreign key, though
+	 * this code is quite naive and we will simply just disallow removal of
+	 * semi and anti joins if there's anything in the foreign key trigger
+	 * queue. A more complete solution would be able to check if the relation
+	 * in question has pending triggers, but this will do for now.
+	 */
+	if (!AfterTriggerQueueIsEmpty())
+		return false;
+
+	/*
+	 * We'll start by checking that the left hand relation is a singleton
+	 * and that it has at least 1 foreign key.  A lack of foreign key seems
+	 * like a more likely possibility to allow us to exit early than checking
+	 * the right hand rel has any indexes.
+	 */
+	if (sjinfo->delay_upper_joins ||
+		bms_membership(sjinfo->min_lefthand) != BMS_SINGLETON)
+		return false;
+
+	outerrelid = bms_singleton_member(sjinfo->min_lefthand);
+	outerrel = find_base_rel(root, outerrelid);
+
+	/*
+	 * There's no possibility to remove the join if the outer rel is not a
+	 * baserel or the baserel has no foreign keys defined.
+	 */
+	if (outerrel->reloptkind != RELOPT_BASEREL ||
+		outerrel->rtekind != RTE_RELATION ||
+		outerrel->fklist == NIL)
+		return false;
+
+	if (bms_membership(sjinfo->min_righthand) != BMS_SINGLETON)
+		return false;
+
+	innerrelid = bms_singleton_member(sjinfo->min_righthand);
+	innerrel = find_base_rel(root, innerrelid);
+
+	/*
+	 * If the right hand relation is not a base rel then it can't possibly be
+	 * referenced by a foreign key. The same goes if there's no unique indexes
+	 * on the relation, however, to keep it simple here we'll make do with
+	 * checking if there's any indexes, as if there's no indexes then there's
+	 * certainly no unique indexes.
+	 */
+	if (innerrel->reloptkind != RELOPT_BASEREL ||
+		innerrel->rtekind != RTE_RELATION ||
+		innerrel->indexlist == NIL)
+		return false;
+
+	referencing_vars = NIL;
+	index_vars = NIL;
+	operator_list = NIL;
+
+	/*
+	 * Pre-process the join quals into lists that contain the vars from either
+	 * side of the joins and also a list which contains the operators from the
+	 * join conditions. At this stage we may still discover that the join
+	 * cannot be removed if, for example we find a qual that does not reference
+	 * both sides of the join.
+	 *
+	 * referencing_vars will contain a list of Vars from the left hand
+	 * relation, these are the expressions that we'll check against the
+	 * referencing side of the foreign key.
+	 *
+	 * index_vars will contain a list of Vars from the right hand relation,
+	 * these are the expressions that we'll check on the referenced side of the
+	 * foreign key.
+	 *
+	 * operator_list, this is list of operator Oids that we'll need to ensure
+	 * are compatible with the operator specified in the foreign key.
+	 */
+	foreach(lc, sjinfo->join_quals)
+	{
+		OpExpr	   *opexpr = (OpExpr *) lfirst(lc);
+		Oid			opno;
+		Node	   *left_expr;
+		Node	   *right_expr;
+		Relids		left_varnos;
+		Relids		right_varnos;
+		Relids		all_varnos;
+		Oid			opinputtype;
+
+		/* Is it a binary opclause? */
+		if (!IsA(opexpr, OpExpr) ||
+			list_length(opexpr->args) != 2)
+		{
+			/* We only accept quals which reference both sides of the join. */
+			return false;
+		}
+
+		left_expr = linitial(opexpr->args);
+
+		/* Punt if it's anything apart from a Var */
+		if (!IsA(left_expr, Var))
+			return false;
+
+		right_expr = lsecond(opexpr->args);
+
+		/* Punt if it's anything apart from a Var */
+		if (!IsA(right_expr, Var))
+			return false;
+
+		opinputtype = exprType(left_expr);
+		opno = opexpr->opno;
+
+		/*
+		 * FIXME: it would be nice to fast path out if the
+		 * operator couldn't possibly be used in a foreign
+		 * key, but what to use to detect this?
+		 */
+		if (!op_mergejoinable(opno, opinputtype))
+			return false;
+
+		left_varnos = pull_varnos(left_expr);
+		right_varnos = pull_varnos(right_expr);
+		all_varnos = bms_union(left_varnos, right_varnos);
+
+		/*
+		 * Check if the clause matches both sides of the join. If only 1 side
+		 * is matched then, since we're dealing with a SEMI or ANTI join then
+		 * it must be from the inner side. So this qual could restrict the
+		 * results so that we can't be sure the foreign key will cause us to
+		 * match at least 1 record in the relation. In this case we must punt.
+		 */
+		if (!bms_overlap(all_varnos, sjinfo->syn_righthand) ||
+			bms_is_subset(all_varnos, sjinfo->syn_righthand))
+			return false;
+
+		/* check rel membership of arguments */
+		if (!bms_is_empty(right_varnos) &&
+			bms_is_subset(right_varnos, sjinfo->syn_righthand) &&
+			!bms_overlap(left_varnos, sjinfo->syn_righthand))
+		{
+			/* typical case, right_expr is RHS variable */
+		}
+		else if (!bms_is_empty(left_varnos) &&
+				 bms_is_subset(left_varnos, sjinfo->syn_righthand) &&
+				 !bms_overlap(right_varnos, sjinfo->syn_righthand))
+		{
+			Node *tmp;
+			/* flipped case, left_expr is RHS variable */
+			opno = get_commutator(opno);
+			if (!OidIsValid(opno))
+				return false;
+
+			/* swap the operands */
+			tmp = left_expr;
+			left_expr = right_expr;
+			right_expr = tmp;
+		}
+		else
+			return false;
+
+		/* so far so good, keep building lists */
+		referencing_vars = lappend(referencing_vars, left_expr);
+		operator_list = lappend_oid(operator_list, opno);
+		index_vars = lappend(index_vars, right_expr);
+	}
+
+	/* no suitable join condition items? Then we can't remove the join */
+	if (referencing_vars == NIL)
+		return false;
+
+	/*
+	 * Now that we've built the join Var lists we can now check if there are
+	 * any foreign keys that will support us removing this join.
+	 */
+	if (relation_has_foreign_key_for(root, sjinfo->jointype, outerrel,
+				innerrel, referencing_vars, index_vars, operator_list))
+	{
+		*leftrel = outerrel;
+		*leftrelcolumns = referencing_vars;
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * convert_semijoin_to_isnotnull_quals
+ *		Adds any required "col IS NOT NULL" quals which are required to ensure
+ *		that the query remains equivalent to what it was before the semi join
+ *		was removed.
+ */
+void
+convert_semijoin_to_isnotnull_quals(PlannerInfo *root, RelOptInfo *rel, List *columnlist)
+{
+	ListCell	*l;
+
+	/*
+	 * If a semi join has been successfully removed by the join removal code,
+	 * then a foreign key must exist that proves the join to not be required.
+	 * For us this means that we can be sure each column that is part of that
+	 * foreign key, which has a non-null value mustn't be filtered out as there
+	 * must be a record in the foreign relation which these records reference.
+	 * This is not the case for columns which are part of the foreign key but
+	 * have a NULL value. These records obviously aren't referencing any
+	 * foreign tuple, so in order to get the query to produce the same the same
+	 * result, we'll just filter these NULLs out. We do this by adding items to
+	 * the WHERE clause, such as:
+	 * "WHERE fkeycol1 IS NOT NULL AND fkeycol2 IS NOT NULL", though we needn't
+	 * bother doing this if the column has a NOT NULL constraint.
+	 */
+
+	foreach(l, columnlist)
+	{
+		Var			  *var = (Var *) lfirst(l);
+		RangeTblEntry *rte;
+
+		/* should be a var if it came from a foreign key */
+		Assert(IsA(var, Var));
+
+		rte = root->simple_rte_array[var->varno];
+
+		/*
+		 * No point in adding a col IS NOT NULL if the column
+		 * has a NOT NULL constraint defined for it.
+		 */
+		if (!get_attnotnull(rte->relid, var->varattno))
+		{
+			RestrictInfo *rinfo;
+			NullTest *ntest = makeNode(NullTest);
+
+			ntest->nulltesttype = IS_NOT_NULL;
+			ntest->arg = (Expr *) var;
+			ntest->argisrow = false;
+
+			rinfo = make_restrictinfo((Expr *)ntest, false, false, false,
+						NULL, NULL, NULL);
+			rel->baserestrictinfo = lappend(rel->baserestrictinfo, rinfo);
+		}
+	}
+}
+
+/*
+ * convert_antijoin_to_isnull_quals
+ *		Adds any required "col IS NULL" quals which are required to ensure
+ *		that the query remains equivalent to what it was before the anti join
+ *		was removed.
+ */
+void
+convert_antijoin_to_isnull_quals(PlannerInfo *root, RelOptInfo *rel, List *columnlist)
+{
+	ListCell	 *l;
+	RestrictInfo *rinfo;
+	Expr		 *expr;
+	List		 *isnulltests = NIL;
+
+	/*
+	 * If an anti join has been successfully removed by the join removal code,
+	 * then a foreign key must exist that proves the join to not be required.
+	 * For us this means that in order to make the query produce the same
+	 * result as if the anti join had not been removed then we should only be
+	 * allowing tuples where any of the foreign key columns has a NULL value to
+	 * make it through. We do this simply by adding items to the WHERE clause
+	 * of the query, such as "WHERE fkeycol1 IS NULL OR fkeycol2 IS NULL",
+	 * though we can skip any columns that have a NOT NULL constraint. If all
+	 * of the columns happen to have NOT NULL constraints defined, then it's
+	 * not possible for the query to produce any records at all. In that case
+	 * we add "WHERE false" to the WHERE clause.
+	 */
+
+	foreach(l, columnlist)
+	{
+		Var			  *var = (Var *) lfirst(l);
+		RangeTblEntry *rte;
+
+		/* should be a var if it came from a foreign key */
+		Assert(IsA(var, Var));
+
+		rte = root->simple_rte_array[var->varno];
+
+		/*
+		 * No point in adding a col IS NULL if the column
+		 * has a NOT NULL constraint defined for it.
+		 */
+		if (!get_attnotnull(rte->relid, var->varattno))
+		{
+			NullTest *ntest = makeNode(NullTest);
+			ntest->nulltesttype = IS_NULL;
+			ntest->arg = (Expr *) var;
+			ntest->argisrow = false;
+
+			isnulltests = lappend(isnulltests, ntest);
+		}
+	}
+
+	/*
+	 * If we still have an empty list by the time we get to here then it would
+	 * appear that each column has a NOT NULL constraint. In this case then
+	 * it's not possible for the query to return any records, so we can simply
+	 * add a "WHERE false" constant expression and tell the planner to check
+	 * for gating quals.
+	 */
+	if (isnulltests == NIL)
+	{
+		expr = (Expr *) makeBoolConst(false, false);
+		rinfo = make_restrictinfo(expr, false, false, true, NULL, NULL, NULL);
+
+		/* tell createplan.c to check for gating quals */
+		root->hasPseudoConstantQuals = true;
+	}
+	else
+	{
+		/*
+		 * Now we can build a RestrictInfo for the newly created IS NULL tests.
+		 * If there's only 1 test expression then we can just make the
+		 * RestrictInfo use that expression, if there's more than 1 we'll need
+		 * to "OR" all of these together.
+		 */
+		if (list_length(isnulltests) == 1)
+			expr = (Expr *) linitial(isnulltests);
+		else
+			expr = make_orclause(isnulltests);
+
+		rinfo = make_restrictinfo(expr, false, false, false, NULL, NULL, NULL);
+	}
+
+	rel->baserestrictinfo = lappend(rel->baserestrictinfo, rinfo);
+}
+
+/*
+ * relation_has_foreign_key_for
+ *	  Checks if rel has a foreign key which references referencedrel with the
+ *	  given list of expressions.
+ *
+ *	For the match to succeed:
+ *	  referencing_vars must match the columns defined in the foreign key.
+ *	  index_vars must match the columns defined in the index for the foreign key.
+ */
+static bool
+relation_has_foreign_key_for(PlannerInfo *root, JoinType jointype,
+			RelOptInfo *rel, RelOptInfo *referencedrel,
+			List *referencing_vars, List *index_vars, List *operator_list)
+{
+	ListCell *lc;
+	Oid		  refreloid;
+
+	/*
+	 * Look up the Oid of the referenced relation. We'll need to see if
+	 * the referencing relation has a foreign key which references this
+	 */
+	refreloid = root->simple_rte_array[referencedrel->relid]->relid;
+
+	Assert(list_length(referencing_vars) > 0);
+	Assert(list_length(referencing_vars) == list_length(index_vars));
+	Assert(list_length(referencing_vars) == list_length(operator_list));
+
+	/*
+	 * Search through each foreign key on the referencing relation and try
+	 * to find one which references the referenced relation. If we find one
+	 * then we'll see if the join condition matches the foreign key definition.
+	 */
+	foreach(lc, rel->fklist)
+	{
+		ForeignKeyInfo *fk = (ForeignKeyInfo *) lfirst(lc);
+
+		if (fk->confrelid == refreloid)
+		{
+			if (expressions_match_foreign_key(fk, referencing_vars,
+				index_vars, operator_list))
+				return true;
+		}
+	}
+
+	return false;
+}
+
+/*
+ * expressions_match_foreign_key
+ *		True if the given fkvars, indexvars and operators will match
+ *		exactly 1 record in the referenced relation of the foreign key.
+ *
+ * Note: This function expects fkvars and indexvars to only contain Var types.
+ *       Expression indexes are not supported by foreign keys.
+ */
+static bool
+expressions_match_foreign_key(ForeignKeyInfo *fk, List *fkvars,
+					List *indexvars, List *operators)
+{
+	ListCell *lc;
+	ListCell *lc2;
+	ListCell *lc3;
+	int		  col;
+
+	Assert(list_length(fkvars) == list_length(indexvars));
+	Assert(list_length(fkvars) == list_length(operators));
+
+	/*
+	 * Fast path out if there's not enough conditions to match each column in
+	 * the foreign key. Note that we cannot check that the number of
+	 * expressions are equal here since it would cause any expressions which
+	 * are duplicated not to match.
+	 */
+	if (list_length(fkvars) < fk->conncols)
+		return false;
+
+	/*
+	 * For each column defined in the foreign key we must ensure that we find
+	 * a matching var in fkvars for the referencing side of the foreign key and
+	 * also a matching indexvar on the referenced side of the foreign key.
+	 */
+	for (col = 0; col < fk->conncols; col++)
+	{
+		bool  matched = false;
+
+		forthree(lc, fkvars, lc2, indexvars, lc3, operators)
+		{
+			Var *expr = (Var *) lfirst(lc);
+			Var *idxexpr = (Var *) lfirst(lc2);
+			Oid  opr = lfirst_oid(lc3);
+
+			Assert(IsA(expr, Var));
+			Assert(IsA(idxexpr, Var));
+
+			/*
+			 * Search for a column in the foreign key which matches the current
+			 * join condition expression.
+			 */
+			if (fk->conkey[col] == expr->varattno &&
+				fk->confkey[col] == idxexpr->varattno &&
+				equality_ops_are_compatible(opr, fk->conpfeqop[col]))
+			{
+				matched = true;
+				break;
+			}
+		}
+
+		/*
+		 * Did we find anything matching the fk col? If not then we'll
+		 * return no match.
+		 */
+		if (!matched)
+			return false;
+	}
+
+	return true; /* matched */
+}
+
 
 /*
  * Remove the target relid from the planner's data structures, having
@@ -393,6 +936,9 @@ remove_rel_from_query(PlannerInfo *root, int relid, Relids joinrelids)
 	 */
 	rel->reloptkind = RELOPT_DEADREL;
 
+	/* Strip out any eclass members that belong to this rel */
+	remove_rel_from_eclass(root, relid);
+
 	/*
 	 * Remove references to the rel from other baserels' attr_needed arrays.
 	 */
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index b2becfa..0b1c1a6 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -25,7 +25,9 @@
 #include "access/transam.h"
 #include "access/xlog.h"
 #include "catalog/catalog.h"
+#include "catalog/pg_constraint.h"
 #include "catalog/heap.h"
+#include "catalog/pg_type.h"
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -38,6 +40,7 @@
 #include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "storage/bufmgr.h"
+#include "utils/fmgroids.h"
 #include "utils/lsyscache.h"
 #include "utils/rel.h"
 #include "utils/snapmgr.h"
@@ -89,6 +92,13 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 	Relation	relation;
 	bool		hasindex;
 	List	   *indexinfos = NIL;
+	List	   *fkinfos = NIL;
+	Relation	fkeyRel;
+	Relation	fkeyRelIdx;
+	ScanKeyData fkeyScankey;
+	SysScanDesc fkeyScan;
+	HeapTuple	tuple;
+
 
 	/*
 	 * We need not lock the relation since it was already locked, either by
@@ -384,6 +394,111 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 
 	heap_close(relation, NoLock);
 
+	ScanKeyInit(&fkeyScankey,
+				Anum_pg_constraint_conrelid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(relationObjectId));
+
+	fkeyRel = heap_open(ConstraintRelationId, AccessShareLock);
+	fkeyRelIdx = index_open(ConstraintRelidIndexId, AccessShareLock);
+	fkeyScan = systable_beginscan_ordered(fkeyRel, fkeyRelIdx, NULL, 1, &fkeyScankey);
+
+	while ((tuple = systable_getnext_ordered(fkeyScan, ForwardScanDirection)) != NULL)
+	{
+		Form_pg_constraint con = (Form_pg_constraint) GETSTRUCT(tuple);
+		ForeignKeyInfo *fkinfo;
+		Datum		adatum;
+		bool		isNull;
+		ArrayType  *arr;
+		int			numkeys;
+
+		/* Not a foreign key */
+		if (con->contype != CONSTRAINT_FOREIGN)
+			continue;
+
+		/* we're not interested unless the fk has been validated */
+		if (!con->convalidated)
+			continue;
+
+		fkinfo = (ForeignKeyInfo *) palloc(sizeof(ForeignKeyInfo));
+		fkinfo->conindid = con->conindid;
+		fkinfo->confrelid = con->confrelid;
+		fkinfo->convalidated = con->convalidated;
+		fkinfo->conrelid = con->conrelid;
+		fkinfo->confupdtype = con->confupdtype;
+		fkinfo->confdeltype = con->confdeltype;
+		fkinfo->confmatchtype = con->confmatchtype;
+
+		adatum = heap_getattr(tuple, Anum_pg_constraint_conkey,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null conkey for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		numkeys = ARR_DIMS(arr)[0];
+		if (ARR_NDIM(arr) != 1 ||
+			numkeys < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != INT2OID)
+			elog(ERROR, "conkey is not a 1-D smallint array");
+
+		fkinfo->conkey = (int16 *) ARR_DATA_PTR(arr);
+		fkinfo->conncols = numkeys;
+
+		adatum = heap_getattr(tuple, Anum_pg_constraint_confkey,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null confkey for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		numkeys = ARR_DIMS(arr)[0];
+
+		/* sanity check */
+		if (numkeys != fkinfo->conncols)
+			elog(ERROR, "number of confkey elements does not equal conkey elements");
+
+		if (ARR_NDIM(arr) != 1 ||
+			numkeys < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != INT2OID)
+			elog(ERROR, "confkey is not a 1-D smallint array");
+
+		fkinfo->confkey = (int16 *) ARR_DATA_PTR(arr);
+		adatum = heap_getattr(tuple, Anum_pg_constraint_conpfeqop,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null conpfeqop for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		numkeys = ARR_DIMS(arr)[0];
+
+		/* sanity check */
+		if (numkeys != fkinfo->conncols)
+			elog(ERROR, "number of conpfeqop elements does not equal conkey elements");
+
+		if (ARR_NDIM(arr) != 1 ||
+			numkeys < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != OIDOID)
+			elog(ERROR, "conpfeqop is not a 1-D smallint array");
+
+		fkinfo->conpfeqop = (Oid *) ARR_DATA_PTR(arr);
+
+		fkinfos = lappend(fkinfos, fkinfo);
+	}
+
+	rel->fklist = fkinfos;
+	systable_endscan_ordered(fkeyScan);
+	index_close(fkeyRelIdx, AccessShareLock);
+	heap_close(fkeyRel, AccessShareLock);
+
+
 	/*
 	 * Allow a plugin to editorialize on the info we obtained from the
 	 * catalogs.  Actions might include altering the assumed relation size,
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index c938c27..a0fb8eb 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -115,6 +115,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind)
 	rel->lateral_relids = NULL;
 	rel->lateral_referencers = NULL;
 	rel->indexlist = NIL;
+	rel->fklist = NIL;
 	rel->pages = 0;
 	rel->tuples = 0;
 	rel->allvisfrac = 0;
@@ -377,6 +378,7 @@ build_join_rel(PlannerInfo *root,
 	joinrel->lateral_relids = NULL;
 	joinrel->lateral_referencers = NULL;
 	joinrel->indexlist = NIL;
+	joinrel->fklist = NIL;
 	joinrel->pages = 0;
 	joinrel->tuples = 0;
 	joinrel->allvisfrac = 0;
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 4b5ef99..1d581a8 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -972,6 +972,33 @@ get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 	ReleaseSysCache(tp);
 }
 
+/*
+ * get_attnotnull
+ *
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	tp;
+
+	tp = SearchSysCache2(ATTNUM,
+						 ObjectIdGetDatum(relid),
+						 Int16GetDatum(attnum));
+	if (HeapTupleIsValid(tp))
+	{
+		Form_pg_attribute att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+		bool		result;
+
+		result = att_tup->attnotnull;
+		ReleaseSysCache(tp);
+		return result;
+	}
+	else
+		return false;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h
index d0b0356..34a75e4 100644
--- a/src/include/commands/trigger.h
+++ b/src/include/commands/trigger.h
@@ -181,6 +181,7 @@ extern void ExecBSTruncateTriggers(EState *estate,
 extern void ExecASTruncateTriggers(EState *estate,
 					   ResultRelInfo *relinfo);
 
+extern bool AfterTriggerQueueIsEmpty(void);
 extern void AfterTriggerBeginXact(void);
 extern void AfterTriggerBeginQuery(void);
 extern void AfterTriggerEndQuery(EState *estate);
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index dacbe9c..8cf5c28 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -355,6 +355,8 @@ typedef struct PlannerInfo
  *		lateral_referencers - relids of rels that reference this one laterally
  *		indexlist - list of IndexOptInfo nodes for relation's indexes
  *					(always NIL if it's not a table)
+ *		fklist - list of ForeignKeyInfo's for relation's foreign key
+ *					constraints. (always NIL if it's not a table)
  *		pages - number of disk pages in relation (zero if not a table)
  *		tuples - number of tuples in relation (not considering restrictions)
  *		allvisfrac - fraction of disk pages that are marked all-visible
@@ -448,6 +450,7 @@ typedef struct RelOptInfo
 	Relids		lateral_relids; /* minimum parameterization of rel */
 	Relids		lateral_referencers;	/* rels that reference me laterally */
 	List	   *indexlist;		/* list of IndexOptInfo */
+	List	   *fklist;			/* list of ForeignKeyInfo */
 	BlockNumber pages;			/* size estimates derived from pg_class */
 	double		tuples;
 	double		allvisfrac;
@@ -538,6 +541,20 @@ typedef struct IndexOptInfo
 	bool		amhasgetbitmap; /* does AM have amgetbitmap interface? */
 } IndexOptInfo;
 
+typedef struct ForeignKeyInfo
+{
+	Oid			conindid;		/* index supporting this constraint */
+	Oid			confrelid;		/* relation referenced by foreign key */
+	bool		convalidated;	/* constraint has been validated? */
+	Oid			conrelid;		/* relation this constraint constrains */
+	char		confupdtype;	/* foreign key's ON UPDATE action */
+	char		confdeltype;	/* foreign key's ON DELETE action */
+	char		confmatchtype;	/* foreign key's match type */
+	int			conncols;		/* number of columns references */
+	int16	   *conkey;			/* Columns of conrelid that the constraint applies to */
+	int16	   *confkey;		/* columns of confrelid that foreign key references */
+	Oid		   *conpfeqop;		/* Operator list for comparing PK to FK */
+} ForeignKeyInfo;
 
 /*
  * EquivalenceClasses
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 9b22fda..00716c9 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -108,6 +108,7 @@ extern EquivalenceClass *get_eclass_for_sort_expr(PlannerInfo *root,
 						 Relids rel,
 						 bool create_it);
 extern void generate_base_implied_equalities(PlannerInfo *root);
+extern void remove_rel_from_eclass(PlannerInfo *root, int relid);
 extern List *generate_join_implied_equalities(PlannerInfo *root,
 								 Relids join_relids,
 								 Relids outer_relids,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index f46460a..3ec200a 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -70,6 +70,7 @@ extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern int32 get_atttypmod(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 					  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern char *get_collation_name(Oid colloid);
 extern char *get_constraint_name(Oid conoid);
 extern Oid	get_opclass_family(Oid opclass);
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 1cb1c51..8530733 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3249,6 +3249,317 @@ select i8.* from int8_tbl i8 left join (select f1 from int4_tbl group by f1) i4
 (1 row)
 
 rollback;
+BEGIN;
+-- Test join removals for semi and anti joins
+CREATE TEMP TABLE b (id INT NOT NULL PRIMARY KEY);
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id INT REFERENCES b(id));
+-- should remove semi join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE b_id IN(SELECT id FROM b);
+          QUERY PLAN          
+------------------------------
+ Seq Scan on a
+   Filter: (b_id IS NOT NULL)
+(2 rows)
+
+-- should remove semi join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id = id);
+          QUERY PLAN          
+------------------------------
+ Seq Scan on a
+   Filter: (b_id IS NOT NULL)
+(2 rows)
+
+-- should remove anti join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = id);
+        QUERY PLAN        
+--------------------------
+ Seq Scan on a
+   Filter: (b_id IS NULL)
+(2 rows)
+
+-- should not remove anti join as id > 100 will void
+-- the foreign key's guarantee that 1 will exist.
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = id AND id > 100);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Hash Anti Join
+   Hash Cond: (a.b_id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Bitmap Heap Scan on b
+               Recheck Cond: (id > 100)
+               ->  Bitmap Index Scan on b_pkey
+                     Index Cond: (id > 100)
+(8 rows)
+
+-- should remove semi join to b (swapped condition order)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE id = a.b_id);
+          QUERY PLAN          
+------------------------------
+ Seq Scan on a
+   Filter: (b_id IS NOT NULL)
+(2 rows)
+
+-- should not remove semi join (since not using equals)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE id >= a.b_id);
+               QUERY PLAN                
+-----------------------------------------
+ Nested Loop Semi Join
+   ->  Seq Scan on a
+   ->  Index Only Scan using b_pkey on b
+         Index Cond: (id >= a.b_id)
+(4 rows)
+
+-- should not remove semi join
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE b_id+0 IN(SELECT id FROM b);
+             QUERY PLAN             
+------------------------------------
+ Hash Semi Join
+   Hash Cond: ((a.b_id + 0) = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(5 rows)
+
+-- should not remove semi join
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE b_id IN(SELECT id+0 FROM b);
+             QUERY PLAN             
+------------------------------------
+ Hash Semi Join
+   Hash Cond: (a.b_id = (b.id + 0))
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(5 rows)
+
+-- should not remove semi join (wrong column)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE id IN(SELECT id FROM b);
+         QUERY PLAN         
+----------------------------
+ Hash Join
+   Hash Cond: (b.id = a.id)
+   ->  Seq Scan on b
+   ->  Hash
+         ->  Seq Scan on a
+(5 rows)
+
+ROLLBACK;
+BEGIN;
+-- Semi join removal code with 2 column foreign keys
+CREATE TEMP TABLE b (id1 INT NOT NULL, id2 INT NOT NULL, PRIMARY KEY(id1,id2));
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id1 INT, b_id2 INT);
+ALTER TABLE a ADD CONSTRAINT a_b_id1_b_id2_fkey FOREIGN KEY (b_id1,b_id2) REFERENCES b(id1,id2) MATCH SIMPLE;
+-- should remove semi join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Seq Scan on a
+   Filter: ((b_id1 IS NOT NULL) AND (b_id2 IS NOT NULL))
+(2 rows)
+
+-- should remove anti join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+                   QUERY PLAN                   
+------------------------------------------------
+ Seq Scan on a
+   Filter: ((b_id1 IS NULL) OR (b_id2 IS NULL))
+(2 rows)
+
+-- should not remove semi join to b (extra condition)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2 AND a.b_id2 >= id2);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Hash Semi Join
+   Hash Cond: ((a.b_id1 = b.id1) AND (a.b_id2 = b.id2))
+   Join Filter: (a.b_id2 >= b.id2)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(6 rows)
+
+-- should not remove semi join to b (wrong operator)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 > id1 AND a.b_id2 < id2);
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Nested Loop Semi Join
+   ->  Seq Scan on a
+   ->  Index Only Scan using b_pkey on b
+         Index Cond: ((id1 < a.b_id1) AND (id2 > a.b_id2))
+(4 rows)
+
+-- should not remove semi join (only checking id1)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1);
+           QUERY PLAN            
+---------------------------------
+ Hash Join
+   Hash Cond: (a.b_id1 = b.id1)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  HashAggregate
+               Group Key: b.id1
+               ->  Seq Scan on b
+(7 rows)
+
+-- should not remove semi join (only checking id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id2 = id2);
+           QUERY PLAN            
+---------------------------------
+ Hash Join
+   Hash Cond: (a.b_id2 = b.id2)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  HashAggregate
+               Group Key: b.id2
+               ->  Seq Scan on b
+(7 rows)
+
+-- should not remove semi join (checking wrong columns)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id2 = id1 AND a.b_id1 = id2);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Hash Join
+   Hash Cond: ((a.b_id2 = b.id1) AND (a.b_id1 = b.id2))
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(5 rows)
+
+-- should not remove semi join (no check for id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id2 = id1 AND a.b_id1 = id1);
+               QUERY PLAN                
+-----------------------------------------
+ Nested Loop Semi Join
+   ->  Seq Scan on a
+         Filter: (b_id2 = b_id1)
+   ->  Index Only Scan using b_pkey on b
+         Index Cond: (id1 = a.b_id2)
+(5 rows)
+
+-- should not remove semi join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id1 = id2);
+            QUERY PLAN             
+-----------------------------------
+ Hash Join
+   Hash Cond: (a.b_id1 = b.id1)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: (id1 = id2)
+(6 rows)
+
+-- should not remove semi join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id1 = id1);
+              QUERY PLAN               
+---------------------------------------
+ Hash Join
+   Hash Cond: (a.b_id1 = b.id1)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  HashAggregate
+               Group Key: b.id1, b.id1
+               ->  Seq Scan on b
+(7 rows)
+
+-- Check that the IS NULL and IS NOT NULL filters are not added
+-- for columns which have a NOT NULL constraint.
+ALTER TABLE a ALTER COLUMN b_id1 SET NOT NULL;
+-- Should only filter on b_id2 IS NOT NULL
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on a
+   Filter: (b_id2 IS NOT NULL)
+(2 rows)
+
+-- Should only filter on b_id2 IS NULL
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+        QUERY PLAN         
+---------------------------
+ Seq Scan on a
+   Filter: (b_id2 IS NULL)
+(2 rows)
+
+ALTER TABLE a ALTER COLUMN b_id2 SET NOT NULL;
+-- No IS NOT NULL filters should be added.
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+  QUERY PLAN   
+---------------
+ Seq Scan on a
+(1 row)
+
+-- Since now neither b_id1 or b_id2 can be NULL this query can't
+-- produce any records. Check that we get a One-Time Filter: false
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+        QUERY PLAN        
+--------------------------
+ Result
+   One-Time Filter: false
+   ->  Seq Scan on a
+(3 rows)
+
+ROLLBACK;
+BEGIN WORK;
+-- In this test we want to ensure that ANTI JOIN removal does not
+-- occur when there are pending foreign key triggers.
+-- We test this by updating a relation which is referenced by a foreign key
+-- and then executing another query which would normally allow the anti
+-- join to be removed. If the anti join was removed then the table
+-- records_violating_fkey would be empty, but here we'll ensure that
+-- the record that we update ends up violating the foreign key.
+CREATE TABLE j2 (id INT NOT NULL PRIMARY KEY);
+CREATE TABLE j1 (
+  id INT PRIMARY KEY,
+  j2_id INT NOT NULL REFERENCES j2 (id) MATCH FULL ON DELETE CASCADE ON UPDATE CASCADE
+);
+INSERT INTO j2 VALUES(10),(20);
+INSERT INTO j1 VALUES(1,10),(2,20);
+-- create a table to store records that 'violate' the fkey.
+CREATE TABLE records_violating_fkey (j2_id INT NOT NULL);
+CREATE OR REPLACE FUNCTION j1_update() RETURNS TRIGGER AS $$
+BEGIN
+  INSERT INTO records_violating_fkey SELECT j2_id FROM j1 WHERE NOT EXISTS(SELECT 1 FROM j2 WHERE j2_id = j2.id);
+  RETURN NEW;
+  END;
+$$ LANGUAGE plpgsql;
+CREATE TRIGGER j1_update_trigger BEFORE UPDATE ON j2 FOR EACH ROW EXECUTE PROCEDURE j1_update();
+-- This update statement will cause some foreign key triggers to be queued.
+-- The trigger defined above will fire which will cause all records which
+-- currently violate the foreign key to be inserted into the records_violating_fkey
+-- table. The intended behaviour of this is that we'll see records violating the
+-- foreign key, however if we incorrectly performed an ANTI JOIN removal, then
+-- we wouldn't see this violation record, as we'd wrongly assume that the query
+-- could not produce any records.
+UPDATE j2 SET id = id + 1;
+SELECT * FROM records_violating_fkey;
+ j2_id 
+-------
+    10
+(1 row)
+
+ROLLBACK;
 create temp table parent (k int primary key, pd int);
 create temp table child (k int unique, cd int);
 insert into parent values (1, 10), (2, 20), (3, 30);
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index fa3e068..66b02ea 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -973,6 +973,171 @@ select i8.* from int8_tbl i8 left join (select f1 from int4_tbl group by f1) i4
 
 rollback;
 
+BEGIN;
+
+-- Test join removals for semi and anti joins
+CREATE TEMP TABLE b (id INT NOT NULL PRIMARY KEY);
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id INT REFERENCES b(id));
+
+-- should remove semi join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE b_id IN(SELECT id FROM b);
+
+-- should remove semi join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id = id);
+
+-- should remove anti join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = id);
+
+-- should not remove anti join as id > 100 will void
+-- the foreign key's guarantee that 1 will exist.
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = id AND id > 100);
+
+-- should remove semi join to b (swapped condition order)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE id = a.b_id);
+
+-- should not remove semi join (since not using equals)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE id >= a.b_id);
+
+-- should not remove semi join
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE b_id+0 IN(SELECT id FROM b);
+
+-- should not remove semi join
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE b_id IN(SELECT id+0 FROM b);
+
+-- should not remove semi join (wrong column)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE id IN(SELECT id FROM b);
+
+ROLLBACK;
+
+BEGIN;
+
+-- Semi join removal code with 2 column foreign keys
+
+CREATE TEMP TABLE b (id1 INT NOT NULL, id2 INT NOT NULL, PRIMARY KEY(id1,id2));
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id1 INT, b_id2 INT);
+
+ALTER TABLE a ADD CONSTRAINT a_b_id1_b_id2_fkey FOREIGN KEY (b_id1,b_id2) REFERENCES b(id1,id2) MATCH SIMPLE;
+
+-- should remove semi join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+
+-- should remove anti join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+
+-- should not remove semi join to b (extra condition)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2 AND a.b_id2 >= id2);
+
+-- should not remove semi join to b (wrong operator)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 > id1 AND a.b_id2 < id2);
+
+-- should not remove semi join (only checking id1)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1);
+
+-- should not remove semi join (only checking id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id2 = id2);
+
+-- should not remove semi join (checking wrong columns)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id2 = id1 AND a.b_id1 = id2);
+
+-- should not remove semi join (no check for id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id2 = id1 AND a.b_id1 = id1);
+
+-- should not remove semi join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id1 = id2);
+
+-- should not remove semi join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id1 = id1);
+
+
+-- Check that the IS NULL and IS NOT NULL filters are not added
+-- for columns which have a NOT NULL constraint.
+ALTER TABLE a ALTER COLUMN b_id1 SET NOT NULL;
+
+-- Should only filter on b_id2 IS NOT NULL
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+
+-- Should only filter on b_id2 IS NULL
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+
+ALTER TABLE a ALTER COLUMN b_id2 SET NOT NULL;
+
+-- No IS NOT NULL filters should be added.
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+
+-- Since now neither b_id1 or b_id2 can be NULL this query can't
+-- produce any records. Check that we get a One-Time Filter: false
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+
+ROLLBACK;
+
+BEGIN WORK;
+
+-- In this test we want to ensure that ANTI JOIN removal does not
+-- occur when there are pending foreign key triggers.
+-- We test this by updating a relation which is referenced by a foreign key
+-- and then executing another query which would normally allow the anti
+-- join to be removed. If the anti join was removed then the table
+-- records_violating_fkey would be empty, but here we'll ensure that
+-- the record that we update ends up violating the foreign key.
+
+CREATE TABLE j2 (id INT NOT NULL PRIMARY KEY);
+CREATE TABLE j1 (
+  id INT PRIMARY KEY,
+  j2_id INT NOT NULL REFERENCES j2 (id) MATCH FULL ON DELETE CASCADE ON UPDATE CASCADE
+);
+
+INSERT INTO j2 VALUES(10),(20);
+INSERT INTO j1 VALUES(1,10),(2,20);
+
+-- create a table to store records that 'violate' the fkey.
+CREATE TABLE records_violating_fkey (j2_id INT NOT NULL);
+
+CREATE OR REPLACE FUNCTION j1_update() RETURNS TRIGGER AS $$
+BEGIN
+  INSERT INTO records_violating_fkey SELECT j2_id FROM j1 WHERE NOT EXISTS(SELECT 1 FROM j2 WHERE j2_id = j2.id);
+  RETURN NEW;
+  END;
+$$ LANGUAGE plpgsql;
+
+CREATE TRIGGER j1_update_trigger BEFORE UPDATE ON j2 FOR EACH ROW EXECUTE PROCEDURE j1_update();
+
+-- This update statement will cause some foreign key triggers to be queued.
+-- The trigger defined above will fire which will cause all records which
+-- currently violate the foreign key to be inserted into the records_violating_fkey
+-- table. The intended behaviour of this is that we'll see records violating the
+-- foreign key, however if we incorrectly performed an ANTI JOIN removal, then
+-- we wouldn't see this violation record, as we'd wrongly assume that the query
+-- could not produce any records.
+
+UPDATE j2 SET id = id + 1;
+
+SELECT * FROM records_violating_fkey;
+
+ROLLBACK;
+
 create temp table parent (k int primary key, pd int);
 create temp table child (k int unique, cd int);
 insert into parent values (1, 10), (2, 20), (3, 30);

David Rowley

dgrowleyml@gmail.com

over 11 years ago

In reply to: David Rowley (#1)

1 attachment(s)

Re: Patch to support SEMI and ANTI join removal

On Tue, Aug 5, 2014 at 10:35 PM, David Rowley <dgrowleyml@gmail.com> wrote:

The patch (attached) is also now able to detect when a NOT EXISTS clause
cannot produce any records at all.

I've attached an updated version of the patch which fixes up some
incorrect logic in the foreign key matching code, plus various comment
improvements.

Regards

David Rowley

Attachments:

semianti_join_removal_f92541e_2014-08-10.patchapplication/octet-stream; name=semianti_join_removal_f92541e_2014-08-10.patchDownload

diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 9bf0098..88c8d98 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -3887,6 +3887,17 @@ afterTriggerInvokeEvents(AfterTriggerEventList *events,
 	return all_fired;
 }
 
+/* ----------
+ * AfterTriggerQueueIsEmpty()
+ *
+ *	True if there are no pending triggers in the queue.
+ * ----------
+ */
+bool
+AfterTriggerQueueIsEmpty(void)
+{
+	return (afterTriggers->query_depth == -1);
+}
 
 /* ----------
  * AfterTriggerBeginXact()
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index b7aff37..63dbc1b 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -32,6 +32,7 @@
 static EquivalenceMember *add_eq_member(EquivalenceClass *ec,
 			  Expr *expr, Relids relids, Relids nullable_relids,
 			  bool is_child, Oid datatype);
+static void update_rel_class_joins(PlannerInfo *root);
 static void generate_base_implied_equalities_const(PlannerInfo *root,
 									   EquivalenceClass *ec);
 static void generate_base_implied_equalities_no_const(PlannerInfo *root,
@@ -725,7 +726,6 @@ void
 generate_base_implied_equalities(PlannerInfo *root)
 {
 	ListCell   *lc;
-	Index		rti;
 
 	foreach(lc, root->eq_classes)
 	{
@@ -752,6 +752,19 @@ generate_base_implied_equalities(PlannerInfo *root)
 	 * This is also a handy place to mark base rels (which should all exist by
 	 * now) with flags showing whether they have pending eclass joins.
 	 */
+	update_rel_class_joins(root);
+}
+
+/*
+ * update_rel_class_joins
+ *		Process each relation in the PlannerInfo to update the
+ *		has_eclass_joins flag
+ */
+static void
+update_rel_class_joins(PlannerInfo *root)
+{
+	Index		rti;
+
 	for (rti = 1; rti < root->simple_rel_array_size; rti++)
 	{
 		RelOptInfo *brel = root->simple_rel_array[rti];
@@ -764,6 +777,63 @@ generate_base_implied_equalities(PlannerInfo *root)
 }
 
 /*
+ * remove_rel_from_eclass
+ *		Remove all eclass members that belong to relid and also any classes
+ *		which have been left empty as a result of removing a member.
+ */
+void
+remove_rel_from_eclass(PlannerInfo *root, int relid)
+{
+	ListCell	*l,
+				*nextl,
+				*eqm,
+				*eqmnext;
+
+	bool removedany = false;
+
+	/* Strip all traces of this relation out of the eclasses */
+	for (l = list_head(root->eq_classes); l != NULL; l = nextl)
+	{
+		EquivalenceClass *ec = (EquivalenceClass *) lfirst(l);
+
+		nextl = lnext(l);
+
+		for (eqm = list_head(ec->ec_members); eqm != NULL; eqm = eqmnext)
+		{
+			EquivalenceMember *em = (EquivalenceMember *) lfirst(eqm);
+
+			eqmnext = lnext(eqm);
+
+			if (IsA(em->em_expr, Var))
+			{
+				Var *var = (Var *) em->em_expr;
+
+				if (var->varno == relid)
+				{
+					list_delete_ptr(ec->ec_members, em);
+					removedany = true;
+				}
+			}
+		}
+
+		/*
+		 * If we've removed the last member from the EquivalenceClass then we'd
+		 * better delete the entire entry.
+		 */
+		if (list_length(ec->ec_members) == 0)
+			list_delete_ptr(root->eq_classes, ec);
+	}
+
+	/*
+	 * If we removed any eclass members then this may have changed if a
+	 * relation has an eclass join or not, we'd better force an update
+	 * of this
+	 */
+	if (removedany)
+		update_rel_class_joins(root);
+}
+
+/*
  * generate_base_implied_equalities when EC contains pseudoconstant(s)
  */
 static void
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index 773f8a4..4e910a3 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -22,17 +22,33 @@
  */
 #include "postgres.h"
 
+#include "commands/trigger.h"
+#include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "nodes/relation.h"
 #include "optimizer/clauses.h"
 #include "optimizer/joininfo.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/planmain.h"
+#include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
 #include "utils/lsyscache.h"
 
 /* local functions */
-static bool join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
+static bool leftjoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
+static bool semiorantijoin_is_removable(PlannerInfo *root,
+					  SpecialJoinInfo *sjinfo, List **leftrelcolumns,
+					  RelOptInfo **leftrel);
+void convert_semijoin_to_isnotnull_quals(PlannerInfo *root, RelOptInfo *rel,
+					  List *columnlist);
+void convert_antijoin_to_isnull_quals(PlannerInfo *root, RelOptInfo *rel,
+					  List *columnlist);
+static bool relation_has_foreign_key_for(PlannerInfo *root, RelOptInfo *rel,
+					  RelOptInfo *referencedrel, List *referencing_vars,
+					  List *index_vars, List *operator_list);
+static bool expressions_match_foreign_key(ForeignKeyInfo *fk, List *fkvars,
+					  List *indexvars, List *operators);
 static void remove_rel_from_query(PlannerInfo *root, int relid,
 					  Relids joinrelids);
 static List *remove_rel_from_joinlist(List *joinlist, int relid, int *nremoved);
@@ -53,8 +69,8 @@ remove_useless_joins(PlannerInfo *root, List *joinlist)
 	ListCell   *lc;
 
 	/*
-	 * We are only interested in relations that are left-joined to, so we can
-	 * scan the join_info_list to find them easily.
+	 * We are only interested in relations that are left, semi or anti-joined
+	 * to, so we can scan the join_info_list to find them easily.
 	 */
 restart:
 	foreach(lc, root->join_info_list)
@@ -63,14 +79,41 @@ restart:
 		int			innerrelid;
 		int			nremoved;
 
-		/* Skip if not removable */
-		if (!join_is_removable(root, sjinfo))
-			continue;
+		if (sjinfo->jointype == JOIN_LEFT)
+		{
+			/* Skip if not removable */
+			if (!leftjoin_is_removable(root, sjinfo))
+				continue;
+		}
+		else if (sjinfo->jointype == JOIN_SEMI)
+		{
+			List	   *columnlist;
+			RelOptInfo *rel;
+
+			if (!semiorantijoin_is_removable(root, sjinfo, &columnlist, &rel))
+				continue;
+
+			Assert(columnlist != NIL);
+			convert_semijoin_to_isnotnull_quals(root, rel, columnlist);
+		}
+		else if (sjinfo->jointype == JOIN_ANTI)
+		{
+			List	   *columnlist;
+			RelOptInfo *rel;
+
+			if (!semiorantijoin_is_removable(root, sjinfo, &columnlist, &rel))
+				continue;
+
+			Assert(columnlist != NIL);
+			convert_antijoin_to_isnull_quals(root, rel, columnlist);
+		}
+		else
+			continue; /* we don't support this join type */
 
 		/*
-		 * Currently, join_is_removable can only succeed when the sjinfo's
-		 * righthand is a single baserel.  Remove that rel from the query and
-		 * joinlist.
+		 * Currently, all of the functions which test if join removals are
+		 * possible can only succeed when the sjinfo's righthand is a single
+		 * baserel.  Remove that rel from the query and joinlist.
 		 */
 		innerrelid = bms_singleton_member(sjinfo->min_righthand);
 
@@ -136,8 +179,8 @@ clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
 }
 
 /*
- * join_is_removable
- *	  Check whether we need not perform this special join at all, because
+ * leftjoin_is_removable
+ *	  Check whether we need not perform this left join at all, because
  *	  it will just duplicate its left input.
  *
  * This is true for a left join for which the join condition cannot match
@@ -147,7 +190,7 @@ clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
  * above the join.
  */
 static bool
-join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
+leftjoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 {
 	int			innerrelid;
 	RelOptInfo *innerrel;
@@ -157,12 +200,13 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	ListCell   *l;
 	int			attroff;
 
+	Assert(sjinfo->jointype == JOIN_LEFT);
+
 	/*
-	 * Must be a non-delaying left join to a single baserel, else we aren't
+	 * Must be a non-delaying join to a single baserel, else we aren't
 	 * going to be able to do anything with it.
 	 */
-	if (sjinfo->jointype != JOIN_LEFT ||
-		sjinfo->delay_upper_joins ||
+	if (sjinfo->delay_upper_joins ||
 		bms_membership(sjinfo->min_righthand) != BMS_SINGLETON)
 		return false;
 
@@ -367,6 +411,557 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	return false;
 }
 
+/*
+ * semiorantijoin_is_removable
+ *	  True if we can prove that the semi or anti join is redundant due to the
+ *	  existence of a foreign key constraint.
+ *
+ * Detecting if a SEMI or ANTI join may be removed is quite different to the
+ * detection code for left joins. For these we have no need to check if vars
+ * from the join are used in the query as the EXISTS and IN() syntax disallow
+ * this. In order to prove that a semi or anti join is redundant we must ensure
+ * that a foreign key exists on the left side of the join which references the
+ * table on the right side of the join. This means that we can only support a
+ * single table on either side of the join. We must also ensure that the join
+ * condition matches all the foreign key columns to each index column on the
+ * referenced table. If any columns are missing then we cannot be sure we'll
+ * get at most 1 record back, and if there are any extra conditions that are
+ * not in the foreign key then we cannot be sure that the join condition will
+ * produce at least 1 matching row.
+ *
+ * If we manage to find a foreign key which will allow the join to be removed
+ * then the caller may have to add NULL checking to the query in place of the
+ * join. For example if we determine that the join to the table b is not needed
+ * due to the existence of a foreign key on a.b_id referencing b.id in the
+ * following query:
+ *
+ * SELECT * FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id = b.id);
+ *
+ * Then the only possible records that could be returned from a are the ones
+ * WHERE b_id IS NULL.
+ *
+ * If this function returns True, then leftrelcolumns will be populated with
+ * the list of columns from the left relation which exist in the join
+ * condition, leftrel will be set to the RelOptInfo of the left hand relation.
+ */
+static bool
+semiorantijoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo,
+		List **leftrelcolumns, RelOptInfo **leftrel)
+{
+	int			innerrelid;
+	int			outerrelid;
+	RelOptInfo *innerrel;
+	RelOptInfo *outerrel;
+	ListCell   *lc;
+	List	   *referencing_vars;
+	List	   *index_vars;
+	List	   *operator_list;
+
+	Assert(sjinfo->jointype == JOIN_SEMI || sjinfo->jointype == JOIN_ANTI);
+
+	/*
+	 * We mustn't allow semi or anti joins to be removed if there are any
+	 * pending foreign key triggers in the queue. This could happen if we
+	 * are planning a query that has been executed from within a volatile
+	 * function and the query which called this volatile function has made some
+	 * changes to a table referenced by a foreign key. The reason for this is
+	 * that any updates to a table which is referenced by a foreign key
+	 * constraint will only have the referencing tables updated after the
+	 * command is complete, so there is a window of time where records may
+	 * violate the foreign key constraint.
+	 *
+	 * Currently this code is quite naive, as we won't even attempt to remove
+	 * the join if there's any pending foreign key triggers. It may be
+	 * worthwhile to improve this to check if there's any pending triggers for
+	 * the referencing relation in the join, but to keep it simple, this will
+	 * do for now.
+	 */
+	if (!AfterTriggerQueueIsEmpty())
+		return false;
+
+	/*
+	 * We'll start by checking that the left hand relation is a singleton
+	 * and that it has at least 1 foreign key constraint.  A lack of foreign
+	 * key seems like a more likely possibility to allow us to exit early than
+	 * checking the right hand rel has any indexes.
+	 */
+	if (sjinfo->delay_upper_joins ||
+		bms_membership(sjinfo->min_lefthand) != BMS_SINGLETON)
+		return false;
+
+	outerrelid = bms_singleton_member(sjinfo->min_lefthand);
+	outerrel = find_base_rel(root, outerrelid);
+
+	/*
+	 * There's no possibility to remove the join if the outer rel is not a
+	 * baserel or the baserel has no foreign keys defined.
+	 */
+	if (outerrel->reloptkind != RELOPT_BASEREL ||
+		outerrel->rtekind != RTE_RELATION ||
+		outerrel->fklist == NIL)
+		return false;
+
+	if (bms_membership(sjinfo->min_righthand) != BMS_SINGLETON)
+		return false;
+
+	innerrelid = bms_singleton_member(sjinfo->min_righthand);
+	innerrel = find_base_rel(root, innerrelid);
+
+	/*
+	 * If the right hand relation is not a base rel then it can't possibly be
+	 * referenced by a foreign key. The same goes if there's no unique indexes
+	 * on the relation, however, to keep it simple here we'll make do with
+	 * checking if there's any indexes, as if there's no indexes then there's
+	 * certainly no unique indexes.
+	 */
+	if (innerrel->reloptkind != RELOPT_BASEREL ||
+		innerrel->rtekind != RTE_RELATION ||
+		innerrel->indexlist == NIL)
+		return false;
+
+	referencing_vars = NIL;
+	index_vars = NIL;
+	operator_list = NIL;
+
+	/*
+	 * We now pre-process the join quals into lists that contain the vars from
+	 * either side of the joins and also a list which contains the operators
+	 * from the join conditions. At this stage we may still discover that the
+	 * join cannot be removed if, for example we find a qual that does not
+	 * reference both sides of the join.
+	 *
+	 * referencing_vars will contain a list of Vars from the left hand
+	 * relation, these are the expressions that we'll check against the
+	 * referencing side of the foreign key.
+	 *
+	 * index_vars will contain a list of Vars from the right hand relation,
+	 * these are the expressions that we'll check on the referenced side of the
+	 * foreign key.
+	 *
+	 * operator_list, this is list of operator Oids that we'll need to ensure
+	 * are compatible with the operator specified in the foreign key.
+	 */
+	foreach(lc, sjinfo->join_quals)
+	{
+		OpExpr	   *opexpr = (OpExpr *) lfirst(lc);
+		Oid			opno;
+		Node	   *left_expr;
+		Node	   *right_expr;
+		Relids		left_varnos;
+		Relids		right_varnos;
+		Relids		all_varnos;
+		Oid			opinputtype;
+
+		/* Is it a binary opclause? */
+		if (!IsA(opexpr, OpExpr) ||
+			list_length(opexpr->args) != 2)
+		{
+			/* We only accept quals which reference both sides of the join. */
+			return false;
+		}
+
+		left_expr = linitial(opexpr->args);
+
+		/* Punt if the left operand is anything apart from a Var */
+		if (!IsA(left_expr, Var))
+			return false;
+
+		right_expr = lsecond(opexpr->args);
+
+		/* Punt if the right operand is anything apart from a Var */
+		if (!IsA(right_expr, Var))
+			return false;
+
+		opinputtype = exprType(left_expr);
+		opno = opexpr->opno;
+
+		/*
+		 * FIXME: it would be nice to fast path out if the
+		 * operator couldn't possibly be used in a foreign
+		 * key, but what to use to detect this?
+		 */
+		if (!op_mergejoinable(opno, opinputtype))
+			return false;
+
+		left_varnos = pull_varnos(left_expr);
+		right_varnos = pull_varnos(right_expr);
+		all_varnos = bms_union(left_varnos, right_varnos);
+
+		/*
+		 * Check if the clause matches both sides of the join. If only 1 side
+		 * is matched then, since we're dealing with a SEMI or ANTI join then
+		 * it must be from the inner side. So this qual could restrict the
+		 * results, We must disallow this case as any additional quals that
+		 * exist void the proof that the foreign key gives us that we'll match
+		 * exactly 1 record on the referenced relation.
+		 */
+		if (!bms_overlap(all_varnos, sjinfo->syn_righthand) ||
+			bms_is_subset(all_varnos, sjinfo->syn_righthand))
+			return false;
+
+		/* check rel membership of arguments */
+		if (!bms_is_empty(right_varnos) &&
+			bms_is_subset(right_varnos, sjinfo->syn_righthand) &&
+			!bms_overlap(left_varnos, sjinfo->syn_righthand))
+		{
+			/* typical case, right_expr is RHS variable */
+		}
+		else if (!bms_is_empty(left_varnos) &&
+				 bms_is_subset(left_varnos, sjinfo->syn_righthand) &&
+				 !bms_overlap(right_varnos, sjinfo->syn_righthand))
+		{
+			Node *tmp;
+			/* flipped case, left_expr is RHS variable */
+			opno = get_commutator(opno);
+			if (!OidIsValid(opno))
+				return false;
+
+			/* swap the operands */
+			tmp = left_expr;
+			left_expr = right_expr;
+			right_expr = tmp;
+		}
+		else
+			return false;
+
+		/* so far so good, keep building lists */
+		referencing_vars = lappend(referencing_vars, left_expr);
+		operator_list = lappend_oid(operator_list, opno);
+		index_vars = lappend(index_vars, right_expr);
+	}
+
+	/* no suitable join condition items? Then we can't remove the join */
+	if (referencing_vars == NIL)
+		return false;
+
+	/*
+	 * Now that we've built the join Var lists we can now check if there are
+	 * any foreign keys that will support removing the join.
+	 */
+	if (relation_has_foreign_key_for(root, outerrel, innerrel,
+				referencing_vars, index_vars, operator_list))
+	{
+		*leftrel = outerrel;
+		*leftrelcolumns = referencing_vars;
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * convert_semijoin_to_isnotnull_quals
+ *		Adds any required "col IS NOT NULL" quals which are required to ensure
+ *		that the query remains equivalent to what it was before the semi join
+ *		was removed.
+ */
+void
+convert_semijoin_to_isnotnull_quals(PlannerInfo *root, RelOptInfo *rel, List *columnlist)
+{
+	ListCell	*l;
+	Bitmapset	*handledcols = NULL;
+	Oid			 reloid;
+
+	reloid = root->simple_rte_array[rel->relid]->relid;
+
+	/*
+	 * If a semi join has been successfully removed by the join removal code,
+	 * then a foreign key must exist that proves the join to not be required.
+	 *
+	 * The semi join would have never allowed NULL values for any of the
+	 * columns seen in the join condition, as these would have matched up to a
+	 * record in the joined table. Now that we've proved the join to be
+	 * redundant, we must maintain that behavior of not having NULLs by adding
+	 * IS NOT NULL quals to the WHERE clause, although we may skip this if the
+	 * column in question happens to have a NOT NULL constraint.
+	 */
+	foreach(l, columnlist)
+	{
+		Var *var = (Var *) lfirst(l);
+
+		/* should be a var if it came from a foreign key */
+		Assert(IsA(var, Var));
+		Assert(var->varno == rel->relid);
+
+		/*
+		 * Skip this column if it's a duplicate of one we've previously
+		 * handled.
+		 */
+		if (bms_is_member(var->varattno, handledcols))
+			continue;
+
+		/* mark this column as handled */
+		handledcols = bms_add_member(handledcols, var->varattno);
+
+		/* add the IS NOT NULL qual, but only if the column allows NULLs */
+		if (!get_attnotnull(reloid, var->varattno))
+		{
+			RestrictInfo *rinfo;
+			NullTest *ntest = makeNode(NullTest);
+
+			ntest->nulltesttype = IS_NOT_NULL;
+			ntest->arg = (Expr *) var;
+			ntest->argisrow = false;
+
+			rinfo = make_restrictinfo((Expr *)ntest, false, false, false,
+						NULL, NULL, NULL);
+			rel->baserestrictinfo = lappend(rel->baserestrictinfo, rinfo);
+		}
+	}
+}
+
+/*
+ * convert_antijoin_to_isnull_quals
+ *		Adds any required "col IS NULL" quals which are required to ensure
+ *		that the query remains equivalent to what it was before the anti join
+ *		was removed.
+ */
+void
+convert_antijoin_to_isnull_quals(PlannerInfo *root, RelOptInfo *rel, List *columnlist)
+{
+	ListCell	 *l;
+	RestrictInfo *rinfo;
+	Expr		 *expr;
+	List		 *isnulltests = NIL;
+	Bitmapset	 *handledcols = NULL;
+	Oid			 reloid;
+
+	reloid = root->simple_rte_array[rel->relid]->relid;
+
+	/*
+	 * If an anti join has been successfully removed by the join removal code,
+	 * then a foreign key must exist that proves the join to not be required.
+	 *
+	 * The foreign key which proved this join redundant would ensure that each
+	 * record in the referencing rel has a matching record in the referenced
+	 * rel. Though this is not quite true when it comes to NULL valued columns,
+	 * as these won't reference any record. So here, in order to make the query
+	 * produce equivalent results as it would have done with the anti join,
+	 * we'll just ensure that only these NULL valued columns can make their way
+	 * into the final result set. There is also a special case here, if all of
+	 * the columns in the foreign key happen to have a NOT NULL constraint then
+	 * no records can match, so in this case we'll add a "WHERE false" in order
+	 * to save the executer from wasting any time.
+	 */
+	foreach(l, columnlist)
+	{
+		Var			  *var = (Var *) lfirst(l);
+
+		/* should be a var if it came from a foreign key */
+		Assert(IsA(var, Var));
+
+		/*
+		 * Skip this column if it's a duplicate of one we've previously
+		 * handled.
+		 */
+		if (bms_is_member(var->varattno, handledcols))
+			continue;
+
+		/* mark this column as handled */
+		handledcols = bms_add_member(handledcols, var->varattno);
+
+		/*
+		 * No point in adding a col IS NULL if the column
+		 * has a NOT NULL constraint defined for it.
+		 */
+		if (!get_attnotnull(reloid, var->varattno))
+		{
+			NullTest *ntest = makeNode(NullTest);
+			ntest->nulltesttype = IS_NULL;
+			ntest->arg = (Expr *) var;
+			ntest->argisrow = false;
+
+			isnulltests = lappend(isnulltests, ntest);
+		}
+	}
+
+	/*
+	 * If we still have an empty list by the time we get to here then it would
+	 * appear that each column has a NOT NULL constraint. In this case then
+	 * it's not possible for the query to return any records, so we can simply
+	 * add a "WHERE false" constant expression and tell the planner to check
+	 * for gating quals.
+	 */
+	if (isnulltests == NIL)
+	{
+		expr = (Expr *) makeBoolConst(false, false);
+		rinfo = make_restrictinfo(expr, false, false, true, NULL, NULL, NULL);
+
+		/* tell createplan.c to check for gating quals */
+		root->hasPseudoConstantQuals = true;
+	}
+	else
+	{
+		/*
+		 * Now we can build a RestrictInfo for the newly created IS NULL tests.
+		 * If there's only 1 test expression then we can just make the
+		 * RestrictInfo use that expression, if there's more than 1 we'll need
+		 * to "OR" all of these together.
+		 */
+		if (list_length(isnulltests) == 1)
+			expr = (Expr *) linitial(isnulltests);
+		else
+			expr = make_orclause(isnulltests);
+
+		rinfo = make_restrictinfo(expr, false, false, false, NULL, NULL, NULL);
+	}
+
+	rel->baserestrictinfo = lappend(rel->baserestrictinfo, rinfo);
+}
+
+/*
+ * relation_has_foreign_key_for
+ *	  Checks if rel has a foreign key which references referencedrel with the
+ *	  given list of expressions.
+ *
+ *	For the match to succeed:
+ *	  referencing_vars must match the columns defined in the foreign key.
+ *	  index_vars must match the columns defined in the index for the foreign key.
+ */
+static bool
+relation_has_foreign_key_for(PlannerInfo *root, RelOptInfo *rel,
+			RelOptInfo *referencedrel, List *referencing_vars,
+			List *index_vars, List *operator_list)
+{
+	ListCell *lc;
+	Oid		  refreloid;
+
+	/*
+	 * Look up the Oid of the referenced relation. We only want to look at
+	 * foreign keys on the referencing relation which reference this relation.
+	 */
+	refreloid = root->simple_rte_array[referencedrel->relid]->relid;
+
+	Assert(list_length(referencing_vars) > 0);
+	Assert(list_length(referencing_vars) == list_length(index_vars));
+	Assert(list_length(referencing_vars) == list_length(operator_list));
+
+	/*
+	 * Search through each foreign key on the referencing relation and try
+	 * to find one which references the relation in the join condition. If we
+	 * find one then we'll send the join conditions off to
+	 * expressions_match_foreign_key() to see if they match the foreign key.
+	 */
+	foreach(lc, rel->fklist)
+	{
+		ForeignKeyInfo *fk = (ForeignKeyInfo *) lfirst(lc);
+
+		if (fk->confrelid == refreloid)
+		{
+			if (expressions_match_foreign_key(fk, referencing_vars,
+				index_vars, operator_list))
+				return true;
+		}
+	}
+
+	return false;
+}
+
+/*
+ * expressions_match_foreign_key
+ *		True if the given fkvars, indexvars and operators will match
+ *		exactly 1 record in the referenced relation of the foreign key.
+ *
+ * Note: This function expects fkvars and indexvars to only contain Var types.
+ *       Expression indexes are not supported by foreign keys.
+ */
+static bool
+expressions_match_foreign_key(ForeignKeyInfo *fk, List *fkvars,
+					List *indexvars, List *operators)
+{
+	ListCell  *lc;
+	ListCell  *lc2;
+	ListCell  *lc3;
+	int		   col;
+	Bitmapset *allitems;
+	Bitmapset *matcheditems;
+	int		   lstidx;
+
+	Assert(list_length(fkvars) == list_length(indexvars));
+	Assert(list_length(fkvars) == list_length(operators));
+
+	/*
+	 * Fast path out if there's not enough conditions to match each column in
+	 * the foreign key. Note that we cannot check that the number of
+	 * expressions are equal here since it would cause any expressions which
+	 * are duplicated not to match.
+	 */
+	if (list_length(fkvars) < fk->conncols)
+		return false;
+
+	/*
+	 * We need to ensure that each foreign key column can be matched to a list
+	 * item, and we need to ensure that each list item can be matched to a
+	 * foreign key column. We do this by looping over each foreign key column
+	 * and checking that we can find an item in the list which matches the
+	 * current column, however this method does not allow us to ensure that no
+	 * additional items exist in the list. We could solve that by performing
+	 * another loop over each list item and check that it matches an foreign
+	 * key column, but that's a bit wasteful. Instead we'll use 2 bitmapsets,
+	 * one to store the 0 based index of each list item, and with the other
+	 * we'll store each list index that we've managed to match. After we're
+	 * done matching we'll just make sure that both bitmapsets are equal.
+	 */
+	allitems = NULL;
+	matcheditems = NULL;
+
+	/*
+	 * Build a bitmapset which contains each 1 based list index. It seems more
+	 * efficient to do this in reverse so that we allocate enough memory for
+	 * the bitmapset on first loop rather than reallocating each time we find
+	 * we need a bit more space.
+	 */
+	for (lstidx = list_length(fkvars) - 1; lstidx >= 0; lstidx--)
+		allitems = bms_add_member(allitems, lstidx);
+
+	for (col = 0; col < fk->conncols; col++)
+	{
+		bool  matched = false;
+
+		lstidx = 0;
+
+		forthree(lc, fkvars, lc2, indexvars, lc3, operators)
+		{
+			Var *expr = (Var *) lfirst(lc);
+			Var *idxexpr = (Var *) lfirst(lc2);
+			Oid  opr = lfirst_oid(lc3);
+
+			Assert(IsA(expr, Var));
+			Assert(IsA(idxexpr, Var));
+
+			/* Does this join qual match up to the current fkey column? */
+			if (fk->conkey[col] == expr->varattno &&
+				fk->confkey[col] == idxexpr->varattno &&
+				equality_ops_are_compatible(opr, fk->conpfeqop[col]))
+			{
+				matched = true;
+
+				/* mark this list item as matched */
+				matcheditems = bms_add_member(matcheditems, lstidx);
+
+				/*
+				 * Don't break here as there may be duplicate expressions
+				 * that we also need to match against.
+				 */
+			}
+			lstidx++;
+		}
+
+		/* punt if there's no match. */
+		if (!matched)
+			return false;
+	}
+
+	/*
+	 * Ensure that we managed to match every item in the list to a foreign key
+	 * column.
+	 */
+	if (!bms_equal(allitems, matcheditems))
+		return false;
+
+	return true; /* matched */
+}
+
 
 /*
  * Remove the target relid from the planner's data structures, having
@@ -393,6 +988,9 @@ remove_rel_from_query(PlannerInfo *root, int relid, Relids joinrelids)
 	 */
 	rel->reloptkind = RELOPT_DEADREL;
 
+	/* Strip out any eclass members that belong to this rel */
+	remove_rel_from_eclass(root, relid);
+
 	/*
 	 * Remove references to the rel from other baserels' attr_needed arrays.
 	 */
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index b2becfa..0b1c1a6 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -25,7 +25,9 @@
 #include "access/transam.h"
 #include "access/xlog.h"
 #include "catalog/catalog.h"
+#include "catalog/pg_constraint.h"
 #include "catalog/heap.h"
+#include "catalog/pg_type.h"
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -38,6 +40,7 @@
 #include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "storage/bufmgr.h"
+#include "utils/fmgroids.h"
 #include "utils/lsyscache.h"
 #include "utils/rel.h"
 #include "utils/snapmgr.h"
@@ -89,6 +92,13 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 	Relation	relation;
 	bool		hasindex;
 	List	   *indexinfos = NIL;
+	List	   *fkinfos = NIL;
+	Relation	fkeyRel;
+	Relation	fkeyRelIdx;
+	ScanKeyData fkeyScankey;
+	SysScanDesc fkeyScan;
+	HeapTuple	tuple;
+
 
 	/*
 	 * We need not lock the relation since it was already locked, either by
@@ -384,6 +394,111 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 
 	heap_close(relation, NoLock);
 
+	ScanKeyInit(&fkeyScankey,
+				Anum_pg_constraint_conrelid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(relationObjectId));
+
+	fkeyRel = heap_open(ConstraintRelationId, AccessShareLock);
+	fkeyRelIdx = index_open(ConstraintRelidIndexId, AccessShareLock);
+	fkeyScan = systable_beginscan_ordered(fkeyRel, fkeyRelIdx, NULL, 1, &fkeyScankey);
+
+	while ((tuple = systable_getnext_ordered(fkeyScan, ForwardScanDirection)) != NULL)
+	{
+		Form_pg_constraint con = (Form_pg_constraint) GETSTRUCT(tuple);
+		ForeignKeyInfo *fkinfo;
+		Datum		adatum;
+		bool		isNull;
+		ArrayType  *arr;
+		int			numkeys;
+
+		/* Not a foreign key */
+		if (con->contype != CONSTRAINT_FOREIGN)
+			continue;
+
+		/* we're not interested unless the fk has been validated */
+		if (!con->convalidated)
+			continue;
+
+		fkinfo = (ForeignKeyInfo *) palloc(sizeof(ForeignKeyInfo));
+		fkinfo->conindid = con->conindid;
+		fkinfo->confrelid = con->confrelid;
+		fkinfo->convalidated = con->convalidated;
+		fkinfo->conrelid = con->conrelid;
+		fkinfo->confupdtype = con->confupdtype;
+		fkinfo->confdeltype = con->confdeltype;
+		fkinfo->confmatchtype = con->confmatchtype;
+
+		adatum = heap_getattr(tuple, Anum_pg_constraint_conkey,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null conkey for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		numkeys = ARR_DIMS(arr)[0];
+		if (ARR_NDIM(arr) != 1 ||
+			numkeys < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != INT2OID)
+			elog(ERROR, "conkey is not a 1-D smallint array");
+
+		fkinfo->conkey = (int16 *) ARR_DATA_PTR(arr);
+		fkinfo->conncols = numkeys;
+
+		adatum = heap_getattr(tuple, Anum_pg_constraint_confkey,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null confkey for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		numkeys = ARR_DIMS(arr)[0];
+
+		/* sanity check */
+		if (numkeys != fkinfo->conncols)
+			elog(ERROR, "number of confkey elements does not equal conkey elements");
+
+		if (ARR_NDIM(arr) != 1 ||
+			numkeys < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != INT2OID)
+			elog(ERROR, "confkey is not a 1-D smallint array");
+
+		fkinfo->confkey = (int16 *) ARR_DATA_PTR(arr);
+		adatum = heap_getattr(tuple, Anum_pg_constraint_conpfeqop,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null conpfeqop for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		numkeys = ARR_DIMS(arr)[0];
+
+		/* sanity check */
+		if (numkeys != fkinfo->conncols)
+			elog(ERROR, "number of conpfeqop elements does not equal conkey elements");
+
+		if (ARR_NDIM(arr) != 1 ||
+			numkeys < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != OIDOID)
+			elog(ERROR, "conpfeqop is not a 1-D smallint array");
+
+		fkinfo->conpfeqop = (Oid *) ARR_DATA_PTR(arr);
+
+		fkinfos = lappend(fkinfos, fkinfo);
+	}
+
+	rel->fklist = fkinfos;
+	systable_endscan_ordered(fkeyScan);
+	index_close(fkeyRelIdx, AccessShareLock);
+	heap_close(fkeyRel, AccessShareLock);
+
+
 	/*
 	 * Allow a plugin to editorialize on the info we obtained from the
 	 * catalogs.  Actions might include altering the assumed relation size,
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index c938c27..a0fb8eb 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -115,6 +115,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind)
 	rel->lateral_relids = NULL;
 	rel->lateral_referencers = NULL;
 	rel->indexlist = NIL;
+	rel->fklist = NIL;
 	rel->pages = 0;
 	rel->tuples = 0;
 	rel->allvisfrac = 0;
@@ -377,6 +378,7 @@ build_join_rel(PlannerInfo *root,
 	joinrel->lateral_relids = NULL;
 	joinrel->lateral_referencers = NULL;
 	joinrel->indexlist = NIL;
+	joinrel->fklist = NIL;
 	joinrel->pages = 0;
 	joinrel->tuples = 0;
 	joinrel->allvisfrac = 0;
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 552e498..aa81c7c 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -916,6 +916,33 @@ get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 	ReleaseSysCache(tp);
 }
 
+/*
+ * get_attnotnull
+ *
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	tp;
+
+	tp = SearchSysCache2(ATTNUM,
+						 ObjectIdGetDatum(relid),
+						 Int16GetDatum(attnum));
+	if (HeapTupleIsValid(tp))
+	{
+		Form_pg_attribute att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+		bool		result;
+
+		result = att_tup->attnotnull;
+		ReleaseSysCache(tp);
+		return result;
+	}
+	else
+		return false;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h
index d0b0356..34a75e4 100644
--- a/src/include/commands/trigger.h
+++ b/src/include/commands/trigger.h
@@ -181,6 +181,7 @@ extern void ExecBSTruncateTriggers(EState *estate,
 extern void ExecASTruncateTriggers(EState *estate,
 					   ResultRelInfo *relinfo);
 
+extern bool AfterTriggerQueueIsEmpty(void);
 extern void AfterTriggerBeginXact(void);
 extern void AfterTriggerBeginQuery(void);
 extern void AfterTriggerEndQuery(EState *estate);
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index dacbe9c..f69df09 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -355,6 +355,8 @@ typedef struct PlannerInfo
  *		lateral_referencers - relids of rels that reference this one laterally
  *		indexlist - list of IndexOptInfo nodes for relation's indexes
  *					(always NIL if it's not a table)
+ *		fklist - list of ForeignKeyInfo's for relation's foreign key
+ *					constraints. (always NIL if it's not a table)
  *		pages - number of disk pages in relation (zero if not a table)
  *		tuples - number of tuples in relation (not considering restrictions)
  *		allvisfrac - fraction of disk pages that are marked all-visible
@@ -448,6 +450,7 @@ typedef struct RelOptInfo
 	Relids		lateral_relids; /* minimum parameterization of rel */
 	Relids		lateral_referencers;	/* rels that reference me laterally */
 	List	   *indexlist;		/* list of IndexOptInfo */
+	List	   *fklist;			/* list of ForeignKeyInfo */
 	BlockNumber pages;			/* size estimates derived from pg_class */
 	double		tuples;
 	double		allvisfrac;
@@ -538,6 +541,51 @@ typedef struct IndexOptInfo
 	bool		amhasgetbitmap; /* does AM have amgetbitmap interface? */
 } IndexOptInfo;
 
+/*
+ * ForeignKeyInfo
+ *		Used to store pg_constraint records for foreign key constraints for use
+ *		by the planner.
+ *
+ *		conindid - The index which supports the foreign key
+ *
+ *		confrelid - The relation that is referenced by this foreign key
+ *
+ *		convalidated - True if the foreign key has been validated.
+ *
+ *		conrelid - The Oid of the relation that the foreign key belongs to
+ *
+ *		confupdtype - ON UPDATE action for when the referenced table is updated
+ *
+ *		confdeltype - ON DELETE action, controls what to do when a record is
+ *					deleted from the referenced table.
+ *
+ *		confmatchtype - foreign key match type, e.g MATCH FULL, MATCH PARTIAL
+ *
+ *		conncols - Number of columns defined in the foreign key
+ *
+ *		conkey - An array of conncols elements to store the varattno of the
+ *					columns on the referencing side of the foreign key
+ *
+ *		confkey - An array of conncols elements to store the varattno of the
+ *					columns on the referenced side of the foreign key
+ *
+ *		conpfeqop - An array of conncols elements to store the operators for
+ *					PK = FK comparisons
+ */
+typedef struct ForeignKeyInfo
+{
+	Oid			conindid;		/* index supporting this constraint */
+	Oid			confrelid;		/* relation referenced by foreign key */
+	bool		convalidated;	/* constraint has been validated? */
+	Oid			conrelid;		/* relation this constraint constrains */
+	char		confupdtype;	/* foreign key's ON UPDATE action */
+	char		confdeltype;	/* foreign key's ON DELETE action */
+	char		confmatchtype;	/* foreign key's match type */
+	int			conncols;		/* number of columns references */
+	int16	   *conkey;			/* Columns of conrelid that the constraint applies to */
+	int16	   *confkey;		/* columns of confrelid that foreign key references */
+	Oid		   *conpfeqop;		/* Operator list for comparing PK to FK */
+} ForeignKeyInfo;
 
 /*
  * EquivalenceClasses
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 9b22fda..00716c9 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -108,6 +108,7 @@ extern EquivalenceClass *get_eclass_for_sort_expr(PlannerInfo *root,
 						 Relids rel,
 						 bool create_it);
 extern void generate_base_implied_equalities(PlannerInfo *root);
+extern void remove_rel_from_eclass(PlannerInfo *root, int relid);
 extern List *generate_join_implied_equalities(PlannerInfo *root,
 								 Relids join_relids,
 								 Relids outer_relids,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 07d24d4..910190d 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -68,6 +68,7 @@ extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern int32 get_atttypmod(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 					  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern char *get_collation_name(Oid colloid);
 extern char *get_constraint_name(Oid conoid);
 extern Oid	get_opclass_family(Oid opclass);
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 1cb1c51..d9252c1 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3249,6 +3249,330 @@ select i8.* from int8_tbl i8 left join (select f1 from int4_tbl group by f1) i4
 (1 row)
 
 rollback;
+BEGIN;
+-- Test join removals for semi and anti joins
+CREATE TEMP TABLE b (id INT NOT NULL PRIMARY KEY, val INT);
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id INT REFERENCES b(id));
+-- should remove semi join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE b_id IN(SELECT id FROM b);
+          QUERY PLAN          
+------------------------------
+ Seq Scan on a
+   Filter: (b_id IS NOT NULL)
+(2 rows)
+
+-- should remove semi join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id = id);
+          QUERY PLAN          
+------------------------------
+ Seq Scan on a
+   Filter: (b_id IS NOT NULL)
+(2 rows)
+
+-- should remove anti join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = id);
+        QUERY PLAN        
+--------------------------
+ Seq Scan on a
+   Filter: (b_id IS NULL)
+(2 rows)
+
+-- should not remove anti join as id > 100 will void
+-- the foreign key's guarantee that 1 will exist.
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = id AND id > 100);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Hash Anti Join
+   Hash Cond: (a.b_id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Bitmap Heap Scan on b
+               Recheck Cond: (id > 100)
+               ->  Bitmap Index Scan on b_pkey
+                     Index Cond: (id > 100)
+(8 rows)
+
+-- should not remove anti join as val is not part of the foreign key.
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = id AND val = id);
+            QUERY PLAN            
+----------------------------------
+ Hash Anti Join
+   Hash Cond: (a.b_id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: (id = val)
+(6 rows)
+
+-- should remove semi join to b (swapped condition order)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE id = a.b_id);
+          QUERY PLAN          
+------------------------------
+ Seq Scan on a
+   Filter: (b_id IS NOT NULL)
+(2 rows)
+
+-- should not remove semi join (since not using equals)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE id >= a.b_id);
+               QUERY PLAN                
+-----------------------------------------
+ Nested Loop Semi Join
+   ->  Seq Scan on a
+   ->  Index Only Scan using b_pkey on b
+         Index Cond: (id >= a.b_id)
+(4 rows)
+
+-- should not remove semi join
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE b_id+0 IN(SELECT id FROM b);
+             QUERY PLAN             
+------------------------------------
+ Hash Semi Join
+   Hash Cond: ((a.b_id + 0) = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(5 rows)
+
+-- should not remove semi join
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE b_id IN(SELECT id+0 FROM b);
+             QUERY PLAN             
+------------------------------------
+ Hash Semi Join
+   Hash Cond: (a.b_id = (b.id + 0))
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(5 rows)
+
+-- should not remove semi join (wrong column)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE id IN(SELECT id FROM b);
+         QUERY PLAN         
+----------------------------
+ Hash Semi Join
+   Hash Cond: (a.id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(5 rows)
+
+ROLLBACK;
+BEGIN;
+-- Semi join removal code with 2 column foreign keys
+CREATE TEMP TABLE b (id1 INT NOT NULL, id2 INT NOT NULL, PRIMARY KEY(id1,id2));
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id1 INT, b_id2 INT);
+ALTER TABLE a ADD CONSTRAINT a_b_id1_b_id2_fkey FOREIGN KEY (b_id1,b_id2) REFERENCES b(id1,id2) MATCH SIMPLE;
+-- should remove semi join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Seq Scan on a
+   Filter: ((b_id1 IS NOT NULL) AND (b_id2 IS NOT NULL))
+(2 rows)
+
+-- should remove anti join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+                   QUERY PLAN                   
+------------------------------------------------
+ Seq Scan on a
+   Filter: ((b_id1 IS NULL) OR (b_id2 IS NULL))
+(2 rows)
+
+-- should not remove semi join to b (extra condition)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2 AND a.b_id2 >= id2);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Hash Semi Join
+   Hash Cond: ((a.b_id1 = b.id1) AND (a.b_id2 = b.id2))
+   Join Filter: (a.b_id2 >= b.id2)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(6 rows)
+
+-- should not remove semi join to b (wrong operator)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 > id1 AND a.b_id2 < id2);
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Nested Loop Semi Join
+   ->  Seq Scan on a
+   ->  Index Only Scan using b_pkey on b
+         Index Cond: ((id1 < a.b_id1) AND (id2 > a.b_id2))
+(4 rows)
+
+-- should not remove semi join (only checking id1)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1);
+           QUERY PLAN            
+---------------------------------
+ Hash Join
+   Hash Cond: (a.b_id1 = b.id1)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  HashAggregate
+               Group Key: b.id1
+               ->  Seq Scan on b
+(7 rows)
+
+-- should not remove semi join (only checking id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id2 = id2);
+           QUERY PLAN            
+---------------------------------
+ Hash Join
+   Hash Cond: (a.b_id2 = b.id2)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  HashAggregate
+               Group Key: b.id2
+               ->  Seq Scan on b
+(7 rows)
+
+-- should not remove semi join (checking wrong columns)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id2 = id1 AND a.b_id1 = id2);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Hash Join
+   Hash Cond: ((a.b_id2 = b.id1) AND (a.b_id1 = b.id2))
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(5 rows)
+
+-- should not remove semi join (no check for id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id2 = id1 AND a.b_id1 = id1);
+               QUERY PLAN                
+-----------------------------------------
+ Nested Loop Semi Join
+   ->  Seq Scan on a
+         Filter: (b_id2 = b_id1)
+   ->  Index Only Scan using b_pkey on b
+         Index Cond: (id1 = a.b_id2)
+(5 rows)
+
+-- should not remove semi join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id1 = id2);
+            QUERY PLAN             
+-----------------------------------
+ Hash Join
+   Hash Cond: (a.b_id1 = b.id1)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: (id1 = id2)
+(6 rows)
+
+-- should not remove semi join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id1 = id1);
+              QUERY PLAN               
+---------------------------------------
+ Hash Join
+   Hash Cond: (a.b_id1 = b.id1)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  HashAggregate
+               Group Key: b.id1, b.id1
+               ->  Seq Scan on b
+(7 rows)
+
+-- Check that the IS NULL and IS NOT NULL filters are not added
+-- for columns which have a NOT NULL constraint.
+ALTER TABLE a ALTER COLUMN b_id1 SET NOT NULL;
+-- Should only filter on b_id2 IS NOT NULL
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on a
+   Filter: (b_id2 IS NOT NULL)
+(2 rows)
+
+-- Should only filter on b_id2 IS NULL
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+        QUERY PLAN         
+---------------------------
+ Seq Scan on a
+   Filter: (b_id2 IS NULL)
+(2 rows)
+
+ALTER TABLE a ALTER COLUMN b_id2 SET NOT NULL;
+-- No IS NOT NULL filters should be added.
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+  QUERY PLAN   
+---------------
+ Seq Scan on a
+(1 row)
+
+-- Since now neither b_id1 or b_id2 can be NULL this query can't
+-- produce any records. Check that we get a One-Time Filter: false
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+        QUERY PLAN        
+--------------------------
+ Result
+   One-Time Filter: false
+   ->  Seq Scan on a
+(3 rows)
+
+ROLLBACK;
+BEGIN WORK;
+-- In this test we want to ensure that ANTI JOIN removal does not
+-- occur when there are pending foreign key triggers.
+-- We test this by updating a relation which is referenced by a foreign key
+-- and then executing another query which would normally allow the anti
+-- join to be removed. If the anti join was removed then the table
+-- records_violating_fkey would be empty, but here we'll ensure that
+-- the record that we update ends up violating the foreign key.
+CREATE TABLE j2 (id INT NOT NULL PRIMARY KEY);
+CREATE TABLE j1 (
+  id INT PRIMARY KEY,
+  j2_id INT NOT NULL REFERENCES j2 (id) MATCH FULL ON DELETE CASCADE ON UPDATE CASCADE
+);
+INSERT INTO j2 VALUES(10),(20);
+INSERT INTO j1 VALUES(1,10),(2,20);
+-- create a table to store records that 'violate' the fkey.
+CREATE TABLE records_violating_fkey (j2_id INT NOT NULL);
+CREATE OR REPLACE FUNCTION j1_update() RETURNS TRIGGER AS $$
+BEGIN
+  INSERT INTO records_violating_fkey SELECT j2_id FROM j1 WHERE NOT EXISTS(SELECT 1 FROM j2 WHERE j2_id = j2.id);
+  RETURN NEW;
+  END;
+$$ LANGUAGE plpgsql;
+CREATE TRIGGER j1_update_trigger BEFORE UPDATE ON j2 FOR EACH ROW EXECUTE PROCEDURE j1_update();
+-- This update statement will cause some foreign key triggers to be queued.
+-- The trigger defined above will fire which will cause all records which
+-- currently violate the foreign key to be inserted into the records_violating_fkey
+-- table. The intended behaviour of this is that we'll see records violating the
+-- foreign key, however if we incorrectly performed an ANTI JOIN removal, then
+-- we wouldn't see this violation record, as we'd wrongly assume that the query
+-- could not produce any records.
+UPDATE j2 SET id = id + 1;
+SELECT * FROM records_violating_fkey;
+ j2_id 
+-------
+    10
+(1 row)
+
+ROLLBACK;
 create temp table parent (k int primary key, pd int);
 create temp table child (k int unique, cd int);
 insert into parent values (1, 10), (2, 20), (3, 30);
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index fa3e068..5ec5016 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -973,6 +973,175 @@ select i8.* from int8_tbl i8 left join (select f1 from int4_tbl group by f1) i4
 
 rollback;
 
+BEGIN;
+
+-- Test join removals for semi and anti joins
+CREATE TEMP TABLE b (id INT NOT NULL PRIMARY KEY, val INT);
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id INT REFERENCES b(id));
+
+-- should remove semi join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE b_id IN(SELECT id FROM b);
+
+-- should remove semi join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id = id);
+
+-- should remove anti join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = id);
+
+-- should not remove anti join as id > 100 will void
+-- the foreign key's guarantee that 1 will exist.
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = id AND id > 100);
+
+-- should not remove anti join as val is not part of the foreign key.
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = id AND val = id);
+
+-- should remove semi join to b (swapped condition order)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE id = a.b_id);
+
+-- should not remove semi join (since not using equals)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE id >= a.b_id);
+
+-- should not remove semi join
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE b_id+0 IN(SELECT id FROM b);
+
+-- should not remove semi join
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE b_id IN(SELECT id+0 FROM b);
+
+-- should not remove semi join (wrong column)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE id IN(SELECT id FROM b);
+
+ROLLBACK;
+
+BEGIN;
+
+-- Semi join removal code with 2 column foreign keys
+
+CREATE TEMP TABLE b (id1 INT NOT NULL, id2 INT NOT NULL, PRIMARY KEY(id1,id2));
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id1 INT, b_id2 INT);
+
+ALTER TABLE a ADD CONSTRAINT a_b_id1_b_id2_fkey FOREIGN KEY (b_id1,b_id2) REFERENCES b(id1,id2) MATCH SIMPLE;
+
+-- should remove semi join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+
+-- should remove anti join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+
+-- should not remove semi join to b (extra condition)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2 AND a.b_id2 >= id2);
+
+-- should not remove semi join to b (wrong operator)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 > id1 AND a.b_id2 < id2);
+
+-- should not remove semi join (only checking id1)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1);
+
+-- should not remove semi join (only checking id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id2 = id2);
+
+-- should not remove semi join (checking wrong columns)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id2 = id1 AND a.b_id1 = id2);
+
+-- should not remove semi join (no check for id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id2 = id1 AND a.b_id1 = id1);
+
+-- should not remove semi join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id1 = id2);
+
+-- should not remove semi join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id1 = id1);
+
+
+-- Check that the IS NULL and IS NOT NULL filters are not added
+-- for columns which have a NOT NULL constraint.
+ALTER TABLE a ALTER COLUMN b_id1 SET NOT NULL;
+
+-- Should only filter on b_id2 IS NOT NULL
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+
+-- Should only filter on b_id2 IS NULL
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+
+ALTER TABLE a ALTER COLUMN b_id2 SET NOT NULL;
+
+-- No IS NOT NULL filters should be added.
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+
+-- Since now neither b_id1 or b_id2 can be NULL this query can't
+-- produce any records. Check that we get a One-Time Filter: false
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+
+ROLLBACK;
+
+BEGIN WORK;
+
+-- In this test we want to ensure that ANTI JOIN removal does not
+-- occur when there are pending foreign key triggers.
+-- We test this by updating a relation which is referenced by a foreign key
+-- and then executing another query which would normally allow the anti
+-- join to be removed. If the anti join was removed then the table
+-- records_violating_fkey would be empty, but here we'll ensure that
+-- the record that we update ends up violating the foreign key.
+
+CREATE TABLE j2 (id INT NOT NULL PRIMARY KEY);
+CREATE TABLE j1 (
+  id INT PRIMARY KEY,
+  j2_id INT NOT NULL REFERENCES j2 (id) MATCH FULL ON DELETE CASCADE ON UPDATE CASCADE
+);
+
+INSERT INTO j2 VALUES(10),(20);
+INSERT INTO j1 VALUES(1,10),(2,20);
+
+-- create a table to store records that 'violate' the fkey.
+CREATE TABLE records_violating_fkey (j2_id INT NOT NULL);
+
+CREATE OR REPLACE FUNCTION j1_update() RETURNS TRIGGER AS $$
+BEGIN
+  INSERT INTO records_violating_fkey SELECT j2_id FROM j1 WHERE NOT EXISTS(SELECT 1 FROM j2 WHERE j2_id = j2.id);
+  RETURN NEW;
+  END;
+$$ LANGUAGE plpgsql;
+
+CREATE TRIGGER j1_update_trigger BEFORE UPDATE ON j2 FOR EACH ROW EXECUTE PROCEDURE j1_update();
+
+-- This update statement will cause some foreign key triggers to be queued.
+-- The trigger defined above will fire which will cause all records which
+-- currently violate the foreign key to be inserted into the records_violating_fkey
+-- table. The intended behaviour of this is that we'll see records violating the
+-- foreign key, however if we incorrectly performed an ANTI JOIN removal, then
+-- we wouldn't see this violation record, as we'd wrongly assume that the query
+-- could not produce any records.
+
+UPDATE j2 SET id = id + 1;
+
+SELECT * FROM records_violating_fkey;
+
+ROLLBACK;
+
 create temp table parent (k int primary key, pd int);
 create temp table child (k int unique, cd int);
 insert into parent values (1, 10), (2, 20), (3, 30);

David Rowley

dgrowleyml@gmail.com

over 11 years ago

In reply to: David Rowley (#2)

4 attachment(s)

Re: Patch to support SEMI and ANTI join removal

On Sun, Aug 10, 2014 at 11:48 PM, David Rowley <dgrowleyml@gmail.com> wrote:

I've attached an updated version of the patch which fixes up some
incorrect logic in the foreign key matching code, plus various comment
improvements.

I've made a few updated to the patch to simplify some logic in the code
which analyses the join condition. The result is slightly faster code for
detecting either successful or unsuccessful join removal.

I've also been doing a little benchmarking of the overhead that this adds
to planning time for a handful of different queries.
With the queries I tested the overhead was between ~20 and ~423 nanoseconds
per SEMI or ANTI join, the 20 was for the earliest fast path out on an
unsuccessful removal and the 423 was for a successful removal. (tests done
on a 4 year old intel i5 laptop). This accounted for between 0.01% and 0.2%
of planning time for the tested queries. I was quite happy with this, but I
did manage to knock it down a little more with the
bms_get_singleton_v1.patch, which I've attached. This reduces the range to
between ~15 and ~409 nanoseconds, but probably this is going into micro
benchmark territory... so perhaps not worth the extra code...

With the benchmarks I just put semiorantijoin_is_removable() in a tight 1
million iteration loop and grabbed the total planning time for that, I then
compared this to an unpatched master's planning time after dividing the
time reported for the 1 million removals version by 1 million.

I didn't really find a good way to measure the extra overhead in actually
loading the foreign key constraints in get_relation_info()

Regards

David Rowley

Attachments:

anti_join_removal_benchmark.txttext/plain; charset=US-ASCII; name=anti_join_removal_benchmark.txtDownload

semianti_join_removal_7be0c95_2014-08-17.patchapplication/octet-stream; name=semianti_join_removal_7be0c95_2014-08-17.patchDownload

diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 9bf0098..88c8d98 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -3887,6 +3887,17 @@ afterTriggerInvokeEvents(AfterTriggerEventList *events,
 	return all_fired;
 }
 
+/* ----------
+ * AfterTriggerQueueIsEmpty()
+ *
+ *	True if there are no pending triggers in the queue.
+ * ----------
+ */
+bool
+AfterTriggerQueueIsEmpty(void)
+{
+	return (afterTriggers->query_depth == -1);
+}
 
 /* ----------
  * AfterTriggerBeginXact()
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index b7aff37..63dbc1b 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -32,6 +32,7 @@
 static EquivalenceMember *add_eq_member(EquivalenceClass *ec,
 			  Expr *expr, Relids relids, Relids nullable_relids,
 			  bool is_child, Oid datatype);
+static void update_rel_class_joins(PlannerInfo *root);
 static void generate_base_implied_equalities_const(PlannerInfo *root,
 									   EquivalenceClass *ec);
 static void generate_base_implied_equalities_no_const(PlannerInfo *root,
@@ -725,7 +726,6 @@ void
 generate_base_implied_equalities(PlannerInfo *root)
 {
 	ListCell   *lc;
-	Index		rti;
 
 	foreach(lc, root->eq_classes)
 	{
@@ -752,6 +752,19 @@ generate_base_implied_equalities(PlannerInfo *root)
 	 * This is also a handy place to mark base rels (which should all exist by
 	 * now) with flags showing whether they have pending eclass joins.
 	 */
+	update_rel_class_joins(root);
+}
+
+/*
+ * update_rel_class_joins
+ *		Process each relation in the PlannerInfo to update the
+ *		has_eclass_joins flag
+ */
+static void
+update_rel_class_joins(PlannerInfo *root)
+{
+	Index		rti;
+
 	for (rti = 1; rti < root->simple_rel_array_size; rti++)
 	{
 		RelOptInfo *brel = root->simple_rel_array[rti];
@@ -764,6 +777,63 @@ generate_base_implied_equalities(PlannerInfo *root)
 }
 
 /*
+ * remove_rel_from_eclass
+ *		Remove all eclass members that belong to relid and also any classes
+ *		which have been left empty as a result of removing a member.
+ */
+void
+remove_rel_from_eclass(PlannerInfo *root, int relid)
+{
+	ListCell	*l,
+				*nextl,
+				*eqm,
+				*eqmnext;
+
+	bool removedany = false;
+
+	/* Strip all traces of this relation out of the eclasses */
+	for (l = list_head(root->eq_classes); l != NULL; l = nextl)
+	{
+		EquivalenceClass *ec = (EquivalenceClass *) lfirst(l);
+
+		nextl = lnext(l);
+
+		for (eqm = list_head(ec->ec_members); eqm != NULL; eqm = eqmnext)
+		{
+			EquivalenceMember *em = (EquivalenceMember *) lfirst(eqm);
+
+			eqmnext = lnext(eqm);
+
+			if (IsA(em->em_expr, Var))
+			{
+				Var *var = (Var *) em->em_expr;
+
+				if (var->varno == relid)
+				{
+					list_delete_ptr(ec->ec_members, em);
+					removedany = true;
+				}
+			}
+		}
+
+		/*
+		 * If we've removed the last member from the EquivalenceClass then we'd
+		 * better delete the entire entry.
+		 */
+		if (list_length(ec->ec_members) == 0)
+			list_delete_ptr(root->eq_classes, ec);
+	}
+
+	/*
+	 * If we removed any eclass members then this may have changed if a
+	 * relation has an eclass join or not, we'd better force an update
+	 * of this
+	 */
+	if (removedany)
+		update_rel_class_joins(root);
+}
+
+/*
  * generate_base_implied_equalities when EC contains pseudoconstant(s)
  */
 static void
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index 773f8a4..b0a11d7 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -22,17 +22,33 @@
  */
 #include "postgres.h"
 
+#include "commands/trigger.h"
+#include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "nodes/relation.h"
 #include "optimizer/clauses.h"
 #include "optimizer/joininfo.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/planmain.h"
+#include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
 #include "utils/lsyscache.h"
 
 /* local functions */
-static bool join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
+static bool leftjoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
+static bool semiorantijoin_is_removable(PlannerInfo *root,
+					  SpecialJoinInfo *sjinfo, List **leftrelcolumns,
+					  RelOptInfo **leftrel);
+void convert_semijoin_to_isnotnull_quals(PlannerInfo *root, RelOptInfo *rel,
+					  List *columnlist);
+void convert_antijoin_to_isnull_quals(PlannerInfo *root, RelOptInfo *rel,
+					  List *columnlist);
+static bool relation_has_foreign_key_for(PlannerInfo *root, RelOptInfo *rel,
+					  RelOptInfo *referencedrel, List *referencing_vars,
+					  List *index_vars, List *operator_list);
+static bool expressions_match_foreign_key(ForeignKeyInfo *fk, List *fkvars,
+					  List *indexvars, List *operators);
 static void remove_rel_from_query(PlannerInfo *root, int relid,
 					  Relids joinrelids);
 static List *remove_rel_from_joinlist(List *joinlist, int relid, int *nremoved);
@@ -53,8 +69,8 @@ remove_useless_joins(PlannerInfo *root, List *joinlist)
 	ListCell   *lc;
 
 	/*
-	 * We are only interested in relations that are left-joined to, so we can
-	 * scan the join_info_list to find them easily.
+	 * We are only interested in relations that are left, semi or anti-joined
+	 * to, so we can scan the join_info_list to find them easily.
 	 */
 restart:
 	foreach(lc, root->join_info_list)
@@ -63,14 +79,41 @@ restart:
 		int			innerrelid;
 		int			nremoved;
 
-		/* Skip if not removable */
-		if (!join_is_removable(root, sjinfo))
-			continue;
+		if (sjinfo->jointype == JOIN_LEFT)
+		{
+			/* Skip if not removable */
+			if (!leftjoin_is_removable(root, sjinfo))
+				continue;
+		}
+		else if (sjinfo->jointype == JOIN_SEMI)
+		{
+			List	   *columnlist;
+			RelOptInfo *rel;
+
+			if (!semiorantijoin_is_removable(root, sjinfo, &columnlist, &rel))
+				continue;
+
+			Assert(columnlist != NIL);
+			convert_semijoin_to_isnotnull_quals(root, rel, columnlist);
+		}
+		else if (sjinfo->jointype == JOIN_ANTI)
+		{
+			List	   *columnlist;
+			RelOptInfo *rel;
+
+			if (!semiorantijoin_is_removable(root, sjinfo, &columnlist, &rel))
+				continue;
+
+			Assert(columnlist != NIL);
+			convert_antijoin_to_isnull_quals(root, rel, columnlist);
+		}
+		else
+			continue; /* we don't support this join type */
 
 		/*
-		 * Currently, join_is_removable can only succeed when the sjinfo's
-		 * righthand is a single baserel.  Remove that rel from the query and
-		 * joinlist.
+		 * Currently, all of the functions which test if join removals are
+		 * possible can only succeed when the sjinfo's righthand is a single
+		 * baserel.  Remove that rel from the query and joinlist.
 		 */
 		innerrelid = bms_singleton_member(sjinfo->min_righthand);
 
@@ -136,8 +179,8 @@ clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
 }
 
 /*
- * join_is_removable
- *	  Check whether we need not perform this special join at all, because
+ * leftjoin_is_removable
+ *	  Check whether we need not perform this left join at all, because
  *	  it will just duplicate its left input.
  *
  * This is true for a left join for which the join condition cannot match
@@ -147,7 +190,7 @@ clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
  * above the join.
  */
 static bool
-join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
+leftjoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 {
 	int			innerrelid;
 	RelOptInfo *innerrel;
@@ -157,12 +200,13 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	ListCell   *l;
 	int			attroff;
 
+	Assert(sjinfo->jointype == JOIN_LEFT);
+
 	/*
-	 * Must be a non-delaying left join to a single baserel, else we aren't
+	 * Must be a non-delaying join to a single baserel, else we aren't
 	 * going to be able to do anything with it.
 	 */
-	if (sjinfo->jointype != JOIN_LEFT ||
-		sjinfo->delay_upper_joins ||
+	if (sjinfo->delay_upper_joins ||
 		bms_membership(sjinfo->min_righthand) != BMS_SINGLETON)
 		return false;
 
@@ -367,6 +411,533 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	return false;
 }
 
+/*
+ * semiorantijoin_is_removable
+ *	  True if we can prove that the semi or anti join is redundant due to the
+ *	  existence of a foreign key constraint.
+ *
+ * Detecting if a SEMI or ANTI join may be removed is quite different to the
+ * detection code for left joins. For these we have no need to check if vars
+ * from the join are used in the query as the EXISTS and IN() syntax disallow
+ * this. In order to prove that a semi or anti join is redundant we must ensure
+ * that a foreign key exists on the left side of the join which references the
+ * table on the right side of the join. This means that we can only support a
+ * single table on either side of the join. We must also ensure that the join
+ * condition matches all the foreign key columns to each index column on the
+ * referenced table. If any columns are missing then we cannot be sure we'll
+ * get at most 1 record back, and if there are any extra conditions that don't
+ * exist in the foreign key then we cannot be sure that the join condition will
+ * match at least 1 row.
+ *
+ * If we manage to find a foreign key which will allow the join to be removed
+ * then the caller may have to add NULL checking to the query in place of the
+ * join. For example if we determine that the join to the table b is not needed
+ * due to the existence of a foreign key on a.b_id referencing b.id in the
+ * following query:
+ *
+ * SELECT * FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id = b.id);
+ *
+ * Then the only possible records that could be returned from 'a' are the ones
+ * WHERE b_id IS NULL.
+ *
+ * If this function returns True, then leftrelcolumns will be populated with
+ * the list of columns from the left relation which exist in the join
+ * condition, leftrel will be set to the RelOptInfo of the left hand relation.
+ *
+ * Note: The likelihood of a join being removed here is likely fairly small, so
+ * we must try hard to fast path out early at the first hint that the join
+ * cannot be removed.
+ */
+static bool
+semiorantijoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo,
+		List **leftrelcolumns, RelOptInfo **leftrel)
+{
+	int			innerrelid;
+	int			outerrelid;
+	RelOptInfo *innerrel;
+	RelOptInfo *outerrel;
+	ListCell   *lc;
+	List	   *referencing_vars;
+	List	   *index_vars;
+	List	   *operator_list;
+
+	Assert(sjinfo->jointype == JOIN_SEMI || sjinfo->jointype == JOIN_ANTI);
+
+	/*
+	 * We mustn't allow semi or anti joins to be removed if there are any
+	 * pending foreign key triggers in the queue. This could happen if we
+	 * are planning a query that has been executed from within a volatile
+	 * function and the query which called this volatile function has made some
+	 * changes to a table referenced by a foreign key. The reason for this is
+	 * that any updates to a table which is referenced by a foreign key
+	 * constraint will only have the referencing tables updated after the
+	 * command is complete, so there is a window of time where records may
+	 * violate the foreign key constraint.
+	 *
+	 * Currently this code is quite naive, as we won't even attempt to remove
+	 * the join if there are *any* pending foreign key triggers, on any
+	 * relation. It may be worthwhile to improve this to check if there's any
+	 * pending triggers for the referencing relation in the join.
+	 */
+	if (!AfterTriggerQueueIsEmpty())
+		return false;
+
+	/*
+	 * We'll start by checking that the left hand relation is a singleton
+	 * and that it has at least 1 foreign key constraint.  A lack of foreign
+	 * key seems like a more likely possibility to allow us to exit early than
+	 * checking the right hand rel has any indexes.
+	 */
+	if (sjinfo->delay_upper_joins ||
+		bms_membership(sjinfo->min_lefthand) != BMS_SINGLETON)
+		return false;
+
+	outerrelid = bms_singleton_member(sjinfo->min_lefthand);
+	outerrel = find_base_rel(root, outerrelid);
+
+	/*
+	 * There's no possibility to remove the join if the outer rel is not a
+	 * baserel or the baserel has no foreign keys defined.
+	 */
+	if (outerrel->reloptkind != RELOPT_BASEREL ||
+		outerrel->rtekind != RTE_RELATION ||
+		outerrel->fklist == NIL)
+		return false;
+
+	if (bms_membership(sjinfo->min_righthand) != BMS_SINGLETON)
+		return false;
+
+	innerrelid = bms_singleton_member(sjinfo->min_righthand);
+	innerrel = find_base_rel(root, innerrelid);
+
+	/*
+	 * If the right hand relation is not a base rel then it can't possibly be
+	 * referenced by a foreign key. The same goes if there's no unique indexes
+	 * on the relation, however, to keep it simple here we'll make do with
+	 * checking if there's any indexes, as if there's no indexes then there's
+	 * certainly no unique indexes.
+	 */
+	if (innerrel->reloptkind != RELOPT_BASEREL ||
+		innerrel->rtekind != RTE_RELATION ||
+		innerrel->indexlist == NIL)
+		return false;
+
+	referencing_vars = NIL;
+	index_vars = NIL;
+	operator_list = NIL;
+
+	/*
+	 * We now pre-process the join quals into lists that contain the Vars from
+	 * either side of the joins and also a list which contains the operators
+	 * from the join conditions. At this stage we may still discover that the
+	 * join cannot be removed if, for example we find a qual that does not
+	 * reference both sides of the join. Note that we'll reject any operand
+	 * that's not a Var here, as a foreign key cannot reference an expression
+	 * index.
+	 *
+	 * referencing_vars will contain a list of Vars from the left hand
+	 * relation, these are the Vars that we'll check against the referencing
+	 * side of the foreign key.
+	 *
+	 * index_vars will contain a list of Vars from the right hand relation,
+	 * these are the Vars that we'll check on the referenced side of the
+	 * foreign key.
+	 *
+	 * operator_list, this is list of operator Oids that we'll need to ensure
+	 * are compatible with the operator specified in the foreign key.
+	 */
+	foreach(lc, sjinfo->join_quals)
+	{
+		OpExpr *opexpr = (OpExpr *) lfirst(lc);
+		Oid		opno;
+		Var	   *left_var;
+		Var	   *right_var;
+
+		/* Is it a binary opclause? */
+		if (!IsA(opexpr, OpExpr) ||
+			list_length(opexpr->args) != 2)
+		{
+			/* We only accept quals which reference both sides of the join. */
+			return false;
+		}
+
+		left_var = (Var *) linitial(opexpr->args);
+
+		/* Punt if the left operand is anything apart from a Var */
+		if (!IsA(left_var, Var) || left_var->varlevelsup > 0)
+			return false;
+
+		right_var = (Var *) lsecond(opexpr->args);
+
+		/* Punt if the right operand is anything apart from a Var */
+		if (!IsA(right_var, Var) || right_var->varlevelsup > 0)
+			return false;
+
+		opno = opexpr->opno;
+
+		/*
+		 * FIXME: it would be nice to fast path out if the
+		 * operator couldn't possibly be used in a foreign
+		 * key, but what to use to detect this?
+		 */
+		if (!op_mergejoinable(opno, left_var->vartype))
+			return false;
+
+		/* detect which operand belongs to which relation */
+		if (bms_is_member(right_var->varno, sjinfo->syn_righthand) &&
+			bms_is_member(left_var->varno, sjinfo->syn_lefthand))
+		{
+			/* typical case, right_var belongs to RHS */
+			referencing_vars = lappend(referencing_vars, left_var);
+			index_vars = lappend(index_vars, right_var);
+		}
+		else if (bms_is_member(left_var->varno, sjinfo->syn_righthand) &&
+				 bms_is_member(right_var->varno, sjinfo->syn_lefthand))
+		{
+			/* swapped case, right_var belongs to LHS */
+			referencing_vars = lappend(referencing_vars, right_var);
+			index_vars = lappend(index_vars, left_var);
+		}
+		else
+		{
+			/* qual does not reference both sides of the join, punt */
+			return false;
+		}
+		operator_list = lappend_oid(operator_list, opno);
+	}
+
+	/* no suitable join condition items? Then we can't remove the join */
+	if (referencing_vars == NIL)
+		return false;
+
+	/*
+	 * Now that we've built the join Var lists we can now check if there are
+	 * any foreign keys that will support removing the join.
+	 */
+	if (relation_has_foreign_key_for(root, outerrel, innerrel,
+				referencing_vars, index_vars, operator_list))
+	{
+		*leftrel = outerrel;
+		*leftrelcolumns = referencing_vars;
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * convert_semijoin_to_isnotnull_quals
+ *		Adds any required "col IS NOT NULL" quals which are required to ensure
+ *		that the query remains equivalent to what it was before the semi join
+ *		was removed.
+ */
+void
+convert_semijoin_to_isnotnull_quals(PlannerInfo *root, RelOptInfo *rel, List *columnlist)
+{
+	ListCell	*l;
+	Bitmapset	*handledcols = NULL;
+	Oid			 reloid;
+
+	reloid = root->simple_rte_array[rel->relid]->relid;
+
+	/*
+	 * If a semi join has been successfully removed by the join removal code,
+	 * then a foreign key must exist that proves the join to not be required.
+	 *
+	 * The semi join would have never allowed NULL values for any of the
+	 * columns seen in the join condition, as these would have matched up to a
+	 * record in the joined table. Now that we've proved the join to be
+	 * redundant, we must maintain that behavior of not having NULLs by adding
+	 * IS NOT NULL quals to the WHERE clause, although we may skip this if the
+	 * column in question happens to have a NOT NULL constraint.
+	 */
+	foreach(l, columnlist)
+	{
+		Var *var = (Var *) lfirst(l);
+
+		/* should be a var if it came from a foreign key */
+		Assert(IsA(var, Var));
+		Assert(var->varno == rel->relid);
+
+		/*
+		 * Skip this column if it's a duplicate of one we've previously
+		 * handled.
+		 */
+		if (bms_is_member(var->varattno, handledcols))
+			continue;
+
+		/* mark this column as handled */
+		handledcols = bms_add_member(handledcols, var->varattno);
+
+		/* add the IS NOT NULL qual, but only if the column allows NULLs */
+		if (!get_attnotnull(reloid, var->varattno))
+		{
+			RestrictInfo *rinfo;
+			NullTest *ntest = makeNode(NullTest);
+
+			ntest->nulltesttype = IS_NOT_NULL;
+			ntest->arg = (Expr *) var;
+			ntest->argisrow = false;
+
+			rinfo = make_restrictinfo((Expr *)ntest, false, false, false,
+						NULL, NULL, NULL);
+			rel->baserestrictinfo = lappend(rel->baserestrictinfo, rinfo);
+		}
+	}
+}
+
+/*
+ * convert_antijoin_to_isnull_quals
+ *		Adds any required "col IS NULL" quals which are required to ensure
+ *		that the query remains equivalent to what it was before the anti join
+ *		was removed.
+ */
+void
+convert_antijoin_to_isnull_quals(PlannerInfo *root, RelOptInfo *rel, List *columnlist)
+{
+	ListCell	 *l;
+	RestrictInfo *rinfo;
+	Expr		 *expr;
+	List		 *isnulltests = NIL;
+	Bitmapset	 *handledcols = NULL;
+	Oid			 reloid;
+
+	reloid = root->simple_rte_array[rel->relid]->relid;
+
+	/*
+	 * If an anti join has been successfully removed by the join removal code,
+	 * then a foreign key must exist that proves the join to not be required.
+	 *
+	 * The foreign key which proved this join redundant would ensure that each
+	 * record in the referencing rel has a matching record in the referenced
+	 * rel. Though this is not quite true when it comes to NULL valued columns,
+	 * as these won't reference any record. So here, in order to make the query
+	 * produce equivalent results as it would have done with the anti join,
+	 * we'll just ensure that only these NULL valued columns can make their way
+	 * into the final result set. There is also a special case here, if all of
+	 * the columns in the foreign key happen to have a NOT NULL constraint then
+	 * no records can match, so in this case we'll add a "WHERE false" in order
+	 * to save the executer from wasting any time.
+	 */
+	foreach(l, columnlist)
+	{
+		Var			  *var = (Var *) lfirst(l);
+
+		/* should be a var if it came from a foreign key */
+		Assert(IsA(var, Var));
+
+		/*
+		 * Skip this column if it's a duplicate of one we've previously
+		 * handled.
+		 */
+		if (bms_is_member(var->varattno, handledcols))
+			continue;
+
+		/* mark this column as handled */
+		handledcols = bms_add_member(handledcols, var->varattno);
+
+		/*
+		 * No point in adding a col IS NULL if the column
+		 * has a NOT NULL constraint defined for it.
+		 */
+		if (!get_attnotnull(reloid, var->varattno))
+		{
+			NullTest *ntest = makeNode(NullTest);
+			ntest->nulltesttype = IS_NULL;
+			ntest->arg = (Expr *) var;
+			ntest->argisrow = false;
+
+			isnulltests = lappend(isnulltests, ntest);
+		}
+	}
+
+	/*
+	 * If we still have an empty list by the time we get to here then it would
+	 * appear that each column has a NOT NULL constraint. In this case then
+	 * it's not possible for the query to return any records, so we can simply
+	 * add a "WHERE false" constant expression and tell the planner to check
+	 * for gating quals.
+	 */
+	if (isnulltests == NIL)
+	{
+		expr = (Expr *) makeBoolConst(false, false);
+		rinfo = make_restrictinfo(expr, false, false, true, NULL, NULL, NULL);
+
+		/* tell createplan.c to check for gating quals */
+		root->hasPseudoConstantQuals = true;
+	}
+	else
+	{
+		/*
+		 * Now we can build a RestrictInfo for the newly created IS NULL tests.
+		 * If there's only 1 test expression then we can just make the
+		 * RestrictInfo use that expression, if there's more than 1 we'll need
+		 * to "OR" all of these together.
+		 */
+		if (list_length(isnulltests) == 1)
+			expr = (Expr *) linitial(isnulltests);
+		else
+			expr = make_orclause(isnulltests);
+
+		rinfo = make_restrictinfo(expr, false, false, false, NULL, NULL, NULL);
+	}
+
+	rel->baserestrictinfo = lappend(rel->baserestrictinfo, rinfo);
+}
+
+/*
+ * relation_has_foreign_key_for
+ *	  Checks if rel has a foreign key which references referencedrel with the
+ *	  given list of expressions.
+ *
+ *	For the match to succeed:
+ *	  referencing_vars must match the columns defined in the foreign key.
+ *	  index_vars must match the columns defined in the index for the foreign key.
+ */
+static bool
+relation_has_foreign_key_for(PlannerInfo *root, RelOptInfo *rel,
+			RelOptInfo *referencedrel, List *referencing_vars,
+			List *index_vars, List *operator_list)
+{
+	ListCell *lc;
+	Oid		  refreloid;
+
+	/*
+	 * Look up the Oid of the referenced relation. We only want to look at
+	 * foreign keys on the referencing relation which reference this relation.
+	 */
+	refreloid = root->simple_rte_array[referencedrel->relid]->relid;
+
+	Assert(list_length(referencing_vars) > 0);
+	Assert(list_length(referencing_vars) == list_length(index_vars));
+	Assert(list_length(referencing_vars) == list_length(operator_list));
+
+	/*
+	 * Search through each foreign key on the referencing relation and try
+	 * to find one which references the relation in the join condition. If we
+	 * find one then we'll send the join conditions off to
+	 * expressions_match_foreign_key() to see if they match the foreign key.
+	 */
+	foreach(lc, rel->fklist)
+	{
+		ForeignKeyInfo *fk = (ForeignKeyInfo *) lfirst(lc);
+
+		if (fk->confrelid == refreloid)
+		{
+			if (expressions_match_foreign_key(fk, referencing_vars,
+				index_vars, operator_list))
+				return true;
+		}
+	}
+
+	return false;
+}
+
+/*
+ * expressions_match_foreign_key
+ *		True if the given fkvars, indexvars and operators will match
+ *		exactly 1 record in the referenced relation of the foreign key.
+ *
+ * Note: This function expects fkvars and indexvars to only contain Var types.
+ *       Expression indexes are not supported by foreign keys.
+ */
+static bool
+expressions_match_foreign_key(ForeignKeyInfo *fk, List *fkvars,
+					List *indexvars, List *operators)
+{
+	ListCell  *lc;
+	ListCell  *lc2;
+	ListCell  *lc3;
+	int		   col;
+	Bitmapset *allitems;
+	Bitmapset *matcheditems;
+	int		   lstidx;
+
+	Assert(list_length(fkvars) == list_length(indexvars));
+	Assert(list_length(fkvars) == list_length(operators));
+
+	/*
+	 * Fast path out if there's not enough conditions to match each column in
+	 * the foreign key. Note that we cannot check that the number of
+	 * expressions are equal here since it would cause any expressions which
+	 * are duplicated not to match.
+	 */
+	if (list_length(fkvars) < fk->conncols)
+		return false;
+
+	/*
+	 * We need to ensure that each foreign key column can be matched to a list
+	 * item, and we need to ensure that each list item can be matched to a
+	 * foreign key column. We do this by looping over each foreign key column
+	 * and checking that we can find an item in the list which matches the
+	 * current column, however this method does not allow us to ensure that no
+	 * additional items exist in the list. We could solve that by performing
+	 * another loop over each list item and check that it matches an foreign
+	 * key column, but that's a bit wasteful. Instead we'll use 2 bitmapsets,
+	 * one to store the 0 based index of each list item, and with the other
+	 * we'll store each list index that we've managed to match. After we're
+	 * done matching we'll just make sure that both bitmapsets are equal.
+	 */
+	allitems = NULL;
+	matcheditems = NULL;
+
+	/*
+	 * Build a bitmapset which contains each 1 based list index. It seems more
+	 * efficient to do this in reverse so that we allocate enough memory for
+	 * the bitmapset on first loop rather than reallocating each time we find
+	 * we need a bit more space.
+	 */
+	for (lstidx = list_length(fkvars) - 1; lstidx >= 0; lstidx--)
+		allitems = bms_add_member(allitems, lstidx);
+
+	for (col = 0; col < fk->conncols; col++)
+	{
+		bool  matched = false;
+
+		lstidx = 0;
+
+		forthree(lc, fkvars, lc2, indexvars, lc3, operators)
+		{
+			Var *expr = (Var *) lfirst(lc);
+			Var *idxexpr = (Var *) lfirst(lc2);
+			Oid  opr = lfirst_oid(lc3);
+
+			Assert(IsA(expr, Var));
+			Assert(IsA(idxexpr, Var));
+
+			/* Does this join qual match up to the current fkey column? */
+			if (fk->conkey[col] == expr->varattno &&
+				fk->confkey[col] == idxexpr->varattno &&
+				equality_ops_are_compatible(opr, fk->conpfeqop[col]))
+			{
+				matched = true;
+
+				/* mark this list item as matched */
+				matcheditems = bms_add_member(matcheditems, lstidx);
+
+				/*
+				 * Don't break here as there may be duplicate expressions
+				 * that we also need to match against.
+				 */
+			}
+			lstidx++;
+		}
+
+		/* punt if there's no match. */
+		if (!matched)
+			return false;
+	}
+
+	/*
+	 * Ensure that we managed to match every item in the list to a foreign key
+	 * column.
+	 */
+	if (!bms_equal(allitems, matcheditems))
+		return false;
+
+	return true; /* matched */
+}
+
 
 /*
  * Remove the target relid from the planner's data structures, having
@@ -393,6 +964,9 @@ remove_rel_from_query(PlannerInfo *root, int relid, Relids joinrelids)
 	 */
 	rel->reloptkind = RELOPT_DEADREL;
 
+	/* Strip out any eclass members that belong to this rel */
+	remove_rel_from_eclass(root, relid);
+
 	/*
 	 * Remove references to the rel from other baserels' attr_needed arrays.
 	 */
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index b2becfa..0b1c1a6 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -25,7 +25,9 @@
 #include "access/transam.h"
 #include "access/xlog.h"
 #include "catalog/catalog.h"
+#include "catalog/pg_constraint.h"
 #include "catalog/heap.h"
+#include "catalog/pg_type.h"
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -38,6 +40,7 @@
 #include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "storage/bufmgr.h"
+#include "utils/fmgroids.h"
 #include "utils/lsyscache.h"
 #include "utils/rel.h"
 #include "utils/snapmgr.h"
@@ -89,6 +92,13 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 	Relation	relation;
 	bool		hasindex;
 	List	   *indexinfos = NIL;
+	List	   *fkinfos = NIL;
+	Relation	fkeyRel;
+	Relation	fkeyRelIdx;
+	ScanKeyData fkeyScankey;
+	SysScanDesc fkeyScan;
+	HeapTuple	tuple;
+
 
 	/*
 	 * We need not lock the relation since it was already locked, either by
@@ -384,6 +394,111 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 
 	heap_close(relation, NoLock);
 
+	ScanKeyInit(&fkeyScankey,
+				Anum_pg_constraint_conrelid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(relationObjectId));
+
+	fkeyRel = heap_open(ConstraintRelationId, AccessShareLock);
+	fkeyRelIdx = index_open(ConstraintRelidIndexId, AccessShareLock);
+	fkeyScan = systable_beginscan_ordered(fkeyRel, fkeyRelIdx, NULL, 1, &fkeyScankey);
+
+	while ((tuple = systable_getnext_ordered(fkeyScan, ForwardScanDirection)) != NULL)
+	{
+		Form_pg_constraint con = (Form_pg_constraint) GETSTRUCT(tuple);
+		ForeignKeyInfo *fkinfo;
+		Datum		adatum;
+		bool		isNull;
+		ArrayType  *arr;
+		int			numkeys;
+
+		/* Not a foreign key */
+		if (con->contype != CONSTRAINT_FOREIGN)
+			continue;
+
+		/* we're not interested unless the fk has been validated */
+		if (!con->convalidated)
+			continue;
+
+		fkinfo = (ForeignKeyInfo *) palloc(sizeof(ForeignKeyInfo));
+		fkinfo->conindid = con->conindid;
+		fkinfo->confrelid = con->confrelid;
+		fkinfo->convalidated = con->convalidated;
+		fkinfo->conrelid = con->conrelid;
+		fkinfo->confupdtype = con->confupdtype;
+		fkinfo->confdeltype = con->confdeltype;
+		fkinfo->confmatchtype = con->confmatchtype;
+
+		adatum = heap_getattr(tuple, Anum_pg_constraint_conkey,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null conkey for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		numkeys = ARR_DIMS(arr)[0];
+		if (ARR_NDIM(arr) != 1 ||
+			numkeys < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != INT2OID)
+			elog(ERROR, "conkey is not a 1-D smallint array");
+
+		fkinfo->conkey = (int16 *) ARR_DATA_PTR(arr);
+		fkinfo->conncols = numkeys;
+
+		adatum = heap_getattr(tuple, Anum_pg_constraint_confkey,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null confkey for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		numkeys = ARR_DIMS(arr)[0];
+
+		/* sanity check */
+		if (numkeys != fkinfo->conncols)
+			elog(ERROR, "number of confkey elements does not equal conkey elements");
+
+		if (ARR_NDIM(arr) != 1 ||
+			numkeys < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != INT2OID)
+			elog(ERROR, "confkey is not a 1-D smallint array");
+
+		fkinfo->confkey = (int16 *) ARR_DATA_PTR(arr);
+		adatum = heap_getattr(tuple, Anum_pg_constraint_conpfeqop,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null conpfeqop for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		numkeys = ARR_DIMS(arr)[0];
+
+		/* sanity check */
+		if (numkeys != fkinfo->conncols)
+			elog(ERROR, "number of conpfeqop elements does not equal conkey elements");
+
+		if (ARR_NDIM(arr) != 1 ||
+			numkeys < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != OIDOID)
+			elog(ERROR, "conpfeqop is not a 1-D smallint array");
+
+		fkinfo->conpfeqop = (Oid *) ARR_DATA_PTR(arr);
+
+		fkinfos = lappend(fkinfos, fkinfo);
+	}
+
+	rel->fklist = fkinfos;
+	systable_endscan_ordered(fkeyScan);
+	index_close(fkeyRelIdx, AccessShareLock);
+	heap_close(fkeyRel, AccessShareLock);
+
+
 	/*
 	 * Allow a plugin to editorialize on the info we obtained from the
 	 * catalogs.  Actions might include altering the assumed relation size,
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index c938c27..a0fb8eb 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -115,6 +115,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind)
 	rel->lateral_relids = NULL;
 	rel->lateral_referencers = NULL;
 	rel->indexlist = NIL;
+	rel->fklist = NIL;
 	rel->pages = 0;
 	rel->tuples = 0;
 	rel->allvisfrac = 0;
@@ -377,6 +378,7 @@ build_join_rel(PlannerInfo *root,
 	joinrel->lateral_relids = NULL;
 	joinrel->lateral_referencers = NULL;
 	joinrel->indexlist = NIL;
+	joinrel->fklist = NIL;
 	joinrel->pages = 0;
 	joinrel->tuples = 0;
 	joinrel->allvisfrac = 0;
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 552e498..aa81c7c 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -916,6 +916,33 @@ get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 	ReleaseSysCache(tp);
 }
 
+/*
+ * get_attnotnull
+ *
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	tp;
+
+	tp = SearchSysCache2(ATTNUM,
+						 ObjectIdGetDatum(relid),
+						 Int16GetDatum(attnum));
+	if (HeapTupleIsValid(tp))
+	{
+		Form_pg_attribute att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+		bool		result;
+
+		result = att_tup->attnotnull;
+		ReleaseSysCache(tp);
+		return result;
+	}
+	else
+		return false;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h
index d0b0356..34a75e4 100644
--- a/src/include/commands/trigger.h
+++ b/src/include/commands/trigger.h
@@ -181,6 +181,7 @@ extern void ExecBSTruncateTriggers(EState *estate,
 extern void ExecASTruncateTriggers(EState *estate,
 					   ResultRelInfo *relinfo);
 
+extern bool AfterTriggerQueueIsEmpty(void);
 extern void AfterTriggerBeginXact(void);
 extern void AfterTriggerBeginQuery(void);
 extern void AfterTriggerEndQuery(EState *estate);
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index dacbe9c..f69df09 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -355,6 +355,8 @@ typedef struct PlannerInfo
  *		lateral_referencers - relids of rels that reference this one laterally
  *		indexlist - list of IndexOptInfo nodes for relation's indexes
  *					(always NIL if it's not a table)
+ *		fklist - list of ForeignKeyInfo's for relation's foreign key
+ *					constraints. (always NIL if it's not a table)
  *		pages - number of disk pages in relation (zero if not a table)
  *		tuples - number of tuples in relation (not considering restrictions)
  *		allvisfrac - fraction of disk pages that are marked all-visible
@@ -448,6 +450,7 @@ typedef struct RelOptInfo
 	Relids		lateral_relids; /* minimum parameterization of rel */
 	Relids		lateral_referencers;	/* rels that reference me laterally */
 	List	   *indexlist;		/* list of IndexOptInfo */
+	List	   *fklist;			/* list of ForeignKeyInfo */
 	BlockNumber pages;			/* size estimates derived from pg_class */
 	double		tuples;
 	double		allvisfrac;
@@ -538,6 +541,51 @@ typedef struct IndexOptInfo
 	bool		amhasgetbitmap; /* does AM have amgetbitmap interface? */
 } IndexOptInfo;
 
+/*
+ * ForeignKeyInfo
+ *		Used to store pg_constraint records for foreign key constraints for use
+ *		by the planner.
+ *
+ *		conindid - The index which supports the foreign key
+ *
+ *		confrelid - The relation that is referenced by this foreign key
+ *
+ *		convalidated - True if the foreign key has been validated.
+ *
+ *		conrelid - The Oid of the relation that the foreign key belongs to
+ *
+ *		confupdtype - ON UPDATE action for when the referenced table is updated
+ *
+ *		confdeltype - ON DELETE action, controls what to do when a record is
+ *					deleted from the referenced table.
+ *
+ *		confmatchtype - foreign key match type, e.g MATCH FULL, MATCH PARTIAL
+ *
+ *		conncols - Number of columns defined in the foreign key
+ *
+ *		conkey - An array of conncols elements to store the varattno of the
+ *					columns on the referencing side of the foreign key
+ *
+ *		confkey - An array of conncols elements to store the varattno of the
+ *					columns on the referenced side of the foreign key
+ *
+ *		conpfeqop - An array of conncols elements to store the operators for
+ *					PK = FK comparisons
+ */
+typedef struct ForeignKeyInfo
+{
+	Oid			conindid;		/* index supporting this constraint */
+	Oid			confrelid;		/* relation referenced by foreign key */
+	bool		convalidated;	/* constraint has been validated? */
+	Oid			conrelid;		/* relation this constraint constrains */
+	char		confupdtype;	/* foreign key's ON UPDATE action */
+	char		confdeltype;	/* foreign key's ON DELETE action */
+	char		confmatchtype;	/* foreign key's match type */
+	int			conncols;		/* number of columns references */
+	int16	   *conkey;			/* Columns of conrelid that the constraint applies to */
+	int16	   *confkey;		/* columns of confrelid that foreign key references */
+	Oid		   *conpfeqop;		/* Operator list for comparing PK to FK */
+} ForeignKeyInfo;
 
 /*
  * EquivalenceClasses
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 9b22fda..00716c9 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -108,6 +108,7 @@ extern EquivalenceClass *get_eclass_for_sort_expr(PlannerInfo *root,
 						 Relids rel,
 						 bool create_it);
 extern void generate_base_implied_equalities(PlannerInfo *root);
+extern void remove_rel_from_eclass(PlannerInfo *root, int relid);
 extern List *generate_join_implied_equalities(PlannerInfo *root,
 								 Relids join_relids,
 								 Relids outer_relids,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 07d24d4..910190d 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -68,6 +68,7 @@ extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern int32 get_atttypmod(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 					  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern char *get_collation_name(Oid colloid);
 extern char *get_constraint_name(Oid conoid);
 extern Oid	get_opclass_family(Oid opclass);
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 1cb1c51..d9252c1 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3249,6 +3249,330 @@ select i8.* from int8_tbl i8 left join (select f1 from int4_tbl group by f1) i4
 (1 row)
 
 rollback;
+BEGIN;
+-- Test join removals for semi and anti joins
+CREATE TEMP TABLE b (id INT NOT NULL PRIMARY KEY, val INT);
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id INT REFERENCES b(id));
+-- should remove semi join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE b_id IN(SELECT id FROM b);
+          QUERY PLAN          
+------------------------------
+ Seq Scan on a
+   Filter: (b_id IS NOT NULL)
+(2 rows)
+
+-- should remove semi join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id = id);
+          QUERY PLAN          
+------------------------------
+ Seq Scan on a
+   Filter: (b_id IS NOT NULL)
+(2 rows)
+
+-- should remove anti join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = id);
+        QUERY PLAN        
+--------------------------
+ Seq Scan on a
+   Filter: (b_id IS NULL)
+(2 rows)
+
+-- should not remove anti join as id > 100 will void
+-- the foreign key's guarantee that 1 will exist.
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = id AND id > 100);
+                  QUERY PLAN                   
+-----------------------------------------------
+ Hash Anti Join
+   Hash Cond: (a.b_id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Bitmap Heap Scan on b
+               Recheck Cond: (id > 100)
+               ->  Bitmap Index Scan on b_pkey
+                     Index Cond: (id > 100)
+(8 rows)
+
+-- should not remove anti join as val is not part of the foreign key.
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = id AND val = id);
+            QUERY PLAN            
+----------------------------------
+ Hash Anti Join
+   Hash Cond: (a.b_id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: (id = val)
+(6 rows)
+
+-- should remove semi join to b (swapped condition order)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE id = a.b_id);
+          QUERY PLAN          
+------------------------------
+ Seq Scan on a
+   Filter: (b_id IS NOT NULL)
+(2 rows)
+
+-- should not remove semi join (since not using equals)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE id >= a.b_id);
+               QUERY PLAN                
+-----------------------------------------
+ Nested Loop Semi Join
+   ->  Seq Scan on a
+   ->  Index Only Scan using b_pkey on b
+         Index Cond: (id >= a.b_id)
+(4 rows)
+
+-- should not remove semi join
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE b_id+0 IN(SELECT id FROM b);
+             QUERY PLAN             
+------------------------------------
+ Hash Semi Join
+   Hash Cond: ((a.b_id + 0) = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(5 rows)
+
+-- should not remove semi join
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE b_id IN(SELECT id+0 FROM b);
+             QUERY PLAN             
+------------------------------------
+ Hash Semi Join
+   Hash Cond: (a.b_id = (b.id + 0))
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(5 rows)
+
+-- should not remove semi join (wrong column)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE id IN(SELECT id FROM b);
+         QUERY PLAN         
+----------------------------
+ Hash Semi Join
+   Hash Cond: (a.id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(5 rows)
+
+ROLLBACK;
+BEGIN;
+-- Semi join removal code with 2 column foreign keys
+CREATE TEMP TABLE b (id1 INT NOT NULL, id2 INT NOT NULL, PRIMARY KEY(id1,id2));
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id1 INT, b_id2 INT);
+ALTER TABLE a ADD CONSTRAINT a_b_id1_b_id2_fkey FOREIGN KEY (b_id1,b_id2) REFERENCES b(id1,id2) MATCH SIMPLE;
+-- should remove semi join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Seq Scan on a
+   Filter: ((b_id1 IS NOT NULL) AND (b_id2 IS NOT NULL))
+(2 rows)
+
+-- should remove anti join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+                   QUERY PLAN                   
+------------------------------------------------
+ Seq Scan on a
+   Filter: ((b_id1 IS NULL) OR (b_id2 IS NULL))
+(2 rows)
+
+-- should not remove semi join to b (extra condition)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2 AND a.b_id2 >= id2);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Hash Semi Join
+   Hash Cond: ((a.b_id1 = b.id1) AND (a.b_id2 = b.id2))
+   Join Filter: (a.b_id2 >= b.id2)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(6 rows)
+
+-- should not remove semi join to b (wrong operator)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 > id1 AND a.b_id2 < id2);
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Nested Loop Semi Join
+   ->  Seq Scan on a
+   ->  Index Only Scan using b_pkey on b
+         Index Cond: ((id1 < a.b_id1) AND (id2 > a.b_id2))
+(4 rows)
+
+-- should not remove semi join (only checking id1)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1);
+           QUERY PLAN            
+---------------------------------
+ Hash Join
+   Hash Cond: (a.b_id1 = b.id1)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  HashAggregate
+               Group Key: b.id1
+               ->  Seq Scan on b
+(7 rows)
+
+-- should not remove semi join (only checking id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id2 = id2);
+           QUERY PLAN            
+---------------------------------
+ Hash Join
+   Hash Cond: (a.b_id2 = b.id2)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  HashAggregate
+               Group Key: b.id2
+               ->  Seq Scan on b
+(7 rows)
+
+-- should not remove semi join (checking wrong columns)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id2 = id1 AND a.b_id1 = id2);
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Hash Join
+   Hash Cond: ((a.b_id2 = b.id1) AND (a.b_id1 = b.id2))
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(5 rows)
+
+-- should not remove semi join (no check for id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id2 = id1 AND a.b_id1 = id1);
+               QUERY PLAN                
+-----------------------------------------
+ Nested Loop Semi Join
+   ->  Seq Scan on a
+         Filter: (b_id2 = b_id1)
+   ->  Index Only Scan using b_pkey on b
+         Index Cond: (id1 = a.b_id2)
+(5 rows)
+
+-- should not remove semi join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id1 = id2);
+            QUERY PLAN             
+-----------------------------------
+ Hash Join
+   Hash Cond: (a.b_id1 = b.id1)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: (id1 = id2)
+(6 rows)
+
+-- should not remove semi join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id1 = id1);
+              QUERY PLAN               
+---------------------------------------
+ Hash Join
+   Hash Cond: (a.b_id1 = b.id1)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  HashAggregate
+               Group Key: b.id1, b.id1
+               ->  Seq Scan on b
+(7 rows)
+
+-- Check that the IS NULL and IS NOT NULL filters are not added
+-- for columns which have a NOT NULL constraint.
+ALTER TABLE a ALTER COLUMN b_id1 SET NOT NULL;
+-- Should only filter on b_id2 IS NOT NULL
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on a
+   Filter: (b_id2 IS NOT NULL)
+(2 rows)
+
+-- Should only filter on b_id2 IS NULL
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+        QUERY PLAN         
+---------------------------
+ Seq Scan on a
+   Filter: (b_id2 IS NULL)
+(2 rows)
+
+ALTER TABLE a ALTER COLUMN b_id2 SET NOT NULL;
+-- No IS NOT NULL filters should be added.
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+  QUERY PLAN   
+---------------
+ Seq Scan on a
+(1 row)
+
+-- Since now neither b_id1 or b_id2 can be NULL this query can't
+-- produce any records. Check that we get a One-Time Filter: false
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+        QUERY PLAN        
+--------------------------
+ Result
+   One-Time Filter: false
+   ->  Seq Scan on a
+(3 rows)
+
+ROLLBACK;
+BEGIN WORK;
+-- In this test we want to ensure that ANTI JOIN removal does not
+-- occur when there are pending foreign key triggers.
+-- We test this by updating a relation which is referenced by a foreign key
+-- and then executing another query which would normally allow the anti
+-- join to be removed. If the anti join was removed then the table
+-- records_violating_fkey would be empty, but here we'll ensure that
+-- the record that we update ends up violating the foreign key.
+CREATE TABLE j2 (id INT NOT NULL PRIMARY KEY);
+CREATE TABLE j1 (
+  id INT PRIMARY KEY,
+  j2_id INT NOT NULL REFERENCES j2 (id) MATCH FULL ON DELETE CASCADE ON UPDATE CASCADE
+);
+INSERT INTO j2 VALUES(10),(20);
+INSERT INTO j1 VALUES(1,10),(2,20);
+-- create a table to store records that 'violate' the fkey.
+CREATE TABLE records_violating_fkey (j2_id INT NOT NULL);
+CREATE OR REPLACE FUNCTION j1_update() RETURNS TRIGGER AS $$
+BEGIN
+  INSERT INTO records_violating_fkey SELECT j2_id FROM j1 WHERE NOT EXISTS(SELECT 1 FROM j2 WHERE j2_id = j2.id);
+  RETURN NEW;
+  END;
+$$ LANGUAGE plpgsql;
+CREATE TRIGGER j1_update_trigger BEFORE UPDATE ON j2 FOR EACH ROW EXECUTE PROCEDURE j1_update();
+-- This update statement will cause some foreign key triggers to be queued.
+-- The trigger defined above will fire which will cause all records which
+-- currently violate the foreign key to be inserted into the records_violating_fkey
+-- table. The intended behaviour of this is that we'll see records violating the
+-- foreign key, however if we incorrectly performed an ANTI JOIN removal, then
+-- we wouldn't see this violation record, as we'd wrongly assume that the query
+-- could not produce any records.
+UPDATE j2 SET id = id + 1;
+SELECT * FROM records_violating_fkey;
+ j2_id 
+-------
+    10
+(1 row)
+
+ROLLBACK;
 create temp table parent (k int primary key, pd int);
 create temp table child (k int unique, cd int);
 insert into parent values (1, 10), (2, 20), (3, 30);
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index fa3e068..5ec5016 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -973,6 +973,175 @@ select i8.* from int8_tbl i8 left join (select f1 from int4_tbl group by f1) i4
 
 rollback;
 
+BEGIN;
+
+-- Test join removals for semi and anti joins
+CREATE TEMP TABLE b (id INT NOT NULL PRIMARY KEY, val INT);
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id INT REFERENCES b(id));
+
+-- should remove semi join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE b_id IN(SELECT id FROM b);
+
+-- should remove semi join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id = id);
+
+-- should remove anti join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = id);
+
+-- should not remove anti join as id > 100 will void
+-- the foreign key's guarantee that 1 will exist.
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = id AND id > 100);
+
+-- should not remove anti join as val is not part of the foreign key.
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = id AND val = id);
+
+-- should remove semi join to b (swapped condition order)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE id = a.b_id);
+
+-- should not remove semi join (since not using equals)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE id >= a.b_id);
+
+-- should not remove semi join
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE b_id+0 IN(SELECT id FROM b);
+
+-- should not remove semi join
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE b_id IN(SELECT id+0 FROM b);
+
+-- should not remove semi join (wrong column)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE id IN(SELECT id FROM b);
+
+ROLLBACK;
+
+BEGIN;
+
+-- Semi join removal code with 2 column foreign keys
+
+CREATE TEMP TABLE b (id1 INT NOT NULL, id2 INT NOT NULL, PRIMARY KEY(id1,id2));
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id1 INT, b_id2 INT);
+
+ALTER TABLE a ADD CONSTRAINT a_b_id1_b_id2_fkey FOREIGN KEY (b_id1,b_id2) REFERENCES b(id1,id2) MATCH SIMPLE;
+
+-- should remove semi join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+
+-- should remove anti join to b
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+
+-- should not remove semi join to b (extra condition)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2 AND a.b_id2 >= id2);
+
+-- should not remove semi join to b (wrong operator)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 > id1 AND a.b_id2 < id2);
+
+-- should not remove semi join (only checking id1)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1);
+
+-- should not remove semi join (only checking id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id2 = id2);
+
+-- should not remove semi join (checking wrong columns)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id2 = id1 AND a.b_id1 = id2);
+
+-- should not remove semi join (no check for id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id2 = id1 AND a.b_id1 = id1);
+
+-- should not remove semi join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id1 = id2);
+
+-- should not remove semi join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id1 = id1);
+
+
+-- Check that the IS NULL and IS NOT NULL filters are not added
+-- for columns which have a NOT NULL constraint.
+ALTER TABLE a ALTER COLUMN b_id1 SET NOT NULL;
+
+-- Should only filter on b_id2 IS NOT NULL
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+
+-- Should only filter on b_id2 IS NULL
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+
+ALTER TABLE a ALTER COLUMN b_id2 SET NOT NULL;
+
+-- No IS NOT NULL filters should be added.
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+
+-- Since now neither b_id1 or b_id2 can be NULL this query can't
+-- produce any records. Check that we get a One-Time Filter: false
+EXPLAIN (COSTS OFF)
+SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id1 = id1 AND a.b_id2 = id2);
+
+ROLLBACK;
+
+BEGIN WORK;
+
+-- In this test we want to ensure that ANTI JOIN removal does not
+-- occur when there are pending foreign key triggers.
+-- We test this by updating a relation which is referenced by a foreign key
+-- and then executing another query which would normally allow the anti
+-- join to be removed. If the anti join was removed then the table
+-- records_violating_fkey would be empty, but here we'll ensure that
+-- the record that we update ends up violating the foreign key.
+
+CREATE TABLE j2 (id INT NOT NULL PRIMARY KEY);
+CREATE TABLE j1 (
+  id INT PRIMARY KEY,
+  j2_id INT NOT NULL REFERENCES j2 (id) MATCH FULL ON DELETE CASCADE ON UPDATE CASCADE
+);
+
+INSERT INTO j2 VALUES(10),(20);
+INSERT INTO j1 VALUES(1,10),(2,20);
+
+-- create a table to store records that 'violate' the fkey.
+CREATE TABLE records_violating_fkey (j2_id INT NOT NULL);
+
+CREATE OR REPLACE FUNCTION j1_update() RETURNS TRIGGER AS $$
+BEGIN
+  INSERT INTO records_violating_fkey SELECT j2_id FROM j1 WHERE NOT EXISTS(SELECT 1 FROM j2 WHERE j2_id = j2.id);
+  RETURN NEW;
+  END;
+$$ LANGUAGE plpgsql;
+
+CREATE TRIGGER j1_update_trigger BEFORE UPDATE ON j2 FOR EACH ROW EXECUTE PROCEDURE j1_update();
+
+-- This update statement will cause some foreign key triggers to be queued.
+-- The trigger defined above will fire which will cause all records which
+-- currently violate the foreign key to be inserted into the records_violating_fkey
+-- table. The intended behaviour of this is that we'll see records violating the
+-- foreign key, however if we incorrectly performed an ANTI JOIN removal, then
+-- we wouldn't see this violation record, as we'd wrongly assume that the query
+-- could not produce any records.
+
+UPDATE j2 SET id = id + 1;
+
+SELECT * FROM records_violating_fkey;
+
+ROLLBACK;
+
 create temp table parent (k int primary key, pd int);
 create temp table child (k int unique, cd int);
 insert into parent values (1, 10), (2, 20), (3, 30);

bms_get_singleton_v1.patchapplication/octet-stream; name=bms_get_singleton_v1.patchDownload

diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index c927b78..daec9e6 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -525,6 +525,61 @@ bms_singleton_member(const Bitmapset *a)
 }
 
 /*
+ * bms_get_singleton
+ *
+ * Returns True and sets singleton to the value of the singleton, if the
+ * bitmapset is a singleton, otherwise, if the bitmapset is NULL, empty or has
+ * multiple values, False is returned.
+ *
+ * This function can be useful if some processing only needs to take place
+ * when a Bitmapset is a singleton and that singleton value is required for
+ * that processing, for example, you could do:
+ *
+ * if (bms_membership(a) != BMS_SINGLETON)
+ *		return; // nothing to do
+ * singleton = bms_singleton_member(a);
+ *
+ * But it would be more efficiently processed by doing:
+ *
+ * if (!bms_get_singleton(a, &singleton))
+ *		return; // nothing to do
+ */
+bool
+bms_get_singleton(const Bitmapset *a, int *singleton)
+{
+	int			result = -1;
+	int			nwords;
+	int			wordnum;
+
+	if (a == NULL)
+		return false;
+	nwords = a->nwords;
+	for (wordnum = 0; wordnum < nwords; wordnum++)
+	{
+		bitmapword	w = a->words[wordnum];
+
+		if (w != 0)
+		{
+			if (result >= 0 || HAS_MULTIPLE_ONES(w))
+				return false;
+			result = wordnum * BITS_PER_BITMAPWORD;
+			while ((w & 255) == 0)
+			{
+				w >>= 8;
+				result += 8;
+			}
+			result += rightmost_one_pos[w & 255];
+		}
+	}
+	if (result < 0)
+		return false;
+
+	*singleton = result;
+	return true;
+}
+
+
+/*
  * bms_num_members - count members of set
  */
 int
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 63dbc1b..0d7722b 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -937,9 +937,9 @@ generate_base_implied_equalities_no_const(PlannerInfo *root,
 		int			relid;
 
 		Assert(!cur_em->em_is_child);	/* no children yet */
-		if (bms_membership(cur_em->em_relids) != BMS_SINGLETON)
+		if (!bms_get_singleton(cur_em->em_relids, &relid))
 			continue;
-		relid = bms_singleton_member(cur_em->em_relids);
+
 		Assert(relid < root->simple_rel_array_size);
 
 		if (prev_ems[relid] != NULL)
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index b0a11d7..e906cd3 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -207,10 +207,9 @@ leftjoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	 * going to be able to do anything with it.
 	 */
 	if (sjinfo->delay_upper_joins ||
-		bms_membership(sjinfo->min_righthand) != BMS_SINGLETON)
+		!bms_get_singleton(sjinfo->min_righthand, &innerrelid))
 		return false;
 
-	innerrelid = bms_singleton_member(sjinfo->min_righthand);
 	innerrel = find_base_rel(root, innerrelid);
 
 	if (innerrel->reloptkind != RELOPT_BASEREL)
@@ -489,10 +488,9 @@ semiorantijoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo,
 	 * checking the right hand rel has any indexes.
 	 */
 	if (sjinfo->delay_upper_joins ||
-		bms_membership(sjinfo->min_lefthand) != BMS_SINGLETON)
+		!bms_get_singleton(sjinfo->min_lefthand, &outerrelid))
 		return false;
 
-	outerrelid = bms_singleton_member(sjinfo->min_lefthand);
 	outerrel = find_base_rel(root, outerrelid);
 
 	/*
@@ -504,10 +502,9 @@ semiorantijoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo,
 		outerrel->fklist == NIL)
 		return false;
 
-	if (bms_membership(sjinfo->min_righthand) != BMS_SINGLETON)
+	if (!bms_get_singleton(sjinfo->min_righthand, &innerrelid))
 		return false;
 
-	innerrelid = bms_singleton_member(sjinfo->min_righthand);
 	innerrel = find_base_rel(root, innerrelid);
 
 	/*
diff --git a/src/backend/optimizer/util/placeholder.c b/src/backend/optimizer/util/placeholder.c
index 8d7c4fe..d9d1c6a 100644
--- a/src/backend/optimizer/util/placeholder.c
+++ b/src/backend/optimizer/util/placeholder.c
@@ -383,10 +383,10 @@ add_placeholders_to_base_rels(PlannerInfo *root)
 	{
 		PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(lc);
 		Relids		eval_at = phinfo->ph_eval_at;
+		int			varno;
 
-		if (bms_membership(eval_at) == BMS_SINGLETON)
+		if (bms_get_singleton(eval_at, &varno))
 		{
-			int			varno = bms_singleton_member(eval_at);
 			RelOptInfo *rel = find_base_rel(root, varno);
 
 			/* add it to reltargetlist if needed above the rel scan level */
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index f770608..6db7270 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -72,6 +72,8 @@ extern bool bms_is_member(int x, const Bitmapset *a);
 extern bool bms_overlap(const Bitmapset *a, const Bitmapset *b);
 extern bool bms_nonempty_difference(const Bitmapset *a, const Bitmapset *b);
 extern int	bms_singleton_member(const Bitmapset *a);
+extern bool bms_get_singleton(const Bitmapset *a, int *singleton);
+
 extern int	bms_num_members(const Bitmapset *a);
 
 /* optimized tests when we don't need to know exact membership count: */

anti_join_removal_benchmark.xlsxapplication/vnd.openxmlformats-officedocument.spreadsheetml.sheet; name=anti_join_removal_benchmark.xlsxDownload

David Rowley

dgrowleyml@gmail.com

over 11 years ago

In reply to: David Rowley (#1)

1 attachment(s)

Re: Patch to support SEMI and ANTI join removal

On Tue, Aug 5, 2014 at 10:35 PM, David Rowley <dgrowleyml@gmail.com> wrote:

Currently most of my changes are in analyzejoin.c, but I did also have to
make changes to load the foreign key constraints so that they were
available to the planner. One thing that is currently lacking, which would
likely be needed, before the finished patch is ready, would be a
"relhasfkeys" column in pg_class. Such a column would mean that it would be
possible to skip scanning pg_constraint for foreign keys when there's none
to find. I'll delay implementing that until I get a bit more feedback to
weather this patch would be a welcome addition to the existing join removal
code or not.

I've modified this patch to include a new "relhasfkey" column in pg_class,
and then only attempt to load the foreign keys in get_relation_info() if
the pg_class flag is true.

Currently what I'm not quite sure on is the best place to be clearing this
flag. I see that relhaspkey is cleared during vacuum, but only if there's
no indexes at all on the relation. It seems that it will remain set to
"true" after vacuum, if the primary key is dropped and there's still other
indexes on the relation. My guess here is that this is done so that
pg_constraint does not have to be checked to see if a PK exists, which is
why I'm not sure if this would be the correct place for me to do the same
in order to determine if there's any FKs on the relation. Though I can't
quite think where else I might clear this flag.

Any ideas or feedback on this would be welcome

Regards

David Rowley

Heikki Linnakangas

hlinnakangas@vmware.com

over 11 years ago

In reply to: David Rowley (#4)

Re: Patch to support SEMI and ANTI join removal

On 08/26/2014 03:28 PM, David Rowley wrote:

Any ideas or feedback on this would be welcome

Before someone spends time reviewing this patch, are you sure this is
worth the effort? It seems like very narrow use case to me. I understand
removing LEFT and INNER joins, but the case for SEMI and ANTI joins
seems a lot thinner. Unnecessary LEFT and INNER joins can easily creep
into a query when views are used, for example, but I can't imagine that
happening for a SEMI or ANTI join. Maybe I'm lacking imagination. If
someone has run into a query in the wild that would benefit from this,
please raise your hand.

If I understood correctly, you're planning to work on INNER join removal
too. How much of the code in this patch is also required for INNER join
removal, and how much is specific to SEMI and ANTI joins?

Just so everyone is on the same page on what kind of queries this helps
with, here are some examples from the added regression tests:

-- Test join removals for semi and anti joins
CREATE TEMP TABLE b (id INT NOT NULL PRIMARY KEY, val INT);
CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id INT REFERENCES b(id));
-- should remove semi join to b
EXPLAIN (COSTS OFF)
SELECT id FROM a WHERE b_id IN(SELECT id FROM b);
QUERY PLAN
------------------------------
Seq Scan on a
Filter: (b_id IS NOT NULL)
(2 rows)

-- should remove semi join to b
EXPLAIN (COSTS OFF)
SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id = id);
QUERY PLAN
------------------------------
Seq Scan on a
Filter: (b_id IS NOT NULL)
(2 rows)

-- should remove anti join to b
EXPLAIN (COSTS OFF)
SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = id);
QUERY PLAN
--------------------------
Seq Scan on a
Filter: (b_id IS NULL)
(2 rows)

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

David Rowley

dgrowleyml@gmail.com

over 11 years ago

In reply to: Heikki Linnakangas (#5)

Re: Patch to support SEMI and ANTI join removal

On Wed, Aug 27, 2014 at 1:40 AM, Heikki Linnakangas <hlinnakangas@vmware.com

wrote:

On 08/26/2014 03:28 PM, David Rowley wrote:

Any ideas or feedback on this would be welcome

Before someone spends time reviewing this patch, are you sure this is
worth the effort? It seems like very narrow use case to me. I understand
removing LEFT and INNER joins, but the case for SEMI and ANTI joins seems a
lot thinner. Unnecessary LEFT and INNER joins can easily creep into a query
when views are used, for example, but I can't imagine that happening for a
SEMI or ANTI join. Maybe I'm lacking imagination. If someone has run into a
query in the wild that would benefit from this, please raise your hand.

I agree that the use case for removals of SEMI and ANTI join are a lot
thinner than LEFT and INNER joins. My longer term goal here is to add join
removal support for INNER joins. In order to do this I need the foreign key
infrastructure which is included in this patch. I held back from just going
ahead and writing the INNER JOIN removal patch as I didn't want to waste
the extra effort in doing that if someone was to find a show stopper
problem with using foreign keys the way I am with this patch. I was kind of
hoping someone would be able to look at this patch a bit more and confirm
to me that it's safe to do this or not before I go ahead and write the
inner join version.

If I understood correctly, you're planning to work on INNER join removal
too. How much of the code in this patch is also required for INNER join
removal, and how much is specific to SEMI and ANTI joins?

Apart from the extra lines of code in remove_useless_joins(), there's 3
functions added here which won't be needed at all for INNER
JOINs; semiorantijoin_is_removable(), convert_semijoin_to_isnotnull_quals()
and convert_antijoin_to_isnull_quals(). Not including the regression tests,
this is 396 lines with comments and 220 lines without. All of these
functions are static and in analyzejoin.c.

The benchmarks I posted a few weeks back show that the overhead of
performing the semi/anti join removal checks is quite low. I measured an
extra 400 or so nanoseconds for a successful removal on my i5 laptop. Or
just 15 nanoseconds on the earliest fast path for a non-removal. This
accounted for between 0.008% and 0.2% of planning time for the queries I
tested.

Regards

David Rowley

Jim Nasby

jim@nasby.net

over 11 years ago

In reply to: Heikki Linnakangas (#5)

Re: Patch to support SEMI and ANTI join removal

On 8/26/14, 8:40 AM, Heikki Linnakangas wrote:

Just so everyone is on the same page on what kind of queries this helps with, here are some examples from the added regression tests:

-- Test join removals for semi and anti joins
CREATE TEMP TABLE b (id INT NOT NULL PRIMARY KEY, val INT);
CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id INT REFERENCES b(id));
-- should remove semi join to b
EXPLAIN (COSTS OFF)
SELECT id FROM a WHERE b_id IN(SELECT id FROM b);

<snip>

SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id = id);

I also fail to see a use for examples that are that silly *unless* we're talking machine-generated SQL, but I suspect that normally uses JOINS.

Where I would expect this to be useful is in cases where we can pre-evaluate some other condition in the subqueries to make the subqueries useless (ie: SELECT id FROM b WHERE 1=1), or where the condition could be passed through (ie: SELECT id FROM b WHERE id=42). Another possibility would be if there's a condition in the subquery that could trigger constraint elimination.

Those are the real world cases I'd expect to see from anything reasonably sane (an adjective that doesn't always apply to some of the users I have to support...) My $0.01 on the burden of carrying the "useless" tests and code around is that it doesn't seem like all that much overhead...
--
Jim C. Nasby, Data Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 11 years ago

In reply to: Jim Nasby (#7)

Re: Patch to support SEMI and ANTI join removal

Jim Nasby <jim@nasby.net> writes:

On 8/26/14, 8:40 AM, Heikki Linnakangas wrote:

Just so everyone is on the same page on what kind of queries this helps with, here are some examples from the added regression tests:

-- Test join removals for semi and anti joins
CREATE TEMP TABLE b (id INT NOT NULL PRIMARY KEY, val INT);
CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id INT REFERENCES b(id));
-- should remove semi join to b
EXPLAIN (COSTS OFF)
SELECT id FROM a WHERE b_id IN(SELECT id FROM b);
<snip>
SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id = id);

I also fail to see a use for examples that are that silly *unless* we're talking machine-generated SQL, but I suspect that normally uses JOINS.

Where I would expect this to be useful is in cases where we can pre-evaluate some other condition in the subqueries to make the subqueries useless (ie: SELECT id FROM b WHERE 1=1), or where the condition could be passed through (ie: SELECT id FROM b WHERE id=42). Another possibility would be if there's a condition in the subquery that could trigger constraint elimination.

Unless I'm misunderstanding something, pretty much *any* WHERE restriction
in the subquery would defeat this optimization, since it would no longer
be certain that there was a match to an arbitrary outer-query row. So
it seems unlikely to me that this would fire in enough real-world cases
to be worth including. I am definitely not a fan of carrying around
deadwood in the planner.

If the majority of the added code is code that will be needed for
less-bogus optimizations, it might be all right; but I'd kind of want to
see the less-bogus optimizations working first.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

David Rowley

dgrowleyml@gmail.com

over 11 years ago

In reply to: Tom Lane (#8)

1 attachment(s)

Re: Patch to support SEMI and ANTI join removal

On Thu, Aug 28, 2014 at 6:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

If the majority of the added code is code that will be needed for
less-bogus optimizations, it might be all right; but I'd kind of want to
see the less-bogus optimizations working first.

That seems fair. Likely there'd be not a great deal of value to semi and
anti joins removal alone. I was more just trying to spread weight of an
inner join removal patch...

So In response to this, I've gone off and written an inner join removal
support patch (attached), which turned out to be a bit less complex than I
thought.

Here's a quick demo, of the patch at work:

test=# create table c (id int primary key);
CREATE TABLE
test=# create table b (id int primary key, c_id int not null references
c(id));
CREATE TABLE
test=# create table a (id int primary key, b_id int not null references
b(id));
CREATE TABLE
test=#
test=# explain select a.* from a inner join b on a.b_id = b.id inner join c
on b.c_id = c.id;
QUERY PLAN
-----------------------------------------------------
Seq Scan on a (cost=0.00..31.40 rows=2140 width=8)
Planning time: 1.061 ms
(2 rows)

Perhaps not a greatly useful example, but if you can imagine the joins are
hidden in a view and the user is just requesting a small subset of columns,
then it does seem quite powerful.

There's currently a few things with the patch that I'll list below, which
may raise a few questions:

1. I don't think that I'm currently handling removing eclass members
properly. So far the code just removes the Vars that belong to the relation
being removed. I likely should also be doing bms_del_member(ec->ec_relids,
relid); on the eclass, but perhaps I should just be marking the whole class
as "ec_broken = true" and adding another eclass all everything that the
broken one has minus the parts from the removed relation?

2. Currently the inner join removal is dis-allowed if the (would be)
removal relation has *any* baserestrictinfo items. The reason for this is
that we must ensure that the inner join gives us exactly 1 row match on the
join condition, but a baserestrictinfo can void the proof that a foreign
key would give us that a matching row does exist. However there is an
exception to this that could allow that restriction to be relaxed. That is
if the qual in baserestrictinfo use vars that are in an eclass, where the
same eclass also has ec members vars that belong to the rel that we're
using the foreign key for to prove the relation not needed.... umm.. that's
probably better described by example:

Assume there's a foreign key a (x) reference b(x)

SELECT a.* FROM a INNER JOIN b ON a.x = b.x WHERE b.x = 1

relation b should be removable because an eclass will contain {a.x, b.x}
and therefore s baserestrictinfo for a.x = 1 should also exist on relation
a. Therefore removing relation b should produce equivalent results, i.e
everything that gets filtered out on relation b will also be filtered out
on relation a anyway.

I think the patch without this is still worth it, but if someone feels
strongly about it I'll take a bash at supporting it.

3. Currently the inner join support does not allow removals using foreign
keys which contain duplicate columns on the referencing side. e.g (a,a)
references (x,y), this is basically because of the point I made in item 2.
In this case a baserestrictinfo would exist on the referenced relation to
say WHERE x = y. I'd have to remove the restriction described in item 2 and
do a small change to the code that extracts the join condition from the
eclass for this to work. But it's likely a corner case that's not worth too
much trouble to support. I think probably if I saw an FK like that in the
field, I'd probably scratch my head for a while, while trying to
understanding why they bothered.

4. The patch currently only allows removals for eclass join types. If the
rel has any joininfo items, then the join removal is disallowed. From what
I can see equality type inner join conditions get described in eclasses,
and only non-equality join conditions make it into the joininfo list, and
since foreign keys only support equality operators, then I thought this was
a valid restriction, however, if someone can show me a flaw in my
assumption then I may need to improve this.

5. I've added a flag to pg_class called relhasfkey. Currently this gets set
to true when a foreign key is added, though I've added nothing to set it
back to false again. I notice that relhasindex gets set back to false
during vacuum, if vacuum happens to find there to not be any indexes on the
rel. I didn't put my logic here as I wasn't too sure if scanning
pg_constraint during a vacuum seemed very correct, so I just left out the
"setting it to false" logic based on the the fact that I noticed that
relhaspkey gets away with quite lazy setting back to false logic (only when
there's no indexes of any kind left on the relation at all).

The only think else I can think of is perhaps optimising a little. I was
thinking likely most queries wont benefit from this too much, so I was
thinking of adding some logic to skip all join removals by pulling out the
varnos from the target list entries and skipping even attempting to perform
a join removal for a relation that has its varno in the targetlist of the
query. Though perhaps a few benchmarks will determine if this is worth it
or not.

Comments are welcome. -- I'm really hoping this patch generates a bit more
interest than the SEMI/ANTI join removal one!

Regards

David Rowley

Attachments:

inner_join_removals_2014-09-11_38cf71c.patchapplication/octet-stream; name=inner_join_removals_2014-09-11_38cf71c.patchDownload

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 68f8434..6d2e5c0 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1916,6 +1916,15 @@
      </row>
 
      <row>
+      <entry><structfield>relhasfkey</structfield></entry>
+      <entry><type>bool</type></entry>
+      <entry></entry>
+      <entry>
+       True if the table has (or once had) a foreign key constraint
+      </entry>
+     </row>
+ 
+     <row>
       <entry><structfield>relhasrules</structfield></entry>
       <entry><type>bool</type></entry>
       <entry></entry>
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index c346eda..93433ec 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -797,6 +797,7 @@ InsertPgClassTuple(Relation pg_class_desc,
 	values[Anum_pg_class_relchecks - 1] = Int16GetDatum(rd_rel->relchecks);
 	values[Anum_pg_class_relhasoids - 1] = BoolGetDatum(rd_rel->relhasoids);
 	values[Anum_pg_class_relhaspkey - 1] = BoolGetDatum(rd_rel->relhaspkey);
+	values[Anum_pg_class_relhasfkey - 1] = BoolGetDatum(rd_rel->relhasfkey);
 	values[Anum_pg_class_relhasrules - 1] = BoolGetDatum(rd_rel->relhasrules);
 	values[Anum_pg_class_relhastriggers - 1] = BoolGetDatum(rd_rel->relhastriggers);
 	values[Anum_pg_class_relhassubclass - 1] = BoolGetDatum(rd_rel->relhassubclass);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7bc579b..60a5857 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -361,6 +361,7 @@ static void ATAddCheckConstraint(List **wqueue,
 					 LOCKMODE lockmode);
 static void ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel,
 						  Constraint *fkconstraint, LOCKMODE lockmode);
+static void SetRelationHasForeignKey(Relation rel, bool hasfkey);
 static void ATExecDropConstraint(Relation rel, const char *constrName,
 					 DropBehavior behavior,
 					 bool recurse, bool recursing,
@@ -6353,6 +6354,12 @@ ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel,
 							 constrOid, indexOid);
 
 	/*
+	 * Ensure that the relation is marked as having foreign key constraints in
+	 * pg_class
+	 */
+	SetRelationHasForeignKey(rel, true);
+
+	/*
 	 * Tell Phase 3 to check that the constraint is satisfied by existing
 	 * rows. We can skip this during table creation, when requested explicitly
 	 * by specifying NOT VALID in an ADD FOREIGN KEY command, and when we're
@@ -6381,6 +6388,50 @@ ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel,
 }
 
 /*
+ * Update the relhasfkey column in the relation's pg_class tuple.
+ *
+ * Caller had better hold exclusive lock on the relation.
+ *
+ * An important side effect is that a SI update message will be sent out for
+ * the pg_class tuple, which will force other backends to rebuild their
+ * relcache entries for the rel.  Also, this backend will rebuild its
+ * own relcache entry at the next CommandCounterIncrement.
+ */
+static void
+SetRelationHasForeignKey(Relation rel, bool hasfkey)
+{
+	Relation	relrel;
+	HeapTuple	reltup;
+	Form_pg_class relStruct;
+
+	relrel = heap_open(RelationRelationId, RowExclusiveLock);
+	reltup = SearchSysCacheCopy1(RELOID,
+								 ObjectIdGetDatum(RelationGetRelid(rel)));
+	if (!HeapTupleIsValid(reltup))
+		elog(ERROR, "cache lookup failed for relation %u",
+			 RelationGetRelid(rel));
+	relStruct = (Form_pg_class) GETSTRUCT(reltup);
+
+	if (relStruct->relhasfkey != hasfkey)
+	{
+		relStruct->relhasfkey = hasfkey;
+
+		simple_heap_update(relrel, &reltup->t_self, reltup);
+
+		/* keep catalog indexes current */
+		CatalogUpdateIndexes(relrel, reltup);
+	}
+	else
+	{
+		/* Skip the disk update, but force relcache inval anyway */
+		CacheInvalidateRelcache(rel);
+	}
+
+	heap_freetuple(reltup);
+	heap_close(relrel, RowExclusiveLock);
+}
+
+/*
  * ALTER TABLE ALTER CONSTRAINT
  *
  * Update the attributes of a constraint.
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 9bf0098..88c8d98 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -3887,6 +3887,17 @@ afterTriggerInvokeEvents(AfterTriggerEventList *events,
 	return all_fired;
 }
 
+/* ----------
+ * AfterTriggerQueueIsEmpty()
+ *
+ *	True if there are no pending triggers in the queue.
+ * ----------
+ */
+bool
+AfterTriggerQueueIsEmpty(void)
+{
+	return (afterTriggers->query_depth == -1);
+}
 
 /* ----------
  * AfterTriggerBeginXact()
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index b7aff37..29d9eb3 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -32,6 +32,7 @@
 static EquivalenceMember *add_eq_member(EquivalenceClass *ec,
 			  Expr *expr, Relids relids, Relids nullable_relids,
 			  bool is_child, Oid datatype);
+static void update_rel_class_joins(PlannerInfo *root);
 static void generate_base_implied_equalities_const(PlannerInfo *root,
 									   EquivalenceClass *ec);
 static void generate_base_implied_equalities_no_const(PlannerInfo *root,
@@ -49,8 +50,6 @@ static List *generate_join_implied_equalities_broken(PlannerInfo *root,
 										Relids outer_relids,
 										Relids nominal_inner_relids,
 										AppendRelInfo *inner_appinfo);
-static Oid select_equality_operator(EquivalenceClass *ec,
-						 Oid lefttype, Oid righttype);
 static RestrictInfo *create_join_clause(PlannerInfo *root,
 				   EquivalenceClass *ec, Oid opno,
 				   EquivalenceMember *leftem,
@@ -725,7 +724,6 @@ void
 generate_base_implied_equalities(PlannerInfo *root)
 {
 	ListCell   *lc;
-	Index		rti;
 
 	foreach(lc, root->eq_classes)
 	{
@@ -752,6 +750,19 @@ generate_base_implied_equalities(PlannerInfo *root)
 	 * This is also a handy place to mark base rels (which should all exist by
 	 * now) with flags showing whether they have pending eclass joins.
 	 */
+	update_rel_class_joins(root);
+}
+
+/*
+ * update_rel_class_joins
+ *		Process each relation in the PlannerInfo to update the
+ *		has_eclass_joins flag
+ */
+static void
+update_rel_class_joins(PlannerInfo *root)
+{
+	Index		rti;
+
 	for (rti = 1; rti < root->simple_rel_array_size; rti++)
 	{
 		RelOptInfo *brel = root->simple_rel_array[rti];
@@ -764,6 +775,63 @@ generate_base_implied_equalities(PlannerInfo *root)
 }
 
 /*
+ * remove_rel_from_eclass
+ *		Remove all eclass members that belong to relid and also any classes
+ *		which have been left empty as a result of removing a member.
+ */
+void
+remove_rel_from_eclass(PlannerInfo *root, int relid)
+{
+	ListCell	*l,
+				*nextl,
+				*eqm,
+				*eqmnext;
+
+	bool removedany = false;
+
+	/* Strip all traces of this relation out of the eclasses */
+	for (l = list_head(root->eq_classes); l != NULL; l = nextl)
+	{
+		EquivalenceClass *ec = (EquivalenceClass *) lfirst(l);
+
+		nextl = lnext(l);
+
+		for (eqm = list_head(ec->ec_members); eqm != NULL; eqm = eqmnext)
+		{
+			EquivalenceMember *em = (EquivalenceMember *) lfirst(eqm);
+
+			eqmnext = lnext(eqm);
+
+			if (IsA(em->em_expr, Var))
+			{
+				Var *var = (Var *) em->em_expr;
+
+				if (var->varno == relid)
+				{
+					list_delete_ptr(ec->ec_members, em);
+					removedany = true;
+				}
+			}
+		}
+
+		/*
+		 * If we've removed the last member from the EquivalenceClass then we'd
+		 * better delete the entire entry.
+		 */
+		if (list_length(ec->ec_members) == 0)
+			list_delete_ptr(root->eq_classes, ec);
+	}
+
+	/*
+	 * If we removed any eclass members then this may have changed if a
+	 * relation has an eclass join or not, we'd better force an update
+	 * of this
+	 */
+	if (removedany)
+		update_rel_class_joins(root);
+}
+
+/*
  * generate_base_implied_equalities when EC contains pseudoconstant(s)
  */
 static void
@@ -1281,7 +1349,7 @@ generate_join_implied_equalities_broken(PlannerInfo *root,
  *
  * Returns InvalidOid if no operator can be found for this datatype combination
  */
-static Oid
+Oid
 select_equality_operator(EquivalenceClass *ec, Oid lefttype, Oid righttype)
 {
 	ListCell   *lc;
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index 773f8a4..4cb5c98 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -22,23 +22,38 @@
  */
 #include "postgres.h"
 
+#include "commands/trigger.h"
+#include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "nodes/relation.h"
 #include "optimizer/clauses.h"
 #include "optimizer/joininfo.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/planmain.h"
+#include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
 #include "utils/lsyscache.h"
 
 /* local functions */
-static bool join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
+static bool innerjoin_is_removable(PlannerInfo *root, List *joinlist,
+					  RangeTblRef *removalrtr, RelOptInfo **removerrel,
+					  List **columnlist);
+static bool leftjoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
+static bool relation_is_needed(PlannerInfo *root, Relids joinrelids,
+					  RelOptInfo *rel);
+static void convert_join_to_isnotnull_quals(PlannerInfo *root, RelOptInfo *rel,
+					  List *columnlist);
+static bool relation_has_foreign_key_for(PlannerInfo *root, RelOptInfo *rel,
+					  RelOptInfo *referencedrel, List *referencing_vars,
+					  List *index_vars, List *operator_list);
+static bool expressions_match_foreign_key(ForeignKeyInfo *fk, List *fkvars,
+					  List *indexvars, List *operators);
 static void remove_rel_from_query(PlannerInfo *root, int relid,
 					  Relids joinrelids);
 static List *remove_rel_from_joinlist(List *joinlist, int relid, int *nremoved);
 static Oid	distinct_col_search(int colno, List *colnos, List *opids);
 
-
 /*
  * remove_useless_joins
  *		Check for relations that don't actually need to be joined at all,
@@ -51,21 +66,76 @@ List *
 remove_useless_joins(PlannerInfo *root, List *joinlist)
 {
 	ListCell   *lc;
+	int			nremoved;
 
-	/*
-	 * We are only interested in relations that are left-joined to, so we can
-	 * scan the join_info_list to find them easily.
-	 */
 restart:
+
+	/* start with trying to remove needless inner joins */
+	foreach(lc, joinlist)
+	{
+		RangeTblRef *rtr = (RangeTblRef *) lfirst(lc);
+		RelOptInfo	*rel;
+		RelOptInfo *removerrel;
+		List		*columnlist;
+
+		if (!IsA(rtr, RangeTblRef))
+			continue;
+
+		/* skip if the join can't be removed */
+		if (!innerjoin_is_removable(root, joinlist, rtr, &removerrel, &columnlist))
+			continue;
+
+		rel = find_base_rel(root, rtr->rtindex);
+
+		/*
+		 * If any of the columns on the join condition are NULLable then since
+		 * we've removed the join, there's now a possibility that null valued
+		 * rows could make it into the results. To ensure this does not happen
+		 * we'll add IS NOT NULL quals to the rel that allowed the join to be
+		 * removed, though we need only do this if the columns are actually
+		 * NULLable.
+		 */
+		convert_join_to_isnotnull_quals(root, removerrel, columnlist);
+
+		remove_rel_from_query(root, rtr->rtindex,
+				bms_union(rel->relids, removerrel->relids));
+
+		/* We verify that exactly one reference gets removed from joinlist */
+		nremoved = 0;
+		joinlist = remove_rel_from_joinlist(joinlist, rtr->rtindex, &nremoved);
+		if (nremoved != 1)
+			elog(ERROR, "failed to find relation %d in joinlist", rtr->rtindex);
+
+		/*
+		 * We can delete this RangeTblRef from the list too, since it's no
+		 * longer of interest.
+		 */
+		joinlist = list_delete_ptr(joinlist, rtr);
+
+		/*
+		 * Restart the scan.  This is necessary to ensure we find all removable
+		 * joins independently of their ordering. (note that removal of
+		 * attr_needed bits may make a join, inner or outer, appear removable
+		 * that did not before).   Also, since we just deleted the current list
+		 * cell, we'd have to have some kluge to continue the list scan anyway.
+		 */
+		goto restart;
+	}
+
+	/* now process special joins. Currently only left joins are supported */
 	foreach(lc, root->join_info_list)
 	{
 		SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) lfirst(lc);
 		int			innerrelid;
-		int			nremoved;
 
-		/* Skip if not removable */
-		if (!join_is_removable(root, sjinfo))
-			continue;
+		if (sjinfo->jointype == JOIN_LEFT)
+		{
+			/* Skip if not removable */
+			if (!leftjoin_is_removable(root, sjinfo))
+				continue;
+		}
+		else
+			continue; /* we don't support this join type */
 
 		/*
 		 * Currently, join_is_removable can only succeed when the sjinfo's
@@ -91,12 +161,11 @@ restart:
 		root->join_info_list = list_delete_ptr(root->join_info_list, sjinfo);
 
 		/*
-		 * Restart the scan.  This is necessary to ensure we find all
-		 * removable joins independently of ordering of the join_info_list
-		 * (note that removal of attr_needed bits may make a join appear
-		 * removable that did not before).  Also, since we just deleted the
-		 * current list cell, we'd have to have some kluge to continue the
-		 * list scan anyway.
+		 * Restart the scan.  This is necessary to ensure we find all removable
+		 * joins independently of their ordering. (note that removal of
+		 * attr_needed bits may make a join, inner or outer, appear removable
+		 * that did not before).   Also, since we just deleted the current list
+		 * cell, we'd have to have some kluge to continue the list scan anyway.
 		 */
 		goto restart;
 	}
@@ -136,8 +205,203 @@ clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
 }
 
 /*
- * join_is_removable
- *	  Check whether we need not perform this special join at all, because
+ * innerjoin_is_removable
+ *		True if removalrtr's join can be removed due to the existence of
+ *		a foreign key which proves that we'll get exactly 1 matching row
+ *		from the join.
+ *
+ * We are able to prove the join is not required under the following
+ * conditions:
+ * 1. No Vars from the join are needed anywhere in the query.
+ * 2. We have another relation in the joinlist which references this relation
+ *    with a condition which matches the join condition exactly. The exact
+ *    match on the join condition means that the join will neither duplicate
+ *    nor restrict the results on the left side of the join.
+ */
+static bool
+innerjoin_is_removable(PlannerInfo *root, List *joinlist,
+					   RangeTblRef *removalrtr, RelOptInfo **removerrel,
+					   List **columnlist)
+{
+	ListCell   *lc;
+	RelOptInfo *removalrel;
+
+	removalrel = find_base_rel(root, removalrtr->rtindex);
+
+	/*
+	 * If removalrel has no indexes, then it mustn't have any unique indexes,
+	 * so therefore won't support being referenced by a foreign key.
+	 */
+	if (removalrel->indexlist == NIL)
+		return false;
+
+	/*
+	 * Currently we disallow the removal if we find any baserestrictinfo quals
+	 * on the relation. The reason for this is that these could filter out
+	 * rows and make it so the foreign key cannot prove that we'll match
+	 * exactly 1 row on the join condition. However, this check is currently
+	 * probably a bit overly strict, as we should be able to allow quals that
+	 * are present in the join condition. e.g:
+	 * SELECT a.* FROM a INNER JOIN b ON a.x = b.x WHERE b.x = 1
+	 */
+	if (removalrel->baserestrictinfo != NIL)
+		return false;
+
+	/*
+	 * Currently only eclass joins are supported, so if there are any non
+	 * eclass join quals then we'll report the join is non-removable.
+	 */
+	if (removalrel->joininfo != NIL)
+		return false;
+
+	/*
+	 * We mustn't allow any joins to be removed if there are any pending
+	 * foreign key triggers in the queue. This could happen if we are planning
+	 * a query that has been executed from within a volatile function and the
+	 * query which called this volatile function has made some changes to a
+	 * table referenced by a foreign key. The reason for this is that any
+	 * updates to a table which is referenced by a foreign key constraint will
+	 * only have the referencing tables updated after the command is complete,
+	 * so there is a window of time where records may violate the foreign key
+	 * constraint.
+	 *
+	 * Currently this code is quite naive, as we won't even attempt to remove
+	 * the join if there are *any* pending foreign key triggers, on any
+	 * relation. It may be worthwhile to improve this to check if there's any
+	 * pending triggers for the referencing relation in the join.
+	 */
+	if (!AfterTriggerQueueIsEmpty())
+		return false;
+
+	/*
+	 * Now we'll search through each relation in the joinlist to see if we can
+	 * find a relation which has a foreign key which references removalrel on
+	 * the join condition. If we find a rel with a foreign key which matches
+	 * the join condition exactly, then we can be sure that exactly 1 row will
+	 * be matched on the join, if we also see that no Vars from the relation
+	 * are needed, then we can report the join as removable.
+	 */
+	foreach (lc, joinlist)
+	{
+		RangeTblRef	*rtr = (RangeTblRef *) lfirst(lc);
+		RelOptInfo	*rel;
+		ListCell	*lc2;
+		List		*referencing_vars;
+		List		*index_vars;
+		List		*operator_list;
+		Relids		 joinrelids;
+
+		/* we can't remove ourself, or anything other than RangeTblRefs */
+		if (rtr == removalrtr || !IsA(rtr, RangeTblRef))
+			continue;
+
+		rel = find_base_rel(root, rtr->rtindex);
+
+		/* a rel without foreign keys won't help us, so skip it */
+		if (rel->fklist == NIL)
+			continue;
+
+		/*
+		 * If there's no join condition between the 2 rels, we can't use it to
+		 * prove the join is redundant.
+		 */
+		if (!have_relevant_eclass_joinclause(root, rel, removalrel))
+			continue;
+
+		joinrelids = bms_union(rel->relids, removalrel->relids);
+
+		if (relation_is_needed(root, joinrelids, removalrel))
+			return false;
+
+		referencing_vars = NIL;
+		index_vars = NIL;
+		operator_list = NIL;
+
+		foreach(lc2, root->eq_classes)
+		{
+			EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc2);
+
+			if (list_length(ec->ec_members) <= 1)
+				continue;
+
+			if (bms_overlap(removalrel->relids, ec->ec_relids) &&
+				bms_overlap(removalrel->relids, ec->ec_relids))
+			{
+				ListCell *lc3;
+				Var *refvar = NULL;
+				Var *idxvar = NULL;
+
+				/*
+				 * Look at each member of the eclass and try to find a Var from
+				 * each side of the join that we can append to the list of
+				 * columns that should be checked against each foreign key.
+				 *
+				 * The following logic does not allow for join removals to take
+				 * place for foreign keys that have duplicate columns on the
+				 * referencing side of the foreign key, such as:
+				 * (a,a) references (x,y)
+				 * The use case for such a foreign key is likely small enough
+				 * that we needn't bother making this code anymore complex to
+				 * solve. If we find more than 1 var from any of the rels then
+				 * we'll bail out.
+				 */
+				foreach (lc3, ec->ec_members)
+				{
+					EquivalenceMember *ecm = (EquivalenceMember *) lfirst(lc3);
+
+					Var *var = (Var *) ecm->em_expr;
+
+					if (!IsA(var, Var))
+						continue; /* Ignore Consts */
+
+					if (var->varno == rtr->rtindex)
+					{
+						if (refvar != NULL)
+							return false;
+						refvar = var;
+					}
+
+					else if (var->varno == removalrtr->rtindex)
+					{
+						if (idxvar != NULL)
+							return false;
+						idxvar = var;
+					}
+				}
+
+				if (refvar != NULL && idxvar != NULL)
+				{
+					/* grab the correct equality operator for these two vars */
+					Oid opno = select_equality_operator(ec, refvar->vartype, idxvar->vartype);
+
+					if (!OidIsValid(opno))
+						return false;
+
+					referencing_vars = lappend(referencing_vars, refvar);
+					index_vars = lappend(index_vars, idxvar);
+					operator_list = lappend_oid(operator_list, opno);
+				}
+			}
+		}
+
+		if (referencing_vars != NULL)
+		{
+			if (relation_has_foreign_key_for(root, rel, removalrel,
+				referencing_vars, index_vars, operator_list))
+			{
+				*removerrel = rel;
+				*columnlist = referencing_vars;
+				return true; /* removalrel can be removed */
+			}
+		}
+	}
+
+	return false; /* can't remove join */
+}
+
+/*
+ * leftjoin_is_removable
+ *	  Check whether we need not perform this left join at all, because
  *	  it will just duplicate its left input.
  *
  * This is true for a left join for which the join condition cannot match
@@ -147,7 +411,7 @@ clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
  * above the join.
  */
 static bool
-join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
+leftjoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 {
 	int			innerrelid;
 	RelOptInfo *innerrel;
@@ -155,14 +419,14 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	Relids		joinrelids;
 	List	   *clause_list = NIL;
 	ListCell   *l;
-	int			attroff;
+
+	Assert(sjinfo->jointype == JOIN_LEFT);
 
 	/*
-	 * Must be a non-delaying left join to a single baserel, else we aren't
+	 * Must be a non-delaying join to a single baserel, else we aren't
 	 * going to be able to do anything with it.
 	 */
-	if (sjinfo->jointype != JOIN_LEFT ||
-		sjinfo->delay_upper_joins ||
+	if (sjinfo->delay_upper_joins ||
 		bms_membership(sjinfo->min_righthand) != BMS_SINGLETON)
 		return false;
 
@@ -205,52 +469,9 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	/* Compute the relid set for the join we are considering */
 	joinrelids = bms_union(sjinfo->min_lefthand, sjinfo->min_righthand);
 
-	/*
-	 * We can't remove the join if any inner-rel attributes are used above the
-	 * join.
-	 *
-	 * Note that this test only detects use of inner-rel attributes in higher
-	 * join conditions and the target list.  There might be such attributes in
-	 * pushed-down conditions at this join, too.  We check that case below.
-	 *
-	 * As a micro-optimization, it seems better to start with max_attr and
-	 * count down rather than starting with min_attr and counting up, on the
-	 * theory that the system attributes are somewhat less likely to be wanted
-	 * and should be tested last.
-	 */
-	for (attroff = innerrel->max_attr - innerrel->min_attr;
-		 attroff >= 0;
-		 attroff--)
-	{
-		if (!bms_is_subset(innerrel->attr_needed[attroff], joinrelids))
-			return false;
-	}
-
-	/*
-	 * Similarly check that the inner rel isn't needed by any PlaceHolderVars
-	 * that will be used above the join.  We only need to fail if such a PHV
-	 * actually references some inner-rel attributes; but the correct check
-	 * for that is relatively expensive, so we first check against ph_eval_at,
-	 * which must mention the inner rel if the PHV uses any inner-rel attrs as
-	 * non-lateral references.  Note that if the PHV's syntactic scope is just
-	 * the inner rel, we can't drop the rel even if the PHV is variable-free.
-	 */
-	foreach(l, root->placeholder_list)
-	{
-		PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l);
-
-		if (bms_is_subset(phinfo->ph_needed, joinrelids))
-			continue;			/* PHV is not used above the join */
-		if (bms_overlap(phinfo->ph_lateral, innerrel->relids))
-			return false;		/* it references innerrel laterally */
-		if (!bms_overlap(phinfo->ph_eval_at, innerrel->relids))
-			continue;			/* it definitely doesn't reference innerrel */
-		if (bms_is_subset(phinfo->ph_eval_at, innerrel->relids))
-			return false;		/* there isn't any other place to eval PHV */
-		if (bms_overlap(pull_varnos((Node *) phinfo->ph_var->phexpr),
-						innerrel->relids))
-			return false;		/* it does reference innerrel */
-	}
+	/* if the relation is referenced in the query then it cannot be removed */
+	if (relation_is_needed(root, joinrelids, innerrel))
+		return false;
 
 	/*
 	 * Search for mergejoinable clauses that constrain the inner rel against
@@ -367,6 +588,279 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	return false;
 }
 
+/*
+ * relation_is_needed
+ *		True if any of the Vars from this relation are required in the query
+ */
+static inline bool
+relation_is_needed(PlannerInfo *root, Relids joinrelids, RelOptInfo *rel)
+{
+	int		  attroff;
+	ListCell *l;
+
+	/*
+	 * rel is referenced if any of it's attributes are used above the join.
+	 *
+	 * Note that this test only detects use of rel's attributes in higher
+	 * join conditions and the target list.  There might be such attributes in
+	 * pushed-down conditions at this join, too.  We check that case below.
+	 *
+	 * As a micro-optimization, it seems better to start with max_attr and
+	 * count down rather than starting with min_attr and counting up, on the
+	 * theory that the system attributes are somewhat less likely to be wanted
+	 * and should be tested last.
+	 */
+	for (attroff = rel->max_attr - rel->min_attr;
+		 attroff >= 0;
+		 attroff--)
+	{
+		if (!bms_is_subset(rel->attr_needed[attroff], joinrelids))
+			return true;
+	}
+
+	/*
+	 * Similarly check that rel isn't needed by any PlaceHolderVars that will
+	 * be used above the join.  We only need to fail if such a PHV actually
+	 * references some of rel's attributes; but the correct check for that is
+	 * relatively expensive, so we first check against ph_eval_at, which must
+	 * mention rel if the PHV uses any of-rel's attrs as non-lateral
+	 * references.  Note that if the PHV's syntactic scope is just rel, we
+	 * can't return true even if the PHV is variable-free.
+	 */
+	foreach(l, root->placeholder_list)
+	{
+		PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l);
+
+		if (bms_is_subset(phinfo->ph_needed, joinrelids))
+			continue;			/* PHV is not used above the join */
+		if (bms_overlap(phinfo->ph_lateral, rel->relids))
+			return true;		/* it references rel laterally */
+		if (!bms_overlap(phinfo->ph_eval_at, rel->relids))
+			continue;			/* it definitely doesn't reference rel */
+		if (bms_is_subset(phinfo->ph_eval_at, rel->relids))
+			return true;		/* there isn't any other place to eval PHV */
+		if (bms_overlap(pull_varnos((Node *) phinfo->ph_var->phexpr),
+						rel->relids))
+			return true;		/* it does reference rel */
+	}
+
+	return false; /* it does not reference rel */
+}
+
+/*
+ * convert_join_to_isnotnull_quals
+ *		Adds any required "col IS NOT NULL" quals which are required to ensure
+ *		that the query remains equivalent to what it was before the join
+ *		was removed.
+ */
+static void
+convert_join_to_isnotnull_quals(PlannerInfo *root, RelOptInfo *rel, List *columnlist)
+{
+	ListCell	*l;
+	Bitmapset	*handledcols = NULL;
+	Oid			 reloid;
+
+	reloid = root->simple_rte_array[rel->relid]->relid;
+
+	/*
+	 * If a join has been successfully removed by the join removal code,
+	 * then a foreign key must exist that proves the join to not be required.
+	 *
+	 * The join would have never allowed NULL values for any of the columns
+	 * seen in the join condition, as these would have matched up to a record
+	 * in the joined table. Now that we've proved the join to be redundant, we
+	 * must maintain that behavior of not having NULLs by adding IS NOT NULL
+	 * quals to the WHERE clause, although we may skip this if the column in
+	 * question happens to have a NOT NULL constraint.
+	 */
+	foreach(l, columnlist)
+	{
+		Var *var = (Var *) lfirst(l);
+
+		/* should be a var if it came from a foreign key */
+		Assert(IsA(var, Var));
+		Assert(var->varno == rel->relid);
+
+		/*
+		 * Skip this column if it's a duplicate of one we've previously
+		 * handled.
+		 */
+		if (bms_is_member(var->varattno, handledcols))
+			continue;
+
+		/* mark this column as handled */
+		handledcols = bms_add_member(handledcols, var->varattno);
+
+		/* add the IS NOT NULL qual, but only if the column allows NULLs */
+		if (!get_attnotnull(reloid, var->varattno))
+		{
+			RestrictInfo *rinfo;
+			NullTest *ntest = makeNode(NullTest);
+
+			ntest->nulltesttype = IS_NOT_NULL;
+			ntest->arg = (Expr *) var;
+			ntest->argisrow = false;
+
+			rinfo = make_restrictinfo((Expr *)ntest, false, false, false,
+						NULL, NULL, NULL);
+			rel->baserestrictinfo = lappend(rel->baserestrictinfo, rinfo);
+		}
+	}
+}
+
+/*
+ * relation_has_foreign_key_for
+ *	  Checks if rel has a foreign key which references referencedrel with the
+ *	  given list of expressions.
+ *
+ *	For the match to succeed:
+ *	  referencing_vars must match the columns defined in the foreign key.
+ *	  index_vars must match the columns defined in the index for the foreign key.
+ */
+static bool
+relation_has_foreign_key_for(PlannerInfo *root, RelOptInfo *rel,
+			RelOptInfo *referencedrel, List *referencing_vars,
+			List *index_vars, List *operator_list)
+{
+	ListCell *lc;
+	Oid		  refreloid;
+
+	/*
+	 * Look up the Oid of the referenced relation. We only want to look at
+	 * foreign keys on the referencing relation which reference this relation.
+	 */
+	refreloid = root->simple_rte_array[referencedrel->relid]->relid;
+
+	Assert(list_length(referencing_vars) > 0);
+	Assert(list_length(referencing_vars) == list_length(index_vars));
+	Assert(list_length(referencing_vars) == list_length(operator_list));
+
+	/*
+	 * Search through each foreign key on the referencing relation and try
+	 * to find one which references the relation in the join condition. If we
+	 * find one then we'll send the join conditions off to
+	 * expressions_match_foreign_key() to see if they match the foreign key.
+	 */
+	foreach(lc, rel->fklist)
+	{
+		ForeignKeyInfo *fk = (ForeignKeyInfo *) lfirst(lc);
+
+		if (fk->confrelid == refreloid)
+		{
+			if (expressions_match_foreign_key(fk, referencing_vars,
+				index_vars, operator_list))
+				return true;
+		}
+	}
+
+	return false;
+}
+
+/*
+ * expressions_match_foreign_key
+ *		True if the given fkvars, indexvars and operators will match
+ *		exactly 1 record in the referenced relation of the foreign key.
+ *
+ * Note: This function expects fkvars and indexvars to only contain Var types.
+ *       Expression indexes are not supported by foreign keys.
+ */
+static bool
+expressions_match_foreign_key(ForeignKeyInfo *fk, List *fkvars,
+					List *indexvars, List *operators)
+{
+	ListCell  *lc;
+	ListCell  *lc2;
+	ListCell  *lc3;
+	int		   col;
+	Bitmapset *allitems;
+	Bitmapset *matcheditems;
+	int		   lstidx;
+
+	Assert(list_length(fkvars) == list_length(indexvars));
+	Assert(list_length(fkvars) == list_length(operators));
+
+	/*
+	 * Fast path out if there's not enough conditions to match each column in
+	 * the foreign key. Note that we cannot check that the number of
+	 * expressions are equal here since it would cause any expressions which
+	 * are duplicated not to match.
+	 */
+	if (list_length(fkvars) < fk->conncols)
+		return false;
+
+	/*
+	 * We need to ensure that each foreign key column can be matched to a list
+	 * item, and we need to ensure that each list item can be matched to a
+	 * foreign key column. We do this by looping over each foreign key column
+	 * and checking that we can find an item in the list which matches the
+	 * current column, however this method does not allow us to ensure that no
+	 * additional items exist in the list. We could solve that by performing
+	 * another loop over each list item and check that it matches an foreign
+	 * key column, but that's a bit wasteful. Instead we'll use 2 bitmapsets,
+	 * one to store the 0 based index of each list item, and with the other
+	 * we'll store each list index that we've managed to match. After we're
+	 * done matching we'll just make sure that both bitmapsets are equal.
+	 */
+	allitems = NULL;
+	matcheditems = NULL;
+
+	/*
+	 * Build a bitmapset which contains each 1 based list index. It seems more
+	 * efficient to do this in reverse so that we allocate enough memory for
+	 * the bitmapset on first loop rather than reallocating each time we find
+	 * we need a bit more space.
+	 */
+	for (lstidx = list_length(fkvars) - 1; lstidx >= 0; lstidx--)
+		allitems = bms_add_member(allitems, lstidx);
+
+	for (col = 0; col < fk->conncols; col++)
+	{
+		bool  matched = false;
+
+		lstidx = 0;
+
+		forthree(lc, fkvars, lc2, indexvars, lc3, operators)
+		{
+			Var *expr = (Var *) lfirst(lc);
+			Var *idxexpr = (Var *) lfirst(lc2);
+			Oid  opr = lfirst_oid(lc3);
+
+			Assert(IsA(expr, Var));
+			Assert(IsA(idxexpr, Var));
+
+			/* Does this join qual match up to the current fkey column? */
+			if (fk->conkey[col] == expr->varattno &&
+				fk->confkey[col] == idxexpr->varattno &&
+				equality_ops_are_compatible(opr, fk->conpfeqop[col]))
+			{
+				matched = true;
+
+				/* mark this list item as matched */
+				matcheditems = bms_add_member(matcheditems, lstidx);
+
+				/*
+				 * Don't break here as there may be duplicate expressions
+				 * that we also need to match against.
+				 */
+			}
+			lstidx++;
+		}
+
+		/* punt if there's no match. */
+		if (!matched)
+			return false;
+	}
+
+	/*
+	 * Ensure that we managed to match every item in the list to a foreign key
+	 * column.
+	 */
+	if (!bms_equal(allitems, matcheditems))
+		return false;
+
+	return true; /* matched */
+}
+
 
 /*
  * Remove the target relid from the planner's data structures, having
@@ -393,6 +887,9 @@ remove_rel_from_query(PlannerInfo *root, int relid, Relids joinrelids)
 	 */
 	rel->reloptkind = RELOPT_DEADREL;
 
+	/* Strip out any eclass members that belong to this rel */
+	remove_rel_from_eclass(root, relid);
+
 	/*
 	 * Remove references to the rel from other baserels' attr_needed arrays.
 	 */
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index b2becfa..8e98d4b 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -25,7 +25,9 @@
 #include "access/transam.h"
 #include "access/xlog.h"
 #include "catalog/catalog.h"
+#include "catalog/pg_constraint.h"
 #include "catalog/heap.h"
+#include "catalog/pg_type.h"
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -38,6 +40,7 @@
 #include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "storage/bufmgr.h"
+#include "utils/fmgroids.h"
 #include "utils/lsyscache.h"
 #include "utils/rel.h"
 #include "utils/snapmgr.h"
@@ -384,6 +387,121 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 
 	heap_close(relation, NoLock);
 
+	/* load the foreign key constraints, if there are any */
+	if (relation->rd_rel->relhasfkey)
+	{
+		List	   *fkinfos = NIL;
+		Relation	fkeyRel;
+		Relation	fkeyRelIdx;
+		ScanKeyData fkeyScankey;
+		SysScanDesc fkeyScan;
+		HeapTuple	tuple;
+
+		ScanKeyInit(&fkeyScankey,
+					Anum_pg_constraint_conrelid,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(relationObjectId));
+
+		fkeyRel = heap_open(ConstraintRelationId, AccessShareLock);
+		fkeyRelIdx = index_open(ConstraintRelidIndexId, AccessShareLock);
+		fkeyScan = systable_beginscan_ordered(fkeyRel, fkeyRelIdx, NULL, 1, &fkeyScankey);
+
+		while ((tuple = systable_getnext_ordered(fkeyScan, ForwardScanDirection)) != NULL)
+		{
+			Form_pg_constraint con = (Form_pg_constraint) GETSTRUCT(tuple);
+			ForeignKeyInfo *fkinfo;
+			Datum		adatum;
+			bool		isNull;
+			ArrayType  *arr;
+			int			numkeys;
+
+			/* Not a foreign key */
+			if (con->contype != CONSTRAINT_FOREIGN)
+				continue;
+
+			/* we're not interested unless the fk has been validated */
+			if (!con->convalidated)
+				continue;
+
+			fkinfo = (ForeignKeyInfo *) palloc(sizeof(ForeignKeyInfo));
+			fkinfo->conindid = con->conindid;
+			fkinfo->confrelid = con->confrelid;
+			fkinfo->convalidated = con->convalidated;
+			fkinfo->conrelid = con->conrelid;
+			fkinfo->confupdtype = con->confupdtype;
+			fkinfo->confdeltype = con->confdeltype;
+			fkinfo->confmatchtype = con->confmatchtype;
+
+			adatum = heap_getattr(tuple, Anum_pg_constraint_conkey,
+								RelationGetDescr(fkeyRel), &isNull);
+
+			if (isNull)
+				elog(ERROR, "null conkey for constraint %u",
+					HeapTupleGetOid(tuple));
+
+			arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+			numkeys = ARR_DIMS(arr)[0];
+			if (ARR_NDIM(arr) != 1 ||
+				numkeys < 0 ||
+				ARR_HASNULL(arr) ||
+				ARR_ELEMTYPE(arr) != INT2OID)
+				elog(ERROR, "conkey is not a 1-D smallint array");
+
+			fkinfo->conkey = (int16 *) ARR_DATA_PTR(arr);
+			fkinfo->conncols = numkeys;
+
+			adatum = heap_getattr(tuple, Anum_pg_constraint_confkey,
+								RelationGetDescr(fkeyRel), &isNull);
+
+			if (isNull)
+				elog(ERROR, "null confkey for constraint %u",
+					HeapTupleGetOid(tuple));
+
+			arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+			numkeys = ARR_DIMS(arr)[0];
+
+			/* sanity check */
+			if (numkeys != fkinfo->conncols)
+				elog(ERROR, "number of confkey elements does not equal conkey elements");
+
+			if (ARR_NDIM(arr) != 1 ||
+				numkeys < 0 ||
+				ARR_HASNULL(arr) ||
+				ARR_ELEMTYPE(arr) != INT2OID)
+				elog(ERROR, "confkey is not a 1-D smallint array");
+
+			fkinfo->confkey = (int16 *) ARR_DATA_PTR(arr);
+			adatum = heap_getattr(tuple, Anum_pg_constraint_conpfeqop,
+								RelationGetDescr(fkeyRel), &isNull);
+
+			if (isNull)
+				elog(ERROR, "null conpfeqop for constraint %u",
+					HeapTupleGetOid(tuple));
+
+			arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+			numkeys = ARR_DIMS(arr)[0];
+
+			/* sanity check */
+			if (numkeys != fkinfo->conncols)
+				elog(ERROR, "number of conpfeqop elements does not equal conkey elements");
+
+			if (ARR_NDIM(arr) != 1 ||
+				numkeys < 0 ||
+				ARR_HASNULL(arr) ||
+				ARR_ELEMTYPE(arr) != OIDOID)
+				elog(ERROR, "conpfeqop is not a 1-D smallint array");
+
+			fkinfo->conpfeqop = (Oid *) ARR_DATA_PTR(arr);
+
+			fkinfos = lappend(fkinfos, fkinfo);
+		}
+
+		rel->fklist = fkinfos;
+		systable_endscan_ordered(fkeyScan);
+		index_close(fkeyRelIdx, AccessShareLock);
+		heap_close(fkeyRel, AccessShareLock);
+	}
+
 	/*
 	 * Allow a plugin to editorialize on the info we obtained from the
 	 * catalogs.  Actions might include altering the assumed relation size,
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index c938c27..a0fb8eb 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -115,6 +115,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind)
 	rel->lateral_relids = NULL;
 	rel->lateral_referencers = NULL;
 	rel->indexlist = NIL;
+	rel->fklist = NIL;
 	rel->pages = 0;
 	rel->tuples = 0;
 	rel->allvisfrac = 0;
@@ -377,6 +378,7 @@ build_join_rel(PlannerInfo *root,
 	joinrel->lateral_relids = NULL;
 	joinrel->lateral_referencers = NULL;
 	joinrel->indexlist = NIL;
+	joinrel->fklist = NIL;
 	joinrel->pages = 0;
 	joinrel->tuples = 0;
 	joinrel->allvisfrac = 0;
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 552e498..aa81c7c 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -916,6 +916,33 @@ get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 	ReleaseSysCache(tp);
 }
 
+/*
+ * get_attnotnull
+ *
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	tp;
+
+	tp = SearchSysCache2(ATTNUM,
+						 ObjectIdGetDatum(relid),
+						 Int16GetDatum(attnum));
+	if (HeapTupleIsValid(tp))
+	{
+		Form_pg_attribute att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+		bool		result;
+
+		result = att_tup->attnotnull;
+		ReleaseSysCache(tp);
+		return result;
+	}
+	else
+		return false;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index f2fb317..55f2155 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -62,6 +62,7 @@ CATALOG(pg_class,1259) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83) BKI_SCHEMA_MACRO
 	int16		relchecks;		/* # of CHECK constraints for class */
 	bool		relhasoids;		/* T if we generate OIDs for rows of rel */
 	bool		relhaspkey;		/* has (or has had) PRIMARY KEY index */
+	bool		relhasfkey;		/* has (or has had) a FOREIGN KEY constraint */
 	bool		relhasrules;	/* has (or has had) any rules */
 	bool		relhastriggers; /* has (or has had) any TRIGGERs */
 	bool		relhassubclass; /* has (or has had) derived classes */
@@ -94,7 +95,7 @@ typedef FormData_pg_class *Form_pg_class;
  * ----------------
  */
 
-#define Natts_pg_class					29
+#define Natts_pg_class					30
 #define Anum_pg_class_relname			1
 #define Anum_pg_class_relnamespace		2
 #define Anum_pg_class_reltype			3
@@ -115,15 +116,16 @@ typedef FormData_pg_class *Form_pg_class;
 #define Anum_pg_class_relchecks			18
 #define Anum_pg_class_relhasoids		19
 #define Anum_pg_class_relhaspkey		20
-#define Anum_pg_class_relhasrules		21
-#define Anum_pg_class_relhastriggers	22
-#define Anum_pg_class_relhassubclass	23
-#define Anum_pg_class_relispopulated	24
-#define Anum_pg_class_relreplident		25
-#define Anum_pg_class_relfrozenxid		26
-#define Anum_pg_class_relminmxid		27
-#define Anum_pg_class_relacl			28
-#define Anum_pg_class_reloptions		29
+#define Anum_pg_class_relhasfkey		21
+#define Anum_pg_class_relhasrules		22
+#define Anum_pg_class_relhastriggers	23
+#define Anum_pg_class_relhassubclass	24
+#define Anum_pg_class_relispopulated	25
+#define Anum_pg_class_relreplident		26
+#define Anum_pg_class_relfrozenxid		27
+#define Anum_pg_class_relminmxid		28
+#define Anum_pg_class_relacl			29
+#define Anum_pg_class_reloptions		30
 
 /* ----------------
  *		initial contents of pg_class
@@ -138,13 +140,13 @@ typedef FormData_pg_class *Form_pg_class;
  * Note: "3" in the relfrozenxid column stands for FirstNormalTransactionId;
  * similarly, "1" in relminmxid stands for FirstMultiXactId
  */
-DATA(insert OID = 1247 (  pg_type		PGNSP 71 0 PGUID 0 0 0 0 0 0 0 f f p r 30 0 t f f f f t n 3 1 _null_ _null_ ));
+DATA(insert OID = 1247 (  pg_type		PGNSP 71 0 PGUID 0 0 0 0 0 0 0 f f p r 30 0 t f f f f f t n 3 1 _null_ _null_ ));
 DESCR("");
-DATA(insert OID = 1249 (  pg_attribute	PGNSP 75 0 PGUID 0 0 0 0 0 0 0 f f p r 21 0 f f f f f t n 3 1 _null_ _null_ ));
+DATA(insert OID = 1249 (  pg_attribute	PGNSP 75 0 PGUID 0 0 0 0 0 0 0 f f p r 21 0 f f f f f f t n 3 1 _null_ _null_ ));
 DESCR("");
-DATA(insert OID = 1255 (  pg_proc		PGNSP 81 0 PGUID 0 0 0 0 0 0 0 f f p r 27 0 t f f f f t n 3 1 _null_ _null_ ));
+DATA(insert OID = 1255 (  pg_proc		PGNSP 81 0 PGUID 0 0 0 0 0 0 0 f f p r 27 0 t f f f f f t n 3 1 _null_ _null_ ));
 DESCR("");
-DATA(insert OID = 1259 (  pg_class		PGNSP 83 0 PGUID 0 0 0 0 0 0 0 f f p r 29 0 t f f f f t n 3 1 _null_ _null_ ));
+DATA(insert OID = 1259 (  pg_class		PGNSP 83 0 PGUID 0 0 0 0 0 0 0 f f p r 30 0 t f f f f f t n 3 1 _null_ _null_ ));
 DESCR("");
 
 
diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h
index d0b0356..34a75e4 100644
--- a/src/include/commands/trigger.h
+++ b/src/include/commands/trigger.h
@@ -181,6 +181,7 @@ extern void ExecBSTruncateTriggers(EState *estate,
 extern void ExecASTruncateTriggers(EState *estate,
 					   ResultRelInfo *relinfo);
 
+extern bool AfterTriggerQueueIsEmpty(void);
 extern void AfterTriggerBeginXact(void);
 extern void AfterTriggerBeginQuery(void);
 extern void AfterTriggerEndQuery(EState *estate);
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index dacbe9c..f69df09 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -355,6 +355,8 @@ typedef struct PlannerInfo
  *		lateral_referencers - relids of rels that reference this one laterally
  *		indexlist - list of IndexOptInfo nodes for relation's indexes
  *					(always NIL if it's not a table)
+ *		fklist - list of ForeignKeyInfo's for relation's foreign key
+ *					constraints. (always NIL if it's not a table)
  *		pages - number of disk pages in relation (zero if not a table)
  *		tuples - number of tuples in relation (not considering restrictions)
  *		allvisfrac - fraction of disk pages that are marked all-visible
@@ -448,6 +450,7 @@ typedef struct RelOptInfo
 	Relids		lateral_relids; /* minimum parameterization of rel */
 	Relids		lateral_referencers;	/* rels that reference me laterally */
 	List	   *indexlist;		/* list of IndexOptInfo */
+	List	   *fklist;			/* list of ForeignKeyInfo */
 	BlockNumber pages;			/* size estimates derived from pg_class */
 	double		tuples;
 	double		allvisfrac;
@@ -538,6 +541,51 @@ typedef struct IndexOptInfo
 	bool		amhasgetbitmap; /* does AM have amgetbitmap interface? */
 } IndexOptInfo;
 
+/*
+ * ForeignKeyInfo
+ *		Used to store pg_constraint records for foreign key constraints for use
+ *		by the planner.
+ *
+ *		conindid - The index which supports the foreign key
+ *
+ *		confrelid - The relation that is referenced by this foreign key
+ *
+ *		convalidated - True if the foreign key has been validated.
+ *
+ *		conrelid - The Oid of the relation that the foreign key belongs to
+ *
+ *		confupdtype - ON UPDATE action for when the referenced table is updated
+ *
+ *		confdeltype - ON DELETE action, controls what to do when a record is
+ *					deleted from the referenced table.
+ *
+ *		confmatchtype - foreign key match type, e.g MATCH FULL, MATCH PARTIAL
+ *
+ *		conncols - Number of columns defined in the foreign key
+ *
+ *		conkey - An array of conncols elements to store the varattno of the
+ *					columns on the referencing side of the foreign key
+ *
+ *		confkey - An array of conncols elements to store the varattno of the
+ *					columns on the referenced side of the foreign key
+ *
+ *		conpfeqop - An array of conncols elements to store the operators for
+ *					PK = FK comparisons
+ */
+typedef struct ForeignKeyInfo
+{
+	Oid			conindid;		/* index supporting this constraint */
+	Oid			confrelid;		/* relation referenced by foreign key */
+	bool		convalidated;	/* constraint has been validated? */
+	Oid			conrelid;		/* relation this constraint constrains */
+	char		confupdtype;	/* foreign key's ON UPDATE action */
+	char		confdeltype;	/* foreign key's ON DELETE action */
+	char		confmatchtype;	/* foreign key's match type */
+	int			conncols;		/* number of columns references */
+	int16	   *conkey;			/* Columns of conrelid that the constraint applies to */
+	int16	   *confkey;		/* columns of confrelid that foreign key references */
+	Oid		   *conpfeqop;		/* Operator list for comparing PK to FK */
+} ForeignKeyInfo;
 
 /*
  * EquivalenceClasses
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 9b22fda..b11ae78 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -108,10 +108,13 @@ extern EquivalenceClass *get_eclass_for_sort_expr(PlannerInfo *root,
 						 Relids rel,
 						 bool create_it);
 extern void generate_base_implied_equalities(PlannerInfo *root);
+extern void remove_rel_from_eclass(PlannerInfo *root, int relid);
 extern List *generate_join_implied_equalities(PlannerInfo *root,
 								 Relids join_relids,
 								 Relids outer_relids,
 								 RelOptInfo *inner_rel);
+extern Oid select_equality_operator(EquivalenceClass *ec, Oid lefttype,
+								 Oid righttype);
 extern bool exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2);
 extern void add_child_rel_equivalences(PlannerInfo *root,
 						   AppendRelInfo *appinfo,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 07d24d4..910190d 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -68,6 +68,7 @@ extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern int32 get_atttypmod(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 					  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern char *get_collation_name(Oid colloid);
 extern char *get_constraint_name(Oid conoid);
 extern Oid	get_opclass_family(Oid opclass);
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 2501184..251d44f 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3276,6 +3276,260 @@ select i8.* from int8_tbl i8 left join (select f1 from int4_tbl group by f1) i4
 (1 row)
 
 rollback;
+BEGIN;
+-- Test join removals for inner joins
+CREATE TEMP TABLE c (id INT NOT NULL PRIMARY KEY);
+CREATE TEMP TABLE b (id INT NOT NULL PRIMARY KEY, c_id INT NOT NULL REFERENCES c(id), val INT);
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id INT REFERENCES b(id));
+-- this should remove inner join to b
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id;
+          QUERY PLAN          
+------------------------------
+ Seq Scan on a
+   Filter: (b_id IS NOT NULL)
+(2 rows)
+
+-- this should remove inner join to b and c
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id INNER JOIN c ON b.c_id = c.id;
+          QUERY PLAN          
+------------------------------
+ Seq Scan on a
+   Filter: (b_id IS NOT NULL)
+(2 rows)
+
+-- this should generate the same plan as above.
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM c INNER JOIN b ON c.id = b.c_id INNER JOIN a ON a.b_id = b.id;
+          QUERY PLAN          
+------------------------------
+ Seq Scan on a
+   Filter: (b_id IS NOT NULL)
+(2 rows)
+
+-- inner join can't be removed due to b columns in the target list
+EXPLAIN (COSTS OFF)
+SELECT * FROM a INNER JOIN b ON a.b_id = b.id;
+          QUERY PLAN          
+------------------------------
+ Hash Join
+   Hash Cond: (a.b_id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(5 rows)
+
+-- this should not remove inner join to b due to quals restricting results from b
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b on a.b_id = b.id WHERE b.val = 10;
+            QUERY PLAN            
+----------------------------------
+ Hash Join
+   Hash Cond: (a.b_id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: (val = 10)
+(6 rows)
+
+-- this should remove joins to b and c.
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) FROM a INNER JOIN b ON a.b_id = b.id LEFT OUTER JOIN c ON a.id = c.id;
+             QUERY PLAN             
+------------------------------------
+ Aggregate
+   ->  Seq Scan on a
+         Filter: (b_id IS NOT NULL)
+(3 rows)
+
+-- this should remove joins to b and c, however it b will only be removed on
+-- 2nd attempt after c is removed by the left join removal code.
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) FROM a INNER JOIN b ON a.b_id = b.id LEFT OUTER JOIN c ON b.id = c.id;
+             QUERY PLAN             
+------------------------------------
+ Aggregate
+   ->  Seq Scan on a
+         Filter: (b_id IS NOT NULL)
+(3 rows)
+
+-- this should not remove join to b
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b on a.b_id = b.id WHERE b.val = b.id;
+            QUERY PLAN            
+----------------------------------
+ Hash Join
+   Hash Cond: (a.b_id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: (id = val)
+(6 rows)
+
+-- this should not remove the join, no foreign key exists between a.id and b.id
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.id = b.id;
+         QUERY PLAN         
+----------------------------
+ Hash Join
+   Hash Cond: (a.id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(5 rows)
+
+ALTER TABLE a ALTER COLUMN b_id SET NOT NULL;
+-- Ensure the join gets removed, but an IS NOT NULL qual is not added for b_id
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id;
+  QUERY PLAN   
+---------------
+ Seq Scan on a
+(1 row)
+
+ROLLBACK;
+BEGIN;
+-- inner join removal code with 2 column foreign keys
+CREATE TEMP TABLE b (id1 INT NOT NULL, id2 INT NOT NULL, PRIMARY KEY(id1,id2));
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id1 INT, b_id2 INT);
+ALTER TABLE a ADD CONSTRAINT a_b_id1_b_id2_fkey FOREIGN KEY (b_id1,b_id2) REFERENCES b(id1,id2) MATCH SIMPLE;
+-- this should remove inner join to b
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 = b.id1 AND a.b_id2 = b.id2;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Seq Scan on a
+   Filter: ((b_id1 IS NOT NULL) AND (b_id2 IS NOT NULL))
+(2 rows)
+
+-- should not remove inner join to b (extra condition)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 = b.id1 AND a.b_id2 = b.id2 AND a.b_id2 >= b.id2;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Merge Join
+   Merge Cond: ((b.id1 = a.b_id1) AND (b.id2 = a.b_id2))
+   Join Filter: (a.b_id2 >= b.id2)
+   ->  Index Only Scan using b_pkey on b
+   ->  Sort
+         Sort Key: a.b_id1, a.b_id2
+         ->  Seq Scan on a
+(7 rows)
+
+-- should not remove inner join to b (wrong operator)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 > b.id1 AND a.b_id2 < b.id2;
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Nested Loop
+   ->  Seq Scan on a
+   ->  Index Only Scan using b_pkey on b
+         Index Cond: ((id1 < a.b_id1) AND (id2 > a.b_id2))
+(4 rows)
+
+-- should not remove inner join (only checking id1)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 = b.id1;
+               QUERY PLAN                
+-----------------------------------------
+ Merge Join
+   Merge Cond: (b.id1 = a.b_id1)
+   ->  Index Only Scan using b_pkey on b
+   ->  Sort
+         Sort Key: a.b_id1
+         ->  Seq Scan on a
+(6 rows)
+
+-- should not remove inner join (checking wrong columns)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id2 = b.id1 AND a.b_id1 = b.id2;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Merge Join
+   Merge Cond: ((b.id1 = a.b_id2) AND (b.id2 = a.b_id1))
+   ->  Index Only Scan using b_pkey on b
+   ->  Sort
+         Sort Key: a.b_id2, a.b_id1
+         ->  Seq Scan on a
+(6 rows)
+
+-- should not remove inner join (no check for id2)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id2 = b.id1 AND a.b_id1 = b.id1;
+              QUERY PLAN               
+---------------------------------------
+ Hash Join
+   Hash Cond: (b.id1 = a.b_id2)
+   ->  Seq Scan on b
+   ->  Hash
+         ->  Seq Scan on a
+               Filter: (b_id2 = b_id1)
+(6 rows)
+
+-- should not remove inner join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 = b.id1 AND a.b_id1 = b.id2;
+            QUERY PLAN             
+-----------------------------------
+ Hash Join
+   Hash Cond: (a.b_id1 = b.id1)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: (id1 = id2)
+(6 rows)
+
+-- should not remove inner join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a INNER JOIN b ON a.b_id1 = b.id1 AND a.b_id1 = b.id1;
+               QUERY PLAN                
+-----------------------------------------
+ Merge Join
+   Merge Cond: (b.id1 = a.b_id1)
+   ->  Index Only Scan using b_pkey on b
+   ->  Sort
+         Sort Key: a.b_id1
+         ->  Seq Scan on a
+(6 rows)
+
+ROLLBACK;
+-- In this test we want to ensure that INNER JOIN removal does not
+-- occur when there are pending foreign key triggers.
+-- We test this by updating a relation which is referenced by a foreign key
+-- and then executing another query which would normally allow the inner
+-- join to be removed.
+CREATE TABLE j2 (id INT NOT NULL PRIMARY KEY);
+CREATE TABLE j1 (
+  id INT PRIMARY KEY,
+  j2_id INT NOT NULL REFERENCES j2 (id) MATCH FULL ON DELETE CASCADE ON UPDATE CASCADE
+);
+INSERT INTO j2 VALUES(10),(20);
+INSERT INTO j1 VALUES(1,10),(2,20);
+-- create a table to store records that 'violate' the fkey.
+CREATE TABLE results (j2_id INT NOT NULL);
+CREATE OR REPLACE FUNCTION j1_update() RETURNS TRIGGER AS $$
+BEGIN
+  INSERT INTO results SELECT j2_id FROM j1 INNER JOIN j2 ON j1.j2_id = j2.id;
+  RETURN NEW;
+  END;
+$$ LANGUAGE plpgsql;
+CREATE TRIGGER j1_update_trigger BEFORE UPDATE ON j2 FOR EACH ROW EXECUTE PROCEDURE j1_update();
+UPDATE j2 SET id = id + 1;
+-- results should only contain 3 records. If we blindly removed the join despite the
+-- foreign key not having updated the referenced records yet, we'd get 4 rows in results.
+SELECT * FROM results;
+ j2_id 
+-------
+    10
+    20
+    20
+(3 rows)
+
+DROP TABLE j1;
+DROP TABLE j2;
+DROP TABLE results;
+DROP FUNCTION j1_update();
 create temp table parent (k int primary key, pd int);
 create temp table child (k int unique, cd int);
 insert into parent values (1, 10), (2, 20), (3, 30);
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index 718e1d9..b94f12d 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -977,6 +977,140 @@ select i8.* from int8_tbl i8 left join (select f1 from int4_tbl group by f1) i4
 
 rollback;
 
+BEGIN;
+
+-- Test join removals for inner joins
+CREATE TEMP TABLE c (id INT NOT NULL PRIMARY KEY);
+CREATE TEMP TABLE b (id INT NOT NULL PRIMARY KEY, c_id INT NOT NULL REFERENCES c(id), val INT);
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id INT REFERENCES b(id));
+
+-- this should remove inner join to b
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id;
+
+-- this should remove inner join to b and c
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id INNER JOIN c ON b.c_id = c.id;
+
+-- this should generate the same plan as above.
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM c INNER JOIN b ON c.id = b.c_id INNER JOIN a ON a.b_id = b.id;
+
+-- inner join can't be removed due to b columns in the target list
+EXPLAIN (COSTS OFF)
+SELECT * FROM a INNER JOIN b ON a.b_id = b.id;
+
+-- this should not remove inner join to b due to quals restricting results from b
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b on a.b_id = b.id WHERE b.val = 10;
+
+-- this should remove joins to b and c.
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) FROM a INNER JOIN b ON a.b_id = b.id LEFT OUTER JOIN c ON a.id = c.id;
+
+-- this should remove joins to b and c, however it b will only be removed on
+-- 2nd attempt after c is removed by the left join removal code.
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) FROM a INNER JOIN b ON a.b_id = b.id LEFT OUTER JOIN c ON b.id = c.id;
+
+-- this should not remove join to b
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b on a.b_id = b.id WHERE b.val = b.id;
+
+-- this should not remove the join, no foreign key exists between a.id and b.id
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.id = b.id;
+
+ALTER TABLE a ALTER COLUMN b_id SET NOT NULL;
+
+-- Ensure the join gets removed, but an IS NOT NULL qual is not added for b_id
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id;
+
+ROLLBACK;
+
+BEGIN;
+
+-- inner join removal code with 2 column foreign keys
+
+CREATE TEMP TABLE b (id1 INT NOT NULL, id2 INT NOT NULL, PRIMARY KEY(id1,id2));
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id1 INT, b_id2 INT);
+
+ALTER TABLE a ADD CONSTRAINT a_b_id1_b_id2_fkey FOREIGN KEY (b_id1,b_id2) REFERENCES b(id1,id2) MATCH SIMPLE;
+
+-- this should remove inner join to b
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 = b.id1 AND a.b_id2 = b.id2;
+
+-- should not remove inner join to b (extra condition)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 = b.id1 AND a.b_id2 = b.id2 AND a.b_id2 >= b.id2;
+
+-- should not remove inner join to b (wrong operator)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 > b.id1 AND a.b_id2 < b.id2;
+
+-- should not remove inner join (only checking id1)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 = b.id1;
+
+-- should not remove inner join (checking wrong columns)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id2 = b.id1 AND a.b_id1 = b.id2;
+
+-- should not remove inner join (no check for id2)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id2 = b.id1 AND a.b_id1 = b.id1;
+
+-- should not remove inner join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 = b.id1 AND a.b_id1 = b.id2;
+
+-- should not remove inner join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a INNER JOIN b ON a.b_id1 = b.id1 AND a.b_id1 = b.id1;
+
+ROLLBACK;
+
+-- In this test we want to ensure that INNER JOIN removal does not
+-- occur when there are pending foreign key triggers.
+-- We test this by updating a relation which is referenced by a foreign key
+-- and then executing another query which would normally allow the inner
+-- join to be removed.
+
+CREATE TABLE j2 (id INT NOT NULL PRIMARY KEY);
+CREATE TABLE j1 (
+  id INT PRIMARY KEY,
+  j2_id INT NOT NULL REFERENCES j2 (id) MATCH FULL ON DELETE CASCADE ON UPDATE CASCADE
+);
+
+INSERT INTO j2 VALUES(10),(20);
+INSERT INTO j1 VALUES(1,10),(2,20);
+
+-- create a table to store records that 'violate' the fkey.
+CREATE TABLE results (j2_id INT NOT NULL);
+
+CREATE OR REPLACE FUNCTION j1_update() RETURNS TRIGGER AS $$
+BEGIN
+  INSERT INTO results SELECT j2_id FROM j1 INNER JOIN j2 ON j1.j2_id = j2.id;
+  RETURN NEW;
+  END;
+$$ LANGUAGE plpgsql;
+
+CREATE TRIGGER j1_update_trigger BEFORE UPDATE ON j2 FOR EACH ROW EXECUTE PROCEDURE j1_update();
+
+UPDATE j2 SET id = id + 1;
+
+-- results should only contain 3 records. If we blindly removed the join despite the
+-- foreign key not having updated the referenced records yet, we'd get 4 rows in results.
+SELECT * FROM results;
+
+DROP TABLE j1;
+DROP TABLE j2;
+DROP TABLE results;
+DROP FUNCTION j1_update();
+
+
 create temp table parent (k int primary key, pd int);
 create temp table child (k int unique, cd int);
 insert into parent values (1, 10), (2, 20), (3, 30);

#10

Robert Haas

robertmhaas@gmail.com

over 11 years ago

In reply to: David Rowley (#9)

Re: Patch to support SEMI and ANTI join removal

On Thu, Sep 11, 2014 at 7:14 AM, David Rowley <dgrowleyml@gmail.com> wrote:

Here's a quick demo, of the patch at work:

test=# create table c (id int primary key);
CREATE TABLE
test=# create table b (id int primary key, c_id int not null references
c(id));
CREATE TABLE
test=# create table a (id int primary key, b_id int not null references
b(id));
CREATE TABLE
test=#
test=# explain select a.* from a inner join b on a.b_id = b.id inner join c
on b.c_id = c.id;
QUERY PLAN
-----------------------------------------------------
Seq Scan on a (cost=0.00..31.40 rows=2140 width=8)
Planning time: 1.061 ms
(2 rows)

That is just awesome. You are my new hero.

1. I don't think that I'm currently handling removing eclass members
properly. So far the code just removes the Vars that belong to the relation
being removed. I likely should also be doing bms_del_member(ec->ec_relids,
relid); on the eclass, but perhaps I should just be marking the whole class
as "ec_broken = true" and adding another eclass all everything that the
broken one has minus the parts from the removed relation?

I haven't read the patch, but I think the question is why this needs
to be different than what we do for left join removal.

Assume there's a foreign key a (x) reference b(x)

SELECT a.* FROM a INNER JOIN b ON a.x = b.x WHERE b.x = 1

relation b should be removable because an eclass will contain {a.x, b.x} and
therefore s baserestrictinfo for a.x = 1 should also exist on relation a.
Therefore removing relation b should produce equivalent results, i.e
everything that gets filtered out on relation b will also be filtered out on
relation a anyway.

I think the patch without this is still worth it, but if someone feels
strongly about it I'll take a bash at supporting it.

That'd be nice to fix, but IMHO not essential.

3. Currently the inner join support does not allow removals using foreign
keys which contain duplicate columns on the referencing side. e.g (a,a)
references (x,y), this is basically because of the point I made in item 2.
In this case a baserestrictinfo would exist on the referenced relation to
say WHERE x = y.

I think it's fine to not bother with this case. Who cares?

4. The patch currently only allows removals for eclass join types. If the
rel has any joininfo items, then the join removal is disallowed. From what I
can see equality type inner join conditions get described in eclasses, and
only non-equality join conditions make it into the joininfo list, and since
foreign keys only support equality operators, then I thought this was a
valid restriction, however, if someone can show me a flaw in my assumption
then I may need to improve this.

Seems OK.

5. I've added a flag to pg_class called relhasfkey. Currently this gets set
to true when a foreign key is added, though I've added nothing to set it
back to false again. I notice that relhasindex gets set back to false during
vacuum, if vacuum happens to find there to not be any indexes on the rel. I
didn't put my logic here as I wasn't too sure if scanning pg_constraint
during a vacuum seemed very correct, so I just left out the "setting it to
false" logic based on the the fact that I noticed that relhaspkey gets away
with quite lazy setting back to false logic (only when there's no indexes of
any kind left on the relation at all).

The alternative to resetting the flag somehow is not having it in the
first place. Would that be terribly expensive?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Tom Lane

tgl@sss.pgh.pa.us

over 11 years ago

In reply to: Robert Haas (#10)

Re: Patch to support SEMI and ANTI join removal

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, Sep 11, 2014 at 7:14 AM, David Rowley <dgrowleyml@gmail.com> wrote:

5. I've added a flag to pg_class called relhasfkey. Currently this gets set
to true when a foreign key is added, though I've added nothing to set it
back to false again. I notice that relhasindex gets set back to false during
vacuum, if vacuum happens to find there to not be any indexes on the rel. I
didn't put my logic here as I wasn't too sure if scanning pg_constraint
during a vacuum seemed very correct, so I just left out the "setting it to
false" logic based on the the fact that I noticed that relhaspkey gets away
with quite lazy setting back to false logic (only when there's no indexes of
any kind left on the relation at all).

The alternative to resetting the flag somehow is not having it in the
first place. Would that be terribly expensive?

The behavior of relhaspkey is a legacy thing that we've tolerated only
because nothing whatsoever in the backend depends on it at all. I'm not
eager to add more equally-ill-defined pg_class columns.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

David Rowley

dgrowleyml@gmail.com

over 11 years ago

In reply to: Robert Haas (#10)

Re: Patch to support SEMI and ANTI join removal

On Fri, Sep 12, 2014 at 3:35 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Sep 11, 2014 at 7:14 AM, David Rowley <dgrowleyml@gmail.com>
wrote:

1. I don't think that I'm currently handling removing eclass members

properly. So far the code just removes the Vars that belong to the

relation

being removed. I likely should also be doing

bms_del_member(ec->ec_relids,

relid); on the eclass, but perhaps I should just be marking the whole

class

as "ec_broken = true" and adding another eclass all everything that the
broken one has minus the parts from the removed relation?

I haven't read the patch, but I think the question is why this needs
to be different than what we do for left join removal.

I discovered over here ->
/messages/by-id/CAApHDvo5wCRk7uHBuMHJaDpbW-b_oGKQOuisCZzHC25=H3__fA@mail.gmail.com
during the early days of the semi and anti join removal code that the
planner was trying to generate paths to the dead rel. I managed to track
the problem down to eclass members still existing for the dead rel. I guess
we must not have eclass members for outer rels? or we'd likely have seen
some troubles with left join removals already.

In the meantime I'll fix up the inner join removal code to properly delete
the ec_relids member for the dead rel. I guess probably the restrict info
should come out too.

I know it's late in the commitfest, but since there was next to no interest
in semi and anti join removals, can I rename the patch in the commitfest
app to be "Inner join removals"? It's either that or I'd mark that patch as
rejected and submit this one in October.

Regards

David Rowley

#13

David Rowley

dgrowleyml@gmail.com

over 11 years ago

In reply to: Tom Lane (#11)

Re: Patch to support SEMI and ANTI join removal

On Fri, Sep 12, 2014 at 3:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, Sep 11, 2014 at 7:14 AM, David Rowley <dgrowleyml@gmail.com>

wrote:

5. I've added a flag to pg_class called relhasfkey. Currently this gets

set

to true when a foreign key is added, though I've added nothing to set it
back to false again. I notice that relhasindex gets set back to false

during

vacuum, if vacuum happens to find there to not be any indexes on the

rel. I

didn't put my logic here as I wasn't too sure if scanning pg_constraint
during a vacuum seemed very correct, so I just left out the "setting it

to

false" logic based on the the fact that I noticed that relhaspkey gets

away

with quite lazy setting back to false logic (only when there's no

indexes of

any kind left on the relation at all).

The alternative to resetting the flag somehow is not having it in the
first place. Would that be terribly expensive?

I'd imagine not really expensive. I guess I just thought that it would be a
good idea to save from having to bother looking in pg_constraint for
foreign keys when none exist. The scan uses pg_constraint_conrelid_index so
only would ever see the constraints for the rel being cached/loaded.

The behavior of relhaspkey is a legacy thing that we've tolerated only
because nothing whatsoever in the backend depends on it at all. I'm not
eager to add more equally-ill-defined pg_class columns.

I guess it's certainly not required. It would be easier to add it later if
we decided it was a good idea, rather than having to keep it forever and a
day if it's next to useless.

I'll remove it from the patch.

Regards

David Rowley

#14

Tom Lane

tgl@sss.pgh.pa.us

over 11 years ago

In reply to: David Rowley (#12)

Re: Patch to support SEMI and ANTI join removal

David Rowley <dgrowleyml@gmail.com> writes:

On Fri, Sep 12, 2014 at 3:35 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I haven't read the patch, but I think the question is why this needs
to be different than what we do for left join removal.

I discovered over here ->
/messages/by-id/CAApHDvo5wCRk7uHBuMHJaDpbW-b_oGKQOuisCZzHC25=H3__fA@mail.gmail.com
during the early days of the semi and anti join removal code that the
planner was trying to generate paths to the dead rel. I managed to track
the problem down to eclass members still existing for the dead rel. I guess
we must not have eclass members for outer rels? or we'd likely have seen
some troubles with left join removals already.

Mere existence of an eclass entry ought not cause paths to get built.
It'd be worth looking a bit harder into what's happening there.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

David Rowley

dgrowleyml@gmail.com

over 11 years ago

In reply to: Tom Lane (#14)

1 attachment(s)

Re: Patch to support SEMI and ANTI join removal

On Sat, Sep 13, 2014 at 1:38 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Rowley <dgrowleyml@gmail.com> writes:

On Fri, Sep 12, 2014 at 3:35 AM, Robert Haas <robertmhaas@gmail.com>

wrote:

I haven't read the patch, but I think the question is why this needs
to be different than what we do for left join removal.

I discovered over here ->

/messages/by-id/CAApHDvo5wCRk7uHBuMHJaDpbW-b_oGKQOuisCZzHC25=H3__fA@mail.gmail.com

during the early days of the semi and anti join removal code that the
planner was trying to generate paths to the dead rel. I managed to track
the problem down to eclass members still existing for the dead rel. I

guess

we must not have eclass members for outer rels? or we'd likely have seen
some troubles with left join removals already.

Mere existence of an eclass entry ought not cause paths to get built.
It'd be worth looking a bit harder into what's happening there.

It took me a bit of time to create this problem again as I didn't record
the actual query where I hit the assert failure the first time. Though, now
I have managed to recreate the problem again by removing the code that I
had added which removes eclass members for dead rels.

Using the attached patch, the failing test case is:

create table b (id int primary key);
create table a (id int primary key, b_id int references b);
create index on a (b_id); -- add index to create alternative path

explain select a.* from a inner join b on b.id=a.b_id;

What seems to be happening is that generate_implied_equalities_for_column
generates a RestrictInfo for the dead rel due to the eclass member still
existing. This new rinfo gets matched to the index
by match_clauses_to_index()

The code then later fails in get_loop_count:

TRAP: FailedAssertion("!(outer_rel->rows > 0)", File:
"src\backend\optimizer\path\indxpath.c", Line: 1861)

The call stack looks like:

postgres.exe!get_loop_count(PlannerInfo * root, Bitmapset * outer_relids)

Line 1861 C
postgres.exe!build_index_paths(PlannerInfo * root, RelOptInfo * rel,
IndexOptInfo * index, IndexClauseSet * clauses, char useful_predicate,
SaOpControl saop_control, ScanTypeControl scantype) Line 938 C
postgres.exe!get_index_paths(PlannerInfo * root, RelOptInfo * rel,
IndexOptInfo * index, IndexClauseSet * clauses, List * * bitindexpaths)
Line 745 C
postgres.exe!get_join_index_paths(PlannerInfo * root, RelOptInfo * rel,
IndexOptInfo * index, IndexClauseSet * rclauseset, IndexClauseSet *
jclauseset, IndexClauseSet * eclauseset, List * * bitindexpaths, Bitmapset
* relids, List * * considered_relids) Line 672 C
postgres.exe!consider_index_join_outer_rels(PlannerInfo * root,
RelOptInfo * rel, IndexOptInfo * index, IndexClauseSet * rclauseset,
IndexClauseSet * jclauseset, IndexClauseSet * eclauseset, List * *
bitindexpaths, List * indexjoinclauses, int considered_clauses, List * *
considered_relids) Line 585 C
postgres.exe!consider_index_join_clauses(PlannerInfo * root, RelOptInfo *
rel, IndexOptInfo * index, IndexClauseSet * rclauseset, IndexClauseSet *
jclauseset, IndexClauseSet * eclauseset, List * * bitindexpaths) Line 485 C
postgres.exe!create_index_paths(PlannerInfo * root, RelOptInfo * rel)
Line 308 C
postgres.exe!set_plain_rel_pathlist(PlannerInfo * root, RelOptInfo * rel,
RangeTblEntry * rte) Line 403 C
postgres.exe!set_rel_pathlist(PlannerInfo * root, RelOptInfo * rel,
unsigned int rti, RangeTblEntry * rte) Line 337 C
postgres.exe!set_base_rel_pathlists(PlannerInfo * root) Line 223 C
postgres.exe!make_one_rel(PlannerInfo * root, List * joinlist) Line 157 C
postgres.exe!query_planner(PlannerInfo * root, List * tlist, void
(PlannerInfo *, void *) * qp_callback, void * qp_extra) Line 236 C
postgres.exe!grouping_planner(PlannerInfo * root, double tuple_fraction)
Line 1289 C
postgres.exe!subquery_planner(PlannerGlobal * glob, Query * parse,
PlannerInfo * parent_root, char hasRecursion, double tuple_fraction,
PlannerInfo * * subroot) Line 573 C
postgres.exe!standard_planner(Query * parse, int cursorOptions,
ParamListInfoData * boundParams) Line 211 C
postgres.exe!planner(Query * parse, int cursorOptions, ParamListInfoData
* boundParams) Line 139 C
postgres.exe!pg_plan_query(Query * querytree, int cursorOptions,
ParamListInfoData * boundParams) Line 750 C
postgres.exe!ExplainOneQuery(Query * query, IntoClause * into,
ExplainState * es, const char * queryString, ParamListInfoData * params)
Line 330 C
postgres.exe!ExplainQuery(ExplainStmt * stmt, const char * queryString,
ParamListInfoData * params, _DestReceiver * dest) Line 231 C
postgres.exe!standard_ProcessUtility(Node * parsetree, const char *
queryString, ProcessUtilityContext context, ParamListInfoData * params,
_DestReceiver * dest, char * completionTag) Line 647 C
postgres.exe!ProcessUtility(Node * parsetree, const char * queryString,
ProcessUtilityContext context, ParamListInfoData * params, _DestReceiver *
dest, char * completionTag) Line 314 C
postgres.exe!PortalRunUtility(PortalData * portal, Node * utilityStmt,
char isTopLevel, _DestReceiver * dest, char * completionTag) Line 1195 C
postgres.exe!FillPortalStore(PortalData * portal, char isTopLevel) Line
1063 C
postgres.exe!PortalRun(PortalData * portal, long count, char isTopLevel,
_DestReceiver * dest, _DestReceiver * altdest, char * completionTag) Line
790 C
postgres.exe!exec_simple_query(const char * query_string) Line 1052 C
postgres.exe!PostgresMain(int argc, char * * argv, const char * dbname,
const char * username) Line 4012 C
postgres.exe!BackendRun(Port * port) Line 4113 C
postgres.exe!SubPostmasterMain(int argc, char * * argv) Line 4618 C
postgres.exe!main(int argc, char * * argv) Line 207 C

So my solution to this was just to add the code that strips out the eclass
members that belong to the newly dead rel in join removal.
The only other solution I can think of would be to add a bitmap set of dead
rels onto the PlannerInfo struct or perhaps just generate one and passing
that in prohibited_rels in generate_implied_equalities_for_column (). I
don't really care for this solution very much as it seems better to make
the join removal code pay for this extra processing rather than (probably)
most queries.

Of course this is my problem as I'm unable to create the same situation
with the existing left join removals. The point here is more to justify why
I added code to strip eclass members of dead rels.

Any thoughts? Or arguments against me keeping the code that strips out the
eclass members of dead rels?

Regards

David Rowley

Attachments:

inner_join_removals_2014-09-16_2bcc9ba.patchapplication/octet-stream; name=inner_join_removals_2014-09-16_2bcc9ba.patchDownload

diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 9bf0098..88c8d98 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -3887,6 +3887,17 @@ afterTriggerInvokeEvents(AfterTriggerEventList *events,
 	return all_fired;
 }
 
+/* ----------
+ * AfterTriggerQueueIsEmpty()
+ *
+ *	True if there are no pending triggers in the queue.
+ * ----------
+ */
+bool
+AfterTriggerQueueIsEmpty(void)
+{
+	return (afterTriggers->query_depth == -1);
+}
 
 /* ----------
  * AfterTriggerBeginXact()
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index b7aff37..5f6826a 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -49,8 +49,6 @@ static List *generate_join_implied_equalities_broken(PlannerInfo *root,
 										Relids outer_relids,
 										Relids nominal_inner_relids,
 										AppendRelInfo *inner_appinfo);
-static Oid select_equality_operator(EquivalenceClass *ec,
-						 Oid lefttype, Oid righttype);
 static RestrictInfo *create_join_clause(PlannerInfo *root,
 				   EquivalenceClass *ec, Oid opno,
 				   EquivalenceMember *leftem,
@@ -1281,7 +1279,7 @@ generate_join_implied_equalities_broken(PlannerInfo *root,
  *
  * Returns InvalidOid if no operator can be found for this datatype combination
  */
-static Oid
+Oid
 select_equality_operator(EquivalenceClass *ec, Oid lefttype, Oid righttype)
 {
 	ListCell   *lc;
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index 773f8a4..5e31fd5 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -22,23 +22,38 @@
  */
 #include "postgres.h"
 
+#include "commands/trigger.h"
+#include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "nodes/relation.h"
 #include "optimizer/clauses.h"
 #include "optimizer/joininfo.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/planmain.h"
+#include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
 #include "utils/lsyscache.h"
 
 /* local functions */
-static bool join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
+static bool innerjoin_is_removable(PlannerInfo *root, List *joinlist,
+					  RangeTblRef *removalrtr, RelOptInfo **removerrel,
+					  List **columnlist);
+static bool leftjoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
+static bool relation_is_needed(PlannerInfo *root, Relids joinrelids,
+					  RelOptInfo *rel);
+static void convert_join_to_isnotnull_quals(PlannerInfo *root, RelOptInfo *rel,
+					  List *columnlist);
+static bool relation_has_foreign_key_for(PlannerInfo *root, RelOptInfo *rel,
+					  RelOptInfo *referencedrel, List *referencing_vars,
+					  List *index_vars, List *operator_list);
+static bool expressions_match_foreign_key(ForeignKeyInfo *fk, List *fkvars,
+					  List *indexvars, List *operators);
 static void remove_rel_from_query(PlannerInfo *root, int relid,
 					  Relids joinrelids);
 static List *remove_rel_from_joinlist(List *joinlist, int relid, int *nremoved);
 static Oid	distinct_col_search(int colno, List *colnos, List *opids);
 
-
 /*
  * remove_useless_joins
  *		Check for relations that don't actually need to be joined at all,
@@ -51,21 +66,76 @@ List *
 remove_useless_joins(PlannerInfo *root, List *joinlist)
 {
 	ListCell   *lc;
+	int			nremoved;
 
-	/*
-	 * We are only interested in relations that are left-joined to, so we can
-	 * scan the join_info_list to find them easily.
-	 */
 restart:
+
+	/* start with trying to remove needless inner joins */
+	foreach(lc, joinlist)
+	{
+		RangeTblRef *rtr = (RangeTblRef *) lfirst(lc);
+		RelOptInfo	*rel;
+		RelOptInfo *removerrel;
+		List		*columnlist;
+
+		if (!IsA(rtr, RangeTblRef))
+			continue;
+
+		/* skip if the join can't be removed */
+		if (!innerjoin_is_removable(root, joinlist, rtr, &removerrel, &columnlist))
+			continue;
+
+		rel = find_base_rel(root, rtr->rtindex);
+
+		/*
+		 * If any of the columns on the join condition are NULLable then since
+		 * we've removed the join, there's now a possibility that null valued
+		 * rows could make it into the results. To ensure this does not happen
+		 * we'll add IS NOT NULL quals to the rel that allowed the join to be
+		 * removed, though we need only do this if the columns are actually
+		 * NULLable.
+		 */
+		convert_join_to_isnotnull_quals(root, removerrel, columnlist);
+
+		remove_rel_from_query(root, rtr->rtindex,
+				bms_union(rel->relids, removerrel->relids));
+
+		/* We verify that exactly one reference gets removed from joinlist */
+		nremoved = 0;
+		joinlist = remove_rel_from_joinlist(joinlist, rtr->rtindex, &nremoved);
+		if (nremoved != 1)
+			elog(ERROR, "failed to find relation %d in joinlist", rtr->rtindex);
+
+		/*
+		 * We can delete this RangeTblRef from the list too, since it's no
+		 * longer of interest.
+		 */
+		joinlist = list_delete_ptr(joinlist, rtr);
+
+		/*
+		 * Restart the scan.  This is necessary to ensure we find all removable
+		 * joins independently of their ordering. (note that removal of
+		 * attr_needed bits may make a join, inner or outer, appear removable
+		 * that did not before).   Also, since we just deleted the current list
+		 * cell, we'd have to have some kluge to continue the list scan anyway.
+		 */
+		goto restart;
+	}
+
+	/* now process special joins. Currently only left joins are supported */
 	foreach(lc, root->join_info_list)
 	{
 		SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) lfirst(lc);
 		int			innerrelid;
-		int			nremoved;
 
-		/* Skip if not removable */
-		if (!join_is_removable(root, sjinfo))
-			continue;
+		if (sjinfo->jointype == JOIN_LEFT)
+		{
+			/* Skip if not removable */
+			if (!leftjoin_is_removable(root, sjinfo))
+				continue;
+		}
+		else
+			continue; /* we don't support this join type */
 
 		/*
 		 * Currently, join_is_removable can only succeed when the sjinfo's
@@ -91,12 +161,11 @@ restart:
 		root->join_info_list = list_delete_ptr(root->join_info_list, sjinfo);
 
 		/*
-		 * Restart the scan.  This is necessary to ensure we find all
-		 * removable joins independently of ordering of the join_info_list
-		 * (note that removal of attr_needed bits may make a join appear
-		 * removable that did not before).  Also, since we just deleted the
-		 * current list cell, we'd have to have some kluge to continue the
-		 * list scan anyway.
+		 * Restart the scan.  This is necessary to ensure we find all removable
+		 * joins independently of their ordering. (note that removal of
+		 * attr_needed bits may make a join, inner or outer, appear removable
+		 * that did not before).   Also, since we just deleted the current list
+		 * cell, we'd have to have some kluge to continue the list scan anyway.
 		 */
 		goto restart;
 	}
@@ -136,8 +205,231 @@ clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
 }
 
 /*
- * join_is_removable
- *	  Check whether we need not perform this special join at all, because
+ * innerjoin_is_removable
+ *		True if the join to removalrtr can be removed.
+ *
+ * In order to prove a relation which is inner joined is not required we must
+ * be sure that the join would emit exactly 1 row on the join condition. This
+ * differs from the logic which is used for proving LEFT JOINs can be removed,
+ * where it's possible to just check that a unique index exists on the relation
+ * being removed which has a set of columns that is a subset of the columns
+ * seen in the join condition. With INNER JOINs that's no good as we need to
+ * ensure that we get exactly 1 matching row on the join condition, so here we
+ * use foreign keys to prove that we'd get a 1 to 1 row match on the join
+ * condition.
+ */
+static bool
+innerjoin_is_removable(PlannerInfo *root, List *joinlist,
+					   RangeTblRef *removalrtr, RelOptInfo **removerrel,
+					   List **columnlist)
+{
+	ListCell   *lc;
+	RelOptInfo *removalrel;
+
+	removalrel = find_base_rel(root, removalrtr->rtindex);
+
+	/*
+	 * As foreign keys may only reference base rels which have unique indexes,
+	 * we needn't go any further if we're not dealing with a base rel, or if
+	 * the base rel has no unique indexes. We'd also better abort if the
+	 * rtekind is anything but a relation, as things like sub-queries may have
+	 * grouping or distinct clauses that would cause us not to be able to use
+	 * the foreign key to prove the existence of a row matching the join
+	 * condition. We also abort if the rel has no eclass joins as such a rel
+	 * could well be joined using some operator which is not an equality
+	 * operator, or the rel may not even be inner joined at all.
+	 *
+	 * Here we actually only check if the rel has any indexes, ideally we'd be
+	 * checking for unique indexes, but we could only determine that by looping
+	 * over the indexlist, and this is likely too expensive a check to be worth
+	 * it here.
+	 */
+	if (removalrel->reloptkind != RELOPT_BASEREL ||
+		removalrel->rtekind != RTE_RELATION ||
+		removalrel->has_eclass_joins == false ||
+		removalrel->indexlist == NIL)
+		return false;
+
+	/*
+	 * Currently we disallow the removal if we find any baserestrictinfo items
+	 * on the relation being removed. The reason for this is that these would
+	 * filter out rows and make it so the foreign key cannot prove that we'll
+	 * match exactly 1 row on the join condition. However, this check is
+	 * currently probably a bit overly strict as it should be possible to just
+	 * check and ensure that each Var seen in the baserestrictinfo is also
+	 * present in an eclass and if so, just translate and move the whole
+	 * baserestrictinfo over to the relation which has the foreign key to prove
+	 * that this join is not needed. e.g:
+	 * SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id WHERE b.id = 1;
+	 * could become: SELECT a.* FROM a WHERE a.b_id = 1;
+	 */
+	if (removalrel->baserestrictinfo != NIL)
+		return false;
+
+	/*
+	 * Currently only eclass joins are supported, so if there are any non
+	 * eclass join quals then we'll report the join is non-removable.
+	 */
+	if (removalrel->joininfo != NIL)
+		return false;
+
+	/*
+	 * We mustn't allow any joins to be removed if there are any pending
+	 * foreign key triggers in the queue. This could happen if we are planning
+	 * a query that has been executed from within a volatile function and the
+	 * query which called this volatile function has made some changes to a
+	 * table referenced by a foreign key. The reason for this is that any
+	 * updates to a table which is referenced by a foreign key constraint will
+	 * only have the referencing tables updated after the command is complete,
+	 * so there is a window of time where records may violate the foreign key
+	 * constraint.
+	 *
+	 * Currently this code is quite naive, as we won't even attempt to remove
+	 * the join if there are *any* pending foreign key triggers, on any
+	 * relation. It may be worthwhile to improve this to check if there's any
+	 * pending triggers for the referencing relation in the join.
+	 */
+	if (!AfterTriggerQueueIsEmpty())
+		return false;
+
+	/*
+	 * Now we'll search through each relation in the joinlist to see if we can
+	 * find a relation which has a foreign key which references removalrel on
+	 * the join condition. If we find a rel with a foreign key which matches
+	 * the join condition exactly, then we can be sure that exactly 1 row will
+	 * be matched on the join, if we also see that no Vars from the relation
+	 * are needed, then we can report the join as removable.
+	 */
+	foreach (lc, joinlist)
+	{
+		RangeTblRef	*rtr = (RangeTblRef *) lfirst(lc);
+		RelOptInfo	*rel;
+		ListCell	*lc2;
+		List		*referencing_vars;
+		List		*index_vars;
+		List		*operator_list;
+		Relids		 joinrelids;
+
+		/* we can't remove ourself, or anything other than RangeTblRefs */
+		if (rtr == removalrtr || !IsA(rtr, RangeTblRef))
+			continue;
+
+		rel = find_base_rel(root, rtr->rtindex);
+
+		/*
+		 * The only relation type that can help us is a base rel with at least
+		 * one foreign key defined, if there's no eclass joins then this rel
+		 * is not going to help us prove the removalrel is not needed.
+		 */
+		if (rel->reloptkind != RELOPT_BASEREL ||
+			rel->rtekind != RTE_RELATION ||
+			rel->has_eclass_joins == false ||
+			rel->fklist == NIL)
+			continue;
+
+		/*
+		 * Both rels have eclass joins, but do they have eclass joins to each
+		 * other? Skip this rel if it does not.
+		 */
+		if (!have_relevant_eclass_joinclause(root, rel, removalrel))
+			continue;
+
+		joinrelids = bms_union(rel->relids, removalrel->relids);
+
+		/* if any of the Vars from the relation are needed then abort */
+		if (relation_is_needed(root, joinrelids, removalrel))
+			return false;
+
+		referencing_vars = NIL;
+		index_vars = NIL;
+		operator_list = NIL;
+
+		/* now populate the lists with the join condition Vars */
+		foreach(lc2, root->eq_classes)
+		{
+			EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc2);
+
+			if (list_length(ec->ec_members) <= 1)
+				continue;
+
+			if (bms_overlap(removalrel->relids, ec->ec_relids) &&
+				bms_overlap(rel->relids, ec->ec_relids))
+			{
+				ListCell *lc3;
+				Var *refvar = NULL;
+				Var *idxvar = NULL;
+
+				/*
+				 * Look at each member of the eclass and try to find a Var from
+				 * each side of the join that we can append to the list of
+				 * columns that should be checked against each foreign key.
+				 *
+				 * The following logic does not allow for join removals to take
+				 * place for foreign keys that have duplicate columns on the
+				 * referencing side of the foreign key, such as:
+				 * (a,a) references (x,y)
+				 * The use case for such a foreign key is likely small enough
+				 * that we needn't bother making this code anymore complex to
+				 * solve. If we find more than 1 Var from any of the rels then
+				 * we'll bail out.
+				 */
+				foreach (lc3, ec->ec_members)
+				{
+					EquivalenceMember *ecm = (EquivalenceMember *) lfirst(lc3);
+
+					Var *var = (Var *) ecm->em_expr;
+
+					if (!IsA(var, Var))
+						continue; /* Ignore Consts */
+
+					if (var->varno == rel->relid)
+					{
+						if (refvar != NULL)
+							return false;
+						refvar = var;
+					}
+
+					else if (var->varno == removalrel->relid)
+					{
+						if (idxvar != NULL)
+							return false;
+						idxvar = var;
+					}
+				}
+
+				if (refvar != NULL && idxvar != NULL)
+				{
+					/* grab the correct equality operator for these two vars */
+					Oid opno = select_equality_operator(ec, refvar->vartype, idxvar->vartype);
+
+					if (!OidIsValid(opno))
+						return false;
+
+					referencing_vars = lappend(referencing_vars, refvar);
+					index_vars = lappend(index_vars, idxvar);
+					operator_list = lappend_oid(operator_list, opno);
+				}
+			}
+		}
+
+		if (referencing_vars != NULL)
+		{
+			if (relation_has_foreign_key_for(root, rel, removalrel,
+				referencing_vars, index_vars, operator_list))
+			{
+				*removerrel = rel;
+				*columnlist = referencing_vars;
+				return true; /* removalrel can be removed */
+			}
+		}
+	}
+
+	return false; /* can't remove join */
+}
+
+/*
+ * leftjoin_is_removable
+ *	  Check whether we need not perform this left join at all, because
  *	  it will just duplicate its left input.
  *
  * This is true for a left join for which the join condition cannot match
@@ -147,7 +439,7 @@ clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
  * above the join.
  */
 static bool
-join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
+leftjoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 {
 	int			innerrelid;
 	RelOptInfo *innerrel;
@@ -155,14 +447,14 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	Relids		joinrelids;
 	List	   *clause_list = NIL;
 	ListCell   *l;
-	int			attroff;
+
+	Assert(sjinfo->jointype == JOIN_LEFT);
 
 	/*
-	 * Must be a non-delaying left join to a single baserel, else we aren't
+	 * Must be a non-delaying join to a single baserel, else we aren't
 	 * going to be able to do anything with it.
 	 */
-	if (sjinfo->jointype != JOIN_LEFT ||
-		sjinfo->delay_upper_joins ||
+	if (sjinfo->delay_upper_joins ||
 		bms_membership(sjinfo->min_righthand) != BMS_SINGLETON)
 		return false;
 
@@ -205,52 +497,9 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	/* Compute the relid set for the join we are considering */
 	joinrelids = bms_union(sjinfo->min_lefthand, sjinfo->min_righthand);
 
-	/*
-	 * We can't remove the join if any inner-rel attributes are used above the
-	 * join.
-	 *
-	 * Note that this test only detects use of inner-rel attributes in higher
-	 * join conditions and the target list.  There might be such attributes in
-	 * pushed-down conditions at this join, too.  We check that case below.
-	 *
-	 * As a micro-optimization, it seems better to start with max_attr and
-	 * count down rather than starting with min_attr and counting up, on the
-	 * theory that the system attributes are somewhat less likely to be wanted
-	 * and should be tested last.
-	 */
-	for (attroff = innerrel->max_attr - innerrel->min_attr;
-		 attroff >= 0;
-		 attroff--)
-	{
-		if (!bms_is_subset(innerrel->attr_needed[attroff], joinrelids))
-			return false;
-	}
-
-	/*
-	 * Similarly check that the inner rel isn't needed by any PlaceHolderVars
-	 * that will be used above the join.  We only need to fail if such a PHV
-	 * actually references some inner-rel attributes; but the correct check
-	 * for that is relatively expensive, so we first check against ph_eval_at,
-	 * which must mention the inner rel if the PHV uses any inner-rel attrs as
-	 * non-lateral references.  Note that if the PHV's syntactic scope is just
-	 * the inner rel, we can't drop the rel even if the PHV is variable-free.
-	 */
-	foreach(l, root->placeholder_list)
-	{
-		PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l);
-
-		if (bms_is_subset(phinfo->ph_needed, joinrelids))
-			continue;			/* PHV is not used above the join */
-		if (bms_overlap(phinfo->ph_lateral, innerrel->relids))
-			return false;		/* it references innerrel laterally */
-		if (!bms_overlap(phinfo->ph_eval_at, innerrel->relids))
-			continue;			/* it definitely doesn't reference innerrel */
-		if (bms_is_subset(phinfo->ph_eval_at, innerrel->relids))
-			return false;		/* there isn't any other place to eval PHV */
-		if (bms_overlap(pull_varnos((Node *) phinfo->ph_var->phexpr),
-						innerrel->relids))
-			return false;		/* it does reference innerrel */
-	}
+	/* if the relation is referenced in the query then it cannot be removed */
+	if (relation_is_needed(root, joinrelids, innerrel))
+		return false;
 
 	/*
 	 * Search for mergejoinable clauses that constrain the inner rel against
@@ -367,6 +616,271 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	return false;
 }
 
+/*
+ * relation_is_needed
+ *		True if any of the Vars from this relation are required in the query
+ */
+static inline bool
+relation_is_needed(PlannerInfo *root, Relids joinrelids, RelOptInfo *rel)
+{
+	int		  attroff;
+	ListCell *l;
+
+	/*
+	 * rel is referenced if any of it's attributes are used above the join.
+	 *
+	 * Note that this test only detects use of rel's attributes in higher
+	 * join conditions and the target list.  There might be such attributes in
+	 * pushed-down conditions at this join, too.  We check that case below.
+	 *
+	 * As a micro-optimization, it seems better to start with max_attr and
+	 * count down rather than starting with min_attr and counting up, on the
+	 * theory that the system attributes are somewhat less likely to be wanted
+	 * and should be tested last.
+	 */
+	for (attroff = rel->max_attr - rel->min_attr;
+		 attroff >= 0;
+		 attroff--)
+	{
+		if (!bms_is_subset(rel->attr_needed[attroff], joinrelids))
+			return true;
+	}
+
+	/*
+	 * Similarly check that rel isn't needed by any PlaceHolderVars that will
+	 * be used above the join.  We only need to fail if such a PHV actually
+	 * references some of rel's attributes; but the correct check for that is
+	 * relatively expensive, so we first check against ph_eval_at, which must
+	 * mention rel if the PHV uses any of-rel's attrs as non-lateral
+	 * references.  Note that if the PHV's syntactic scope is just rel, we
+	 * can't return true even if the PHV is variable-free.
+	 */
+	foreach(l, root->placeholder_list)
+	{
+		PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l);
+
+		if (bms_is_subset(phinfo->ph_needed, joinrelids))
+			continue;			/* PHV is not used above the join */
+		if (bms_overlap(phinfo->ph_lateral, rel->relids))
+			return true;		/* it references rel laterally */
+		if (!bms_overlap(phinfo->ph_eval_at, rel->relids))
+			continue;			/* it definitely doesn't reference rel */
+		if (bms_is_subset(phinfo->ph_eval_at, rel->relids))
+			return true;		/* there isn't any other place to eval PHV */
+		if (bms_overlap(pull_varnos((Node *) phinfo->ph_var->phexpr),
+						rel->relids))
+			return true;		/* it does reference rel */
+	}
+
+	return false; /* it does not reference rel */
+}
+
+/*
+ * convert_join_to_isnotnull_quals
+ *		Adds any required "col IS NOT NULL" quals which are required to ensure
+ *		that the query remains equivalent to what it was before the join
+ *		was removed.
+ */
+static void
+convert_join_to_isnotnull_quals(PlannerInfo *root, RelOptInfo *rel, List *columnlist)
+{
+	ListCell	*lc;
+	Oid			 reloid;
+
+	reloid = root->simple_rte_array[rel->relid]->relid;
+
+	/*
+	 * If a join has been successfully removed by the join removal code,
+	 * then a foreign key must exist that proves the join to not be required.
+	 *
+	 * The join would have never allowed NULL values for any of the columns
+	 * seen in the join condition, as these would have matched up to a record
+	 * in the joined table. Now that we've proved the join to be redundant, we
+	 * must maintain that behavior of not having NULLs by adding IS NOT NULL
+	 * quals to the WHERE clause, although we may skip this if the column in
+	 * question happens to have a NOT NULL constraint.
+	 */
+	foreach(lc, columnlist)
+	{
+		Var *var = (Var *) lfirst(lc);
+
+		/* should be a Var if it came from a foreign key */
+		Assert(IsA(var, Var));
+		Assert(var->varno == rel->relid);
+
+		/* add the IS NOT NULL qual, but only if the column allows NULLs */
+		if (!get_attnotnull(reloid, var->varattno))
+		{
+			RestrictInfo *rinfo;
+			NullTest *ntest = makeNode(NullTest);
+
+			ntest->nulltesttype = IS_NOT_NULL;
+			ntest->arg = (Expr *) var;
+			ntest->argisrow = false;
+
+			rinfo = make_restrictinfo((Expr *)ntest, true, false, false,
+				NULL, NULL, NULL);
+
+			/* skip adding the IS NOT NULL qual if one already exists */
+			if (!list_member(rel->baserestrictinfo, rinfo))
+				rel->baserestrictinfo = lappend(rel->baserestrictinfo, rinfo);
+		}
+	}
+}
+
+/*
+ * relation_has_foreign_key_for
+ *	  Checks if rel has a foreign key which references referencedrel with the
+ *	  given list of expressions.
+ *
+ *	For the match to succeed:
+ *	  referencing_vars must match the columns defined in the foreign key.
+ *	  index_vars must match the columns defined in the index for the foreign key.
+ */
+static bool
+relation_has_foreign_key_for(PlannerInfo *root, RelOptInfo *rel,
+			RelOptInfo *referencedrel, List *referencing_vars,
+			List *index_vars, List *operator_list)
+{
+	ListCell *lc;
+	Oid		  refreloid;
+
+	/*
+	 * Look up the Oid of the referenced relation. We only want to look at
+	 * foreign keys on the referencing relation which reference this relation.
+	 */
+	refreloid = root->simple_rte_array[referencedrel->relid]->relid;
+
+	Assert(list_length(referencing_vars) > 0);
+	Assert(list_length(referencing_vars) == list_length(index_vars));
+	Assert(list_length(referencing_vars) == list_length(operator_list));
+
+	/*
+	 * Search through each foreign key on the referencing relation and try
+	 * to find one which references the relation in the join condition. If we
+	 * find one then we'll send the join conditions off to
+	 * expressions_match_foreign_key() to see if they match the foreign key.
+	 */
+	foreach(lc, rel->fklist)
+	{
+		ForeignKeyInfo *fk = (ForeignKeyInfo *) lfirst(lc);
+
+		if (fk->confrelid == refreloid)
+		{
+			if (expressions_match_foreign_key(fk, referencing_vars,
+				index_vars, operator_list))
+				return true;
+		}
+	}
+
+	return false;
+}
+
+/*
+ * expressions_match_foreign_key
+ *		True if the given fkvars, indexvars and operators will match
+ *		exactly 1 record in the referenced relation of the foreign key.
+ *
+ * Note: This function expects fkvars and indexvars to only contain Var types.
+ *		 Expression indexes are not supported by foreign keys.
+ */
+static bool
+expressions_match_foreign_key(ForeignKeyInfo *fk, List *fkvars,
+					List *indexvars, List *operators)
+{
+	ListCell  *lc;
+	ListCell  *lc2;
+	ListCell  *lc3;
+	Bitmapset *allitems;
+	Bitmapset *matcheditems;
+	int		   lstidx;
+	int		   col;
+
+	Assert(list_length(fkvars) == list_length(indexvars));
+	Assert(list_length(fkvars) == list_length(operators));
+
+	/*
+	 * Fast path out if there's not enough conditions to match each column in
+	 * the foreign key. Note that we cannot check that the number of
+	 * expressions are equal here since it would cause any expressions which
+	 * are duplicated not to match.
+	 */
+	if (list_length(fkvars) < fk->conncols)
+		return false;
+
+	/*
+	 * We need to ensure that each foreign key column can be matched to a list
+	 * item, and we need to ensure that each list item can be matched to a
+	 * foreign key column. We do this by looping over each foreign key column
+	 * and checking that we can find an item in the list which matches the
+	 * current column, however this method does not allow us to ensure that no
+	 * additional items exist in the list. We could solve that by performing
+	 * another loop over each list item and check that it matches a foreign key
+	 * column, but that's a bit wasteful. Instead we'll use 2 bitmapsets, one
+	 * to store the 0 based index of each list item, and with the other we'll
+	 * store each list index that we've managed to match. After we're done
+	 * matching we'll just make sure that both bitmapsets are equal.
+	 */
+	allitems = NULL;
+	matcheditems = NULL;
+
+	/*
+	 * Build a bitmapset which contains each 1 based list index. It seems more
+	 * efficient to do this in reverse so that we allocate enough memory for
+	 * the bitmapset on first loop rather than reallocating each time we find
+	 * we need a bit more space.
+	 */
+	for (lstidx = list_length(fkvars) - 1; lstidx >= 0; lstidx--)
+		allitems = bms_add_member(allitems, lstidx);
+
+	for (col = 0; col < fk->conncols; col++)
+	{
+		bool  matched = false;
+
+		lstidx = 0;
+
+		forthree(lc, fkvars, lc2, indexvars, lc3, operators)
+		{
+			Var *expr = (Var *) lfirst(lc);
+			Var *idxexpr = (Var *) lfirst(lc2);
+			Oid  opr = lfirst_oid(lc3);
+
+			Assert(IsA(expr, Var));
+			Assert(IsA(idxexpr, Var));
+
+			/* Does this join qual match up to the current fkey column? */
+			if (fk->conkey[col] == expr->varattno &&
+				fk->confkey[col] == idxexpr->varattno &&
+				equality_ops_are_compatible(opr, fk->conpfeqop[col]))
+			{
+				matched = true;
+
+				/* mark this list item as matched */
+				matcheditems = bms_add_member(matcheditems, lstidx);
+
+				/*
+				 * Don't break here as there may be duplicate expressions
+				 * that we also need to match against.
+				 */
+			}
+			lstidx++;
+		}
+
+		/* punt if there's no match. */
+		if (!matched)
+			return false;
+	}
+
+	/*
+	 * Ensure that we managed to match every item in the list to a foreign key
+	 * column.
+	 */
+	if (!bms_equal(allitems, matcheditems))
+		return false;
+
+	return true; /* matched */
+}
+
 
 /*
  * Remove the target relid from the planner's data structures, having
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index b2becfa..fea198e 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -25,7 +25,9 @@
 #include "access/transam.h"
 #include "access/xlog.h"
 #include "catalog/catalog.h"
+#include "catalog/pg_constraint.h"
 #include "catalog/heap.h"
+#include "catalog/pg_type.h"
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -38,6 +40,7 @@
 #include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "storage/bufmgr.h"
+#include "utils/fmgroids.h"
 #include "utils/lsyscache.h"
 #include "utils/rel.h"
 #include "utils/snapmgr.h"
@@ -89,6 +92,12 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 	Relation	relation;
 	bool		hasindex;
 	List	   *indexinfos = NIL;
+	List	   *fkinfos = NIL;
+	Relation	fkeyRel;
+	Relation	fkeyRelIdx;
+	ScanKeyData fkeyScankey;
+	SysScanDesc fkeyScan;
+	HeapTuple	tuple;
 
 	/*
 	 * We need not lock the relation since it was already locked, either by
@@ -384,6 +393,111 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 
 	heap_close(relation, NoLock);
 
+	/* load foreign key constraints */
+	ScanKeyInit(&fkeyScankey,
+				Anum_pg_constraint_conrelid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(relationObjectId));
+
+	fkeyRel = heap_open(ConstraintRelationId, AccessShareLock);
+	fkeyRelIdx = index_open(ConstraintRelidIndexId, AccessShareLock);
+	fkeyScan = systable_beginscan_ordered(fkeyRel, fkeyRelIdx, NULL, 1, &fkeyScankey);
+
+	while ((tuple = systable_getnext_ordered(fkeyScan, ForwardScanDirection)) != NULL)
+	{
+		Form_pg_constraint con = (Form_pg_constraint) GETSTRUCT(tuple);
+		ForeignKeyInfo *fkinfo;
+		Datum		adatum;
+		bool		isNull;
+		ArrayType  *arr;
+		int			nelements;
+
+		/* skip if not a foreign key */
+		if (con->contype != CONSTRAINT_FOREIGN)
+			continue;
+
+		/* we're not interested unless the fkey has been validated */
+		if (!con->convalidated)
+			continue;
+
+		fkinfo = (ForeignKeyInfo *) palloc(sizeof(ForeignKeyInfo));
+		fkinfo->conindid = con->conindid;
+		fkinfo->confrelid = con->confrelid;
+		fkinfo->convalidated = con->convalidated;
+		fkinfo->conrelid = con->conrelid;
+		fkinfo->confupdtype = con->confupdtype;
+		fkinfo->confdeltype = con->confdeltype;
+		fkinfo->confmatchtype = con->confmatchtype;
+
+		adatum = heap_getattr(tuple, Anum_pg_constraint_conkey,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null conkey for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		nelements = ARR_DIMS(arr)[0];
+		if (ARR_NDIM(arr) != 1 ||
+			nelements < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != INT2OID)
+			elog(ERROR, "conkey is not a 1-D smallint array");
+
+		fkinfo->conkey = (int16 *) ARR_DATA_PTR(arr);
+		fkinfo->conncols = nelements;
+
+		adatum = heap_getattr(tuple, Anum_pg_constraint_confkey,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null confkey for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		nelements = ARR_DIMS(arr)[0];
+
+		if (ARR_NDIM(arr) != 1 ||
+			nelements < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != INT2OID)
+			elog(ERROR, "confkey is not a 1-D smallint array");
+
+		/* sanity check */
+		if (nelements != fkinfo->conncols)
+			elog(ERROR, "number of confkey elements does not equal conkey elements");
+
+		fkinfo->confkey = (int16 *) ARR_DATA_PTR(arr);
+		adatum = heap_getattr(tuple, Anum_pg_constraint_conpfeqop,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null conpfeqop for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		nelements = ARR_DIMS(arr)[0];
+
+		if (ARR_NDIM(arr) != 1 ||
+			nelements < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != OIDOID)
+			elog(ERROR, "conpfeqop is not a 1-D smallint array");
+
+		/* sanity check */
+		if (nelements != fkinfo->conncols)
+			elog(ERROR, "number of conpfeqop elements does not equal conkey elements");
+
+		fkinfo->conpfeqop = (Oid *) ARR_DATA_PTR(arr);
+
+		fkinfos = lappend(fkinfos, fkinfo);
+	}
+
+	rel->fklist = fkinfos;
+	systable_endscan_ordered(fkeyScan);
+	index_close(fkeyRelIdx, AccessShareLock);
+	heap_close(fkeyRel, AccessShareLock);
+
 	/*
 	 * Allow a plugin to editorialize on the info we obtained from the
 	 * catalogs.  Actions might include altering the assumed relation size,
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index c938c27..a0fb8eb 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -115,6 +115,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind)
 	rel->lateral_relids = NULL;
 	rel->lateral_referencers = NULL;
 	rel->indexlist = NIL;
+	rel->fklist = NIL;
 	rel->pages = 0;
 	rel->tuples = 0;
 	rel->allvisfrac = 0;
@@ -377,6 +378,7 @@ build_join_rel(PlannerInfo *root,
 	joinrel->lateral_relids = NULL;
 	joinrel->lateral_referencers = NULL;
 	joinrel->indexlist = NIL;
+	joinrel->fklist = NIL;
 	joinrel->pages = 0;
 	joinrel->tuples = 0;
 	joinrel->allvisfrac = 0;
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 552e498..aa81c7c 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -916,6 +916,33 @@ get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 	ReleaseSysCache(tp);
 }
 
+/*
+ * get_attnotnull
+ *
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	tp;
+
+	tp = SearchSysCache2(ATTNUM,
+						 ObjectIdGetDatum(relid),
+						 Int16GetDatum(attnum));
+	if (HeapTupleIsValid(tp))
+	{
+		Form_pg_attribute att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+		bool		result;
+
+		result = att_tup->attnotnull;
+		ReleaseSysCache(tp);
+		return result;
+	}
+	else
+		return false;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h
index d0b0356..34a75e4 100644
--- a/src/include/commands/trigger.h
+++ b/src/include/commands/trigger.h
@@ -181,6 +181,7 @@ extern void ExecBSTruncateTriggers(EState *estate,
 extern void ExecASTruncateTriggers(EState *estate,
 					   ResultRelInfo *relinfo);
 
+extern bool AfterTriggerQueueIsEmpty(void);
 extern void AfterTriggerBeginXact(void);
 extern void AfterTriggerBeginQuery(void);
 extern void AfterTriggerEndQuery(EState *estate);
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index dacbe9c..f69df09 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -355,6 +355,8 @@ typedef struct PlannerInfo
  *		lateral_referencers - relids of rels that reference this one laterally
  *		indexlist - list of IndexOptInfo nodes for relation's indexes
  *					(always NIL if it's not a table)
+ *		fklist - list of ForeignKeyInfo's for relation's foreign key
+ *					constraints. (always NIL if it's not a table)
  *		pages - number of disk pages in relation (zero if not a table)
  *		tuples - number of tuples in relation (not considering restrictions)
  *		allvisfrac - fraction of disk pages that are marked all-visible
@@ -448,6 +450,7 @@ typedef struct RelOptInfo
 	Relids		lateral_relids; /* minimum parameterization of rel */
 	Relids		lateral_referencers;	/* rels that reference me laterally */
 	List	   *indexlist;		/* list of IndexOptInfo */
+	List	   *fklist;			/* list of ForeignKeyInfo */
 	BlockNumber pages;			/* size estimates derived from pg_class */
 	double		tuples;
 	double		allvisfrac;
@@ -538,6 +541,51 @@ typedef struct IndexOptInfo
 	bool		amhasgetbitmap; /* does AM have amgetbitmap interface? */
 } IndexOptInfo;
 
+/*
+ * ForeignKeyInfo
+ *		Used to store pg_constraint records for foreign key constraints for use
+ *		by the planner.
+ *
+ *		conindid - The index which supports the foreign key
+ *
+ *		confrelid - The relation that is referenced by this foreign key
+ *
+ *		convalidated - True if the foreign key has been validated.
+ *
+ *		conrelid - The Oid of the relation that the foreign key belongs to
+ *
+ *		confupdtype - ON UPDATE action for when the referenced table is updated
+ *
+ *		confdeltype - ON DELETE action, controls what to do when a record is
+ *					deleted from the referenced table.
+ *
+ *		confmatchtype - foreign key match type, e.g MATCH FULL, MATCH PARTIAL
+ *
+ *		conncols - Number of columns defined in the foreign key
+ *
+ *		conkey - An array of conncols elements to store the varattno of the
+ *					columns on the referencing side of the foreign key
+ *
+ *		confkey - An array of conncols elements to store the varattno of the
+ *					columns on the referenced side of the foreign key
+ *
+ *		conpfeqop - An array of conncols elements to store the operators for
+ *					PK = FK comparisons
+ */
+typedef struct ForeignKeyInfo
+{
+	Oid			conindid;		/* index supporting this constraint */
+	Oid			confrelid;		/* relation referenced by foreign key */
+	bool		convalidated;	/* constraint has been validated? */
+	Oid			conrelid;		/* relation this constraint constrains */
+	char		confupdtype;	/* foreign key's ON UPDATE action */
+	char		confdeltype;	/* foreign key's ON DELETE action */
+	char		confmatchtype;	/* foreign key's match type */
+	int			conncols;		/* number of columns references */
+	int16	   *conkey;			/* Columns of conrelid that the constraint applies to */
+	int16	   *confkey;		/* columns of confrelid that foreign key references */
+	Oid		   *conpfeqop;		/* Operator list for comparing PK to FK */
+} ForeignKeyInfo;
 
 /*
  * EquivalenceClasses
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 9b22fda..56a1703 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -112,6 +112,8 @@ extern List *generate_join_implied_equalities(PlannerInfo *root,
 								 Relids join_relids,
 								 Relids outer_relids,
 								 RelOptInfo *inner_rel);
+extern Oid select_equality_operator(EquivalenceClass *ec, Oid lefttype,
+								 Oid righttype);
 extern bool exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2);
 extern void add_child_rel_equivalences(PlannerInfo *root,
 						   AppendRelInfo *appinfo,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 07d24d4..910190d 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -68,6 +68,7 @@ extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern int32 get_atttypmod(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 					  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern char *get_collation_name(Oid colloid);
 extern char *get_constraint_name(Oid conoid);
 extern Oid	get_opclass_family(Oid opclass);
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 2501184..8903491 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3276,6 +3276,298 @@ select i8.* from int8_tbl i8 left join (select f1 from int4_tbl group by f1) i4
 (1 row)
 
 rollback;
+BEGIN;
+-- Test join removals for inner joins
+CREATE TEMP TABLE c (id INT NOT NULL PRIMARY KEY);
+CREATE TEMP TABLE b (id INT NOT NULL PRIMARY KEY, c_id INT NOT NULL REFERENCES c(id), val INT);
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id INT REFERENCES b(id));
+-- this should remove inner join to b
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id;
+          QUERY PLAN          
+------------------------------
+ Seq Scan on a
+   Filter: (b_id IS NOT NULL)
+(2 rows)
+
+-- this should remove inner join to b and c
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id INNER JOIN c ON b.c_id = c.id;
+          QUERY PLAN          
+------------------------------
+ Seq Scan on a
+   Filter: (b_id IS NOT NULL)
+(2 rows)
+
+-- change order of tables in query, this should generate the same plan as above.
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM c INNER JOIN b ON c.id = b.c_id INNER JOIN a ON a.b_id = b.id;
+          QUERY PLAN          
+------------------------------
+ Seq Scan on a
+   Filter: (b_id IS NOT NULL)
+(2 rows)
+
+-- inner join can't be removed due to b columns in the target list
+EXPLAIN (COSTS OFF)
+SELECT * FROM a INNER JOIN b ON a.b_id = b.id;
+          QUERY PLAN          
+------------------------------
+ Hash Join
+   Hash Cond: (a.b_id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(5 rows)
+
+-- this should not remove inner join to b due to quals restricting results from b
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b on a.b_id = b.id WHERE b.val = 10;
+            QUERY PLAN            
+----------------------------------
+ Hash Join
+   Hash Cond: (a.b_id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: (val = 10)
+(6 rows)
+
+-- this should remove joins to b and c.
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) FROM a INNER JOIN b ON a.b_id = b.id LEFT OUTER JOIN c ON a.id = c.id;
+             QUERY PLAN             
+------------------------------------
+ Aggregate
+   ->  Seq Scan on a
+         Filter: (b_id IS NOT NULL)
+(3 rows)
+
+-- this should remove joins to b and c, however it b will only be removed on
+-- 2nd attempt after c is removed by the left join removal code.
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) FROM a INNER JOIN b ON a.b_id = b.id LEFT OUTER JOIN c ON b.id = c.id;
+             QUERY PLAN             
+------------------------------------
+ Aggregate
+   ->  Seq Scan on a
+         Filter: (b_id IS NOT NULL)
+(3 rows)
+
+-- this should not remove join to b
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b on a.b_id = b.id WHERE b.val = b.id;
+            QUERY PLAN            
+----------------------------------
+ Hash Join
+   Hash Cond: (a.b_id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: (id = val)
+(6 rows)
+
+-- this should not remove the join, no foreign key exists between a.id and b.id
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.id = b.id;
+         QUERY PLAN         
+----------------------------
+ Hash Join
+   Hash Cond: (a.id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(5 rows)
+
+-- ensure a left joined rel can't remove an inner joined rel
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM b LEFT JOIN a ON b.id = a.b_id;
+          QUERY PLAN          
+------------------------------
+ Hash Right Join
+   Hash Cond: (a.b_id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(5 rows)
+
+-- Ensure we remove b, but don't try and remove c. c has no join condition.
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id CROSS JOIN c;
+                QUERY PLAN                
+------------------------------------------
+ Nested Loop
+   ->  Seq Scan on c
+   ->  Materialize
+         ->  Seq Scan on a
+               Filter: (b_id IS NOT NULL)
+(5 rows)
+
+ALTER TABLE b ALTER COLUMN c_id DROP NOT NULL;
+-- only c should be removed here because the join to b must remain in order to
+-- filter out rows where b.c_id is null
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id INNER JOIN c ON b.c_id = c.id;
+                QUERY PLAN                
+------------------------------------------
+ Hash Join
+   Hash Cond: (a.b_id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: (c_id IS NOT NULL)
+(6 rows)
+
+ALTER TABLE a ALTER COLUMN b_id SET NOT NULL;
+-- Ensure the join gets removed, but an IS NOT NULL qual is not added for b_id
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id;
+  QUERY PLAN   
+---------------
+ Seq Scan on a
+(1 row)
+
+ROLLBACK;
+BEGIN;
+-- inner join removal code with 2 column foreign keys
+CREATE TEMP TABLE b (id1 INT NOT NULL, id2 INT NOT NULL, PRIMARY KEY(id1,id2));
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id1 INT, b_id2 INT);
+ALTER TABLE a ADD CONSTRAINT a_b_id1_b_id2_fkey FOREIGN KEY (b_id1,b_id2) REFERENCES b(id1,id2) MATCH SIMPLE;
+-- this should remove inner join to b
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 = b.id1 AND a.b_id2 = b.id2;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Seq Scan on a
+   Filter: ((b_id1 IS NOT NULL) AND (b_id2 IS NOT NULL))
+(2 rows)
+
+-- should not remove inner join to b (extra condition)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 = b.id1 AND a.b_id2 = b.id2 AND a.b_id2 >= b.id2;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Merge Join
+   Merge Cond: ((b.id1 = a.b_id1) AND (b.id2 = a.b_id2))
+   Join Filter: (a.b_id2 >= b.id2)
+   ->  Index Only Scan using b_pkey on b
+   ->  Sort
+         Sort Key: a.b_id1, a.b_id2
+         ->  Seq Scan on a
+(7 rows)
+
+-- should not remove inner join to b (wrong operator)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 > b.id1 AND a.b_id2 < b.id2;
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Nested Loop
+   ->  Seq Scan on a
+   ->  Index Only Scan using b_pkey on b
+         Index Cond: ((id1 < a.b_id1) AND (id2 > a.b_id2))
+(4 rows)
+
+-- should not remove inner join (only checking id1)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 = b.id1;
+               QUERY PLAN                
+-----------------------------------------
+ Merge Join
+   Merge Cond: (b.id1 = a.b_id1)
+   ->  Index Only Scan using b_pkey on b
+   ->  Sort
+         Sort Key: a.b_id1
+         ->  Seq Scan on a
+(6 rows)
+
+-- should not remove inner join (checking wrong columns)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id2 = b.id1 AND a.b_id1 = b.id2;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Merge Join
+   Merge Cond: ((b.id1 = a.b_id2) AND (b.id2 = a.b_id1))
+   ->  Index Only Scan using b_pkey on b
+   ->  Sort
+         Sort Key: a.b_id2, a.b_id1
+         ->  Seq Scan on a
+(6 rows)
+
+-- should not remove inner join (no check for id2)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id2 = b.id1 AND a.b_id1 = b.id1;
+              QUERY PLAN               
+---------------------------------------
+ Hash Join
+   Hash Cond: (b.id1 = a.b_id2)
+   ->  Seq Scan on b
+   ->  Hash
+         ->  Seq Scan on a
+               Filter: (b_id2 = b_id1)
+(6 rows)
+
+-- should not remove inner join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 = b.id1 AND a.b_id1 = b.id2;
+            QUERY PLAN             
+-----------------------------------
+ Hash Join
+   Hash Cond: (a.b_id1 = b.id1)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: (id1 = id2)
+(6 rows)
+
+-- should not remove inner join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a INNER JOIN b ON a.b_id1 = b.id1 AND a.b_id1 = b.id1;
+               QUERY PLAN                
+-----------------------------------------
+ Merge Join
+   Merge Cond: (b.id1 = a.b_id1)
+   ->  Index Only Scan using b_pkey on b
+   ->  Sort
+         Sort Key: a.b_id1
+         ->  Seq Scan on a
+(6 rows)
+
+ROLLBACK;
+-- In this test we want to ensure that INNER JOIN removal does not
+-- occur when there are pending foreign key triggers.
+-- We test this by updating a relation which is referenced by a foreign key
+-- and then executing another query which would normally allow the inner
+-- join to be removed.
+CREATE TABLE j2 (id INT NOT NULL PRIMARY KEY);
+CREATE TABLE j1 (
+  id INT PRIMARY KEY,
+  j2_id INT NOT NULL REFERENCES j2 (id) MATCH FULL ON DELETE CASCADE ON UPDATE CASCADE
+);
+INSERT INTO j2 VALUES(10),(20);
+INSERT INTO j1 VALUES(1,10),(2,20);
+CREATE TABLE results (j2_id INT NOT NULL);
+CREATE OR REPLACE FUNCTION j1_update() RETURNS TRIGGER AS $$
+BEGIN
+  INSERT INTO results SELECT j2_id FROM j1 INNER JOIN j2 ON j1.j2_id = j2.id;
+  RETURN NEW;
+  END;
+$$ LANGUAGE plpgsql;
+CREATE TRIGGER j1_update_trigger BEFORE UPDATE ON j2 FOR EACH ROW EXECUTE PROCEDURE j1_update();
+UPDATE j2 SET id = id + 1;
+-- results should only contain 3 records. If we blindly removed the join despite the
+-- foreign key not having updated the referenced records yet, we'd get 4 rows in results.
+SELECT * FROM results;
+ j2_id 
+-------
+    10
+    20
+    20
+(3 rows)
+
+DROP TABLE j1;
+DROP TABLE j2;
+DROP TABLE results;
+DROP FUNCTION j1_update();
 create temp table parent (k int primary key, pd int);
 create temp table child (k int unique, cd int);
 insert into parent values (1, 10), (2, 20), (3, 30);
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index 718e1d9..8002e4b 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -977,6 +977,154 @@ select i8.* from int8_tbl i8 left join (select f1 from int4_tbl group by f1) i4
 
 rollback;
 
+BEGIN;
+
+-- Test join removals for inner joins
+CREATE TEMP TABLE c (id INT NOT NULL PRIMARY KEY);
+CREATE TEMP TABLE b (id INT NOT NULL PRIMARY KEY, c_id INT NOT NULL REFERENCES c(id), val INT);
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id INT REFERENCES b(id));
+
+-- this should remove inner join to b
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id;
+
+-- this should remove inner join to b and c
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id INNER JOIN c ON b.c_id = c.id;
+
+-- change order of tables in query, this should generate the same plan as above.
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM c INNER JOIN b ON c.id = b.c_id INNER JOIN a ON a.b_id = b.id;
+
+-- inner join can't be removed due to b columns in the target list
+EXPLAIN (COSTS OFF)
+SELECT * FROM a INNER JOIN b ON a.b_id = b.id;
+
+-- this should not remove inner join to b due to quals restricting results from b
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b on a.b_id = b.id WHERE b.val = 10;
+
+-- this should remove joins to b and c.
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) FROM a INNER JOIN b ON a.b_id = b.id LEFT OUTER JOIN c ON a.id = c.id;
+
+-- this should remove joins to b and c, however it b will only be removed on
+-- 2nd attempt after c is removed by the left join removal code.
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) FROM a INNER JOIN b ON a.b_id = b.id LEFT OUTER JOIN c ON b.id = c.id;
+
+-- this should not remove join to b
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b on a.b_id = b.id WHERE b.val = b.id;
+
+-- this should not remove the join, no foreign key exists between a.id and b.id
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.id = b.id;
+
+-- ensure a left joined rel can't remove an inner joined rel
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM b LEFT JOIN a ON b.id = a.b_id;
+
+-- Ensure we remove b, but don't try and remove c. c has no join condition.
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id CROSS JOIN c;
+
+ALTER TABLE b ALTER COLUMN c_id DROP NOT NULL;
+
+-- only c should be removed here because the join to b must remain in order to
+-- filter out rows where b.c_id is null
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id INNER JOIN c ON b.c_id = c.id;
+
+
+ALTER TABLE a ALTER COLUMN b_id SET NOT NULL;
+
+-- Ensure the join gets removed, but an IS NOT NULL qual is not added for b_id
+EXPLAIN (COSTS OFF)
+SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id;
+
+ROLLBACK;
+
+BEGIN;
+
+-- inner join removal code with 2 column foreign keys
+CREATE TEMP TABLE b (id1 INT NOT NULL, id2 INT NOT NULL, PRIMARY KEY(id1,id2));
+CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id1 INT, b_id2 INT);
+
+ALTER TABLE a ADD CONSTRAINT a_b_id1_b_id2_fkey FOREIGN KEY (b_id1,b_id2) REFERENCES b(id1,id2) MATCH SIMPLE;
+
+-- this should remove inner join to b
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 = b.id1 AND a.b_id2 = b.id2;
+
+-- should not remove inner join to b (extra condition)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 = b.id1 AND a.b_id2 = b.id2 AND a.b_id2 >= b.id2;
+
+-- should not remove inner join to b (wrong operator)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 > b.id1 AND a.b_id2 < b.id2;
+
+-- should not remove inner join (only checking id1)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 = b.id1;
+
+-- should not remove inner join (checking wrong columns)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id2 = b.id1 AND a.b_id1 = b.id2;
+
+-- should not remove inner join (no check for id2)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id2 = b.id1 AND a.b_id1 = b.id1;
+
+-- should not remove inner join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT a.id FROM a INNER JOIN b ON a.b_id1 = b.id1 AND a.b_id1 = b.id2;
+
+-- should not remove inner join (no check for b_id2)
+EXPLAIN (COSTS OFF)
+SELECT id FROM a INNER JOIN b ON a.b_id1 = b.id1 AND a.b_id1 = b.id1;
+
+ROLLBACK;
+
+-- In this test we want to ensure that INNER JOIN removal does not
+-- occur when there are pending foreign key triggers.
+-- We test this by updating a relation which is referenced by a foreign key
+-- and then executing another query which would normally allow the inner
+-- join to be removed.
+
+CREATE TABLE j2 (id INT NOT NULL PRIMARY KEY);
+CREATE TABLE j1 (
+  id INT PRIMARY KEY,
+  j2_id INT NOT NULL REFERENCES j2 (id) MATCH FULL ON DELETE CASCADE ON UPDATE CASCADE
+);
+
+INSERT INTO j2 VALUES(10),(20);
+INSERT INTO j1 VALUES(1,10),(2,20);
+
+CREATE TABLE results (j2_id INT NOT NULL);
+
+CREATE OR REPLACE FUNCTION j1_update() RETURNS TRIGGER AS $$
+BEGIN
+  INSERT INTO results SELECT j2_id FROM j1 INNER JOIN j2 ON j1.j2_id = j2.id;
+  RETURN NEW;
+  END;
+$$ LANGUAGE plpgsql;
+
+CREATE TRIGGER j1_update_trigger BEFORE UPDATE ON j2 FOR EACH ROW EXECUTE PROCEDURE j1_update();
+
+UPDATE j2 SET id = id + 1;
+
+-- results should only contain 3 records. If we blindly removed the join despite the
+-- foreign key not having updated the referenced records yet, we'd get 4 rows in results.
+SELECT * FROM results;
+
+DROP TABLE j1;
+DROP TABLE j2;
+DROP TABLE results;
+DROP FUNCTION j1_update();
+
+
 create temp table parent (k int primary key, pd int);
 create temp table child (k int unique, cd int);
 insert into parent values (1, 10), (2, 20), (3, 30);

#16

Heikki Linnakangas

hlinnakangas@vmware.com

over 11 years ago

In reply to: David Rowley (#15)

Re: Patch to support SEMI and ANTI join removal

On 09/16/2014 01:20 PM, David Rowley wrote:

+	/*
+	 * We mustn't allow any joins to be removed if there are any pending
+	 * foreign key triggers in the queue. This could happen if we are planning
+	 * a query that has been executed from within a volatile function and the
+	 * query which called this volatile function has made some changes to a
+	 * table referenced by a foreign key. The reason for this is that any
+	 * updates to a table which is referenced by a foreign key constraint will
+	 * only have the referencing tables updated after the command is complete,
+	 * so there is a window of time where records may violate the foreign key
+	 * constraint.
+	 *
+	 * Currently this code is quite naive, as we won't even attempt to remove
+	 * the join if there are *any* pending foreign key triggers, on any
+	 * relation. It may be worthwhile to improve this to check if there's any
+	 * pending triggers for the referencing relation in the join.
+	 */
+	if (!AfterTriggerQueueIsEmpty())
+		return false;

Hmm. This code runs when the query is planned. There is no guarantee
that there won't be after triggers pending when the query is later
*executed*.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

David Rowley

dgrowleyml@gmail.com

over 11 years ago

In reply to: Heikki Linnakangas (#16)

Re: Patch to support SEMI and ANTI join removal

On Fri, Sep 26, 2014 at 12:36 AM, Heikki Linnakangas <
hlinnakangas@vmware.com> wrote:

On 09/16/2014 01:20 PM, David Rowley wrote:

+       /*
+        * We mustn't allow any joins to be removed if there are any
pending
+        * foreign key triggers in the queue. This could happen if we are
planning
+        * a query that has been executed from within a volatile function
and the
+        * query which called this volatile function has made some
changes to a
+        * table referenced by a foreign key. The reason for this is that
any
+        * updates to a table which is referenced by a foreign key
constraint will
+        * only have the referencing tables updated after the command is
complete,
+        * so there is a window of time where records may violate the
foreign key
+        * constraint.
+        *
+        * Currently this code is quite naive, as we won't even attempt
to remove
+        * the join if there are *any* pending foreign key triggers, on
any
+        * relation. It may be worthwhile to improve this to check if
there's any
+        * pending triggers for the referencing relation in the join.
+        */
+       if (!AfterTriggerQueueIsEmpty())
+               return false;

Hi Heikki,

Thanks for having a look at the patch.

Hmm. This code runs when the query is planned. There is no guarantee that

there won't be after triggers pending when the query is later *executed*.

Please correct anything that sounds wrong here, but my understanding is
that we'll always plan a query right before we execute it, with the
exception of PREPARE statements where PostgreSQL will cache the query plan
when the prepare statement is first executed. So I think you may have a
point here regarding PREPARE'd statements, but I think that it is isolated
to those.

I think in all other cases we'll plan right before we execute. So if we
happen to be planning an UPDATE statement which has a sub-query that
perform some INNER JOINs, I think we're safe to remove INNER JOINs when
possible, as the UPDATE statement won't get visibility of its own changes.

We can see that here:

create table updatetest (id int primary key, value int, value2 int);

create or replace function getvalue2(p_id int) returns int
as $$select value2 from updatetest where id = p_id$$
language sql volatile;

insert into updatetest values(0,0,0);
insert into updatetest values(1,10,10);
insert into updatetest values(2,20,20);
insert into updatetest values(3,30,30);

update updatetest set value = COALESCE((select value from updatetest u2
where updatetest.id - 1 = u2.id) + 1,0);

update updatetest set value2 = COALESCE(getvalue2(id - 1) + 1,0);

select * from updatetest;
id | value | value2
----+-------+--------
0 | 0 | 0
1 | 1 | 1
2 | 11 | 2
3 | 21 | 3

The value column appears to have been set based on the value that was
previously in the value column, and has not come from the newly set value.
The behaviour is different for the value2 column as the value for this has
been fetched from another query, which *does* see the newly updated value
stored in the value2 column.

My understanding of foreign keys is that any pending foreign key triggers
will be executed just before the query completes, so we should only ever
encounter pending foreign key triggers during planning when we're planning
a query that's being executed from somewhere like a volatile function or
trigger function, if the outer query has updated or deleted some records
which are referenced by a foreign key.

So I think with the check for pending triggers at planning time this is
safe at least for queries being planned right before they're executed, but
you've caused me to realise that I'll probably need to do some more work on
this for when it comes to PREPARE'd queries, as it looks like if we
executed a prepared query from inside a volatile function or trigger
function that was called from a DELETE or UPDATE statement that caused
foreign key triggers to be queued, and we'd happened to have removed some
INNER JOINs when we originally planned that prepare statement, then that
would be wrong.

The only thing that comes to mind to fix that right now is to tag something
maybe in PlannerInfo to say if we've removed any INNER JOINs in planning,
then when we execute a prepared statement we could void the cached plan we
see that some INNER JOINs were removed, but only do this if the foreign key
trigger queue has pending triggers. (which will hopefully not be very
often).

Another thing that comes to mind which may be similar is how we handle
something like:

PREPARE a AS SELECT * from tbl WHERE name LIKE $1;

Where, if $1 is 'foo' or 'foo%' we might want to use an index scan, but if
$1 was '%foo' then we'd probably not.
I've not yet looked into great detail of what happens here, but from some
quick simple tests it seems to replan each time!? But perhaps that'd due to
the parameter, where with my other tests the PREPARE statement had no
parameters.

There was some other discussion relating to some of this over here->
/messages/by-id/20140603235053.GA351732@tornado.leadboat.com

Regards

David Rowley

#18

Andres Freund

andres@2ndquadrant.com

over 11 years ago

In reply to: David Rowley (#17)

Re: Patch to support SEMI and ANTI join removal

On 2014-09-28 17:32:21 +1300, David Rowley wrote:

My understanding of foreign keys is that any pending foreign key triggers
will be executed just before the query completes, so we should only ever
encounter pending foreign key triggers during planning when we're planning
a query that's being executed from somewhere like a volatile function or
trigger function, if the outer query has updated or deleted some records
which are referenced by a foreign key.

Note that foreign key checks also can be deferred. So the window for
these cases is actually larger.

So I think with the check for pending triggers at planning time this is
safe at least for queries being planned right before they're executed, but
you've caused me to realise that I'll probably need to do some more work on
this for when it comes to PREPARE'd queries, as it looks like if we
executed a prepared query from inside a volatile function or trigger
function that was called from a DELETE or UPDATE statement that caused
foreign key triggers to be queued, and we'd happened to have removed some
INNER JOINs when we originally planned that prepare statement, then that
would be wrong.

I'm wondering whether this wouldn't actually be better handled by some
sort of 'one time filter' capability for joins. When noticing during
planning that one side of the join is nullable attach a check to the
join node. Then, whenever that check returns true, skip checking one
side of the join and return rows without looking at that side.

That capability might also be interesting for more efficient planning of
left joins that partially have a constant join expression.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

Tom Lane

tgl@sss.pgh.pa.us

over 11 years ago

In reply to: David Rowley (#17)

Re: Patch to support SEMI and ANTI join removal

David Rowley <dgrowleyml@gmail.com> writes:

Please correct anything that sounds wrong here, but my understanding is
that we'll always plan a query right before we execute it, with the
exception of PREPARE statements where PostgreSQL will cache the query plan
when the prepare statement is first executed.

If this optimization only works in that scenario, it's dead in the water,
because that assumption is unsupportable. The planner does not in general
use the same query snapshot as the executor, so even in an immediate-
execution workflow there could have been data changes (caused by other
transactions) between planning and execution.

Why do you need such an assumption?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

David Rowley

dgrowleyml@gmail.com

over 11 years ago

In reply to: Andres Freund (#18)

Re: Patch to support SEMI and ANTI join removal

On Mon, Sep 29, 2014 at 2:41 AM, Andres Freund <andres@2ndquadrant.com>
wrote:

On 2014-09-28 17:32:21 +1300, David Rowley wrote:

My understanding of foreign keys is that any pending foreign key triggers
will be executed just before the query completes, so we should only ever
encounter pending foreign key triggers during planning when we're

planning

a query that's being executed from somewhere like a volatile function or
trigger function, if the outer query has updated or deleted some records
which are referenced by a foreign key.

Note that foreign key checks also can be deferred. So the window for
these cases is actually larger.

Thanks Andres, I know you had said this before but I had previously failed
to realise exactly what you meant. I thought you were talking about
defining a foreign key to reference a column that has a deferrable unique
index. I now realise you were talking about making the foreign key itself
as deferrable. I've made a change to the patch locally to ignore foreign
keys that are marked as deferrable.

Regards

David Rowley

#21

Andres Freund

andres@2ndquadrant.com

over 11 years ago

In reply to: Tom Lane (#19)

Re: Patch to support SEMI and ANTI join removal

On 2014-09-28 10:41:56 -0400, Tom Lane wrote:

David Rowley <dgrowleyml@gmail.com> writes:

Please correct anything that sounds wrong here, but my understanding is
that we'll always plan a query right before we execute it, with the
exception of PREPARE statements where PostgreSQL will cache the query plan
when the prepare statement is first executed.

If this optimization only works in that scenario, it's dead in the water,
because that assumption is unsupportable. The planner does not in general
use the same query snapshot as the executor, so even in an immediate-
execution workflow there could have been data changes (caused by other
transactions) between planning and execution.

I don't think the effects of other queries are the problem here. The
effect of other backend's deferred FK checks shouldn't matter for other
backends for normal query purposes. It's the planning backend that might
have deferred checks and thus temporarily violated foreign keys.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22

Andres Freund

andres@2ndquadrant.com

over 11 years ago

In reply to: David Rowley (#20)

Re: Patch to support SEMI and ANTI join removal

On 2014-09-29 22:42:57 +1300, David Rowley wrote:

On Mon, Sep 29, 2014 at 2:41 AM, Andres Freund <andres@2ndquadrant.com>
wrote:

On 2014-09-28 17:32:21 +1300, David Rowley wrote:

My understanding of foreign keys is that any pending foreign key triggers
will be executed just before the query completes, so we should only ever
encounter pending foreign key triggers during planning when we're

planning

a query that's being executed from somewhere like a volatile function or
trigger function, if the outer query has updated or deleted some records
which are referenced by a foreign key.

Note that foreign key checks also can be deferred. So the window for
these cases is actually larger.

Thanks Andres, I know you had said this before but I had previously failed
to realise exactly what you meant. I thought you were talking about
defining a foreign key to reference a column that has a deferrable unique
index. I now realise you were talking about making the foreign key itself
as deferrable.

Oh. Don't remember doing that ;)

I've made a change to the patch locally to ignore foreign
keys that are marked as deferrable.

I have serious doubts about the general usefulness if this is onlyu
going to be useable in a very restricted set of circumstances (only one
time plans, no deferrable keys). I think it'd be awesome to have the
capability, but I don't think it's ok to restrict it that much.

To me that means you can't make the decision at plan time, but need to
move it to execution time. It really doesn't sound that hard to short
circuit the semi joins whenever, at execution time, there's no entries
in the deferred trigger queue. It's a bit annoying to have to add code
to all of nestloop/hashjoin/mergejoin to not check the outer tuple if
there's no such entry. But I don't think it'll be too bad. That'd mean
it can be used in prepared statements.

What I think would be a bit finnicky is the costing...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23

Tom Lane

tgl@sss.pgh.pa.us

over 11 years ago

In reply to: Andres Freund (#21)

Re: Patch to support SEMI and ANTI join removal

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-09-28 10:41:56 -0400, Tom Lane wrote:

If this optimization only works in that scenario, it's dead in the water,
because that assumption is unsupportable. The planner does not in general
use the same query snapshot as the executor, so even in an immediate-
execution workflow there could have been data changes (caused by other
transactions) between planning and execution.

I don't think the effects of other queries are the problem here. The
effect of other backend's deferred FK checks shouldn't matter for other
backends for normal query purposes. It's the planning backend that might
have deferred checks and thus temporarily violated foreign keys.

I see. So why aren't we simply ignoring deferrable FKs when making the
optimization? That pushes it back from depending on execution-time state
(unsafe) to depending on table DDL (safe).

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24

Andres Freund

andres@2ndquadrant.com

over 11 years ago

In reply to: Tom Lane (#23)

Re: Patch to support SEMI and ANTI join removal

On 2014-09-29 10:12:25 -0400, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-09-28 10:41:56 -0400, Tom Lane wrote:

If this optimization only works in that scenario, it's dead in the water,
because that assumption is unsupportable. The planner does not in general
use the same query snapshot as the executor, so even in an immediate-
execution workflow there could have been data changes (caused by other
transactions) between planning and execution.

I don't think the effects of other queries are the problem here. The
effect of other backend's deferred FK checks shouldn't matter for other
backends for normal query purposes. It's the planning backend that might
have deferred checks and thus temporarily violated foreign keys.

I see. So why aren't we simply ignoring deferrable FKs when making the
optimization? That pushes it back from depending on execution-time state
(unsafe) to depending on table DDL (safe).

IIRC there's some scenarios where violated FKs are visible to client
code for nondeferrable ones as well. Consider e.g. cascading foreign
keys + triggers. Or, somewhat insane, operators used in fkey triggers
that execute queries themselves.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#25

Tom Lane

tgl@sss.pgh.pa.us

over 11 years ago

In reply to: Andres Freund (#24)

Re: Patch to support SEMI and ANTI join removal

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-09-29 10:12:25 -0400, Tom Lane wrote:

I see. So why aren't we simply ignoring deferrable FKs when making the
optimization? That pushes it back from depending on execution-time state
(unsafe) to depending on table DDL (safe).

IIRC there's some scenarios where violated FKs are visible to client
code for nondeferrable ones as well. Consider e.g. cascading foreign
keys + triggers. Or, somewhat insane, operators used in fkey triggers
that execute queries themselves.

Yeah, I had just thought of the query-in-function-called-from-violating-
query case myself. I plead insufficient caffeine :-(. I'd been making
a mental analogy to non-deferred uniqueness constraints, but actually
what we will optimize outer joins on is "immediate" unique indexes,
wherein there's no delay at all before the constraint is checked.
Too bad there's no equivalent in foreign key land.

These are certainly corner cases, but it doesn't seem up to project
quality standards to just ignore them. So I'm thinking you're right
that a run-time short-circuit would be the only way to deal with the
case safely.

On the whole I'm feeling that the scope of applicability of this
optimization is going to be too narrow to justify the maintenance
effort and extra planning/runtime overhead.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26

David Rowley

dgrowleyml@gmail.com

over 11 years ago

In reply to: Andres Freund (#22)

Re: Patch to support SEMI and ANTI join removal

On Tue, Sep 30, 2014 at 12:42 AM, Andres Freund <andres@2ndquadrant.com>
wrote:

On 2014-09-29 22:42:57 +1300, David Rowley wrote:

I've made a change to the patch locally to ignore foreign
keys that are marked as deferrable.

I have serious doubts about the general usefulness if this is onlyu
going to be useable in a very restricted set of circumstances (only one
time plans, no deferrable keys). I think it'd be awesome to have the
capability, but I don't think it's ok to restrict it that much.

I had a look to see what Oracle does in this situation and I was quite
shocked to see that they're blatantly just ignoring the fact that the
foreign key is being deferred. I tested by deferring the foreign key in a
transaction then updating the referenced record and I see that Oracle just
return the wrong results as they're just blindly removing the join. So it
appears that they've not solved this one very well.

To me that means you can't make the decision at plan time, but need to
move it to execution time. It really doesn't sound that hard to short
circuit the semi joins whenever, at execution time, there's no entries
in the deferred trigger queue. It's a bit annoying to have to add code
to all of nestloop/hashjoin/mergejoin to not check the outer tuple if
there's no such entry. But I don't think it'll be too bad. That'd mean
it can be used in prepared statements.

I'm starting to think about how this might be done, but I'm a bit confused
and I don't know if it's something you've overlooked or something I've
misunderstood.

I've not quite gotten my head around how we might stop the unneeded
relation from being the primary path to join the other inner relations,
i.e. what would stop the planner making a plan that hashed all the other
relations and planned to perform a sequence scan on the relation that we
have no need to scan (because the foreign key tells us the join is
pointless). If we were not use use that relation then we'd just have a
bunch of hash tables with no path to join them up. If we did anything to
force the planner into creating a plan that would suit skipping relations,
then we could possibly be throwing away the optimal plan..... Right?

Regards

David Rowley

#27

Andres Freund

andres@2ndquadrant.com

over 11 years ago

In reply to: David Rowley (#26)

Re: Patch to support SEMI and ANTI join removal

On 2014-09-30 23:25:45 +1300, David Rowley wrote:

On Tue, Sep 30, 2014 at 12:42 AM, Andres Freund <andres@2ndquadrant.com>
wrote:

On 2014-09-29 22:42:57 +1300, David Rowley wrote:

I've made a change to the patch locally to ignore foreign
keys that are marked as deferrable.

I have serious doubts about the general usefulness if this is onlyu
going to be useable in a very restricted set of circumstances (only one
time plans, no deferrable keys). I think it'd be awesome to have the
capability, but I don't think it's ok to restrict it that much.

I had a look to see what Oracle does in this situation and I was quite
shocked to see that they're blatantly just ignoring the fact that the
foreign key is being deferred. I tested by deferring the foreign key in a
transaction then updating the referenced record and I see that Oracle just
return the wrong results as they're just blindly removing the join. So it
appears that they've not solved this one very well.

Ick. I'm pretty strongly against going that way.

To me that means you can't make the decision at plan time, but need to
move it to execution time. It really doesn't sound that hard to short
circuit the semi joins whenever, at execution time, there's no entries
in the deferred trigger queue. It's a bit annoying to have to add code
to all of nestloop/hashjoin/mergejoin to not check the outer tuple if
there's no such entry. But I don't think it'll be too bad. That'd mean
it can be used in prepared statements.

I'm starting to think about how this might be done, but I'm a bit confused
and I don't know if it's something you've overlooked or something I've
misunderstood.

I've not quite gotten my head around how we might stop the unneeded
relation from being the primary path to join the other inner relations,
i.e. what would stop the planner making a plan that hashed all the other
relations and planned to perform a sequence scan on the relation that we
have no need to scan (because the foreign key tells us the join is
pointless). If we were not use use that relation then we'd just have a
bunch of hash tables with no path to join them up. If we did anything to
force the planner into creating a plan that would suit skipping relations,
then we could possibly be throwing away the optimal plan..... Right?

I'm not 100% sure I understand your problem description, but let me
describe how I think this would work. During planning, you'd emit the
exactly same plan as you'd today, with two exceptions:
a) When costing a node where one side of a join is very likely to be
removable, you'd cost it nearly as if there wasn't a join.
b) The planner would attach some sort of 'one time join qual' to the
'likely removable' join nodes. If, during executor init, that qual
returns false, simply don't perform the join. Just check the inner
relation, but entirely skip the outer relation.

With regard to your comment hash tables that aren't joined up: Currently
hash tables aren't built if they're not used. I.e. it's not
ExecInitHash() that does the hashing, but they're generally only built
when needed. E.g. nodeHashJoin.c:ExecHashJoin() only calls
MultiExecProcNode() when in the HJ_BUILD_HASHTABLE state - which it only
initially and sometimes after rescans is.

Does that clear things up or have I completely missed your angle?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#28

David Rowley

dgrowleyml@gmail.com

over 11 years ago

In reply to: Andres Freund (#27)

Re: Patch to support SEMI and ANTI join removal

On Wed, Oct 1, 2014 at 12:01 AM, Andres Freund <andres@2ndquadrant.com>
wrote:

On 2014-09-30 23:25:45 +1300, David Rowley wrote:

I've not quite gotten my head around how we might stop the unneeded
relation from being the primary path to join the other inner relations,
i.e. what would stop the planner making a plan that hashed all the other
relations and planned to perform a sequence scan on the relation that we
have no need to scan (because the foreign key tells us the join is
pointless). If we were not use use that relation then we'd just have a
bunch of hash tables with no path to join them up. If we did anything to
force the planner into creating a plan that would suit skipping

relations,

then we could possibly be throwing away the optimal plan..... Right?

I'm not 100% sure I understand your problem description, but let me
describe how I think this would work. During planning, you'd emit the
exactly same plan as you'd today, with two exceptions:
a) When costing a node where one side of a join is very likely to be
removable, you'd cost it nearly as if there wasn't a join.

Ok given the tables:
create table t1 (x int primary key);
create table t2 (y int primary key);

suppose the planner came up with something like:

test=# explain (costs off) select t2.* from t1 inner join t2 on t1.x=t2.y;
QUERY PLAN
----------------------------
Hash Join
Hash Cond: (t1.x = t2.y)
-> Seq Scan on t1
-> Hash
-> Seq Scan on t2

If we had a foreign key...

alter table t2 add constraint t2_y_fkey foreign key (y) references t1 (x);

...the join to t1 could possibly be "ignored" by the executor... but
there's a problem as the plan states we're going to seqscan then hash that
relation, then seqscan t1 with a hash lookup on each of t1's rows. In this
case how would the executor skip the scan on t1? I can see how this might
work if it was t2 that we were removing, as we'd just skip the hash lookup
part in the hash join node.

b) The planner would attach some sort of 'one time join qual' to the
'likely removable' join nodes. If, during executor init, that qual
returns false, simply don't perform the join. Just check the inner
relation, but entirely skip the outer relation.

I think in the example that I've given above that the relation we'd want to
keep is the outer relation, we'd want to throw away the inner one, but I
can't quite see how that would work.

Hopefully that makes sense.

Regards

David Rowley

#29

Andres Freund

andres@2ndquadrant.com

over 11 years ago

In reply to: David Rowley (#28)

Re: Patch to support SEMI and ANTI join removal

On 2014-10-01 01:03:35 +1300, David Rowley wrote:

On Wed, Oct 1, 2014 at 12:01 AM, Andres Freund <andres@2ndquadrant.com>
wrote:

On 2014-09-30 23:25:45 +1300, David Rowley wrote:

I've not quite gotten my head around how we might stop the unneeded
relation from being the primary path to join the other inner relations,
i.e. what would stop the planner making a plan that hashed all the other
relations and planned to perform a sequence scan on the relation that we
have no need to scan (because the foreign key tells us the join is
pointless). If we were not use use that relation then we'd just have a
bunch of hash tables with no path to join them up. If we did anything to
force the planner into creating a plan that would suit skipping

relations,

then we could possibly be throwing away the optimal plan..... Right?

I'm not 100% sure I understand your problem description, but let me
describe how I think this would work. During planning, you'd emit the
exactly same plan as you'd today, with two exceptions:
a) When costing a node where one side of a join is very likely to be
removable, you'd cost it nearly as if there wasn't a join.

Ok given the tables:
create table t1 (x int primary key);
create table t2 (y int primary key);

suppose the planner came up with something like:

test=# explain (costs off) select t2.* from t1 inner join t2 on t1.x=t2.y;
QUERY PLAN
----------------------------
Hash Join
Hash Cond: (t1.x = t2.y)
-> Seq Scan on t1
-> Hash
-> Seq Scan on t2

If we had a foreign key...

alter table t2 add constraint t2_y_fkey foreign key (y) references t1 (x);

...the join to t1 could possibly be "ignored" by the executor... but
there's a problem as the plan states we're going to seqscan then hash that
relation, then seqscan t1 with a hash lookup on each of t1's rows. In this
case how would the executor skip the scan on t1? I can see how this might
work if it was t2 that we were removing, as we'd just skip the hash lookup
part in the hash join node.

Hm, right. But that doesn't seem like a fatal problem to me. The planner
knows about t1/t2 and Seq(t1), Seq(t2), not just Hash(Seq(t2)). So it
can tell the HashJoin node that when the 'shortcut' qualifier is true,
it should source everything from Seq(t2). Since the sequence scan
doesn't care about the node ontop that doesn't seem to be overly
dramatic?
Obviously reality makes this a bit more complicated...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30

David Rowley

dgrowleyml@gmail.com

over 11 years ago

In reply to: Andres Freund (#29)

Re: Patch to support SEMI and ANTI join removal

On Wed, Oct 1, 2014 at 1:34 AM, Andres Freund <andres@2ndquadrant.com>
wrote:

On 2014-10-01 01:03:35 +1300, David Rowley wrote:

On Wed, Oct 1, 2014 at 12:01 AM, Andres Freund <andres@2ndquadrant.com>
wrote:

On 2014-09-30 23:25:45 +1300, David Rowley wrote:

I've not quite gotten my head around how we might stop the unneeded
relation from being the primary path to join the other inner

relations,

i.e. what would stop the planner making a plan that hashed all the

other

relations and planned to perform a sequence scan on the relation

that we

have no need to scan (because the foreign key tells us the join is
pointless). If we were not use use that relation then we'd just have

a

bunch of hash tables with no path to join them up. If we did

anything to

force the planner into creating a plan that would suit skipping

relations,

then we could possibly be throwing away the optimal plan..... Right?

I'm not 100% sure I understand your problem description, but let me
describe how I think this would work. During planning, you'd emit the
exactly same plan as you'd today, with two exceptions:
a) When costing a node where one side of a join is very likely to be
removable, you'd cost it nearly as if there wasn't a join.

Ok given the tables:
create table t1 (x int primary key);
create table t2 (y int primary key);

suppose the planner came up with something like:

test=# explain (costs off) select t2.* from t1 inner join t2 on

t1.x=t2.y;

QUERY PLAN
----------------------------
Hash Join
Hash Cond: (t1.x = t2.y)
-> Seq Scan on t1
-> Hash
-> Seq Scan on t2

If we had a foreign key...

alter table t2 add constraint t2_y_fkey foreign key (y) references t1

(x);

...the join to t1 could possibly be "ignored" by the executor... but
there's a problem as the plan states we're going to seqscan then hash

that

relation, then seqscan t1 with a hash lookup on each of t1's rows. In

this

case how would the executor skip the scan on t1? I can see how this might
work if it was t2 that we were removing, as we'd just skip the hash

lookup

part in the hash join node.

Hm, right. But that doesn't seem like a fatal problem to me. The planner
knows about t1/t2 and Seq(t1), Seq(t2), not just Hash(Seq(t2)). So it
can tell the HashJoin node that when the 'shortcut' qualifier is true,
it should source everything from Seq(t2). Since the sequence scan
doesn't care about the node ontop that doesn't seem to be overly
dramatic?
Obviously reality makes this a bit more complicated...

Ok, after a bit of study on the hash join code I can see what you mean, it
shouldn't really matter.
I've started working on this now and I've made some changes in
analyzejoins.c so that instead of removing joins, it just marks the
RangeTblEntry, setting a new flag named skipJoinPossible to true. (I'll
think of a better name later)

I'm starting off with hash joins and I'm hacking a bit at ExecInitHashJoin.
I want to add 2 bool flags to HashJoinState, outerIsRequired and
innerIsRequired. I think if both of these flags are set then we can just
abort the join altogether. Though in order to set these flags I need to
identify which relations are for the outer and which are for the inner side
of the join. I've got the logic for that only partially worked out. My
understanding of it so far is that I just need to look at
the hjstate->js.ps.lefttree and righttree. Inner being right, and outer
being left. I'm a little stuck on more complex cases where the scan is
nested deeper in the tree and I'm not quite sure on the logic I should be
using to navigate to it.

Take the following plan: (which I've amended to mark the left and right
nodes)

explain select t1.* from t1 inner join t2 on t1.t2_id= t2.id inner join t3
on t2.t3_id=t3.id;
QUERY PLAN
------------------------------------------------------------------------
Hash Join (cost=122.15..212.40 rows=2140 width=8)
Hash Cond: (t2.t3_id = t3.id)
-> Hash Join (cost=58.15..118.98 rows=2140 width=12) (left node)
Hash Cond: (t1.t2_id = t2.id)
-> Seq Scan on t1 (cost=0.00..31.40 rows=2140 width=8) (left
node)
-> Hash (cost=31.40..31.40 rows=2140 width=8) (right node)
-> Seq Scan on t2 (cost=0.00..31.40 rows=2140 width=8)
(left node)
-> Hash (cost=34.00..34.00 rows=2400 width=4) (right node)
-> Seq Scan on t3 (cost=0.00..34.00 rows=2400 width=4) (left
node)

The schema is set up in such a way that the joins to t2 and t3 can be...
"skipped", and my new code in analyzejoin.c marks the RangeTblEntry records
for this relation to reflect that.

During ExecInitHashJoin for the join between t2 and t3, if I want to find
t2 in the tree, I'd need to
do hjstate->js.ps.lefttree->righttree->lefttree... (which I know just from
looking at the explain output) I just can't work out the logic behind where
the scan node will actually be. At first I had thought something like, loop
down the lefttree path until I reach a deadend, and that's the outer scan
node, but in this case there's a right turn in there too, so that won't
work. If I keep going down the left path I'd end up at t1, which is
completely wrong.

Can anyone shed any light on how I might determine where the scan rel is in
the tree? I need to find it so I can check if the RangeTblEntry is marked
as skip-able.

Regards

David Rowley

#31

Robert Haas

robertmhaas@gmail.com

over 11 years ago

In reply to: David Rowley (#30)

Re: Patch to support SEMI and ANTI join removal

On Mon, Oct 6, 2014 at 5:57 AM, David Rowley <dgrowleyml@gmail.com> wrote:

Hm, right. But that doesn't seem like a fatal problem to me. The planner
knows about t1/t2 and Seq(t1), Seq(t2), not just Hash(Seq(t2)). So it
can tell the HashJoin node that when the 'shortcut' qualifier is true,
it should source everything from Seq(t2). Since the sequence scan
doesn't care about the node ontop that doesn't seem to be overly
dramatic?
Obviously reality makes this a bit more complicated...

Ok, after a bit of study on the hash join code I can see what you mean, it
shouldn't really matter.
I've started working on this now and I've made some changes in
analyzejoins.c so that instead of removing joins, it just marks the
RangeTblEntry, setting a new flag named skipJoinPossible to true. (I'll
think of a better name later)

I'm starting off with hash joins and I'm hacking a bit at ExecInitHashJoin.
I want to add 2 bool flags to HashJoinState, outerIsRequired and
innerIsRequired. I think if both of these flags are set then we can just
abort the join altogether. Though in order to set these flags I need to
identify which relations are for the outer and which are for the inner side
of the join. I've got the logic for that only partially worked out. My
understanding of it so far is that I just need to look at the
hjstate->js.ps.lefttree and righttree. Inner being right, and outer being
left. I'm a little stuck on more complex cases where the scan is nested
deeper in the tree and I'm not quite sure on the logic I should be using to
navigate to it.

Take the following plan: (which I've amended to mark the left and right
nodes)

explain select t1.* from t1 inner join t2 on t1.t2_id= t2.id inner join t3
on t2.t3_id=t3.id;
QUERY PLAN
------------------------------------------------------------------------
Hash Join (cost=122.15..212.40 rows=2140 width=8)
Hash Cond: (t2.t3_id = t3.id)
-> Hash Join (cost=58.15..118.98 rows=2140 width=12) (left node)
Hash Cond: (t1.t2_id = t2.id)
-> Seq Scan on t1 (cost=0.00..31.40 rows=2140 width=8) (left
node)
-> Hash (cost=31.40..31.40 rows=2140 width=8) (right node)
-> Seq Scan on t2 (cost=0.00..31.40 rows=2140 width=8)
(left node)
-> Hash (cost=34.00..34.00 rows=2400 width=4) (right node)
-> Seq Scan on t3 (cost=0.00..34.00 rows=2400 width=4) (left
node)

The schema is set up in such a way that the joins to t2 and t3 can be...
"skipped", and my new code in analyzejoin.c marks the RangeTblEntry records
for this relation to reflect that.

During ExecInitHashJoin for the join between t2 and t3, if I want to find t2
in the tree, I'd need to do hjstate->js.ps.lefttree->righttree->lefttree...
(which I know just from looking at the explain output) I just can't work out
the logic behind where the scan node will actually be. At first I had
thought something like, loop down the lefttree path until I reach a deadend,
and that's the outer scan node, but in this case there's a right turn in
there too, so that won't work. If I keep going down the left path I'd end up
at t1, which is completely wrong.

Can anyone shed any light on how I might determine where the scan rel is in
the tree? I need to find it so I can check if the RangeTblEntry is marked as
skip-able.

I think you're probably going to need logic that knows about
particular node types and does the right thing for each one. I think
- but maybe I'm misunderstanding - what you're looking for is a
function of the form Oid ScansOnePlainTableWithoutQuals(). The
algorithm could be something like:

switch (node type)
{
case T_SeqScanState:
if (no quals)
return the_appropriate_oid;
return false;
case T_HashJoin:
decide whether we can ignore one side of the join
fish out the node from the other side of the join (including
reaching through the Hash node if necessary)
return ScansOnePlainTableWithoutQuals(the node we fished out);
...other specific cases...
default:
return false;
}

This seems messy, though. Can't the deferred trigger queue become
non-empty at pretty much any point in time? At exactly what point are
we making this decision, and how do we know the correct answer can't
change after that point?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#32

Andres Freund

andres@2ndquadrant.com

over 11 years ago

In reply to: Robert Haas (#31)

Re: Patch to support SEMI and ANTI join removal

On 2014-10-06 10:46:09 -0400, Robert Haas wrote:

This seems messy, though. Can't the deferred trigger queue become
non-empty at pretty much any point in time? At exactly what point are
we making this decision, and how do we know the correct answer can't
change after that point?

What we've been talking about is doing this during executor startup. And
at that point we really don't care about new entries in the queue during
query execution - we can't see them anyway.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#33

Robert Haas

robertmhaas@gmail.com

over 11 years ago

In reply to: Andres Freund (#32)

Re: Patch to support SEMI and ANTI join removal

On Mon, Oct 6, 2014 at 10:59 AM, Andres Freund <andres@2ndquadrant.com> wrote:

On 2014-10-06 10:46:09 -0400, Robert Haas wrote:

This seems messy, though. Can't the deferred trigger queue become
non-empty at pretty much any point in time? At exactly what point are
we making this decision, and how do we know the correct answer can't
change after that point?

What we've been talking about is doing this during executor startup. And
at that point we really don't care about new entries in the queue during
query execution - we can't see them anyway.

Ah, OK.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#34

David Rowley

dgrowleyml@gmail.com

over 11 years ago

In reply to: Robert Haas (#31)

Re: Patch to support SEMI and ANTI join removal

On Tue, Oct 7, 2014 at 3:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Oct 6, 2014 at 5:57 AM, David Rowley <dgrowleyml@gmail.com> wrote:

Can anyone shed any light on how I might determine where the scan rel is

in

the tree? I need to find it so I can check if the RangeTblEntry is

marked as

skip-able.

I think you're probably going to need logic that knows about
particular node types and does the right thing for each one. I think
- but maybe I'm misunderstanding - what you're looking for is a
function of the form Oid ScansOnePlainTableWithoutQuals(). The
algorithm could be something like:

switch (node type)
{
case T_SeqScanState:
if (no quals)
return the_appropriate_oid;
return false;
case T_HashJoin:
decide whether we can ignore one side of the join
fish out the node from the other side of the join (including
reaching through the Hash node if necessary)
return ScansOnePlainTableWithoutQuals(the node we fished out);
...other specific cases...
default:
return false;
}

Thanks Robert.

Ok, so I've been hacking away at this for a couple of evenings and I think
I have a working prototype finally!
My earlier thoughts about having to descend down until I find a seqscan
were wrong. It looks like just need to look at the next node down, if it's
a seqscan and it's marked as not needed, then we can skip that side of the
join, or if the child node is a HashJoinState then check the skip status of
that node, if both sides are marked as not needed, then skip that side of
the join.

I've just completed some simple performance tests:

create table t3 (id int primary key);
create table t2 (id int primary key, t3_id int not null references t3);
create table t1 (id int primary key, t2_id int not null references t2);

I've loaded these tables with 4 million rows each.The performance is as
follows:

test=# select count(*) from t1 inner join t2 on t1.t2_id=t2.id inner join
t3 on t2.t3_id=t3.id;
count
---------
4000000
(1 row)
Time: 1022.492 ms

test=# select count(*) from t1;
count
---------
4000000
(1 row)
Time: 693.642 ms

test=# alter table t2 drop constraint t2_t3_id_fkey;
ALTER TABLE
Time: 2.141 ms
test=# alter table t1 drop constraint t1_t2_id_fkey;
ALTER TABLE
Time: 1.678 ms
test=# select count(*) from t1 inner join t2 on t1.t2_id=t2.id inner join
t3 on t2.t3_id=t3.id;
count
---------
4000000
(1 row)
Time: 11682.525 ms

So it seems it's not quite as efficient as join removal at planning time,
but still a big win when it's possible to perform the join skipping.

As yet, I've only added support for hash joins, and I've not really looked
into detail on what's needed for nested loop joins or merge joins.

For hash join I just added some code into the case HJ_BUILD_HASHTABLE: in
ExecHashJoin(). The code just checks if any side can be skipped, if they
can then the node will never move out of the HJ_BUILD_HASHTABLE state, so
that each time ExecHashJoin() is called, it'll just return the next tuple
from the non-skip side of the join until we run out of tuples... Or if both
sides can be skipped NULL is returned as any nodes above this shouldn't
attempt to scan, perhaps this should just be an Assert() as I don't think
any parent nodes should ever bother executing that node if it's not
required. The fact that I've put this code into the switch
under HJ_BUILD_HASHTABLE makes me think it should add about close to zero
overhead for when the join cannot be skipped.

One thing that we've lost out of this execution time join removal checks
method is the ability to still remove joins where the join column is
NULLable. The previous patch added IS NOT NULL to ensure the query was
equivalent when a join was removed, of course this meant that any
subsequent joins may later not have been removed due to the IS NOT NULL
quals existing (which could restrict the rows and remove the ability that
the foreign key could guarantee the existence of exactly 1 row matching the
join condition). I've not yet come up with a nice way to reimplement these
null checks at execution time, so I've thought that perhaps I won't bother,
at least not for this patch. I'd just disable join skipping at planning
time if any of the join columns can have NULLs.

Anyway this is just a progress report, and also to say thanks to Andres,
you might just have saved this patch with the execution time checking idea.

I'll post a patch soon, hopefully once I have merge and nestloop join types
working.

Regards

David Rowley

#35

Andres Freund

andres@2ndquadrant.com

over 11 years ago

In reply to: David Rowley (#34)

Re: Patch to support SEMI and ANTI join removal

On 2014-10-09 00:21:44 +1300, David Rowley wrote:

Ok, so I've been hacking away at this for a couple of evenings and I think
I have a working prototype finally!

Cool!

So it seems it's not quite as efficient as join removal at planning time,
but still a big win when it's possible to perform the join skipping.

Have you checked where the overhead is? Is it really just the additional
node that the tuples are passed through?

Have you measured the overhead of the plan/execution time checks over
master?

One thing that we've lost out of this execution time join removal checks
method is the ability to still remove joins where the join column is
NULLable. The previous patch added IS NOT NULL to ensure the query was
equivalent when a join was removed, of course this meant that any
subsequent joins may later not have been removed due to the IS NOT NULL
quals existing (which could restrict the rows and remove the ability that
the foreign key could guarantee the existence of exactly 1 row matching the
join condition). I've not yet come up with a nice way to reimplement these
null checks at execution time, so I've thought that perhaps I won't bother,
at least not for this patch. I'd just disable join skipping at planning
time if any of the join columns can have NULLs.

Sounds fair enough for the first round.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#36

David Rowley

dgrowleyml@gmail.com

about 11 years ago

In reply to: Andres Freund (#35)

1 attachment(s)

Re: Patch to support SEMI and ANTI join removal

On Thu, Oct 9, 2014 at 12:40 AM, Andres Freund <andres@2ndquadrant.com>
wrote:

On 2014-10-09 00:21:44 +1300, David Rowley wrote:

Ok, so I've been hacking away at this for a couple of evenings and I

think

I have a working prototype finally!

Cool!

Patch attached.

So it seems it's not quite as efficient as join removal at planning time,
but still a big win when it's possible to perform the join skipping.

Have you checked where the overhead is? Is it really just the additional
node that the tuples are passed through?

I've not checked this yet, but I'd assume that it has to be from the extra
node. I'll run some profiles soon.

Have you measured the overhead of the plan/execution time checks over
master?

I did a bit of benchmarking last night, but this was mostly for testing
that I've not added too much overhead on the nest loop code.
For the merge and hashjoin code I've managed to keep the special skipping
code in the EXEC_MJ_INITIALIZE_OUTER and HJ_BUILD_HASHTABLE part of the
main switch statement, so the extra checks should only be performed on the
first call of the node when skips are not possible. For nested loop I can't
see any other way but to pay the small price of setting the skipflags and
checking if there are any skip flags on every call to the node.

I tested the overhead of this on my laptop by creating 2 tables with 1
million rows each, joining them on an INT column, where each value of the
int column was unique. I seem to have added about a 2% overhead to this. :(
Which I was quite surprised at, giving it's just 1000001 million extra
settings if the skipflags and checking that the skip flags are not empty,
but perhaps the extra local variable is causing something else to not make
it into a register. At the moment I can't quite see another way to do it,
but I guess it may not be the end of the world as the chances of having to
perform a nest loop join on 2 tables of that size is probably not that
high.

Test case:
create table t3 (id int primary key);
create table t2 (id int primary key, t3_id int not null references t3);
create table t1 (id int primary key, t2_id int not null references t2);
insert into t3 select x.x from generate_series(1,1000000) x(x);
insert into t2 select x.x,x.x from generate_series(1,1000000) x(x);
insert into t1 select x.x,x.x from generate_series(1,1000000) x(x);
vacuum;
set enable_hashjoin = off;
set enable_mergejoin = off;

select count(*) from t1 inner join t2 on t1.id=t2.id;

Unpatched:
duration: 120 s
number of transactions actually processed: 45
latency average: 2666.667 ms
tps = 0.371901 (including connections establishing)
tps = 0.371965 (excluding connections establishing)

Patched:
Master
duration: 120 s
number of transactions actually processed: 44
latency average: 2727.273 ms
tps = 0.363933 (including connections establishing)
tps = 0.363987 (excluding connections establishing)

102.19%

Of course if we do the join on the column that has the foreign key, then
this is much faster.

test=# select count(*) from t1 inner join t2 on t1.t2_id=t2.id;
count
---------
1000000
(1 row)

Time: 105.206 ms

The explain analyze from the above query looks like:
test=# explain (analyze, costs off, timing off) select count(*) from t1
inner join t2 on t1.t2_id=t2.id;
QUERY PLAN
------------------------------------------------------------------
Aggregate (actual rows=1 loops=1)
-> Nested Loop (actual rows=1000000 loops=1)
-> Seq Scan on t1 (actual rows=1000000 loops=1)
-> Index Only Scan using t2_pkey on t2 (never executed)
Index Cond: (id = t1.t2_id)
Heap Fetches: 0
Execution time: 124.990 ms
(7 rows)

As you can see the scan on t2 never occurred.

I've so far only managed to come up with 1 useful regression test for this
new code. It's not possible to tell if the removal has taken place at plan
time, as the plan looks the same as if it didn't get removed. The only way
to tell us from the explain analyze, but the problem is that the execution
time is shown and can't be removed. Perhaps I should modify the explain to
tag join nodes that the planner has marked to say skipping may be possible?
But this is not really ideal as it's only the join nodes that know about
skipping and they just don't bother executing the child nodes, so it's
really up to the executor to decide which child nodes don't get called, so
to add something to explain might require making it more smart than it
needs to be.

Right now I'm not quite sure if I should modify any costs. Quite possibly
hash joins could have the costing reduced a little to try to encourage
hashjoins over merge joins, as with merge joins we can't skip sort
operations, but with hash joins we can skip the hash table build.

Regards

David Rowley

Attachments:

inner_join_removals_2014-10-15_0f3f1ea.patchapplication/octet-stream; name=inner_join_removals_2014-10-15_0f3f1ea.patchDownload

diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index f4c0ffa..9451e54 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -3887,6 +3887,17 @@ afterTriggerInvokeEvents(AfterTriggerEventList *events,
 	return all_fired;
 }
 
+/* ----------
+ * AfterTriggerQueueIsEmpty()
+ *
+ *	True if there are no pending triggers in the queue.
+ * ----------
+ */
+bool
+AfterTriggerQueueIsEmpty(void)
+{
+	return (afterTriggers->query_depth == -1 && afterTriggers->events.head == NULL);
+}
 
 /* ----------
  * AfterTriggerBeginXact()
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index d5e1273..96452dc 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -45,6 +45,7 @@
 #include "access/relscan.h"
 #include "access/transam.h"
 #include "catalog/index.h"
+#include "commands/trigger.h"
 #include "executor/execdebug.h"
 #include "nodes/nodeFuncs.h"
 #include "parser/parsetree.h"
@@ -661,6 +662,76 @@ get_last_attnums(Node *node, ProjectionInfo *projInfo)
 								  (void *) projInfo);
 }
 
+/*
+ * ExecCanSkipJoin
+ *		Returns True if the join node can be safely skipped without affecting
+ *		query results.
+ */
+bool
+ExecCanSkipJoin(PlanState *planstate)
+{
+	/*
+	 * Currently the only possibility we have of skipping this join node is if
+	 * a foreign key can prove that the join condition will match exactly 1 row
+	 * on the join condition. The checks for this were all done at planning
+	 * time, and any relations that we found foreign keys on that could prove
+	 * this, we marked those relations as skipJoinPossible. Though if this flag
+	 * is true, it still does not mean that we can skip joining to this
+	 * relation. If any changes have been made to records in the referenced
+	 * relation we may not yet have fired the foreign key triggers to cascade
+	 * those changes to the referencing relations, in this case we mustn't skip
+	 * the join as we could produce wrong results by doing so.
+	 *
+	 * Currently this code is quite naive, as we won't allow join skipping if
+	 * there are *any* pending foreign key triggers, on any relation. It may be
+	 * worthwhile to improve this to check if there's any pending triggers for
+	 * the referencing relation in the join.
+	 */
+	if (!AfterTriggerQueueIsEmpty())
+		return false;
+
+	while (planstate != NULL)
+	{
+		switch (nodeTag(planstate))
+		{
+			case T_SeqScanState:
+			case T_IndexOnlyScanState:
+				{
+					Scan *scan = (Scan *) planstate->plan;
+					RangeTblEntry *rte;
+					rte = (RangeTblEntry *) list_nth(planstate->state->es_range_table, scan->scanrelid - 1);
+
+					if (rte->skipJoinPossible)
+						return true;
+					else
+						return false;
+				}
+				break;
+
+			case T_HashState:
+				/* descend to the scan node */
+				planstate = planstate->lefttree;
+				break;
+
+			case T_HashJoinState:
+				{
+					HashJoinState *hjstate = (HashJoinState *)planstate;
+
+					if (HasFlagSkipJoinBoth(hjstate->js.skipflags))
+						return true;
+					else
+						return false;
+				}
+				break;
+
+			default:
+				return false;
+		}
+	}
+	return false;
+}
+
+
 /* ----------------
  *		ExecAssignProjectionInfo
  *
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index 7eec3f3..997314e 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -71,6 +71,7 @@ ExecHashJoin(HashJoinState *node)
 	TupleTableSlot *outerTupleSlot;
 	uint32		hashvalue;
 	int			batchno;
+	int			skipflags;
 
 	/*
 	 * get information from HashJoin node
@@ -113,6 +114,32 @@ ExecHashJoin(HashJoinState *node)
 		switch (node->hj_JoinState)
 		{
 			case HJ_BUILD_HASHTABLE:
+				skipflags = node->js.skipflags;
+
+				/* Can we skip the whole thing? */
+				if (HasFlagSkipJoinBoth(skipflags))
+					return NULL;
+
+				if (HasFlagSkipJoinInner(skipflags))
+				{
+					node->hj_FirstOuterTupleSlot = ExecProcNode(outerNode);
+					if (TupIsNull(node->hj_FirstOuterTupleSlot))
+						return NULL;
+
+					return node->hj_FirstOuterTupleSlot;
+				}
+				else if (HasFlagSkipJoinOuter(skipflags))
+				{
+					TupleTableSlot *result;
+
+					/* bypass the hash node to the node below it */
+					result = ExecProcNode(hashNode->ps.lefttree);
+
+					if (TupIsNull(result))
+						return NULL;
+
+					return result;
+				}
 
 				/*
 				 * First time through: build hash table for inner relation.
@@ -489,6 +516,13 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
 	outerPlanState(hjstate) = ExecInitNode(outerNode, estate, eflags);
 	innerPlanState(hjstate) = ExecInitNode((Plan *) hashNode, estate, eflags);
 
+	hjstate->js.skipflags = 0;
+	if (ExecCanSkipJoin(outerPlanState(hjstate)))
+		hjstate->js.skipflags |= EXEC_SKIPJOIN_OUTER;
+
+	if (ExecCanSkipJoin(innerPlanState(hjstate)))
+		hjstate->js.skipflags |= EXEC_SKIPJOIN_INNER;
+
 	/*
 	 * tuple table initialization
 	 */
@@ -578,6 +612,7 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
 		Assert(IsA(hclause, OpExpr));
 		lclauses = lappend(lclauses, linitial(fstate->args));
 		rclauses = lappend(rclauses, lsecond(fstate->args));
+
 		hoperators = lappend_oid(hoperators, hclause->opno);
 	}
 	hjstate->hj_OuterHashKeys = lclauses;
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index fdf2f4c..49f6125 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -621,6 +621,7 @@ ExecMergeJoin(MergeJoinState *node)
 	ExprContext *econtext;
 	bool		doFillOuter;
 	bool		doFillInner;
+	int			skipflags;
 
 	/*
 	 * get information from node
@@ -679,6 +680,30 @@ ExecMergeJoin(MergeJoinState *node)
 			case EXEC_MJ_INITIALIZE_OUTER:
 				MJ_printf("ExecMergeJoin: EXEC_MJ_INITIALIZE_OUTER\n");
 
+				skipflags = node->js.skipflags;
+
+				if (HasFlagSkipJoinBoth(skipflags))
+					return NULL;
+
+				if (HasFlagSkipJoinInner(skipflags))
+				{
+					outerTupleSlot = ExecProcNode(outerPlan);
+					if (TupIsNull(outerTupleSlot))
+					{
+						return NULL;
+					}
+					return outerTupleSlot;
+				}
+				else if (HasFlagSkipJoinOuter(skipflags))
+				{
+					innerTupleSlot = ExecProcNode(innerPlan);
+					if (TupIsNull(innerTupleSlot))
+					{
+						return NULL;
+					}
+					return innerTupleSlot;
+				}
+
 				outerTupleSlot = ExecProcNode(outerPlan);
 				node->mj_OuterTupleSlot = outerTupleSlot;
 
@@ -1518,6 +1543,13 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
 	innerPlanState(mergestate) = ExecInitNode(innerPlan(node), estate,
 											  eflags | EXEC_FLAG_MARK);
 
+	mergestate->js.skipflags = 0;
+	if (ExecCanSkipJoin(outerPlanState(mergestate)))
+		mergestate->js.skipflags |= EXEC_SKIPJOIN_OUTER;
+
+	if (ExecCanSkipJoin(innerPlanState(mergestate)))
+		mergestate->js.skipflags |= EXEC_SKIPJOIN_INNER;
+
 	/*
 	 * For certain types of inner child nodes, it is advantageous to issue
 	 * MARK every time we advance past an inner tuple we will never return to.
diff --git a/src/backend/executor/nodeNestloop.c b/src/backend/executor/nodeNestloop.c
index 6cdd4ff..e0c844f 100644
--- a/src/backend/executor/nodeNestloop.c
+++ b/src/backend/executor/nodeNestloop.c
@@ -68,6 +68,7 @@ ExecNestLoop(NestLoopState *node)
 	List	   *otherqual;
 	ExprContext *econtext;
 	ListCell   *lc;
+	int			skipflags;
 
 	/*
 	 * get information from the node
@@ -105,6 +106,37 @@ ExecNestLoop(NestLoopState *node)
 	 */
 	ResetExprContext(econtext);
 
+	skipflags = node->js.skipflags;
+
+	if (HasFlagSkipJoinAny(skipflags))
+	{
+		if (HasFlagSkipJoinBoth(skipflags))
+			return NULL;
+		else if (HasFlagSkipJoinInner(skipflags))
+		{
+			outerTupleSlot = ExecProcNode(outerPlan);
+
+			if (TupIsNull(outerTupleSlot))
+			{
+				ENL1_printf("no outer tuple, ending join");
+				return NULL;
+			}
+			return outerTupleSlot;
+		}
+		else
+		{
+			innerTupleSlot = ExecProcNode(innerPlan);
+			econtext->ecxt_innertuple = innerTupleSlot;
+
+			if (TupIsNull(innerTupleSlot))
+			{
+				return NULL;
+			}
+
+			return innerTupleSlot;
+		}
+	}
+
 	/*
 	 * Ok, everything is setup for the join so now loop until we return a
 	 * qualifying join tuple.
@@ -347,6 +379,13 @@ ExecInitNestLoop(NestLoop *node, EState *estate, int eflags)
 		eflags &= ~EXEC_FLAG_REWIND;
 	innerPlanState(nlstate) = ExecInitNode(innerPlan(node), estate, eflags);
 
+	nlstate->js.skipflags = 0;
+	if (ExecCanSkipJoin(outerPlanState(nlstate)))
+		nlstate->js.skipflags |= EXEC_SKIPJOIN_OUTER;
+
+	if (ExecCanSkipJoin(innerPlanState(nlstate)))
+		nlstate->js.skipflags |= EXEC_SKIPJOIN_INNER;
+
 	/*
 	 * tuple table initialization
 	 */
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index e5dd58e..0a665b2 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -49,8 +49,6 @@ static List *generate_join_implied_equalities_broken(PlannerInfo *root,
 										Relids outer_relids,
 										Relids nominal_inner_relids,
 										RelOptInfo *inner_rel);
-static Oid select_equality_operator(EquivalenceClass *ec,
-						 Oid lefttype, Oid righttype);
 static RestrictInfo *create_join_clause(PlannerInfo *root,
 				   EquivalenceClass *ec, Oid opno,
 				   EquivalenceMember *leftem,
@@ -1283,7 +1281,7 @@ generate_join_implied_equalities_broken(PlannerInfo *root,
  *
  * Returns InvalidOid if no operator can be found for this datatype combination
  */
-static Oid
+Oid
 select_equality_operator(EquivalenceClass *ec, Oid lefttype, Oid righttype)
 {
 	ListCell   *lc;
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index 773f8a4..71684f1 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -32,13 +32,21 @@
 #include "utils/lsyscache.h"
 
 /* local functions */
-static bool join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
+static bool innerjoin_is_removable(PlannerInfo *root, List *joinlist,
+					  RangeTblRef *removalrtr, Relids ignoredrels);
+static bool leftjoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
+static bool relation_is_needed(PlannerInfo *root, Relids joinrelids,
+					  RelOptInfo *rel, Relids ignoredrels);
+static bool relation_has_foreign_key_for(PlannerInfo *root, RelOptInfo *rel,
+					  RelOptInfo *referencedrel, List *referencing_vars,
+					  List *index_vars, List *operator_list);
+static bool expressions_match_foreign_key(ForeignKeyInfo *fk, List *fkvars,
+					  List *indexvars, List *operators);
 static void remove_rel_from_query(PlannerInfo *root, int relid,
 					  Relids joinrelids);
 static List *remove_rel_from_joinlist(List *joinlist, int relid, int *nremoved);
 static Oid	distinct_col_search(int colno, List *colnos, List *opids);
 
-
 /*
  * remove_useless_joins
  *		Check for relations that don't actually need to be joined at all,
@@ -46,26 +54,91 @@ static Oid	distinct_col_search(int colno, List *colnos, List *opids);
  *
  * We are passed the current joinlist and return the updated list.  Other
  * data structures that have to be updated are accessible via "root".
+ *
+ * There are 2 methods here for removing joins. Joins such as LEFT JOINs
+ * which can be proved to be needless due to lack of use of any of the joining
+ * relation's columns and the existence of a unique index on a subset of the
+ * join columns can simply be removed from the query plan at plan time. For
+ * certain other join types we make use of foreign keys to attempt to prove the
+ * join is needless, though, for these we're unable to be certain that the join
+ * is not required during at planning time, as if the plan is executed when
+ * pending foreign key triggers have not yet been fired, then the foreign key
+ * is effectively violated until these triggers have fired. Removing a join
+ * in such a case could cause a query to produce incorrect results.
+ *
+ * Instead we handle this case by marking the RangeTblEntry for the relation
+ * with a special flag which tells the executor that it's possible that joining
+ * to this relation may not be required. The executor may then check this flag
+ * and choose to skip the join based on if there are foreign key triggers
+ * pending or not.
  */
 List *
 remove_useless_joins(PlannerInfo *root, List *joinlist)
 {
 	ListCell   *lc;
+	Relids		removedrels = NULL;
 
 	/*
-	 * We are only interested in relations that are left-joined to, so we can
-	 * scan the join_info_list to find them easily.
+	 * Start by analyzing INNER JOINed relations in order to determine if any
+	 * of the relations can be ignored.
 	 */
 restart:
+	foreach(lc, joinlist)
+	{
+		RangeTblRef *rtr = (RangeTblRef *) lfirst(lc);
+
+		if (!IsA(rtr, RangeTblRef))
+			continue;
+
+		/* Don't try to remove this one again if we've already removed it */
+		if (root->simple_rte_array[rtr->rtindex]->skipJoinPossible)
+			continue;
+
+		/* skip if the join can't be removed */
+		if (!innerjoin_is_removable(root, joinlist, rtr, removedrels))
+			continue;
+
+		/*
+		 * Since we're not actually removing the join here, we need to maintain
+		 * a list of relations that we've "removed" so when we're checking if
+		 * other relations can be removed we'll know that if the to be removed
+		 * relation is only referenced by a relation that we've already removed
+		 * that it can be safely assumed that the relation is not referenced by
+		 * any useful relation.
+		 */
+		removedrels = bms_add_member(removedrels, rtr->rtindex);
+
+		/*
+		 * Make a mark for the executor to say that it may be able to skip
+		 * joining to this relation.
+		 */
+		root->simple_rte_array[rtr->rtindex]->skipJoinPossible = true;
+
+		/*
+		 * Restart the scan.  This is necessary to ensure we find all removable
+		 * joins independently of their ordering. (note that since we've added
+		 * this relation to the removedrels, we may now realize that other
+		 * relations can also be removed as they're only referenced by the one
+		 * that we've just marked as possibly removable).
+		 */
+		goto restart;
+	}
+
+	/* now process special joins. Currently only left joins are supported */
 	foreach(lc, root->join_info_list)
 	{
 		SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) lfirst(lc);
 		int			innerrelid;
 		int			nremoved;
 
-		/* Skip if not removable */
-		if (!join_is_removable(root, sjinfo))
-			continue;
+		if (sjinfo->jointype == JOIN_LEFT)
+		{
+			/* Skip if not removable */
+			if (!leftjoin_is_removable(root, sjinfo))
+				continue;
+		}
+		else
+			continue; /* we don't support this join type */
 
 		/*
 		 * Currently, join_is_removable can only succeed when the sjinfo's
@@ -91,12 +164,11 @@ restart:
 		root->join_info_list = list_delete_ptr(root->join_info_list, sjinfo);
 
 		/*
-		 * Restart the scan.  This is necessary to ensure we find all
-		 * removable joins independently of ordering of the join_info_list
-		 * (note that removal of attr_needed bits may make a join appear
-		 * removable that did not before).  Also, since we just deleted the
-		 * current list cell, we'd have to have some kluge to continue the
-		 * list scan anyway.
+		 * Restart the scan.  This is necessary to ensure we find all removable
+		 * joins independently of their ordering. (note that removal of
+		 * attr_needed bits may make a join, inner or outer, appear removable
+		 * that did not before).   Also, since we just deleted the current list
+		 * cell, we'd have to have some kluge to continue the list scan anyway.
 		 */
 		goto restart;
 	}
@@ -136,8 +208,213 @@ clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
 }
 
 /*
- * join_is_removable
- *	  Check whether we need not perform this special join at all, because
+ * innerjoin_is_removable
+ *		True if the join to removalrtr can be removed.
+ *
+ * In order to prove a relation which is inner joined is not required we must
+ * be sure that the join would emit exactly 1 row on the join condition. This
+ * differs from the logic which is used for proving LEFT JOINs can be removed,
+ * where it's possible to just check that a unique index exists on the relation
+ * being removed which has a set of columns that is a subset of the columns
+ * seen in the join condition. If no matching row is found then left join would
+ * not remove the non-matched row from the result set. This is not the case
+ * with INNER JOINs, so here we must use foreign keys as proof that the 1 row
+ * exists before we can allow any joins to be removed.
+ */
+static bool
+innerjoin_is_removable(PlannerInfo *root, List *joinlist,
+					   RangeTblRef *removalrtr, Relids ignoredrels)
+{
+	ListCell   *lc;
+	RelOptInfo *removalrel;
+
+	removalrel = find_base_rel(root, removalrtr->rtindex);
+
+	/*
+	 * As foreign keys may only reference base rels which have unique indexes,
+	 * we needn't go any further if we're not dealing with a base rel, or if
+	 * the base rel has no unique indexes. We'd also better abort if the
+	 * rtekind is anything but a relation, as things like sub-queries may have
+	 * grouping or distinct clauses that would cause us not to be able to use
+	 * the foreign key to prove the existence of a row matching the join
+	 * condition. We also abort if the rel has no eclass joins as such a rel
+	 * could well be joined using some operator which is not an equality
+	 * operator, or the rel may not even be inner joined at all.
+	 *
+	 * Here we actually only check if the rel has any indexes, ideally we'd be
+	 * checking for unique indexes, but we could only determine that by looping
+	 * over the indexlist, and this is likely too expensive a check to be worth
+	 * it here.
+	 */
+	if (removalrel->reloptkind != RELOPT_BASEREL ||
+		removalrel->rtekind != RTE_RELATION ||
+		removalrel->has_eclass_joins == false ||
+		removalrel->indexlist == NIL)
+		return false;
+
+	/*
+	 * Currently we disallow the removal if we find any baserestrictinfo items
+	 * on the relation being removed. The reason for this is that these would
+	 * filter out rows and make it so the foreign key cannot prove that we'll
+	 * match exactly 1 row on the join condition. However, this check is
+	 * currently probably a bit overly strict as it should be possible to just
+	 * check and ensure that each Var seen in the baserestrictinfo is also
+	 * present in an eclass and if so, just translate and move the whole
+	 * baserestrictinfo over to the relation which has the foreign key to prove
+	 * that this join is not needed. e.g:
+	 * SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id WHERE b.id = 1;
+	 * could become: SELECT a.* FROM a WHERE a.b_id = 1;
+	 */
+	if (removalrel->baserestrictinfo != NIL)
+		return false;
+
+	/*
+	 * Currently only eclass joins are supported, so if there are any non
+	 * eclass join quals then we'll report the join is non-removable.
+	 */
+	if (removalrel->joininfo != NIL)
+		return false;
+
+	/*
+	 * Now we'll search through each relation in the joinlist to see if we can
+	 * find a relation which has a foreign key which references removalrel on
+	 * the join condition. If we find a rel with a foreign key which matches
+	 * the join condition exactly, then we can be sure that exactly 1 row will
+	 * be matched on the join, if we also see that no Vars from the relation
+	 * are needed, then we can report the join as removable.
+	 */
+	foreach (lc, joinlist)
+	{
+		RangeTblRef	*rtr = (RangeTblRef *) lfirst(lc);
+		RelOptInfo	*rel;
+		ListCell	*lc2;
+		List		*referencing_vars;
+		List		*index_vars;
+		List		*operator_list;
+		Relids		 joinrelids;
+
+		/* we can't remove ourself, or anything other than RangeTblRefs */
+		if (rtr == removalrtr || !IsA(rtr, RangeTblRef))
+			continue;
+
+		rel = find_base_rel(root, rtr->rtindex);
+
+		/*
+		 * The only relation type that can help us is a base rel with at least
+		 * one foreign key defined, if there's no eclass joins then this rel
+		 * is not going to help us prove the removalrel is not needed.
+		 */
+		if (rel->reloptkind != RELOPT_BASEREL ||
+			rel->rtekind != RTE_RELATION ||
+			rel->has_eclass_joins == false ||
+			rel->fklist == NIL)
+			continue;
+
+		/*
+		 * Both rels have eclass joins, but do they have eclass joins to each
+		 * other? Skip this rel if it does not.
+		 */
+		if (!have_relevant_eclass_joinclause(root, rel, removalrel))
+			continue;
+
+		joinrelids = bms_union(rel->relids, removalrel->relids);
+
+		/* if any of the Vars from the relation are needed then abort */
+		if (relation_is_needed(root, joinrelids, removalrel, ignoredrels))
+			return false;
+
+		referencing_vars = NIL;
+		index_vars = NIL;
+		operator_list = NIL;
+
+		/* now populate the lists with the join condition Vars */
+		foreach(lc2, root->eq_classes)
+		{
+			EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc2);
+
+			if (list_length(ec->ec_members) <= 1)
+				continue;
+
+			if (bms_overlap(removalrel->relids, ec->ec_relids) &&
+				bms_overlap(rel->relids, ec->ec_relids))
+			{
+				ListCell *lc3;
+				Var *refvar = NULL;
+				Var *idxvar = NULL;
+
+				/*
+				 * Look at each member of the eclass and try to find a Var from
+				 * each side of the join that we can append to the list of
+				 * columns that should be checked against each foreign key.
+				 *
+				 * The following logic does not allow for join removals to take
+				 * place for foreign keys that have duplicate columns on the
+				 * referencing side of the foreign key, such as:
+				 * (a,a) references (x,y)
+				 * The use case for such a foreign key is likely small enough
+				 * that we needn't bother making this code anymore complex to
+				 * solve. If we find more than 1 Var from any of the rels then
+				 * we'll bail out.
+				 */
+				foreach (lc3, ec->ec_members)
+				{
+					EquivalenceMember *ecm = (EquivalenceMember *) lfirst(lc3);
+
+					Var *var = (Var *) ecm->em_expr;
+
+					if (!IsA(var, Var))
+						continue; /* Ignore Consts */
+
+					if (var->varno == rel->relid)
+					{
+						if (refvar != NULL)
+							return false;
+						refvar = var;
+					}
+
+					else if (var->varno == removalrel->relid)
+					{
+						if (idxvar != NULL)
+							return false;
+						idxvar = var;
+					}
+				}
+
+				if (refvar != NULL && idxvar != NULL)
+				{
+					Oid opno;
+					Oid reloid = root->simple_rte_array[refvar->varno]->relid;
+
+					if (!get_attnotnull(reloid, refvar->varattno))
+						return false;
+
+					/* grab the correct equality operator for these two vars */
+					opno = select_equality_operator(ec, refvar->vartype, idxvar->vartype);
+
+					if (!OidIsValid(opno))
+						return false;
+
+					referencing_vars = lappend(referencing_vars, refvar);
+					index_vars = lappend(index_vars, idxvar);
+					operator_list = lappend_oid(operator_list, opno);
+				}
+			}
+		}
+
+		if (referencing_vars != NULL)
+		{
+			if (relation_has_foreign_key_for(root, rel, removalrel,
+				referencing_vars, index_vars, operator_list))
+				return true; /* removalrel can be removed */
+		}
+	}
+
+	return false; /* can't remove join */
+}
+
+/*
+ * leftjoin_is_removable
+ *	  Check whether we need not perform this left join at all, because
  *	  it will just duplicate its left input.
  *
  * This is true for a left join for which the join condition cannot match
@@ -147,7 +424,7 @@ clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
  * above the join.
  */
 static bool
-join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
+leftjoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 {
 	int			innerrelid;
 	RelOptInfo *innerrel;
@@ -155,14 +432,14 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	Relids		joinrelids;
 	List	   *clause_list = NIL;
 	ListCell   *l;
-	int			attroff;
+
+	Assert(sjinfo->jointype == JOIN_LEFT);
 
 	/*
-	 * Must be a non-delaying left join to a single baserel, else we aren't
+	 * Must be a non-delaying join to a single baserel, else we aren't
 	 * going to be able to do anything with it.
 	 */
-	if (sjinfo->jointype != JOIN_LEFT ||
-		sjinfo->delay_upper_joins ||
+	if (sjinfo->delay_upper_joins ||
 		bms_membership(sjinfo->min_righthand) != BMS_SINGLETON)
 		return false;
 
@@ -205,52 +482,9 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	/* Compute the relid set for the join we are considering */
 	joinrelids = bms_union(sjinfo->min_lefthand, sjinfo->min_righthand);
 
-	/*
-	 * We can't remove the join if any inner-rel attributes are used above the
-	 * join.
-	 *
-	 * Note that this test only detects use of inner-rel attributes in higher
-	 * join conditions and the target list.  There might be such attributes in
-	 * pushed-down conditions at this join, too.  We check that case below.
-	 *
-	 * As a micro-optimization, it seems better to start with max_attr and
-	 * count down rather than starting with min_attr and counting up, on the
-	 * theory that the system attributes are somewhat less likely to be wanted
-	 * and should be tested last.
-	 */
-	for (attroff = innerrel->max_attr - innerrel->min_attr;
-		 attroff >= 0;
-		 attroff--)
-	{
-		if (!bms_is_subset(innerrel->attr_needed[attroff], joinrelids))
-			return false;
-	}
-
-	/*
-	 * Similarly check that the inner rel isn't needed by any PlaceHolderVars
-	 * that will be used above the join.  We only need to fail if such a PHV
-	 * actually references some inner-rel attributes; but the correct check
-	 * for that is relatively expensive, so we first check against ph_eval_at,
-	 * which must mention the inner rel if the PHV uses any inner-rel attrs as
-	 * non-lateral references.  Note that if the PHV's syntactic scope is just
-	 * the inner rel, we can't drop the rel even if the PHV is variable-free.
-	 */
-	foreach(l, root->placeholder_list)
-	{
-		PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l);
-
-		if (bms_is_subset(phinfo->ph_needed, joinrelids))
-			continue;			/* PHV is not used above the join */
-		if (bms_overlap(phinfo->ph_lateral, innerrel->relids))
-			return false;		/* it references innerrel laterally */
-		if (!bms_overlap(phinfo->ph_eval_at, innerrel->relids))
-			continue;			/* it definitely doesn't reference innerrel */
-		if (bms_is_subset(phinfo->ph_eval_at, innerrel->relids))
-			return false;		/* there isn't any other place to eval PHV */
-		if (bms_overlap(pull_varnos((Node *) phinfo->ph_var->phexpr),
-						innerrel->relids))
-			return false;		/* it does reference innerrel */
-	}
+	/* if the relation is referenced in the query then it cannot be removed */
+	if (relation_is_needed(root, joinrelids, innerrel, NULL))
+		return false;
 
 	/*
 	 * Search for mergejoinable clauses that constrain the inner rel against
@@ -367,6 +601,218 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	return false;
 }
 
+/*
+ * relation_is_needed
+ *		True if any of the Vars from this relation are required in the query
+ */
+static inline bool
+relation_is_needed(PlannerInfo *root, Relids joinrelids, RelOptInfo *rel, Relids ignoredrels)
+{
+	int		  attroff;
+	ListCell *l;
+
+	/*
+	 * rel is referenced if any of it's attributes are used above the join.
+	 *
+	 * Note that this test only detects use of rel's attributes in higher
+	 * join conditions and the target list.  There might be such attributes in
+	 * pushed-down conditions at this join, too.  We check that case below.
+	 *
+	 * As a micro-optimization, it seems better to start with max_attr and
+	 * count down rather than starting with min_attr and counting up, on the
+	 * theory that the system attributes are somewhat less likely to be wanted
+	 * and should be tested last.
+	 */
+	for (attroff = rel->max_attr - rel->min_attr;
+		 attroff >= 0;
+		 attroff--)
+	{
+		if (!bms_is_subset(bms_difference(rel->attr_needed[attroff], ignoredrels), joinrelids))
+			return true;
+	}
+
+	/*
+	 * Similarly check that rel isn't needed by any PlaceHolderVars that will
+	 * be used above the join.  We only need to fail if such a PHV actually
+	 * references some of rel's attributes; but the correct check for that is
+	 * relatively expensive, so we first check against ph_eval_at, which must
+	 * mention rel if the PHV uses any of-rel's attrs as non-lateral
+	 * references.  Note that if the PHV's syntactic scope is just rel, we
+	 * can't return true even if the PHV is variable-free.
+	 */
+	foreach(l, root->placeholder_list)
+	{
+		PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l);
+
+		if (bms_is_subset(phinfo->ph_needed, joinrelids))
+			continue;			/* PHV is not used above the join */
+		if (bms_overlap(phinfo->ph_lateral, rel->relids))
+			return true;		/* it references rel laterally */
+		if (!bms_overlap(phinfo->ph_eval_at, rel->relids))
+			continue;			/* it definitely doesn't reference rel */
+		if (bms_is_subset(phinfo->ph_eval_at, rel->relids))
+			return true;		/* there isn't any other place to eval PHV */
+		if (bms_overlap(pull_varnos((Node *) phinfo->ph_var->phexpr),
+						rel->relids))
+			return true;		/* it does reference rel */
+	}
+
+	return false; /* it does not reference rel */
+}
+
+/*
+ * relation_has_foreign_key_for
+ *	  Checks if rel has a foreign key which references referencedrel with the
+ *	  given list of expressions.
+ *
+ *	For the match to succeed:
+ *	  referencing_vars must match the columns defined in the foreign key.
+ *	  index_vars must match the columns defined in the index for the foreign key.
+ */
+static bool
+relation_has_foreign_key_for(PlannerInfo *root, RelOptInfo *rel,
+			RelOptInfo *referencedrel, List *referencing_vars,
+			List *index_vars, List *operator_list)
+{
+	ListCell *lc;
+	Oid		  refreloid;
+
+	/*
+	 * Look up the Oid of the referenced relation. We only want to look at
+	 * foreign keys on the referencing relation which reference this relation.
+	 */
+	refreloid = root->simple_rte_array[referencedrel->relid]->relid;
+
+	Assert(list_length(referencing_vars) > 0);
+	Assert(list_length(referencing_vars) == list_length(index_vars));
+	Assert(list_length(referencing_vars) == list_length(operator_list));
+
+	/*
+	 * Search through each foreign key on the referencing relation and try
+	 * to find one which references the relation in the join condition. If we
+	 * find one then we'll send the join conditions off to
+	 * expressions_match_foreign_key() to see if they match the foreign key.
+	 */
+	foreach(lc, rel->fklist)
+	{
+		ForeignKeyInfo *fk = (ForeignKeyInfo *) lfirst(lc);
+
+		if (fk->confrelid == refreloid)
+		{
+			if (expressions_match_foreign_key(fk, referencing_vars,
+				index_vars, operator_list))
+				return true;
+		}
+	}
+
+	return false;
+}
+
+/*
+ * expressions_match_foreign_key
+ *		True if the given fkvars, indexvars and operators will match
+ *		exactly 1 record in the referenced relation of the foreign key.
+ *
+ * Note: This function expects fkvars and indexvars to only contain Var types.
+ *		 Expression indexes are not supported by foreign keys.
+ */
+static bool
+expressions_match_foreign_key(ForeignKeyInfo *fk, List *fkvars,
+					List *indexvars, List *operators)
+{
+	ListCell  *lc;
+	ListCell  *lc2;
+	ListCell  *lc3;
+	Bitmapset *allitems;
+	Bitmapset *matcheditems;
+	int		   lstidx;
+	int		   col;
+
+	Assert(list_length(fkvars) == list_length(indexvars));
+	Assert(list_length(fkvars) == list_length(operators));
+
+	/*
+	 * Fast path out if there's not enough conditions to match each column in
+	 * the foreign key. Note that we cannot check that the number of
+	 * expressions are equal here since it would cause any expressions which
+	 * are duplicated not to match.
+	 */
+	if (list_length(fkvars) < fk->conncols)
+		return false;
+
+	/*
+	 * We need to ensure that each foreign key column can be matched to a list
+	 * item, and we need to ensure that each list item can be matched to a
+	 * foreign key column. We do this by looping over each foreign key column
+	 * and checking that we can find an item in the list which matches the
+	 * current column, however this method does not allow us to ensure that no
+	 * additional items exist in the list. We could solve that by performing
+	 * another loop over each list item and check that it matches a foreign key
+	 * column, but that's a bit wasteful. Instead we'll use 2 bitmapsets, one
+	 * to store the 0 based index of each list item, and with the other we'll
+	 * store each list index that we've managed to match. After we're done
+	 * matching we'll just make sure that both bitmapsets are equal.
+	 */
+	allitems = NULL;
+	matcheditems = NULL;
+
+	/*
+	 * Build a bitmapset which contains each 1 based list index. It seems more
+	 * efficient to do this in reverse so that we allocate enough memory for
+	 * the bitmapset on first loop rather than reallocating each time we find
+	 * we need a bit more space.
+	 */
+	for (lstidx = list_length(fkvars) - 1; lstidx >= 0; lstidx--)
+		allitems = bms_add_member(allitems, lstidx);
+
+	for (col = 0; col < fk->conncols; col++)
+	{
+		bool  matched = false;
+
+		lstidx = 0;
+
+		forthree(lc, fkvars, lc2, indexvars, lc3, operators)
+		{
+			Var *expr = (Var *) lfirst(lc);
+			Var *idxexpr = (Var *) lfirst(lc2);
+			Oid  opr = lfirst_oid(lc3);
+
+			Assert(IsA(expr, Var));
+			Assert(IsA(idxexpr, Var));
+
+			/* Does this join qual match up to the current fkey column? */
+			if (fk->conkey[col] == expr->varattno &&
+				fk->confkey[col] == idxexpr->varattno &&
+				equality_ops_are_compatible(opr, fk->conpfeqop[col]))
+			{
+				matched = true;
+
+				/* mark this list item as matched */
+				matcheditems = bms_add_member(matcheditems, lstidx);
+
+				/*
+				 * Don't break here as there may be duplicate expressions
+				 * that we also need to match against.
+				 */
+			}
+			lstidx++;
+		}
+
+		/* punt if there's no match. */
+		if (!matched)
+			return false;
+	}
+
+	/*
+	 * Ensure that we managed to match every item in the list to a foreign key
+	 * column.
+	 */
+	if (!bms_equal(allitems, matcheditems))
+		return false;
+
+	return true; /* matched */
+}
+
 
 /*
  * Remove the target relid from the planner's data structures, having
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index fb74d6b..7ea0149 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -3712,6 +3712,7 @@ plan_cluster_use_sort(Oid tableOid, Oid indexOid)
 	rte->lateral = false;
 	rte->inh = false;
 	rte->inFromCl = true;
+	rte->skipJoinPossible = false;
 	query->rtable = list_make1(rte);
 
 	/* Set up RTE/RelOptInfo arrays */
diff --git a/src/backend/optimizer/prep/prepsecurity.c b/src/backend/optimizer/prep/prepsecurity.c
index b625b5c..74a0dca 100644
--- a/src/backend/optimizer/prep/prepsecurity.c
+++ b/src/backend/optimizer/prep/prepsecurity.c
@@ -311,6 +311,7 @@ expand_security_qual(PlannerInfo *root, List *tlist, int rt_index,
 			subrte->security_barrier = rte->security_barrier;
 			subrte->eref = copyObject(rte->eref);
 			subrte->inFromCl = true;
+			subrte->skipJoinPossible = false;
 			subquery->rtable = list_make1(subrte);
 
 			subrtr = makeNode(RangeTblRef);
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index b2becfa..fea198e 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -25,7 +25,9 @@
 #include "access/transam.h"
 #include "access/xlog.h"
 #include "catalog/catalog.h"
+#include "catalog/pg_constraint.h"
 #include "catalog/heap.h"
+#include "catalog/pg_type.h"
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -38,6 +40,7 @@
 #include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "storage/bufmgr.h"
+#include "utils/fmgroids.h"
 #include "utils/lsyscache.h"
 #include "utils/rel.h"
 #include "utils/snapmgr.h"
@@ -89,6 +92,12 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 	Relation	relation;
 	bool		hasindex;
 	List	   *indexinfos = NIL;
+	List	   *fkinfos = NIL;
+	Relation	fkeyRel;
+	Relation	fkeyRelIdx;
+	ScanKeyData fkeyScankey;
+	SysScanDesc fkeyScan;
+	HeapTuple	tuple;
 
 	/*
 	 * We need not lock the relation since it was already locked, either by
@@ -384,6 +393,111 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 
 	heap_close(relation, NoLock);
 
+	/* load foreign key constraints */
+	ScanKeyInit(&fkeyScankey,
+				Anum_pg_constraint_conrelid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(relationObjectId));
+
+	fkeyRel = heap_open(ConstraintRelationId, AccessShareLock);
+	fkeyRelIdx = index_open(ConstraintRelidIndexId, AccessShareLock);
+	fkeyScan = systable_beginscan_ordered(fkeyRel, fkeyRelIdx, NULL, 1, &fkeyScankey);
+
+	while ((tuple = systable_getnext_ordered(fkeyScan, ForwardScanDirection)) != NULL)
+	{
+		Form_pg_constraint con = (Form_pg_constraint) GETSTRUCT(tuple);
+		ForeignKeyInfo *fkinfo;
+		Datum		adatum;
+		bool		isNull;
+		ArrayType  *arr;
+		int			nelements;
+
+		/* skip if not a foreign key */
+		if (con->contype != CONSTRAINT_FOREIGN)
+			continue;
+
+		/* we're not interested unless the fkey has been validated */
+		if (!con->convalidated)
+			continue;
+
+		fkinfo = (ForeignKeyInfo *) palloc(sizeof(ForeignKeyInfo));
+		fkinfo->conindid = con->conindid;
+		fkinfo->confrelid = con->confrelid;
+		fkinfo->convalidated = con->convalidated;
+		fkinfo->conrelid = con->conrelid;
+		fkinfo->confupdtype = con->confupdtype;
+		fkinfo->confdeltype = con->confdeltype;
+		fkinfo->confmatchtype = con->confmatchtype;
+
+		adatum = heap_getattr(tuple, Anum_pg_constraint_conkey,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null conkey for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		nelements = ARR_DIMS(arr)[0];
+		if (ARR_NDIM(arr) != 1 ||
+			nelements < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != INT2OID)
+			elog(ERROR, "conkey is not a 1-D smallint array");
+
+		fkinfo->conkey = (int16 *) ARR_DATA_PTR(arr);
+		fkinfo->conncols = nelements;
+
+		adatum = heap_getattr(tuple, Anum_pg_constraint_confkey,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null confkey for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		nelements = ARR_DIMS(arr)[0];
+
+		if (ARR_NDIM(arr) != 1 ||
+			nelements < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != INT2OID)
+			elog(ERROR, "confkey is not a 1-D smallint array");
+
+		/* sanity check */
+		if (nelements != fkinfo->conncols)
+			elog(ERROR, "number of confkey elements does not equal conkey elements");
+
+		fkinfo->confkey = (int16 *) ARR_DATA_PTR(arr);
+		adatum = heap_getattr(tuple, Anum_pg_constraint_conpfeqop,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null conpfeqop for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		nelements = ARR_DIMS(arr)[0];
+
+		if (ARR_NDIM(arr) != 1 ||
+			nelements < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != OIDOID)
+			elog(ERROR, "conpfeqop is not a 1-D smallint array");
+
+		/* sanity check */
+		if (nelements != fkinfo->conncols)
+			elog(ERROR, "number of conpfeqop elements does not equal conkey elements");
+
+		fkinfo->conpfeqop = (Oid *) ARR_DATA_PTR(arr);
+
+		fkinfos = lappend(fkinfos, fkinfo);
+	}
+
+	rel->fklist = fkinfos;
+	systable_endscan_ordered(fkeyScan);
+	index_close(fkeyRelIdx, AccessShareLock);
+	heap_close(fkeyRel, AccessShareLock);
+
 	/*
 	 * Allow a plugin to editorialize on the info we obtained from the
 	 * catalogs.  Actions might include altering the assumed relation size,
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 4c76f54..58d80bb 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -115,6 +115,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind)
 	rel->lateral_relids = NULL;
 	rel->lateral_referencers = NULL;
 	rel->indexlist = NIL;
+	rel->fklist = NIL;
 	rel->pages = 0;
 	rel->tuples = 0;
 	rel->allvisfrac = 0;
@@ -377,6 +378,7 @@ build_join_rel(PlannerInfo *root,
 	joinrel->lateral_relids = NULL;
 	joinrel->lateral_referencers = NULL;
 	joinrel->indexlist = NIL;
+	joinrel->fklist = NIL;
 	joinrel->pages = 0;
 	joinrel->tuples = 0;
 	joinrel->allvisfrac = 0;
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 478584d..cafeba9 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1048,6 +1048,7 @@ addRangeTableEntry(ParseState *pstate,
 	rte->lateral = false;
 	rte->inh = inh;
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = ACL_SELECT;
 	rte->checkAsUser = InvalidOid;		/* not set-uid by default, either */
@@ -1101,6 +1102,7 @@ addRangeTableEntryForRelation(ParseState *pstate,
 	rte->lateral = false;
 	rte->inh = inh;
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = ACL_SELECT;
 	rte->checkAsUser = InvalidOid;		/* not set-uid by default, either */
@@ -1179,6 +1181,7 @@ addRangeTableEntryForSubquery(ParseState *pstate,
 	rte->lateral = lateral;
 	rte->inh = false;			/* never true for subqueries */
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = 0;
 	rte->checkAsUser = InvalidOid;
@@ -1433,6 +1436,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 	rte->lateral = lateral;
 	rte->inh = false;			/* never true for functions */
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = 0;
 	rte->checkAsUser = InvalidOid;
@@ -1505,6 +1509,7 @@ addRangeTableEntryForValues(ParseState *pstate,
 	rte->lateral = lateral;
 	rte->inh = false;			/* never true for values RTEs */
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = 0;
 	rte->checkAsUser = InvalidOid;
@@ -1573,6 +1578,7 @@ addRangeTableEntryForJoin(ParseState *pstate,
 	rte->lateral = false;
 	rte->inh = false;			/* never true for joins */
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = 0;
 	rte->checkAsUser = InvalidOid;
@@ -1673,6 +1679,7 @@ addRangeTableEntryForCTE(ParseState *pstate,
 	rte->lateral = false;
 	rte->inh = false;			/* never true for subqueries */
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = 0;
 	rte->checkAsUser = InvalidOid;
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 24ade6c..11ab914 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -843,6 +843,7 @@ pg_get_triggerdef_worker(Oid trigid, bool pretty)
 		oldrte->lateral = false;
 		oldrte->inh = false;
 		oldrte->inFromCl = true;
+		oldrte->skipJoinPossible = false;
 
 		newrte = makeNode(RangeTblEntry);
 		newrte->rtekind = RTE_RELATION;
@@ -853,6 +854,7 @@ pg_get_triggerdef_worker(Oid trigid, bool pretty)
 		newrte->lateral = false;
 		newrte->inh = false;
 		newrte->inFromCl = true;
+		newrte->skipJoinPossible = false;
 
 		/* Build two-element rtable */
 		memset(&dpns, 0, sizeof(dpns));
@@ -2508,6 +2510,7 @@ deparse_context_for(const char *aliasname, Oid relid)
 	rte->lateral = false;
 	rte->inh = false;
 	rte->inFromCl = true;
+	rte->skipJoinPossible = false;
 
 	/* Build one-element rtable */
 	dpns->rtable = list_make1(rte);
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 552e498..aa81c7c 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -916,6 +916,33 @@ get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 	ReleaseSysCache(tp);
 }
 
+/*
+ * get_attnotnull
+ *
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	tp;
+
+	tp = SearchSysCache2(ATTNUM,
+						 ObjectIdGetDatum(relid),
+						 Int16GetDatum(attnum));
+	if (HeapTupleIsValid(tp))
+	{
+		Form_pg_attribute att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+		bool		result;
+
+		result = att_tup->attnotnull;
+		ReleaseSysCache(tp);
+		return result;
+	}
+	else
+		return false;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h
index d0b0356..34a75e4 100644
--- a/src/include/commands/trigger.h
+++ b/src/include/commands/trigger.h
@@ -181,6 +181,7 @@ extern void ExecBSTruncateTriggers(EState *estate,
 extern void ExecASTruncateTriggers(EState *estate,
 					   ResultRelInfo *relinfo);
 
+extern bool AfterTriggerQueueIsEmpty(void);
 extern void AfterTriggerBeginXact(void);
 extern void AfterTriggerBeginQuery(void);
 extern void AfterTriggerEndQuery(EState *estate);
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index d167b49..3bb227e 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -64,6 +64,15 @@
 #define EXEC_FLAG_WITHOUT_OIDS	0x0040	/* force no OIDs in returned tuples */
 #define EXEC_FLAG_WITH_NO_DATA	0x0080	/* rel scannability doesn't matter */
 
+/* Flags used for JoinState.skipflags */
+#define EXEC_SKIPJOIN_INNER		0x0001 /* Skip inner side of join */
+#define EXEC_SKIPJOIN_OUTER		0x0002 /* Skip outer side of join */
+#define EXEC_SKIPJOIN_BOTH		(EXEC_SKIPJOIN_OUTER|EXEC_SKIPJOIN_INNER)
+
+#define HasFlagSkipJoinInner(n)	((n) & EXEC_SKIPJOIN_INNER)
+#define HasFlagSkipJoinOuter(n)	((n) & EXEC_SKIPJOIN_OUTER)
+#define HasFlagSkipJoinBoth(n)	((n) & EXEC_SKIPJOIN_BOTH) == EXEC_SKIPJOIN_BOTH
+#define HasFlagSkipJoinAny(n)	((n) != 0)
 
 /*
  * ExecEvalExpr was formerly a function containing a switch statement;
@@ -337,6 +346,7 @@ extern ProjectionInfo *ExecBuildProjectionInfo(List *targetList,
 						ExprContext *econtext,
 						TupleTableSlot *slot,
 						TupleDesc inputDesc);
+extern bool ExecCanSkipJoin(PlanState *planstate);
 extern void ExecAssignProjectionInfo(PlanState *planstate,
 						 TupleDesc inputDesc);
 extern void ExecFreeExprContext(PlanState *planstate);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 39d2c10..8d1b3dc 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1520,6 +1520,7 @@ typedef struct JoinState
 	PlanState	ps;
 	JoinType	jointype;
 	List	   *joinqual;		/* JOIN quals (in addition to ps.qual) */
+	int			skipflags;
 } JoinState;
 
 /* ----------------
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index cef9544..7504c53 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -813,6 +813,8 @@ typedef struct RangeTblEntry
 	bool		lateral;		/* subquery, function, or values is LATERAL? */
 	bool		inh;			/* inheritance requested? */
 	bool		inFromCl;		/* present in FROM clause? */
+	bool		skipJoinPossible; /* it may be possible to not bother joining
+								   * this relation at all */
 	AclMode		requiredPerms;	/* bitmask of required access permissions */
 	Oid			checkAsUser;	/* if valid, check access as this role */
 	Bitmapset  *selectedCols;	/* columns needing SELECT permission */
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index f1a0504..d762660 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -358,6 +358,8 @@ typedef struct PlannerInfo
  *		lateral_referencers - relids of rels that reference this one laterally
  *		indexlist - list of IndexOptInfo nodes for relation's indexes
  *					(always NIL if it's not a table)
+ *		fklist - list of ForeignKeyInfo's for relation's foreign key
+ *					constraints. (always NIL if it's not a table)
  *		pages - number of disk pages in relation (zero if not a table)
  *		tuples - number of tuples in relation (not considering restrictions)
  *		allvisfrac - fraction of disk pages that are marked all-visible
@@ -451,6 +453,7 @@ typedef struct RelOptInfo
 	Relids		lateral_relids; /* minimum parameterization of rel */
 	Relids		lateral_referencers;	/* rels that reference me laterally */
 	List	   *indexlist;		/* list of IndexOptInfo */
+	List	   *fklist;			/* list of ForeignKeyInfo */
 	BlockNumber pages;			/* size estimates derived from pg_class */
 	double		tuples;
 	double		allvisfrac;
@@ -541,6 +544,51 @@ typedef struct IndexOptInfo
 	bool		amhasgetbitmap; /* does AM have amgetbitmap interface? */
 } IndexOptInfo;
 
+/*
+ * ForeignKeyInfo
+ *		Used to store pg_constraint records for foreign key constraints for use
+ *		by the planner.
+ *
+ *		conindid - The index which supports the foreign key
+ *
+ *		confrelid - The relation that is referenced by this foreign key
+ *
+ *		convalidated - True if the foreign key has been validated.
+ *
+ *		conrelid - The Oid of the relation that the foreign key belongs to
+ *
+ *		confupdtype - ON UPDATE action for when the referenced table is updated
+ *
+ *		confdeltype - ON DELETE action, controls what to do when a record is
+ *					deleted from the referenced table.
+ *
+ *		confmatchtype - foreign key match type, e.g MATCH FULL, MATCH PARTIAL
+ *
+ *		conncols - Number of columns defined in the foreign key
+ *
+ *		conkey - An array of conncols elements to store the varattno of the
+ *					columns on the referencing side of the foreign key
+ *
+ *		confkey - An array of conncols elements to store the varattno of the
+ *					columns on the referenced side of the foreign key
+ *
+ *		conpfeqop - An array of conncols elements to store the operators for
+ *					PK = FK comparisons
+ */
+typedef struct ForeignKeyInfo
+{
+	Oid			conindid;		/* index supporting this constraint */
+	Oid			confrelid;		/* relation referenced by foreign key */
+	bool		convalidated;	/* constraint has been validated? */
+	Oid			conrelid;		/* relation this constraint constrains */
+	char		confupdtype;	/* foreign key's ON UPDATE action */
+	char		confdeltype;	/* foreign key's ON DELETE action */
+	char		confmatchtype;	/* foreign key's match type */
+	int			conncols;		/* number of columns references */
+	int16	   *conkey;			/* Columns of conrelid that the constraint applies to */
+	int16	   *confkey;		/* columns of confrelid that foreign key references */
+	Oid		   *conpfeqop;		/* Operator list for comparing PK to FK */
+} ForeignKeyInfo;
 
 /*
  * EquivalenceClasses
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 9b22fda..b11ae78 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -108,10 +108,13 @@ extern EquivalenceClass *get_eclass_for_sort_expr(PlannerInfo *root,
 						 Relids rel,
 						 bool create_it);
 extern void generate_base_implied_equalities(PlannerInfo *root);
+extern void remove_rel_from_eclass(PlannerInfo *root, int relid);
 extern List *generate_join_implied_equalities(PlannerInfo *root,
 								 Relids join_relids,
 								 Relids outer_relids,
 								 RelOptInfo *inner_rel);
+extern Oid select_equality_operator(EquivalenceClass *ec, Oid lefttype,
+								 Oid righttype);
 extern bool exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2);
 extern void add_child_rel_equivalences(PlannerInfo *root,
 						   AppendRelInfo *appinfo,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 07d24d4..910190d 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -68,6 +68,7 @@ extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern int32 get_atttypmod(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 					  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern char *get_collation_name(Oid colloid);
 extern char *get_constraint_name(Oid conoid);
 extern Oid	get_opclass_family(Oid opclass);
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 2501184..4299227 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3276,6 +3276,32 @@ select i8.* from int8_tbl i8 left join (select f1 from int4_tbl group by f1) i4
 (1 row)
 
 rollback;
+begin work;
+create temp table c (
+  id int primary key
+);
+create temp table b (
+  id int primary key,
+  c_id int not null,
+  constraint b_c_id_fkey foreign key (c_id) references c deferrable
+);
+create temp table a (
+  id int primary key,
+  b_id int not null,
+  constraint a_b_id_fkey foreign key (b_id) references b deferrable
+);
+insert into c (id) values(1);
+insert into b (id, c_id) values(1,1);
+insert into a (id, b_id) values(1,1);
+set constraints b_c_id_fkey deferred;
+update c set id = 2 where id=1;
+-- ensure inner join to be is not skipped.
+select b.* from b inner join c on b.c_id = c.id;
+ id | c_id 
+----+------
+(0 rows)
+
+rollback;
 create temp table parent (k int primary key, pd int);
 create temp table child (k int unique, cd int);
 insert into parent values (1, 10), (2, 20), (3, 30);
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index 718e1d9..e226e4e 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -977,6 +977,34 @@ select i8.* from int8_tbl i8 left join (select f1 from int4_tbl group by f1) i4
 
 rollback;
 
+begin work;
+
+create temp table c (
+  id int primary key
+);
+create temp table b (
+  id int primary key,
+  c_id int not null,
+  constraint b_c_id_fkey foreign key (c_id) references c deferrable
+);
+create temp table a (
+  id int primary key,
+  b_id int not null,
+  constraint a_b_id_fkey foreign key (b_id) references b deferrable
+);
+
+insert into c (id) values(1);
+insert into b (id, c_id) values(1,1);
+insert into a (id, b_id) values(1,1);
+
+set constraints b_c_id_fkey deferred;
+update c set id = 2 where id=1;
+
+-- ensure inner join to be is not skipped.
+select b.* from b inner join c on b.c_id = c.id;
+
+rollback;
+
 create temp table parent (k int primary key, pd int);
 create temp table child (k int unique, cd int);
 insert into parent values (1, 10), (2, 20), (3, 30);

#37

Simon Riggs

simon@2ndQuadrant.com

about 11 years ago

In reply to: David Rowley (#36)

Re: Patch to support SEMI and ANTI join removal

On 15 October 2014 11:03, David Rowley <dgrowleyml@gmail.com> wrote:

The explain analyze from the above query looks like:
test=# explain (analyze, costs off, timing off) select count(*) from t1
inner join t2 on t1.t2_id=t2.id;
QUERY PLAN
------------------------------------------------------------------
Aggregate (actual rows=1 loops=1)
-> Nested Loop (actual rows=1000000 loops=1)
-> Seq Scan on t1 (actual rows=1000000 loops=1)
-> Index Only Scan using t2_pkey on t2 (never executed)
Index Cond: (id = t1.t2_id)
Heap Fetches: 0
Execution time: 124.990 ms
(7 rows)

As you can see the scan on t2 never occurred.

Very good, happy to see this happening (yay FKs!) and with
PostgreSQL-style rigour.

I've reviewed the patch from cold and I have a few comments.

The plan you end up with here works quite differently from left outer
join removal, where the join is simply absent. That inconsistency
causes most of the other problems I see.

I propose that we keep track of whether there are any potentially
skippable joins at the top of the plan. When we begin execution we do
a single if test to see if there is run-time work to do. If we pass
the run-time tests we then descend the tree and prune the plan to
completely remove unnecessary nodes. We end with an EXPLAIN and
EXPLAIN ANALYZE that looks like this

QUERY PLAN
------------------------------------------------------------------
Aggregate (actual rows=1 loops=1)
-> Seq Scan on t1 (actual rows=1000000 loops=1)

Doing that removes all the overheads and complexity; it also matches
how join removal currently works.

The alternative is accepting some pretty horrible additional code in
most join types, plus a small regression on nested loop joins which I
would have to call out as regrettably unacceptable. (Horrible in this
sense that we don't want that code, not that David's code is poor).

The tests on the patch are pretty poor. If we should use EXPLAINs to
show a join removal that works and a join removal that fails. With a
few of the main permutations.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#38

David Rowley

dgrowleyml@gmail.com

about 11 years ago

In reply to: Simon Riggs (#37)

1 attachment(s)

Re: Patch to support SEMI and ANTI join removal

On Sun, Nov 16, 2014 at 10:09 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 15 October 2014 11:03, David Rowley <dgrowleyml@gmail.com> wrote:

The explain analyze from the above query looks like:
test=# explain (analyze, costs off, timing off) select count(*) from t1
inner join t2 on t1.t2_id=t2.id;
QUERY PLAN
------------------------------------------------------------------
Aggregate (actual rows=1 loops=1)
-> Nested Loop (actual rows=1000000 loops=1)
-> Seq Scan on t1 (actual rows=1000000 loops=1)
-> Index Only Scan using t2_pkey on t2 (never executed)
Index Cond: (id = t1.t2_id)
Heap Fetches: 0
Execution time: 124.990 ms
(7 rows)

As you can see the scan on t2 never occurred.

Very good, happy to see this happening (yay FKs!) and with
PostgreSQL-style rigour.

I've reviewed the patch from cold and I have a few comments.

Thanks!

The plan you end up with here works quite differently from left outer
join removal, where the join is simply absent. That inconsistency
causes most of the other problems I see.

I propose that we keep track of whether there are any potentially
skippable joins at the top of the plan. When we begin execution we do
a single if test to see if there is run-time work to do. If we pass
the run-time tests we then descend the tree and prune the plan to
completely remove unnecessary nodes. We end with an EXPLAIN and
EXPLAIN ANALYZE that looks like this

QUERY PLAN
------------------------------------------------------------------
Aggregate (actual rows=1 loops=1)
-> Seq Scan on t1 (actual rows=1000000 loops=1)

Doing that removes all the overheads and complexity; it also matches
how join removal currently works.

This sounds much cleaner than what I have at the moment, although, you say
EXPLAIN would look like that... I don't think that's quite true as the
EXPLAIN still would have the un-pruned version, as the pruning would be
done as executor start-up. Would it cause problems to have the EXPLAIN have
a different looking plan than EXPLAIN ANALYZE?

I'll need to look into how the plan is stored in the case of PREPARE
statements, as no doubt I can't go vandalising any plans that are stored in
the PREPARE hashtable. I'd need to make a copy first, unless that's already
done for me. But I guess I'd only have to do that if some flag on
PlannerInfo hasSkippableNodes was true, so likely the overhead of such a
copy would be regained by skipping some joins.

The alternative is accepting some pretty horrible additional code in
most join types, plus a small regression on nested loop joins which I
would have to call out as regrettably unacceptable. (Horrible in this
sense that we don't want that code, not that David's code is poor).

Yeah it is quite horrid. I did try and keep it as simple and as
non-invasive as possible, but for nest loop it seemed there was just no
better way.

The tests on the patch are pretty poor. If we should use EXPLAINs to
show a join removal that works and a join removal that fails. With a
few of the main permutations.

Agreed. To be honest I abandoned the tests due to a problem with EXPLAIN
ANALYZE outputting the variable timing information at the bottom. There's
no way to disable this! So that makes testing much harder.

I added myself to the list of complainers over here ->
/messages/by-id/CAApHDvoVzBTzLJbD9VfaznWo6jooK1k6-7rFQ8zYM9H7ndCcSA@mail.gmail.com
but the proposed solution (diff tool which supports regex matching) is not
all that simple, and I've not gotten around to attempting to make one yet.

I've also attached a rebased patch, as the old one is no longer applying.

Regards

David Rowley

Attachments:

inner_join_removals_2014-11-16_3a40b4f.patchapplication/octet-stream; name=inner_join_removals_2014-11-16_3a40b4f.patchDownload

diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index ebccfea..ea26615 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -3889,6 +3889,17 @@ afterTriggerInvokeEvents(AfterTriggerEventList *events,
 	return all_fired;
 }
 
+/* ----------
+ * AfterTriggerQueueIsEmpty()
+ *
+ *	True if there are no pending triggers in the queue.
+ * ----------
+ */
+bool
+AfterTriggerQueueIsEmpty(void)
+{
+	return (afterTriggers.query_depth == -1 && afterTriggers.events.head == NULL);
+}
 
 /* ----------
  * AfterTriggerBeginXact()
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index d5e1273..96452dc 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -45,6 +45,7 @@
 #include "access/relscan.h"
 #include "access/transam.h"
 #include "catalog/index.h"
+#include "commands/trigger.h"
 #include "executor/execdebug.h"
 #include "nodes/nodeFuncs.h"
 #include "parser/parsetree.h"
@@ -661,6 +662,76 @@ get_last_attnums(Node *node, ProjectionInfo *projInfo)
 								  (void *) projInfo);
 }
 
+/*
+ * ExecCanSkipJoin
+ *		Returns True if the join node can be safely skipped without affecting
+ *		query results.
+ */
+bool
+ExecCanSkipJoin(PlanState *planstate)
+{
+	/*
+	 * Currently the only possibility we have of skipping this join node is if
+	 * a foreign key can prove that the join condition will match exactly 1 row
+	 * on the join condition. The checks for this were all done at planning
+	 * time, and any relations that we found foreign keys on that could prove
+	 * this, we marked those relations as skipJoinPossible. Though if this flag
+	 * is true, it still does not mean that we can skip joining to this
+	 * relation. If any changes have been made to records in the referenced
+	 * relation we may not yet have fired the foreign key triggers to cascade
+	 * those changes to the referencing relations, in this case we mustn't skip
+	 * the join as we could produce wrong results by doing so.
+	 *
+	 * Currently this code is quite naive, as we won't allow join skipping if
+	 * there are *any* pending foreign key triggers, on any relation. It may be
+	 * worthwhile to improve this to check if there's any pending triggers for
+	 * the referencing relation in the join.
+	 */
+	if (!AfterTriggerQueueIsEmpty())
+		return false;
+
+	while (planstate != NULL)
+	{
+		switch (nodeTag(planstate))
+		{
+			case T_SeqScanState:
+			case T_IndexOnlyScanState:
+				{
+					Scan *scan = (Scan *) planstate->plan;
+					RangeTblEntry *rte;
+					rte = (RangeTblEntry *) list_nth(planstate->state->es_range_table, scan->scanrelid - 1);
+
+					if (rte->skipJoinPossible)
+						return true;
+					else
+						return false;
+				}
+				break;
+
+			case T_HashState:
+				/* descend to the scan node */
+				planstate = planstate->lefttree;
+				break;
+
+			case T_HashJoinState:
+				{
+					HashJoinState *hjstate = (HashJoinState *)planstate;
+
+					if (HasFlagSkipJoinBoth(hjstate->js.skipflags))
+						return true;
+					else
+						return false;
+				}
+				break;
+
+			default:
+				return false;
+		}
+	}
+	return false;
+}
+
+
 /* ----------------
  *		ExecAssignProjectionInfo
  *
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index 7eec3f3..997314e 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -71,6 +71,7 @@ ExecHashJoin(HashJoinState *node)
 	TupleTableSlot *outerTupleSlot;
 	uint32		hashvalue;
 	int			batchno;
+	int			skipflags;
 
 	/*
 	 * get information from HashJoin node
@@ -113,6 +114,32 @@ ExecHashJoin(HashJoinState *node)
 		switch (node->hj_JoinState)
 		{
 			case HJ_BUILD_HASHTABLE:
+				skipflags = node->js.skipflags;
+
+				/* Can we skip the whole thing? */
+				if (HasFlagSkipJoinBoth(skipflags))
+					return NULL;
+
+				if (HasFlagSkipJoinInner(skipflags))
+				{
+					node->hj_FirstOuterTupleSlot = ExecProcNode(outerNode);
+					if (TupIsNull(node->hj_FirstOuterTupleSlot))
+						return NULL;
+
+					return node->hj_FirstOuterTupleSlot;
+				}
+				else if (HasFlagSkipJoinOuter(skipflags))
+				{
+					TupleTableSlot *result;
+
+					/* bypass the hash node to the node below it */
+					result = ExecProcNode(hashNode->ps.lefttree);
+
+					if (TupIsNull(result))
+						return NULL;
+
+					return result;
+				}
 
 				/*
 				 * First time through: build hash table for inner relation.
@@ -489,6 +516,13 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
 	outerPlanState(hjstate) = ExecInitNode(outerNode, estate, eflags);
 	innerPlanState(hjstate) = ExecInitNode((Plan *) hashNode, estate, eflags);
 
+	hjstate->js.skipflags = 0;
+	if (ExecCanSkipJoin(outerPlanState(hjstate)))
+		hjstate->js.skipflags |= EXEC_SKIPJOIN_OUTER;
+
+	if (ExecCanSkipJoin(innerPlanState(hjstate)))
+		hjstate->js.skipflags |= EXEC_SKIPJOIN_INNER;
+
 	/*
 	 * tuple table initialization
 	 */
@@ -578,6 +612,7 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
 		Assert(IsA(hclause, OpExpr));
 		lclauses = lappend(lclauses, linitial(fstate->args));
 		rclauses = lappend(rclauses, lsecond(fstate->args));
+
 		hoperators = lappend_oid(hoperators, hclause->opno);
 	}
 	hjstate->hj_OuterHashKeys = lclauses;
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index fdf2f4c..49f6125 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -621,6 +621,7 @@ ExecMergeJoin(MergeJoinState *node)
 	ExprContext *econtext;
 	bool		doFillOuter;
 	bool		doFillInner;
+	int			skipflags;
 
 	/*
 	 * get information from node
@@ -679,6 +680,30 @@ ExecMergeJoin(MergeJoinState *node)
 			case EXEC_MJ_INITIALIZE_OUTER:
 				MJ_printf("ExecMergeJoin: EXEC_MJ_INITIALIZE_OUTER\n");
 
+				skipflags = node->js.skipflags;
+
+				if (HasFlagSkipJoinBoth(skipflags))
+					return NULL;
+
+				if (HasFlagSkipJoinInner(skipflags))
+				{
+					outerTupleSlot = ExecProcNode(outerPlan);
+					if (TupIsNull(outerTupleSlot))
+					{
+						return NULL;
+					}
+					return outerTupleSlot;
+				}
+				else if (HasFlagSkipJoinOuter(skipflags))
+				{
+					innerTupleSlot = ExecProcNode(innerPlan);
+					if (TupIsNull(innerTupleSlot))
+					{
+						return NULL;
+					}
+					return innerTupleSlot;
+				}
+
 				outerTupleSlot = ExecProcNode(outerPlan);
 				node->mj_OuterTupleSlot = outerTupleSlot;
 
@@ -1518,6 +1543,13 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
 	innerPlanState(mergestate) = ExecInitNode(innerPlan(node), estate,
 											  eflags | EXEC_FLAG_MARK);
 
+	mergestate->js.skipflags = 0;
+	if (ExecCanSkipJoin(outerPlanState(mergestate)))
+		mergestate->js.skipflags |= EXEC_SKIPJOIN_OUTER;
+
+	if (ExecCanSkipJoin(innerPlanState(mergestate)))
+		mergestate->js.skipflags |= EXEC_SKIPJOIN_INNER;
+
 	/*
 	 * For certain types of inner child nodes, it is advantageous to issue
 	 * MARK every time we advance past an inner tuple we will never return to.
diff --git a/src/backend/executor/nodeNestloop.c b/src/backend/executor/nodeNestloop.c
index 6cdd4ff..e0c844f 100644
--- a/src/backend/executor/nodeNestloop.c
+++ b/src/backend/executor/nodeNestloop.c
@@ -68,6 +68,7 @@ ExecNestLoop(NestLoopState *node)
 	List	   *otherqual;
 	ExprContext *econtext;
 	ListCell   *lc;
+	int			skipflags;
 
 	/*
 	 * get information from the node
@@ -105,6 +106,37 @@ ExecNestLoop(NestLoopState *node)
 	 */
 	ResetExprContext(econtext);
 
+	skipflags = node->js.skipflags;
+
+	if (HasFlagSkipJoinAny(skipflags))
+	{
+		if (HasFlagSkipJoinBoth(skipflags))
+			return NULL;
+		else if (HasFlagSkipJoinInner(skipflags))
+		{
+			outerTupleSlot = ExecProcNode(outerPlan);
+
+			if (TupIsNull(outerTupleSlot))
+			{
+				ENL1_printf("no outer tuple, ending join");
+				return NULL;
+			}
+			return outerTupleSlot;
+		}
+		else
+		{
+			innerTupleSlot = ExecProcNode(innerPlan);
+			econtext->ecxt_innertuple = innerTupleSlot;
+
+			if (TupIsNull(innerTupleSlot))
+			{
+				return NULL;
+			}
+
+			return innerTupleSlot;
+		}
+	}
+
 	/*
 	 * Ok, everything is setup for the join so now loop until we return a
 	 * qualifying join tuple.
@@ -347,6 +379,13 @@ ExecInitNestLoop(NestLoop *node, EState *estate, int eflags)
 		eflags &= ~EXEC_FLAG_REWIND;
 	innerPlanState(nlstate) = ExecInitNode(innerPlan(node), estate, eflags);
 
+	nlstate->js.skipflags = 0;
+	if (ExecCanSkipJoin(outerPlanState(nlstate)))
+		nlstate->js.skipflags |= EXEC_SKIPJOIN_OUTER;
+
+	if (ExecCanSkipJoin(innerPlanState(nlstate)))
+		nlstate->js.skipflags |= EXEC_SKIPJOIN_INNER;
+
 	/*
 	 * tuple table initialization
 	 */
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index e5dd58e..0a665b2 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -49,8 +49,6 @@ static List *generate_join_implied_equalities_broken(PlannerInfo *root,
 										Relids outer_relids,
 										Relids nominal_inner_relids,
 										RelOptInfo *inner_rel);
-static Oid select_equality_operator(EquivalenceClass *ec,
-						 Oid lefttype, Oid righttype);
 static RestrictInfo *create_join_clause(PlannerInfo *root,
 				   EquivalenceClass *ec, Oid opno,
 				   EquivalenceMember *leftem,
@@ -1283,7 +1281,7 @@ generate_join_implied_equalities_broken(PlannerInfo *root,
  *
  * Returns InvalidOid if no operator can be found for this datatype combination
  */
-static Oid
+Oid
 select_equality_operator(EquivalenceClass *ec, Oid lefttype, Oid righttype)
 {
 	ListCell   *lc;
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index 773f8a4..d5784ea 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -32,13 +32,21 @@
 #include "utils/lsyscache.h"
 
 /* local functions */
-static bool join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
+static bool innerjoin_is_removable(PlannerInfo *root, List *joinlist,
+					  RangeTblRef *removalrtr, Relids ignoredrels);
+static bool leftjoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
+static bool relation_is_needed(PlannerInfo *root, Relids joinrelids,
+					  RelOptInfo *rel, Relids ignoredrels);
+static bool relation_has_foreign_key_for(PlannerInfo *root, RelOptInfo *rel,
+					  RelOptInfo *referencedrel, List *referencing_vars,
+					  List *index_vars, List *operator_list);
+static bool expressions_match_foreign_key(ForeignKeyInfo *fk, List *fkvars,
+					  List *indexvars, List *operators);
 static void remove_rel_from_query(PlannerInfo *root, int relid,
 					  Relids joinrelids);
 static List *remove_rel_from_joinlist(List *joinlist, int relid, int *nremoved);
 static Oid	distinct_col_search(int colno, List *colnos, List *opids);
 
-
 /*
  * remove_useless_joins
  *		Check for relations that don't actually need to be joined at all,
@@ -46,26 +54,91 @@ static Oid	distinct_col_search(int colno, List *colnos, List *opids);
  *
  * We are passed the current joinlist and return the updated list.  Other
  * data structures that have to be updated are accessible via "root".
+ *
+ * There are 2 methods here for removing joins. Joins such as LEFT JOINs
+ * which can be proved to be needless due to lack of use of any of the joining
+ * relation's columns and the existence of a unique index on a subset of the
+ * join clause, can simply be removed from the query plan at plan time. For
+ * certain other join types we make use of foreign keys to attempt to prove the
+ * join is needless, though, for these we're unable to be certain that the join
+ * is not required at plan time, as if the plan is executed when pending
+ * foreign key triggers have not yet been fired, then the foreign key is
+ * effectively violated until these triggers have fired. Removing a join in
+ * such a case could cause a query to produce incorrect results.
+ *
+ * Instead we handle this case by marking the RangeTblEntry for the relation
+ * with a special flag which tells the executor that it's possible that joining
+ * to this relation may not be required. The executor may then check this flag
+ * and choose to skip the join based on if there are foreign key triggers
+ * pending or not.
  */
 List *
 remove_useless_joins(PlannerInfo *root, List *joinlist)
 {
 	ListCell   *lc;
+	Relids		removedrels = NULL;
 
 	/*
-	 * We are only interested in relations that are left-joined to, so we can
-	 * scan the join_info_list to find them easily.
+	 * Start by analyzing INNER JOINed relations in order to determine if any
+	 * of the relations can be ignored.
 	 */
 restart:
+	foreach(lc, joinlist)
+	{
+		RangeTblRef *rtr = (RangeTblRef *) lfirst(lc);
+
+		if (!IsA(rtr, RangeTblRef))
+			continue;
+
+		/* Don't try to remove this one again if we've already removed it */
+		if (root->simple_rte_array[rtr->rtindex]->skipJoinPossible)
+			continue;
+
+		/* skip if the join can't be removed */
+		if (!innerjoin_is_removable(root, joinlist, rtr, removedrels))
+			continue;
+
+		/*
+		 * Since we're not actually removing the join here, we need to maintain
+		 * a list of relations that we've "removed" so when we're checking if
+		 * other relations can be removed we'll know that if the to be removed
+		 * relation is only referenced by a relation that we've already removed
+		 * that it can be safely assumed that the relation is not referenced by
+		 * any useful relation.
+		 */
+		removedrels = bms_add_member(removedrels, rtr->rtindex);
+
+		/*
+		 * Make a mark for the executor to say that it may be able to skip
+		 * joining to this relation.
+		 */
+		root->simple_rte_array[rtr->rtindex]->skipJoinPossible = true;
+
+		/*
+		 * Restart the scan.  This is necessary to ensure we find all removable
+		 * joins independently of their ordering. (note that since we've added
+		 * this relation to the removedrels, we may now realize that other
+		 * relations can also be removed as they're only referenced by the one
+		 * that we've just marked as possibly removable).
+		 */
+		goto restart;
+	}
+
+	/* now process special joins. Currently only left joins are supported */
 	foreach(lc, root->join_info_list)
 	{
 		SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) lfirst(lc);
 		int			innerrelid;
 		int			nremoved;
 
-		/* Skip if not removable */
-		if (!join_is_removable(root, sjinfo))
-			continue;
+		if (sjinfo->jointype == JOIN_LEFT)
+		{
+			/* Skip if not removable */
+			if (!leftjoin_is_removable(root, sjinfo))
+				continue;
+		}
+		else
+			continue; /* we don't support this join type */
 
 		/*
 		 * Currently, join_is_removable can only succeed when the sjinfo's
@@ -91,12 +164,11 @@ restart:
 		root->join_info_list = list_delete_ptr(root->join_info_list, sjinfo);
 
 		/*
-		 * Restart the scan.  This is necessary to ensure we find all
-		 * removable joins independently of ordering of the join_info_list
-		 * (note that removal of attr_needed bits may make a join appear
-		 * removable that did not before).  Also, since we just deleted the
-		 * current list cell, we'd have to have some kluge to continue the
-		 * list scan anyway.
+		 * Restart the scan.  This is necessary to ensure we find all removable
+		 * joins independently of their ordering. (note that removal of
+		 * attr_needed bits may make a join, inner or outer, appear removable
+		 * that did not before).   Also, since we just deleted the current list
+		 * cell, we'd have to have some kluge to continue the list scan anyway.
 		 */
 		goto restart;
 	}
@@ -136,8 +208,213 @@ clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
 }
 
 /*
- * join_is_removable
- *	  Check whether we need not perform this special join at all, because
+ * innerjoin_is_removable
+ *		True if the join to removalrtr can be removed.
+ *
+ * In order to prove a relation which is inner joined is not required we must
+ * be sure that the join would emit exactly 1 row on the join condition. This
+ * differs from the logic which is used for proving LEFT JOINs can be removed,
+ * where it's possible to just check that a unique index exists on the relation
+ * being removed which has a set of columns that is a subset of the columns
+ * seen in the join condition. If no matching row is found then left join would
+ * not remove the non-matched row from the result set. This is not the case
+ * with INNER JOINs, so here we must use foreign keys as proof that the 1 row
+ * exists before we can allow any joins to be removed.
+ */
+static bool
+innerjoin_is_removable(PlannerInfo *root, List *joinlist,
+					   RangeTblRef *removalrtr, Relids ignoredrels)
+{
+	ListCell   *lc;
+	RelOptInfo *removalrel;
+
+	removalrel = find_base_rel(root, removalrtr->rtindex);
+
+	/*
+	 * As foreign keys may only reference base rels which have unique indexes,
+	 * we needn't go any further if we're not dealing with a base rel, or if
+	 * the base rel has no unique indexes. We'd also better abort if the
+	 * rtekind is anything but a relation, as things like sub-queries may have
+	 * grouping or distinct clauses that would cause us not to be able to use
+	 * the foreign key to prove the existence of a row matching the join
+	 * condition. We also abort if the rel has no eclass joins as such a rel
+	 * could well be joined using some operator which is not an equality
+	 * operator, or the rel may not even be inner joined at all.
+	 *
+	 * Here we actually only check if the rel has any indexes, ideally we'd be
+	 * checking for unique indexes, but we could only determine that by looping
+	 * over the indexlist, and this is likely too expensive a check to be worth
+	 * it here.
+	 */
+	if (removalrel->reloptkind != RELOPT_BASEREL ||
+		removalrel->rtekind != RTE_RELATION ||
+		removalrel->has_eclass_joins == false ||
+		removalrel->indexlist == NIL)
+		return false;
+
+	/*
+	 * Currently we disallow the removal if we find any baserestrictinfo items
+	 * on the relation being removed. The reason for this is that these would
+	 * filter out rows and make it so the foreign key cannot prove that we'll
+	 * match exactly 1 row on the join condition. However, this check is
+	 * currently probably a bit overly strict as it should be possible to just
+	 * check and ensure that each Var seen in the baserestrictinfo is also
+	 * present in an eclass and if so, just translate and move the whole
+	 * baserestrictinfo over to the relation which has the foreign key to prove
+	 * that this join is not needed. e.g:
+	 * SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id WHERE b.id = 1;
+	 * could become: SELECT a.* FROM a WHERE a.b_id = 1;
+	 */
+	if (removalrel->baserestrictinfo != NIL)
+		return false;
+
+	/*
+	 * Currently only eclass joins are supported, so if there are any non
+	 * eclass join quals then we'll report the join is non-removable.
+	 */
+	if (removalrel->joininfo != NIL)
+		return false;
+
+	/*
+	 * Now we'll search through each relation in the joinlist to see if we can
+	 * find a relation which has a foreign key which references removalrel on
+	 * the join condition. If we find a rel with a foreign key which matches
+	 * the join condition exactly, then we can be sure that exactly 1 row will
+	 * be matched on the join, if we also see that no Vars from the relation
+	 * are needed, then we can report the join as removable.
+	 */
+	foreach (lc, joinlist)
+	{
+		RangeTblRef	*rtr = (RangeTblRef *) lfirst(lc);
+		RelOptInfo	*rel;
+		ListCell	*lc2;
+		List		*referencing_vars;
+		List		*index_vars;
+		List		*operator_list;
+		Relids		 joinrelids;
+
+		/* we can't remove ourself, or anything other than RangeTblRefs */
+		if (rtr == removalrtr || !IsA(rtr, RangeTblRef))
+			continue;
+
+		rel = find_base_rel(root, rtr->rtindex);
+
+		/*
+		 * The only relation type that can help us is a base rel with at least
+		 * one foreign key defined, if there's no eclass joins then this rel
+		 * is not going to help us prove the removalrel is not needed.
+		 */
+		if (rel->reloptkind != RELOPT_BASEREL ||
+			rel->rtekind != RTE_RELATION ||
+			rel->has_eclass_joins == false ||
+			rel->fklist == NIL)
+			continue;
+
+		/*
+		 * Both rels have eclass joins, but do they have eclass joins to each
+		 * other? Skip this rel if it does not.
+		 */
+		if (!have_relevant_eclass_joinclause(root, rel, removalrel))
+			continue;
+
+		joinrelids = bms_union(rel->relids, removalrel->relids);
+
+		/* if any of the Vars from the relation are needed then abort */
+		if (relation_is_needed(root, joinrelids, removalrel, ignoredrels))
+			return false;
+
+		referencing_vars = NIL;
+		index_vars = NIL;
+		operator_list = NIL;
+
+		/* now populate the lists with the join condition Vars */
+		foreach(lc2, root->eq_classes)
+		{
+			EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc2);
+
+			if (list_length(ec->ec_members) <= 1)
+				continue;
+
+			if (bms_overlap(removalrel->relids, ec->ec_relids) &&
+				bms_overlap(rel->relids, ec->ec_relids))
+			{
+				ListCell *lc3;
+				Var *refvar = NULL;
+				Var *idxvar = NULL;
+
+				/*
+				 * Look at each member of the eclass and try to find a Var from
+				 * each side of the join that we can append to the list of
+				 * columns that should be checked against each foreign key.
+				 *
+				 * The following logic does not allow for join removals to take
+				 * place for foreign keys that have duplicate columns on the
+				 * referencing side of the foreign key, such as:
+				 * (a,a) references (x,y)
+				 * The use case for such a foreign key is likely small enough
+				 * that we needn't bother making this code anymore complex to
+				 * solve. If we find more than 1 Var from any of the rels then
+				 * we'll bail out.
+				 */
+				foreach (lc3, ec->ec_members)
+				{
+					EquivalenceMember *ecm = (EquivalenceMember *) lfirst(lc3);
+
+					Var *var = (Var *) ecm->em_expr;
+
+					if (!IsA(var, Var))
+						continue; /* Ignore Consts */
+
+					if (var->varno == rel->relid)
+					{
+						if (refvar != NULL)
+							return false;
+						refvar = var;
+					}
+
+					else if (var->varno == removalrel->relid)
+					{
+						if (idxvar != NULL)
+							return false;
+						idxvar = var;
+					}
+				}
+
+				if (refvar != NULL && idxvar != NULL)
+				{
+					Oid opno;
+					Oid reloid = root->simple_rte_array[refvar->varno]->relid;
+
+					if (!get_attnotnull(reloid, refvar->varattno))
+						return false;
+
+					/* grab the correct equality operator for these two vars */
+					opno = select_equality_operator(ec, refvar->vartype, idxvar->vartype);
+
+					if (!OidIsValid(opno))
+						return false;
+
+					referencing_vars = lappend(referencing_vars, refvar);
+					index_vars = lappend(index_vars, idxvar);
+					operator_list = lappend_oid(operator_list, opno);
+				}
+			}
+		}
+
+		if (referencing_vars != NULL)
+		{
+			if (relation_has_foreign_key_for(root, rel, removalrel,
+				referencing_vars, index_vars, operator_list))
+				return true; /* removalrel can be removed */
+		}
+	}
+
+	return false; /* can't remove join */
+}
+
+/*
+ * leftjoin_is_removable
+ *	  Check whether we need not perform this left join at all, because
  *	  it will just duplicate its left input.
  *
  * This is true for a left join for which the join condition cannot match
@@ -147,7 +424,7 @@ clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
  * above the join.
  */
 static bool
-join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
+leftjoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 {
 	int			innerrelid;
 	RelOptInfo *innerrel;
@@ -155,14 +432,14 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	Relids		joinrelids;
 	List	   *clause_list = NIL;
 	ListCell   *l;
-	int			attroff;
+
+	Assert(sjinfo->jointype == JOIN_LEFT);
 
 	/*
-	 * Must be a non-delaying left join to a single baserel, else we aren't
+	 * Must be a non-delaying join to a single baserel, else we aren't
 	 * going to be able to do anything with it.
 	 */
-	if (sjinfo->jointype != JOIN_LEFT ||
-		sjinfo->delay_upper_joins ||
+	if (sjinfo->delay_upper_joins ||
 		bms_membership(sjinfo->min_righthand) != BMS_SINGLETON)
 		return false;
 
@@ -205,52 +482,9 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	/* Compute the relid set for the join we are considering */
 	joinrelids = bms_union(sjinfo->min_lefthand, sjinfo->min_righthand);
 
-	/*
-	 * We can't remove the join if any inner-rel attributes are used above the
-	 * join.
-	 *
-	 * Note that this test only detects use of inner-rel attributes in higher
-	 * join conditions and the target list.  There might be such attributes in
-	 * pushed-down conditions at this join, too.  We check that case below.
-	 *
-	 * As a micro-optimization, it seems better to start with max_attr and
-	 * count down rather than starting with min_attr and counting up, on the
-	 * theory that the system attributes are somewhat less likely to be wanted
-	 * and should be tested last.
-	 */
-	for (attroff = innerrel->max_attr - innerrel->min_attr;
-		 attroff >= 0;
-		 attroff--)
-	{
-		if (!bms_is_subset(innerrel->attr_needed[attroff], joinrelids))
-			return false;
-	}
-
-	/*
-	 * Similarly check that the inner rel isn't needed by any PlaceHolderVars
-	 * that will be used above the join.  We only need to fail if such a PHV
-	 * actually references some inner-rel attributes; but the correct check
-	 * for that is relatively expensive, so we first check against ph_eval_at,
-	 * which must mention the inner rel if the PHV uses any inner-rel attrs as
-	 * non-lateral references.  Note that if the PHV's syntactic scope is just
-	 * the inner rel, we can't drop the rel even if the PHV is variable-free.
-	 */
-	foreach(l, root->placeholder_list)
-	{
-		PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l);
-
-		if (bms_is_subset(phinfo->ph_needed, joinrelids))
-			continue;			/* PHV is not used above the join */
-		if (bms_overlap(phinfo->ph_lateral, innerrel->relids))
-			return false;		/* it references innerrel laterally */
-		if (!bms_overlap(phinfo->ph_eval_at, innerrel->relids))
-			continue;			/* it definitely doesn't reference innerrel */
-		if (bms_is_subset(phinfo->ph_eval_at, innerrel->relids))
-			return false;		/* there isn't any other place to eval PHV */
-		if (bms_overlap(pull_varnos((Node *) phinfo->ph_var->phexpr),
-						innerrel->relids))
-			return false;		/* it does reference innerrel */
-	}
+	/* if the relation is referenced in the query then it cannot be removed */
+	if (relation_is_needed(root, joinrelids, innerrel, NULL))
+		return false;
 
 	/*
 	 * Search for mergejoinable clauses that constrain the inner rel against
@@ -367,6 +601,218 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	return false;
 }
 
+/*
+ * relation_is_needed
+ *		True if any of the Vars from this relation are required in the query
+ */
+static inline bool
+relation_is_needed(PlannerInfo *root, Relids joinrelids, RelOptInfo *rel, Relids ignoredrels)
+{
+	int		  attroff;
+	ListCell *l;
+
+	/*
+	 * rel is referenced if any of it's attributes are used above the join.
+	 *
+	 * Note that this test only detects use of rel's attributes in higher
+	 * join conditions and the target list.  There might be such attributes in
+	 * pushed-down conditions at this join, too.  We check that case below.
+	 *
+	 * As a micro-optimization, it seems better to start with max_attr and
+	 * count down rather than starting with min_attr and counting up, on the
+	 * theory that the system attributes are somewhat less likely to be wanted
+	 * and should be tested last.
+	 */
+	for (attroff = rel->max_attr - rel->min_attr;
+		 attroff >= 0;
+		 attroff--)
+	{
+		if (!bms_is_subset(bms_difference(rel->attr_needed[attroff], ignoredrels), joinrelids))
+			return true;
+	}
+
+	/*
+	 * Similarly check that rel isn't needed by any PlaceHolderVars that will
+	 * be used above the join.  We only need to fail if such a PHV actually
+	 * references some of rel's attributes; but the correct check for that is
+	 * relatively expensive, so we first check against ph_eval_at, which must
+	 * mention rel if the PHV uses any of-rel's attrs as non-lateral
+	 * references.  Note that if the PHV's syntactic scope is just rel, we
+	 * can't return true even if the PHV is variable-free.
+	 */
+	foreach(l, root->placeholder_list)
+	{
+		PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l);
+
+		if (bms_is_subset(phinfo->ph_needed, joinrelids))
+			continue;			/* PHV is not used above the join */
+		if (bms_overlap(phinfo->ph_lateral, rel->relids))
+			return true;		/* it references rel laterally */
+		if (!bms_overlap(phinfo->ph_eval_at, rel->relids))
+			continue;			/* it definitely doesn't reference rel */
+		if (bms_is_subset(phinfo->ph_eval_at, rel->relids))
+			return true;		/* there isn't any other place to eval PHV */
+		if (bms_overlap(pull_varnos((Node *) phinfo->ph_var->phexpr),
+						rel->relids))
+			return true;		/* it does reference rel */
+	}
+
+	return false; /* it does not reference rel */
+}
+
+/*
+ * relation_has_foreign_key_for
+ *	  Checks if rel has a foreign key which references referencedrel with the
+ *	  given list of expressions.
+ *
+ *	For the match to succeed:
+ *	  referencing_vars must match the columns defined in the foreign key.
+ *	  index_vars must match the columns defined in the index for the foreign key.
+ */
+static bool
+relation_has_foreign_key_for(PlannerInfo *root, RelOptInfo *rel,
+			RelOptInfo *referencedrel, List *referencing_vars,
+			List *index_vars, List *operator_list)
+{
+	ListCell *lc;
+	Oid		  refreloid;
+
+	/*
+	 * Look up the Oid of the referenced relation. We only want to look at
+	 * foreign keys on the referencing relation which reference this relation.
+	 */
+	refreloid = root->simple_rte_array[referencedrel->relid]->relid;
+
+	Assert(list_length(referencing_vars) > 0);
+	Assert(list_length(referencing_vars) == list_length(index_vars));
+	Assert(list_length(referencing_vars) == list_length(operator_list));
+
+	/*
+	 * Search through each foreign key on the referencing relation and try
+	 * to find one which references the relation in the join condition. If we
+	 * find one then we'll send the join conditions off to
+	 * expressions_match_foreign_key() to see if they match the foreign key.
+	 */
+	foreach(lc, rel->fklist)
+	{
+		ForeignKeyInfo *fk = (ForeignKeyInfo *) lfirst(lc);
+
+		if (fk->confrelid == refreloid)
+		{
+			if (expressions_match_foreign_key(fk, referencing_vars,
+				index_vars, operator_list))
+				return true;
+		}
+	}
+
+	return false;
+}
+
+/*
+ * expressions_match_foreign_key
+ *		True if the given fkvars, indexvars and operators will match
+ *		exactly 1 record in the referenced relation of the foreign key.
+ *
+ * Note: This function expects fkvars and indexvars to only contain Var types.
+ *		 Expression indexes are not supported by foreign keys.
+ */
+static bool
+expressions_match_foreign_key(ForeignKeyInfo *fk, List *fkvars,
+					List *indexvars, List *operators)
+{
+	ListCell  *lc;
+	ListCell  *lc2;
+	ListCell  *lc3;
+	Bitmapset *allitems;
+	Bitmapset *matcheditems;
+	int		   lstidx;
+	int		   col;
+
+	Assert(list_length(fkvars) == list_length(indexvars));
+	Assert(list_length(fkvars) == list_length(operators));
+
+	/*
+	 * Fast path out if there's not enough conditions to match each column in
+	 * the foreign key. Note that we cannot check that the number of
+	 * expressions are equal here since it would cause any expressions which
+	 * are duplicated not to match.
+	 */
+	if (list_length(fkvars) < fk->conncols)
+		return false;
+
+	/*
+	 * We need to ensure that each foreign key column can be matched to a list
+	 * item, and we need to ensure that each list item can be matched to a
+	 * foreign key column. We do this by looping over each foreign key column
+	 * and checking that we can find an item in the list which matches the
+	 * current column, however this method does not allow us to ensure that no
+	 * additional items exist in the list. We could solve that by performing
+	 * another loop over each list item and check that it matches a foreign key
+	 * column, but that's a bit wasteful. Instead we'll use 2 bitmapsets, one
+	 * to store the 0 based index of each list item, and with the other we'll
+	 * store each list index that we've managed to match. After we're done
+	 * matching we'll just make sure that both bitmapsets are equal.
+	 */
+	allitems = NULL;
+	matcheditems = NULL;
+
+	/*
+	 * Build a bitmapset which contains each 1 based list index. It seems more
+	 * efficient to do this in reverse so that we allocate enough memory for
+	 * the bitmapset on first loop rather than reallocating each time we find
+	 * we need a bit more space.
+	 */
+	for (lstidx = list_length(fkvars) - 1; lstidx >= 0; lstidx--)
+		allitems = bms_add_member(allitems, lstidx);
+
+	for (col = 0; col < fk->conncols; col++)
+	{
+		bool  matched = false;
+
+		lstidx = 0;
+
+		forthree(lc, fkvars, lc2, indexvars, lc3, operators)
+		{
+			Var *expr = (Var *) lfirst(lc);
+			Var *idxexpr = (Var *) lfirst(lc2);
+			Oid  opr = lfirst_oid(lc3);
+
+			Assert(IsA(expr, Var));
+			Assert(IsA(idxexpr, Var));
+
+			/* Does this join qual match up to the current fkey column? */
+			if (fk->conkey[col] == expr->varattno &&
+				fk->confkey[col] == idxexpr->varattno &&
+				equality_ops_are_compatible(opr, fk->conpfeqop[col]))
+			{
+				matched = true;
+
+				/* mark this list item as matched */
+				matcheditems = bms_add_member(matcheditems, lstidx);
+
+				/*
+				 * Don't break here as there may be duplicate expressions
+				 * that we also need to match against.
+				 */
+			}
+			lstidx++;
+		}
+
+		/* punt if there's no match. */
+		if (!matched)
+			return false;
+	}
+
+	/*
+	 * Ensure that we managed to match every item in the list to a foreign key
+	 * column.
+	 */
+	if (!bms_equal(allitems, matcheditems))
+		return false;
+
+	return true; /* matched */
+}
+
 
 /*
  * Remove the target relid from the planner's data structures, having
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index fb74d6b..7ea0149 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -3712,6 +3712,7 @@ plan_cluster_use_sort(Oid tableOid, Oid indexOid)
 	rte->lateral = false;
 	rte->inh = false;
 	rte->inFromCl = true;
+	rte->skipJoinPossible = false;
 	query->rtable = list_make1(rte);
 
 	/* Set up RTE/RelOptInfo arrays */
diff --git a/src/backend/optimizer/prep/prepsecurity.c b/src/backend/optimizer/prep/prepsecurity.c
index b625b5c..74a0dca 100644
--- a/src/backend/optimizer/prep/prepsecurity.c
+++ b/src/backend/optimizer/prep/prepsecurity.c
@@ -311,6 +311,7 @@ expand_security_qual(PlannerInfo *root, List *tlist, int rt_index,
 			subrte->security_barrier = rte->security_barrier;
 			subrte->eref = copyObject(rte->eref);
 			subrte->inFromCl = true;
+			subrte->skipJoinPossible = false;
 			subquery->rtable = list_make1(subrte);
 
 			subrtr = makeNode(RangeTblRef);
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index b2becfa..fea198e 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -25,7 +25,9 @@
 #include "access/transam.h"
 #include "access/xlog.h"
 #include "catalog/catalog.h"
+#include "catalog/pg_constraint.h"
 #include "catalog/heap.h"
+#include "catalog/pg_type.h"
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -38,6 +40,7 @@
 #include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "storage/bufmgr.h"
+#include "utils/fmgroids.h"
 #include "utils/lsyscache.h"
 #include "utils/rel.h"
 #include "utils/snapmgr.h"
@@ -89,6 +92,12 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 	Relation	relation;
 	bool		hasindex;
 	List	   *indexinfos = NIL;
+	List	   *fkinfos = NIL;
+	Relation	fkeyRel;
+	Relation	fkeyRelIdx;
+	ScanKeyData fkeyScankey;
+	SysScanDesc fkeyScan;
+	HeapTuple	tuple;
 
 	/*
 	 * We need not lock the relation since it was already locked, either by
@@ -384,6 +393,111 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 
 	heap_close(relation, NoLock);
 
+	/* load foreign key constraints */
+	ScanKeyInit(&fkeyScankey,
+				Anum_pg_constraint_conrelid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(relationObjectId));
+
+	fkeyRel = heap_open(ConstraintRelationId, AccessShareLock);
+	fkeyRelIdx = index_open(ConstraintRelidIndexId, AccessShareLock);
+	fkeyScan = systable_beginscan_ordered(fkeyRel, fkeyRelIdx, NULL, 1, &fkeyScankey);
+
+	while ((tuple = systable_getnext_ordered(fkeyScan, ForwardScanDirection)) != NULL)
+	{
+		Form_pg_constraint con = (Form_pg_constraint) GETSTRUCT(tuple);
+		ForeignKeyInfo *fkinfo;
+		Datum		adatum;
+		bool		isNull;
+		ArrayType  *arr;
+		int			nelements;
+
+		/* skip if not a foreign key */
+		if (con->contype != CONSTRAINT_FOREIGN)
+			continue;
+
+		/* we're not interested unless the fkey has been validated */
+		if (!con->convalidated)
+			continue;
+
+		fkinfo = (ForeignKeyInfo *) palloc(sizeof(ForeignKeyInfo));
+		fkinfo->conindid = con->conindid;
+		fkinfo->confrelid = con->confrelid;
+		fkinfo->convalidated = con->convalidated;
+		fkinfo->conrelid = con->conrelid;
+		fkinfo->confupdtype = con->confupdtype;
+		fkinfo->confdeltype = con->confdeltype;
+		fkinfo->confmatchtype = con->confmatchtype;
+
+		adatum = heap_getattr(tuple, Anum_pg_constraint_conkey,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null conkey for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		nelements = ARR_DIMS(arr)[0];
+		if (ARR_NDIM(arr) != 1 ||
+			nelements < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != INT2OID)
+			elog(ERROR, "conkey is not a 1-D smallint array");
+
+		fkinfo->conkey = (int16 *) ARR_DATA_PTR(arr);
+		fkinfo->conncols = nelements;
+
+		adatum = heap_getattr(tuple, Anum_pg_constraint_confkey,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null confkey for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		nelements = ARR_DIMS(arr)[0];
+
+		if (ARR_NDIM(arr) != 1 ||
+			nelements < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != INT2OID)
+			elog(ERROR, "confkey is not a 1-D smallint array");
+
+		/* sanity check */
+		if (nelements != fkinfo->conncols)
+			elog(ERROR, "number of confkey elements does not equal conkey elements");
+
+		fkinfo->confkey = (int16 *) ARR_DATA_PTR(arr);
+		adatum = heap_getattr(tuple, Anum_pg_constraint_conpfeqop,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null conpfeqop for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		nelements = ARR_DIMS(arr)[0];
+
+		if (ARR_NDIM(arr) != 1 ||
+			nelements < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != OIDOID)
+			elog(ERROR, "conpfeqop is not a 1-D smallint array");
+
+		/* sanity check */
+		if (nelements != fkinfo->conncols)
+			elog(ERROR, "number of conpfeqop elements does not equal conkey elements");
+
+		fkinfo->conpfeqop = (Oid *) ARR_DATA_PTR(arr);
+
+		fkinfos = lappend(fkinfos, fkinfo);
+	}
+
+	rel->fklist = fkinfos;
+	systable_endscan_ordered(fkeyScan);
+	index_close(fkeyRelIdx, AccessShareLock);
+	heap_close(fkeyRel, AccessShareLock);
+
 	/*
 	 * Allow a plugin to editorialize on the info we obtained from the
 	 * catalogs.  Actions might include altering the assumed relation size,
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 4c76f54..58d80bb 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -115,6 +115,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind)
 	rel->lateral_relids = NULL;
 	rel->lateral_referencers = NULL;
 	rel->indexlist = NIL;
+	rel->fklist = NIL;
 	rel->pages = 0;
 	rel->tuples = 0;
 	rel->allvisfrac = 0;
@@ -377,6 +378,7 @@ build_join_rel(PlannerInfo *root,
 	joinrel->lateral_relids = NULL;
 	joinrel->lateral_referencers = NULL;
 	joinrel->indexlist = NIL;
+	joinrel->fklist = NIL;
 	joinrel->pages = 0;
 	joinrel->tuples = 0;
 	joinrel->allvisfrac = 0;
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 478584d..cafeba9 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1048,6 +1048,7 @@ addRangeTableEntry(ParseState *pstate,
 	rte->lateral = false;
 	rte->inh = inh;
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = ACL_SELECT;
 	rte->checkAsUser = InvalidOid;		/* not set-uid by default, either */
@@ -1101,6 +1102,7 @@ addRangeTableEntryForRelation(ParseState *pstate,
 	rte->lateral = false;
 	rte->inh = inh;
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = ACL_SELECT;
 	rte->checkAsUser = InvalidOid;		/* not set-uid by default, either */
@@ -1179,6 +1181,7 @@ addRangeTableEntryForSubquery(ParseState *pstate,
 	rte->lateral = lateral;
 	rte->inh = false;			/* never true for subqueries */
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = 0;
 	rte->checkAsUser = InvalidOid;
@@ -1433,6 +1436,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 	rte->lateral = lateral;
 	rte->inh = false;			/* never true for functions */
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = 0;
 	rte->checkAsUser = InvalidOid;
@@ -1505,6 +1509,7 @@ addRangeTableEntryForValues(ParseState *pstate,
 	rte->lateral = lateral;
 	rte->inh = false;			/* never true for values RTEs */
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = 0;
 	rte->checkAsUser = InvalidOid;
@@ -1573,6 +1578,7 @@ addRangeTableEntryForJoin(ParseState *pstate,
 	rte->lateral = false;
 	rte->inh = false;			/* never true for joins */
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = 0;
 	rte->checkAsUser = InvalidOid;
@@ -1673,6 +1679,7 @@ addRangeTableEntryForCTE(ParseState *pstate,
 	rte->lateral = false;
 	rte->inh = false;			/* never true for subqueries */
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = 0;
 	rte->checkAsUser = InvalidOid;
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index bf4e81f..919a3b2 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -843,6 +843,7 @@ pg_get_triggerdef_worker(Oid trigid, bool pretty)
 		oldrte->lateral = false;
 		oldrte->inh = false;
 		oldrte->inFromCl = true;
+		oldrte->skipJoinPossible = false;
 
 		newrte = makeNode(RangeTblEntry);
 		newrte->rtekind = RTE_RELATION;
@@ -853,6 +854,7 @@ pg_get_triggerdef_worker(Oid trigid, bool pretty)
 		newrte->lateral = false;
 		newrte->inh = false;
 		newrte->inFromCl = true;
+		newrte->skipJoinPossible = false;
 
 		/* Build two-element rtable */
 		memset(&dpns, 0, sizeof(dpns));
@@ -2508,6 +2510,7 @@ deparse_context_for(const char *aliasname, Oid relid)
 	rte->lateral = false;
 	rte->inh = false;
 	rte->inFromCl = true;
+	rte->skipJoinPossible = false;
 
 	/* Build one-element rtable */
 	dpns->rtable = list_make1(rte);
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 552e498..aa81c7c 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -916,6 +916,33 @@ get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 	ReleaseSysCache(tp);
 }
 
+/*
+ * get_attnotnull
+ *
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	tp;
+
+	tp = SearchSysCache2(ATTNUM,
+						 ObjectIdGetDatum(relid),
+						 Int16GetDatum(attnum));
+	if (HeapTupleIsValid(tp))
+	{
+		Form_pg_attribute att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+		bool		result;
+
+		result = att_tup->attnotnull;
+		ReleaseSysCache(tp);
+		return result;
+	}
+	else
+		return false;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h
index d0b0356..34a75e4 100644
--- a/src/include/commands/trigger.h
+++ b/src/include/commands/trigger.h
@@ -181,6 +181,7 @@ extern void ExecBSTruncateTriggers(EState *estate,
 extern void ExecASTruncateTriggers(EState *estate,
 					   ResultRelInfo *relinfo);
 
+extern bool AfterTriggerQueueIsEmpty(void);
 extern void AfterTriggerBeginXact(void);
 extern void AfterTriggerBeginQuery(void);
 extern void AfterTriggerEndQuery(EState *estate);
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index f1b65b4..2a0c0f6 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -65,6 +65,15 @@
 #define EXEC_FLAG_WITHOUT_OIDS	0x0040	/* force no OIDs in returned tuples */
 #define EXEC_FLAG_WITH_NO_DATA	0x0080	/* rel scannability doesn't matter */
 
+/* Flags used for JoinState.skipflags */
+#define EXEC_SKIPJOIN_INNER		0x0001 /* Skip inner side of join */
+#define EXEC_SKIPJOIN_OUTER		0x0002 /* Skip outer side of join */
+#define EXEC_SKIPJOIN_BOTH		(EXEC_SKIPJOIN_OUTER|EXEC_SKIPJOIN_INNER)
+
+#define HasFlagSkipJoinInner(n)	((n) & EXEC_SKIPJOIN_INNER)
+#define HasFlagSkipJoinOuter(n)	((n) & EXEC_SKIPJOIN_OUTER)
+#define HasFlagSkipJoinBoth(n)	((n) & EXEC_SKIPJOIN_BOTH) == EXEC_SKIPJOIN_BOTH
+#define HasFlagSkipJoinAny(n)	((n) != 0)
 
 /*
  * ExecEvalExpr was formerly a function containing a switch statement;
@@ -339,6 +348,7 @@ extern ProjectionInfo *ExecBuildProjectionInfo(List *targetList,
 						ExprContext *econtext,
 						TupleTableSlot *slot,
 						TupleDesc inputDesc);
+extern bool ExecCanSkipJoin(PlanState *planstate);
 extern void ExecAssignProjectionInfo(PlanState *planstate,
 						 TupleDesc inputDesc);
 extern void ExecFreeExprContext(PlanState *planstate);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 8c8c01f..6403149 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1561,6 +1561,7 @@ typedef struct JoinState
 	PlanState	ps;
 	JoinType	jointype;
 	List	   *joinqual;		/* JOIN quals (in addition to ps.qual) */
+	int			skipflags;
 } JoinState;
 
 /* ----------------
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3e4f815..7f74202 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -813,6 +813,8 @@ typedef struct RangeTblEntry
 	bool		lateral;		/* subquery, function, or values is LATERAL? */
 	bool		inh;			/* inheritance requested? */
 	bool		inFromCl;		/* present in FROM clause? */
+	bool		skipJoinPossible; /* it may be possible to not bother joining
+								   * this relation at all */
 	AclMode		requiredPerms;	/* bitmask of required access permissions */
 	Oid			checkAsUser;	/* if valid, check access as this role */
 	Bitmapset  *selectedCols;	/* columns needing SELECT permission */
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 05cfbcd..4a1bf47 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -359,6 +359,8 @@ typedef struct PlannerInfo
  *		lateral_referencers - relids of rels that reference this one laterally
  *		indexlist - list of IndexOptInfo nodes for relation's indexes
  *					(always NIL if it's not a table)
+ *		fklist - list of ForeignKeyInfo's for relation's foreign key
+ *					constraints. (always NIL if it's not a table)
  *		pages - number of disk pages in relation (zero if not a table)
  *		tuples - number of tuples in relation (not considering restrictions)
  *		allvisfrac - fraction of disk pages that are marked all-visible
@@ -452,6 +454,7 @@ typedef struct RelOptInfo
 	Relids		lateral_relids; /* minimum parameterization of rel */
 	Relids		lateral_referencers;	/* rels that reference me laterally */
 	List	   *indexlist;		/* list of IndexOptInfo */
+	List	   *fklist;			/* list of ForeignKeyInfo */
 	BlockNumber pages;			/* size estimates derived from pg_class */
 	double		tuples;
 	double		allvisfrac;
@@ -542,6 +545,51 @@ typedef struct IndexOptInfo
 	bool		amhasgetbitmap; /* does AM have amgetbitmap interface? */
 } IndexOptInfo;
 
+/*
+ * ForeignKeyInfo
+ *		Used to store pg_constraint records for foreign key constraints for use
+ *		by the planner.
+ *
+ *		conindid - The index which supports the foreign key
+ *
+ *		confrelid - The relation that is referenced by this foreign key
+ *
+ *		convalidated - True if the foreign key has been validated.
+ *
+ *		conrelid - The Oid of the relation that the foreign key belongs to
+ *
+ *		confupdtype - ON UPDATE action for when the referenced table is updated
+ *
+ *		confdeltype - ON DELETE action, controls what to do when a record is
+ *					deleted from the referenced table.
+ *
+ *		confmatchtype - foreign key match type, e.g MATCH FULL, MATCH PARTIAL
+ *
+ *		conncols - Number of columns defined in the foreign key
+ *
+ *		conkey - An array of conncols elements to store the varattno of the
+ *					columns on the referencing side of the foreign key
+ *
+ *		confkey - An array of conncols elements to store the varattno of the
+ *					columns on the referenced side of the foreign key
+ *
+ *		conpfeqop - An array of conncols elements to store the operators for
+ *					PK = FK comparisons
+ */
+typedef struct ForeignKeyInfo
+{
+	Oid			conindid;		/* index supporting this constraint */
+	Oid			confrelid;		/* relation referenced by foreign key */
+	bool		convalidated;	/* constraint has been validated? */
+	Oid			conrelid;		/* relation this constraint constrains */
+	char		confupdtype;	/* foreign key's ON UPDATE action */
+	char		confdeltype;	/* foreign key's ON DELETE action */
+	char		confmatchtype;	/* foreign key's match type */
+	int			conncols;		/* number of columns references */
+	int16	   *conkey;			/* Columns of conrelid that the constraint applies to */
+	int16	   *confkey;		/* columns of confrelid that foreign key references */
+	Oid		   *conpfeqop;		/* Operator list for comparing PK to FK */
+} ForeignKeyInfo;
 
 /*
  * EquivalenceClasses
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 9b22fda..b11ae78 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -108,10 +108,13 @@ extern EquivalenceClass *get_eclass_for_sort_expr(PlannerInfo *root,
 						 Relids rel,
 						 bool create_it);
 extern void generate_base_implied_equalities(PlannerInfo *root);
+extern void remove_rel_from_eclass(PlannerInfo *root, int relid);
 extern List *generate_join_implied_equalities(PlannerInfo *root,
 								 Relids join_relids,
 								 Relids outer_relids,
 								 RelOptInfo *inner_rel);
+extern Oid select_equality_operator(EquivalenceClass *ec, Oid lefttype,
+								 Oid righttype);
 extern bool exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2);
 extern void add_child_rel_equivalences(PlannerInfo *root,
 						   AppendRelInfo *appinfo,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 07d24d4..910190d 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -68,6 +68,7 @@ extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern int32 get_atttypmod(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 					  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern char *get_collation_name(Oid colloid);
 extern char *get_constraint_name(Oid conoid);
 extern Oid	get_opclass_family(Oid opclass);
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 2501184..4299227 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3276,6 +3276,32 @@ select i8.* from int8_tbl i8 left join (select f1 from int4_tbl group by f1) i4
 (1 row)
 
 rollback;
+begin work;
+create temp table c (
+  id int primary key
+);
+create temp table b (
+  id int primary key,
+  c_id int not null,
+  constraint b_c_id_fkey foreign key (c_id) references c deferrable
+);
+create temp table a (
+  id int primary key,
+  b_id int not null,
+  constraint a_b_id_fkey foreign key (b_id) references b deferrable
+);
+insert into c (id) values(1);
+insert into b (id, c_id) values(1,1);
+insert into a (id, b_id) values(1,1);
+set constraints b_c_id_fkey deferred;
+update c set id = 2 where id=1;
+-- ensure inner join to be is not skipped.
+select b.* from b inner join c on b.c_id = c.id;
+ id | c_id 
+----+------
+(0 rows)
+
+rollback;
 create temp table parent (k int primary key, pd int);
 create temp table child (k int unique, cd int);
 insert into parent values (1, 10), (2, 20), (3, 30);
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index 718e1d9..e226e4e 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -977,6 +977,34 @@ select i8.* from int8_tbl i8 left join (select f1 from int4_tbl group by f1) i4
 
 rollback;
 
+begin work;
+
+create temp table c (
+  id int primary key
+);
+create temp table b (
+  id int primary key,
+  c_id int not null,
+  constraint b_c_id_fkey foreign key (c_id) references c deferrable
+);
+create temp table a (
+  id int primary key,
+  b_id int not null,
+  constraint a_b_id_fkey foreign key (b_id) references b deferrable
+);
+
+insert into c (id) values(1);
+insert into b (id, c_id) values(1,1);
+insert into a (id, b_id) values(1,1);
+
+set constraints b_c_id_fkey deferred;
+update c set id = 2 where id=1;
+
+-- ensure inner join to be is not skipped.
+select b.* from b inner join c on b.c_id = c.id;
+
+rollback;
+
 create temp table parent (k int primary key, pd int);
 create temp table child (k int unique, cd int);
 insert into parent values (1, 10), (2, 20), (3, 30);

#39

David Rowley

dgrowleyml@gmail.com

about 11 years ago

In reply to: David Rowley (#38)

Re: Patch to support SEMI and ANTI join removal

On Sun, Nov 16, 2014 at 12:19 PM, David Rowley <dgrowleyml@gmail.com> wrote:

On Sun, Nov 16, 2014 at 10:09 AM, Simon Riggs <simon@2ndquadrant.com>
wrote:

I propose that we keep track of whether there are any potentially
skippable joins at the top of the plan. When we begin execution we do
a single if test to see if there is run-time work to do. If we pass
the run-time tests we then descend the tree and prune the plan to
completely remove unnecessary nodes. We end with an EXPLAIN and
EXPLAIN ANALYZE that looks like this

QUERY PLAN
------------------------------------------------------------------
Aggregate (actual rows=1 loops=1)
-> Seq Scan on t1 (actual rows=1000000 loops=1)

Doing that removes all the overheads and complexity; it also matches
how join removal currently works.

This sounds much cleaner than what I have at the moment, although, you say
EXPLAIN would look like that... I don't think that's quite true as the
EXPLAIN still would have the un-pruned version, as the pruning would be
done as executor start-up. Would it cause problems to have the EXPLAIN have
a different looking plan than EXPLAIN ANALYZE?

Oops, It seems you're right about the EXPLAIN output. I had not previously
realised that plain old EXPLAIN would initialise the plan. It's nice to see
that I'll get my old tests working again!

I've been hacking away at this, and I've now got a function which
"implodes" the plan down to just what is required, I'm just calling this
function is there are no pending foreign key triggers.

Writing this has made me realise that I may need to remove the
functionality that I've added to the planner which, after it removes 1
inner join, it puts that relation in an "ignore list" and tries again to
remove other relations again, but this time ignoring any vars from ignored
relations. The problem I see with this is that, with a plan such as:

Hash Join
Hash Cond: (t1.id = t4.id)
-> Hash Join
Hash Cond: (t1.id = t3.id)
-> Hash Join
Hash Cond: (t1.id = t2.id)
-> Seq Scan on t1
-> Hash
-> Seq Scan on t2
-> Hash
-> Seq Scan on t3
-> Hash
-> Seq Scan on t4

If t1 and t4 are marked as "can remove", then the code that "implodes" plan
to remove the nodes which are no longer required would render the plan a
bit useless as there's no join between t2 and t3, we'd need to keep t1 in
this case, even though non of it's Vars are required. Perhaps I could fix
this by writing some more intelligent code which would leave joins in place
in this situation, and maybe I could coerce the planner into not producing
plans like this by lowering the costs of joins where 1 of the relations
could be removed. Andres did mention lowering costs previously, but at the
time I'd not realised why it was required.

I'm also a little concerned around Merge Joins, as if I removed a Merge
Join, because one of the relations was not required, and just left, say the
SeqScan node for the other relation in place of the Merge Join, then I'd
need to somehow check that none of the parent nodes were expecting some
specific sort order. Perhaps I could just always leave any Sort node in
place, if it existed, and just put the scan below that, but it all feels a
bit like executor performing voodoo on the plan... i.e. just feels like a
little bit more than the executor should know about plans. I'm a bit
worried that I could spend a week on this and Tom or someone else then
comes along and throws it out.

So I'm really just looking for some confirmation to if this is a good or
bad idea, based on the discoveries I've explained above. I really want to
see this stuff working, but at the same time don't want to waste time on it
if it's never going to be committed.

Regards

David Rowley

#40

David Rowley

dgrowleyml@gmail.com

about 11 years ago

In reply to: David Rowley (#39)

1 attachment(s)

Re: Patch to support SEMI and ANTI join removal

On Wed, Nov 19, 2014 at 11:49 PM, David Rowley <dgrowleyml@gmail.com> wrote:

On Sun, Nov 16, 2014 at 12:19 PM, David Rowley <dgrowleyml@gmail.com>
wrote:

On Sun, Nov 16, 2014 at 10:09 AM, Simon Riggs <simon@2ndquadrant.com>
wrote:

I propose that we keep track of whether there are any potentially
skippable joins at the top of the plan. When we begin execution we do
a single if test to see if there is run-time work to do. If we pass
the run-time tests we then descend the tree and prune the plan to
completely remove unnecessary nodes. We end with an EXPLAIN and
EXPLAIN ANALYZE that looks like this

QUERY PLAN
------------------------------------------------------------------
Aggregate (actual rows=1 loops=1)
-> Seq Scan on t1 (actual rows=1000000 loops=1)

Doing that removes all the overheads and complexity; it also matches
how join removal currently works.

I've attached an updated patch which works in this way. All of the skipping
code that I had added to the executor's join functions has now been removed.

Here's an example output with the plan trimmed, and then untrimmed.

set constraints b_c_id_fkey deferred;
explain (costs off) select b.* from b inner join c on b.c_id = c.id;
QUERY PLAN
---------------
Seq Scan on b
(1 row)

-- add a item to the trigger queue by updating a referenced record.
update c set id = 2 where id=1;
explain (costs off) select b.* from b inner join c on b.c_id = c.id;
QUERY PLAN
------------------------------
Hash Join
Hash Cond: (b.c_id = c.id)
-> Seq Scan on b
-> Hash
-> Seq Scan on c
(5 rows)

A slight quirk with the patch as it stands is that I'm unconditionally NOT
removing Sort nodes that sit below a MergeJoin node. The reason for this is
that I've not quite figured out a way to determine if the Sort order is
required still.

An example of this can be seen in the regression tests:

-- check merge join nodes are removed properly
set enable_hashjoin = off;
-- this should remove joins to b and c.
explain (costs off)
select COUNT(*) from a inner join b on a.b_id = b.id left join c on a.id =
c.id;
QUERY PLAN
---------------------------
Aggregate
-> Sort
Sort Key: a.b_id
-> Seq Scan on a
(4 rows)

As the patch stands there's still a couple of FIXMEs in there, so there's
still a bit of work to do yet.

Comments are welcome

Regards

David Rowley

Attachments:

inner_join_removals_2014-11-24_7cde1e4.patchapplication/octet-stream; name=inner_join_removals_2014-11-24_7cde1e4.patchDownload

diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index ebccfea..ea26615 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -3889,6 +3889,17 @@ afterTriggerInvokeEvents(AfterTriggerEventList *events,
 	return all_fired;
 }
 
+/* ----------
+ * AfterTriggerQueueIsEmpty()
+ *
+ *	True if there are no pending triggers in the queue.
+ * ----------
+ */
+bool
+AfterTriggerQueueIsEmpty(void)
+{
+	return (afterTriggers.query_depth == -1 && afterTriggers.events.head == NULL);
+}
 
 /* ----------
  * AfterTriggerBeginXact()
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index a753b20..13b6d92 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -885,6 +885,10 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 		i++;
 	}
 
+	if (AfterTriggerQueueIsEmpty())
+		ExecImplodePlan(&plan, estate);
+
+
 	/*
 	 * Initialize the private state information for all the nodes in the query
 	 * tree.  This opens files, allocates storage and leaves us ready to start
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index d5e1273..15f68f9 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -45,6 +45,7 @@
 #include "access/relscan.h"
 #include "access/transam.h"
 #include "catalog/index.h"
+#include "commands/trigger.h"
 #include "executor/execdebug.h"
 #include "nodes/nodeFuncs.h"
 #include "parser/parsetree.h"
@@ -54,6 +55,11 @@
 
 
 static bool get_last_attnums(Node *node, ProjectionInfo *projInfo);
+static bool ExecCanSkipScanNode(Plan *scannode, EState *estate);
+static Plan *ExecTryRemoveHashJoin(Plan *hashjoin, EState *estate);
+static Plan *ExecTryRemoveNestedLoop(Plan *nestedloop, EState *estate);
+static Plan *ExecTryRemoveMergeJoin(Plan *mergejoin, EState *estate);
+static void ExecResetJoinVarnos(List *tlist);
 static bool index_recheck_constraint(Relation index, Oid *constr_procs,
 						 Datum *existing_values, bool *existing_isnull,
 						 Datum *new_values);
@@ -74,7 +80,7 @@ static void ShutdownExprContext(ExprContext *econtext, bool isCommit);
  * Principally, this creates the per-query memory context that will be
  * used to hold all working data that lives till the end of the query.
  * Note that the per-query context will become a child of the caller's
- * CurrentMemoryContext.
+ * CurrentMemoryContext.-
  * ----------------
  */
 EState *
@@ -661,6 +667,271 @@ get_last_attnums(Node *node, ProjectionInfo *projInfo)
 								  (void *) projInfo);
 }
 
+/* Returns true if the scan node may be skipped, otherwise returns false */
+static bool
+ExecCanSkipScanNode(Plan *scannode, EState *estate)
+{
+	Scan *scan = (Scan *) scannode;
+	RangeTblEntry *rte;
+
+	switch (nodeTag(scannode))
+	{
+		case T_IndexOnlyScan:
+		case T_IndexScan:
+		case T_SeqScan:
+			rte = (RangeTblEntry *) list_nth(estate->es_range_table, scan->scanrelid - 1);
+			return rte->skipJoinPossible;
+		default:
+			/* If it's not a scan node then we can't remove it */
+			return false;
+	}
+}
+
+
+static Plan *
+ExecTryRemoveHashJoin(Plan *hashjoin, EState *estate)
+{
+	bool canRemoveLeft = false;
+	bool canRemoveRight = false;
+	Plan *leftnode = hashjoin->lefttree;
+	Plan *rightnode = hashjoin->righttree;
+
+	/*
+	 * If the left node is NULL, then mode likely the node has already been
+	 * removed, in this case we can skip it
+	 */
+	if (leftnode == NULL)
+		canRemoveLeft = true;
+	else
+		canRemoveLeft = ExecCanSkipScanNode(leftnode, estate);
+
+	if (rightnode == NULL)
+		canRemoveRight = true;
+	else
+	{
+		if (nodeTag(rightnode) != T_Hash)
+		{
+			elog(ERROR, "HashJoin's righttree node should be a Hash node");
+			return hashjoin;
+		}
+
+		/* move to the node where the hash is getting tuples from */
+		rightnode = rightnode->lefttree;
+
+		/*
+		 * If this node is NULL then most likely a hashjoin has been completely
+		 * removed from below the T_Hash node. In this case we can certainly
+		 * remove the right node as there's nothing under it.
+		 */
+		if (rightnode == NULL)
+			canRemoveRight = true;
+		else
+			canRemoveRight = ExecCanSkipScanNode(rightnode, estate);
+	}
+
+	if (canRemoveLeft)
+	{
+		if (canRemoveRight)
+			return NULL; /* this join is not required at all */
+		else
+			return rightnode;
+	}
+	else
+	{
+		if (canRemoveRight)
+			return leftnode; /* only left is required */
+		else
+			return hashjoin; /* both sides are required */
+	}
+}
+
+static Plan *
+ExecTryRemoveNestedLoop(Plan *nestedloop, EState *estate)
+{
+	bool canRemoveLeft = false;
+	bool canRemoveRight = false;
+	Plan *leftnode = nestedloop->lefttree;
+	Plan *rightnode = nestedloop->righttree;
+
+	/*
+	 * If the left node is NULL, then mode likely the node has already been
+	 * removed, in this case we can skip it
+	 */
+	if (leftnode == NULL)
+		canRemoveLeft = true;
+	else
+		canRemoveLeft = ExecCanSkipScanNode(leftnode, estate);
+
+	if (rightnode == NULL)
+		canRemoveRight = true;
+	else
+		canRemoveRight = ExecCanSkipScanNode(rightnode, estate);
+
+	if (canRemoveLeft)
+	{
+		if (canRemoveRight)
+			return NULL; /* this join is not required at all */
+		else
+			return rightnode;
+	}
+	else
+	{
+		if (canRemoveRight)
+			return leftnode; /* only left is required */
+		else
+			return nestedloop; /* both sides are required */
+	}
+}
+
+static Plan *
+ExecTryRemoveMergeJoin(Plan *mergejoin, EState *estate)
+{
+	bool canRemoveLeft = false;
+	bool canRemoveRight = false;
+	Plan *leftnode = mergejoin->lefttree;
+	Plan *rightnode = mergejoin->righttree;
+
+	/*
+	 * If the left node is NULL, then mode likely the node has already been
+	 * removed, in this case we can skip it
+	 */
+	if (leftnode == NULL)
+		canRemoveLeft = true;
+	else
+	{
+		if (nodeTag(leftnode) == T_Sort)
+		{
+			/* move to the node where the merge join is getting tuples from */
+			leftnode = leftnode->lefttree;
+		}
+
+		canRemoveLeft = ExecCanSkipScanNode(leftnode, estate);
+	}
+
+	if (rightnode == NULL)
+		canRemoveRight = true;
+	else
+	{
+		if (nodeTag(rightnode) == T_Sort)
+		{
+			/* move to the node where the hash is getting tuples from */
+			rightnode = rightnode->lefttree;
+		}
+
+		/*
+		 * Check just in case the node from below the sort was already removed,
+		 * if it has then there's no point in this side of the join.
+		 */
+		if (rightnode == NULL)
+			canRemoveRight = true;
+		else
+			canRemoveRight = ExecCanSkipScanNode(rightnode, estate);
+	}
+
+	if (canRemoveLeft)
+	{
+		if (canRemoveRight)
+			return NULL; /* this join is not required at all */
+
+		/*
+		 * Right is required, skip left but maintain any sort nodes above the
+		 * scan node as sort order may be critical for the parent node.
+		 * XXX: Is there any way which we can check if the Sort order is
+		 * important to the parent?
+		 */
+		else
+			return mergejoin->righttree;
+	}
+	else
+	{
+		/* Left is required, but right is not, again keep the sort */
+		if (canRemoveRight)
+			return mergejoin->lefttree;
+		else
+			return mergejoin; /* both sides are required */
+	}
+}
+
+
+/*
+ * Reset a join node's targetlist Vars to remove the OUTER_VAR and INNER_VAR
+ * varnos
+ */
+static void
+ExecResetJoinVarnos(List *tlist)
+{
+	ListCell *lc;
+
+	foreach (lc, tlist)
+	{
+		TargetEntry *tle = (TargetEntry *) lfirst(lc);
+		Var *var = (Var *) tle->expr;
+
+		if (IsA(var, Var))
+			var->varno = var->varnoold;
+	}
+}
+
+/*
+ * Recursively process the plan tree to "move-up" nodes that sit beneath join
+ * nodes of any joins which are deemed unnecessary by the planner during the
+ * join removal process.
+ */
+void
+ExecImplodePlan(Plan **node, EState *estate)
+{
+	Plan *skippedToNode;
+	if (*node == NULL)
+		return;
+
+	/* visit each node recursively */
+	ExecImplodePlan(&(*node)->lefttree, estate);
+	ExecImplodePlan(&(*node)->righttree, estate);
+
+	switch (nodeTag(*node))
+	{
+		case T_HashJoin:
+			skippedToNode = ExecTryRemoveHashJoin(*node, estate);
+			break;
+		case T_NestLoop:
+			skippedToNode = ExecTryRemoveNestedLoop(*node, estate);
+			break;
+		case T_MergeJoin:
+			skippedToNode = ExecTryRemoveMergeJoin(*node, estate);
+			break;
+		default:
+			return;
+	}
+
+	/* both sides of join were removed, so we've nothing more to do here. */
+	if (skippedToNode == NULL)
+	{
+		*node = NULL;
+		return;
+	}
+
+	/*
+	 * If we've managed to move the node up a level, then we'd better also
+	 * replace the targetlist of the new node with that of the original node.
+	 * If we didn't do this then we might end up with columns in the result-set
+	 * that the query did not ask for.
+	 *
+	 * Also, since the original node was a join type node, the targetlist will
+	 * contain OUTER_VAR and INNER_VAR in place if the real varnos, so we must
+	 * put these back to what they should be.
+	 */
+	if (skippedToNode != *node)
+	{
+		// FIXME: What else apart from Sort should not be changed?
+		if (nodeTag(skippedToNode) != T_Sort)
+		{
+			ExecResetJoinVarnos((*node)->targetlist);
+			skippedToNode->targetlist = (*node)->targetlist;
+		}
+		*node = skippedToNode;
+	}
+}
+
 /* ----------------
  *		ExecAssignProjectionInfo
  *
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index e5dd58e..0a665b2 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -49,8 +49,6 @@ static List *generate_join_implied_equalities_broken(PlannerInfo *root,
 										Relids outer_relids,
 										Relids nominal_inner_relids,
 										RelOptInfo *inner_rel);
-static Oid select_equality_operator(EquivalenceClass *ec,
-						 Oid lefttype, Oid righttype);
 static RestrictInfo *create_join_clause(PlannerInfo *root,
 				   EquivalenceClass *ec, Oid opno,
 				   EquivalenceMember *leftem,
@@ -1283,7 +1281,7 @@ generate_join_implied_equalities_broken(PlannerInfo *root,
  *
  * Returns InvalidOid if no operator can be found for this datatype combination
  */
-static Oid
+Oid
 select_equality_operator(EquivalenceClass *ec, Oid lefttype, Oid righttype)
 {
 	ListCell   *lc;
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index 773f8a4..f120cb9 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -32,13 +32,21 @@
 #include "utils/lsyscache.h"
 
 /* local functions */
-static bool join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
+static bool innerjoin_is_removable(PlannerInfo *root, List *joinlist,
+					  RangeTblRef *removalrtr, Relids ignoredrels);
+static bool leftjoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
+static bool relation_is_needed(PlannerInfo *root, Relids joinrelids,
+					  RelOptInfo *rel, Relids ignoredrels);
+static bool relation_has_foreign_key_for(PlannerInfo *root, RelOptInfo *rel,
+					  RelOptInfo *referencedrel, List *referencing_vars,
+					  List *index_vars, List *operator_list);
+static bool expressions_match_foreign_key(ForeignKeyInfo *fk, List *fkvars,
+					  List *indexvars, List *operators);
 static void remove_rel_from_query(PlannerInfo *root, int relid,
 					  Relids joinrelids);
 static List *remove_rel_from_joinlist(List *joinlist, int relid, int *nremoved);
 static Oid	distinct_col_search(int colno, List *colnos, List *opids);
 
-
 /*
  * remove_useless_joins
  *		Check for relations that don't actually need to be joined at all,
@@ -46,26 +54,94 @@ static Oid	distinct_col_search(int colno, List *colnos, List *opids);
  *
  * We are passed the current joinlist and return the updated list.  Other
  * data structures that have to be updated are accessible via "root".
+ *
+ * There are 2 methods here for removing joins. Joins such as LEFT JOINs
+ * which can be proved to be needless due to lack of use of any of the joining
+ * relation's columns and the existence of a unique index on a subset of the
+ * join clause, can simply be removed from the query plan at plan time. For
+ * certain other join types we make use of foreign keys to attempt to prove the
+ * join is needless, though, for these we're unable to be certain that the join
+ * is not required at plan time, as if the plan is executed when pending
+ * foreign key triggers have not yet been fired, then the foreign key is
+ * effectively violated until these triggers have fired. Removing a join in
+ * such a case could cause a query to produce incorrect results.
+ *
+ * Instead we handle this case by marking the RangeTblEntry for the relation
+ * with a special flag which tells the executor that it's possible that joining
+ * to this relation may not be required. The executor may then check this flag
+ * and choose to skip the join based on if there are foreign key triggers
+ * pending or not.
  */
 List *
 remove_useless_joins(PlannerInfo *root, List *joinlist)
 {
 	ListCell   *lc;
+	Relids		removedrels = NULL;
 
 	/*
-	 * We are only interested in relations that are left-joined to, so we can
-	 * scan the join_info_list to find them easily.
+	 * Start by analyzing INNER JOINed relations in order to determine if any
+	 * of the relations can be ignored.
 	 */
 restart:
+	foreach(lc, joinlist)
+	{
+		RangeTblRef		*rtr = (RangeTblRef *) lfirst(lc);
+		RangeTblEntry	*rte;
+
+		if (!IsA(rtr, RangeTblRef))
+			continue;
+
+		rte = root->simple_rte_array[rtr->rtindex];
+
+		/* Don't try to remove this one again if we've already removed it */
+		if (rte->skipJoinPossible == true)
+			continue;
+
+		/* skip if the join can't be removed */
+		if (!innerjoin_is_removable(root, joinlist, rtr, removedrels))
+			continue;
+
+		/*
+		 * Since we're not actually removing the join here, we need to maintain
+		 * a list of relations that we've "removed" so when we're checking if
+		 * other relations can be removed we'll know that if the to be removed
+		 * relation is only referenced by a relation that we've already removed
+		 * that it can be safely assumed that the relation is not referenced by
+		 * any useful relation.
+		 */
+		removedrels = bms_add_member(removedrels, rtr->rtindex);
+
+		/*
+		 * Make a mark for the executor to say that it may be able to skip
+		 * joining to this relation.
+		 */
+		rte->skipJoinPossible = true;
+
+		/*
+		 * Restart the scan.  This is necessary to ensure we find all removable
+		 * joins independently of their ordering. (note that since we've added
+		 * this relation to the removedrels, we may now realize that other
+		 * relations can also be removed as they're only referenced by the one
+		 * that we've just marked as possibly removable).
+		 */
+		goto restart;
+	}
+
+	/* now process special joins. Currently only left joins are supported */
 	foreach(lc, root->join_info_list)
 	{
 		SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) lfirst(lc);
 		int			innerrelid;
 		int			nremoved;
 
-		/* Skip if not removable */
-		if (!join_is_removable(root, sjinfo))
-			continue;
+		if (sjinfo->jointype == JOIN_LEFT)
+		{
+			/* Skip if not removable */
+			if (!leftjoin_is_removable(root, sjinfo))
+				continue;
+		}
+		else
+			continue; /* we don't support this join type */
 
 		/*
 		 * Currently, join_is_removable can only succeed when the sjinfo's
@@ -91,12 +167,11 @@ restart:
 		root->join_info_list = list_delete_ptr(root->join_info_list, sjinfo);
 
 		/*
-		 * Restart the scan.  This is necessary to ensure we find all
-		 * removable joins independently of ordering of the join_info_list
-		 * (note that removal of attr_needed bits may make a join appear
-		 * removable that did not before).  Also, since we just deleted the
-		 * current list cell, we'd have to have some kluge to continue the
-		 * list scan anyway.
+		 * Restart the scan.  This is necessary to ensure we find all removable
+		 * joins independently of their ordering. (note that removal of
+		 * attr_needed bits may make a join, inner or outer, appear removable
+		 * that did not before).   Also, since we just deleted the current list
+		 * cell, we'd have to have some kluge to continue the list scan anyway.
 		 */
 		goto restart;
 	}
@@ -136,8 +211,213 @@ clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
 }
 
 /*
- * join_is_removable
- *	  Check whether we need not perform this special join at all, because
+ * innerjoin_is_removable
+ *		True if the join to removalrtr can be removed.
+ *
+ * In order to prove a relation which is inner joined is not required we must
+ * be sure that the join would emit exactly 1 row on the join condition. This
+ * differs from the logic which is used for proving LEFT JOINs can be removed,
+ * where it's possible to just check that a unique index exists on the relation
+ * being removed which has a set of columns that is a subset of the columns
+ * seen in the join condition. If no matching row is found then left join would
+ * not remove the non-matched row from the result set. This is not the case
+ * with INNER JOINs, so here we must use foreign keys as proof that the 1 row
+ * exists before we can allow any joins to be removed.
+ */
+static bool
+innerjoin_is_removable(PlannerInfo *root, List *joinlist,
+					   RangeTblRef *removalrtr, Relids ignoredrels)
+{
+	ListCell   *lc;
+	RelOptInfo *removalrel;
+
+	removalrel = find_base_rel(root, removalrtr->rtindex);
+
+	/*
+	 * As foreign keys may only reference base rels which have unique indexes,
+	 * we needn't go any further if we're not dealing with a base rel, or if
+	 * the base rel has no unique indexes. We'd also better abort if the
+	 * rtekind is anything but a relation, as things like sub-queries may have
+	 * grouping or distinct clauses that would cause us not to be able to use
+	 * the foreign key to prove the existence of a row matching the join
+	 * condition. We also abort if the rel has no eclass joins as such a rel
+	 * could well be joined using some operator which is not an equality
+	 * operator, or the rel may not even be inner joined at all.
+	 *
+	 * Here we actually only check if the rel has any indexes, ideally we'd be
+	 * checking for unique indexes, but we could only determine that by looping
+	 * over the indexlist, and this is likely too expensive a check to be worth
+	 * it here.
+	 */
+	if (removalrel->reloptkind != RELOPT_BASEREL ||
+		removalrel->rtekind != RTE_RELATION ||
+		removalrel->has_eclass_joins == false ||
+		removalrel->indexlist == NIL)
+		return false;
+
+	/*
+	 * Currently we disallow the removal if we find any baserestrictinfo items
+	 * on the relation being removed. The reason for this is that these would
+	 * filter out rows and make it so the foreign key cannot prove that we'll
+	 * match exactly 1 row on the join condition. However, this check is
+	 * currently probably a bit overly strict as it should be possible to just
+	 * check and ensure that each Var seen in the baserestrictinfo is also
+	 * present in an eclass and if so, just translate and move the whole
+	 * baserestrictinfo over to the relation which has the foreign key to prove
+	 * that this join is not needed. e.g:
+	 * SELECT a.* FROM a INNER JOIN b ON a.b_id = b.id WHERE b.id = 1;
+	 * could become: SELECT a.* FROM a WHERE a.b_id = 1;
+	 */
+	if (removalrel->baserestrictinfo != NIL)
+		return false;
+
+	/*
+	 * Currently only eclass joins are supported, so if there are any non
+	 * eclass join quals then we'll report the join is non-removable.
+	 */
+	if (removalrel->joininfo != NIL)
+		return false;
+
+	/*
+	 * Now we'll search through each relation in the joinlist to see if we can
+	 * find a relation which has a foreign key which references removalrel on
+	 * the join condition. If we find a rel with a foreign key which matches
+	 * the join condition exactly, then we can be sure that exactly 1 row will
+	 * be matched on the join, if we also see that no Vars from the relation
+	 * are needed, then we can report the join as removable.
+	 */
+	foreach (lc, joinlist)
+	{
+		RangeTblRef	*rtr = (RangeTblRef *) lfirst(lc);
+		RelOptInfo	*rel;
+		ListCell	*lc2;
+		List		*referencing_vars;
+		List		*index_vars;
+		List		*operator_list;
+		Relids		 joinrelids;
+
+		/* we can't remove ourself, or anything other than RangeTblRefs */
+		if (rtr == removalrtr || !IsA(rtr, RangeTblRef))
+			continue;
+
+		rel = find_base_rel(root, rtr->rtindex);
+
+		/*
+		 * The only relation type that can help us is a base rel with at least
+		 * one foreign key defined, if there's no eclass joins then this rel
+		 * is not going to help us prove the removalrel is not needed.
+		 */
+		if (rel->reloptkind != RELOPT_BASEREL ||
+			rel->rtekind != RTE_RELATION ||
+			rel->has_eclass_joins == false ||
+			rel->fklist == NIL)
+			continue;
+
+		/*
+		 * Both rels have eclass joins, but do they have eclass joins to each
+		 * other? Skip this rel if it does not.
+		 */
+		if (!have_relevant_eclass_joinclause(root, rel, removalrel))
+			continue;
+
+		joinrelids = bms_union(rel->relids, removalrel->relids);
+
+		/* if any of the Vars from the relation are needed then abort */
+		if (relation_is_needed(root, joinrelids, removalrel, ignoredrels))
+			return false;
+
+		referencing_vars = NIL;
+		index_vars = NIL;
+		operator_list = NIL;
+
+		/* now populate the lists with the join condition Vars */
+		foreach(lc2, root->eq_classes)
+		{
+			EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc2);
+
+			if (list_length(ec->ec_members) <= 1)
+				continue;
+
+			if (bms_overlap(removalrel->relids, ec->ec_relids) &&
+				bms_overlap(rel->relids, ec->ec_relids))
+			{
+				ListCell *lc3;
+				Var *refvar = NULL;
+				Var *idxvar = NULL;
+
+				/*
+				 * Look at each member of the eclass and try to find a Var from
+				 * each side of the join that we can append to the list of
+				 * columns that should be checked against each foreign key.
+				 *
+				 * The following logic does not allow for join removals to take
+				 * place for foreign keys that have duplicate columns on the
+				 * referencing side of the foreign key, such as:
+				 * (a,a) references (x,y)
+				 * The use case for such a foreign key is likely small enough
+				 * that we needn't bother making this code anymore complex to
+				 * solve. If we find more than 1 Var from any of the rels then
+				 * we'll bail out.
+				 */
+				foreach (lc3, ec->ec_members)
+				{
+					EquivalenceMember *ecm = (EquivalenceMember *) lfirst(lc3);
+
+					Var *var = (Var *) ecm->em_expr;
+
+					if (!IsA(var, Var))
+						continue; /* Ignore Consts */
+
+					if (var->varno == rel->relid)
+					{
+						if (refvar != NULL)
+							return false;
+						refvar = var;
+					}
+
+					else if (var->varno == removalrel->relid)
+					{
+						if (idxvar != NULL)
+							return false;
+						idxvar = var;
+					}
+				}
+
+				if (refvar != NULL && idxvar != NULL)
+				{
+					Oid opno;
+					Oid reloid = root->simple_rte_array[refvar->varno]->relid;
+
+					if (!get_attnotnull(reloid, refvar->varattno))
+						return false;
+
+					/* grab the correct equality operator for these two vars */
+					opno = select_equality_operator(ec, refvar->vartype, idxvar->vartype);
+
+					if (!OidIsValid(opno))
+						return false;
+
+					referencing_vars = lappend(referencing_vars, refvar);
+					index_vars = lappend(index_vars, idxvar);
+					operator_list = lappend_oid(operator_list, opno);
+				}
+			}
+		}
+
+		if (referencing_vars != NULL)
+		{
+			if (relation_has_foreign_key_for(root, rel, removalrel,
+				referencing_vars, index_vars, operator_list))
+				return true; /* removalrel can be removed */
+		}
+	}
+
+	return false; /* can't remove join */
+}
+
+/*
+ * leftjoin_is_removable
+ *	  Check whether we need not perform this left join at all, because
  *	  it will just duplicate its left input.
  *
  * This is true for a left join for which the join condition cannot match
@@ -147,7 +427,7 @@ clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
  * above the join.
  */
 static bool
-join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
+leftjoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 {
 	int			innerrelid;
 	RelOptInfo *innerrel;
@@ -155,14 +435,14 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	Relids		joinrelids;
 	List	   *clause_list = NIL;
 	ListCell   *l;
-	int			attroff;
+
+	Assert(sjinfo->jointype == JOIN_LEFT);
 
 	/*
-	 * Must be a non-delaying left join to a single baserel, else we aren't
+	 * Must be a non-delaying join to a single baserel, else we aren't
 	 * going to be able to do anything with it.
 	 */
-	if (sjinfo->jointype != JOIN_LEFT ||
-		sjinfo->delay_upper_joins ||
+	if (sjinfo->delay_upper_joins ||
 		bms_membership(sjinfo->min_righthand) != BMS_SINGLETON)
 		return false;
 
@@ -205,52 +485,9 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	/* Compute the relid set for the join we are considering */
 	joinrelids = bms_union(sjinfo->min_lefthand, sjinfo->min_righthand);
 
-	/*
-	 * We can't remove the join if any inner-rel attributes are used above the
-	 * join.
-	 *
-	 * Note that this test only detects use of inner-rel attributes in higher
-	 * join conditions and the target list.  There might be such attributes in
-	 * pushed-down conditions at this join, too.  We check that case below.
-	 *
-	 * As a micro-optimization, it seems better to start with max_attr and
-	 * count down rather than starting with min_attr and counting up, on the
-	 * theory that the system attributes are somewhat less likely to be wanted
-	 * and should be tested last.
-	 */
-	for (attroff = innerrel->max_attr - innerrel->min_attr;
-		 attroff >= 0;
-		 attroff--)
-	{
-		if (!bms_is_subset(innerrel->attr_needed[attroff], joinrelids))
-			return false;
-	}
-
-	/*
-	 * Similarly check that the inner rel isn't needed by any PlaceHolderVars
-	 * that will be used above the join.  We only need to fail if such a PHV
-	 * actually references some inner-rel attributes; but the correct check
-	 * for that is relatively expensive, so we first check against ph_eval_at,
-	 * which must mention the inner rel if the PHV uses any inner-rel attrs as
-	 * non-lateral references.  Note that if the PHV's syntactic scope is just
-	 * the inner rel, we can't drop the rel even if the PHV is variable-free.
-	 */
-	foreach(l, root->placeholder_list)
-	{
-		PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l);
-
-		if (bms_is_subset(phinfo->ph_needed, joinrelids))
-			continue;			/* PHV is not used above the join */
-		if (bms_overlap(phinfo->ph_lateral, innerrel->relids))
-			return false;		/* it references innerrel laterally */
-		if (!bms_overlap(phinfo->ph_eval_at, innerrel->relids))
-			continue;			/* it definitely doesn't reference innerrel */
-		if (bms_is_subset(phinfo->ph_eval_at, innerrel->relids))
-			return false;		/* there isn't any other place to eval PHV */
-		if (bms_overlap(pull_varnos((Node *) phinfo->ph_var->phexpr),
-						innerrel->relids))
-			return false;		/* it does reference innerrel */
-	}
+	/* if the relation is referenced in the query then it cannot be removed */
+	if (relation_is_needed(root, joinrelids, innerrel, NULL))
+		return false;
 
 	/*
 	 * Search for mergejoinable clauses that constrain the inner rel against
@@ -367,6 +604,218 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	return false;
 }
 
+/*
+ * relation_is_needed
+ *		True if any of the Vars from this relation are required in the query
+ */
+static inline bool
+relation_is_needed(PlannerInfo *root, Relids joinrelids, RelOptInfo *rel, Relids ignoredrels)
+{
+	int		  attroff;
+	ListCell *l;
+
+	/*
+	 * rel is referenced if any of it's attributes are used above the join.
+	 *
+	 * Note that this test only detects use of rel's attributes in higher
+	 * join conditions and the target list.  There might be such attributes in
+	 * pushed-down conditions at this join, too.  We check that case below.
+	 *
+	 * As a micro-optimization, it seems better to start with max_attr and
+	 * count down rather than starting with min_attr and counting up, on the
+	 * theory that the system attributes are somewhat less likely to be wanted
+	 * and should be tested last.
+	 */
+	for (attroff = rel->max_attr - rel->min_attr;
+		 attroff >= 0;
+		 attroff--)
+	{
+		if (!bms_is_subset(bms_difference(rel->attr_needed[attroff], ignoredrels), joinrelids))
+			return true;
+	}
+
+	/*
+	 * Similarly check that rel isn't needed by any PlaceHolderVars that will
+	 * be used above the join.  We only need to fail if such a PHV actually
+	 * references some of rel's attributes; but the correct check for that is
+	 * relatively expensive, so we first check against ph_eval_at, which must
+	 * mention rel if the PHV uses any of-rel's attrs as non-lateral
+	 * references.  Note that if the PHV's syntactic scope is just rel, we
+	 * can't return true even if the PHV is variable-free.
+	 */
+	foreach(l, root->placeholder_list)
+	{
+		PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l);
+
+		if (bms_is_subset(phinfo->ph_needed, joinrelids))
+			continue;			/* PHV is not used above the join */
+		if (bms_overlap(phinfo->ph_lateral, rel->relids))
+			return true;		/* it references rel laterally */
+		if (!bms_overlap(phinfo->ph_eval_at, rel->relids))
+			continue;			/* it definitely doesn't reference rel */
+		if (bms_is_subset(phinfo->ph_eval_at, rel->relids))
+			return true;		/* there isn't any other place to eval PHV */
+		if (bms_overlap(pull_varnos((Node *) phinfo->ph_var->phexpr),
+						rel->relids))
+			return true;		/* it does reference rel */
+	}
+
+	return false; /* it does not reference rel */
+}
+
+/*
+ * relation_has_foreign_key_for
+ *	  Checks if rel has a foreign key which references referencedrel with the
+ *	  given list of expressions.
+ *
+ *	For the match to succeed:
+ *	  referencing_vars must match the columns defined in the foreign key.
+ *	  index_vars must match the columns defined in the index for the foreign key.
+ */
+static bool
+relation_has_foreign_key_for(PlannerInfo *root, RelOptInfo *rel,
+			RelOptInfo *referencedrel, List *referencing_vars,
+			List *index_vars, List *operator_list)
+{
+	ListCell *lc;
+	Oid		  refreloid;
+
+	/*
+	 * Look up the Oid of the referenced relation. We only want to look at
+	 * foreign keys on the referencing relation which reference this relation.
+	 */
+	refreloid = root->simple_rte_array[referencedrel->relid]->relid;
+
+	Assert(list_length(referencing_vars) > 0);
+	Assert(list_length(referencing_vars) == list_length(index_vars));
+	Assert(list_length(referencing_vars) == list_length(operator_list));
+
+	/*
+	 * Search through each foreign key on the referencing relation and try
+	 * to find one which references the relation in the join condition. If we
+	 * find one then we'll send the join conditions off to
+	 * expressions_match_foreign_key() to see if they match the foreign key.
+	 */
+	foreach(lc, rel->fklist)
+	{
+		ForeignKeyInfo *fk = (ForeignKeyInfo *) lfirst(lc);
+
+		if (fk->confrelid == refreloid)
+		{
+			if (expressions_match_foreign_key(fk, referencing_vars,
+				index_vars, operator_list))
+				return true;
+		}
+	}
+
+	return false;
+}
+
+/*
+ * expressions_match_foreign_key
+ *		True if the given fkvars, indexvars and operators will match
+ *		exactly 1 record in the referenced relation of the foreign key.
+ *
+ * Note: This function expects fkvars and indexvars to only contain Var types.
+ *		 Expression indexes are not supported by foreign keys.
+ */
+static bool
+expressions_match_foreign_key(ForeignKeyInfo *fk, List *fkvars,
+					List *indexvars, List *operators)
+{
+	ListCell  *lc;
+	ListCell  *lc2;
+	ListCell  *lc3;
+	Bitmapset *allitems;
+	Bitmapset *matcheditems;
+	int		   lstidx;
+	int		   col;
+
+	Assert(list_length(fkvars) == list_length(indexvars));
+	Assert(list_length(fkvars) == list_length(operators));
+
+	/*
+	 * Fast path out if there's not enough conditions to match each column in
+	 * the foreign key. Note that we cannot check that the number of
+	 * expressions are equal here since it would cause any expressions which
+	 * are duplicated not to match.
+	 */
+	if (list_length(fkvars) < fk->conncols)
+		return false;
+
+	/*
+	 * We need to ensure that each foreign key column can be matched to a list
+	 * item, and we need to ensure that each list item can be matched to a
+	 * foreign key column. We do this by looping over each foreign key column
+	 * and checking that we can find an item in the list which matches the
+	 * current column, however this method does not allow us to ensure that no
+	 * additional items exist in the list. We could solve that by performing
+	 * another loop over each list item and check that it matches a foreign key
+	 * column, but that's a bit wasteful. Instead we'll use 2 bitmapsets, one
+	 * to store the 0 based index of each list item, and with the other we'll
+	 * store each list index that we've managed to match. After we're done
+	 * matching we'll just make sure that both bitmapsets are equal.
+	 */
+	allitems = NULL;
+	matcheditems = NULL;
+
+	/*
+	 * Build a bitmapset which contains each 1 based list index. It seems more
+	 * efficient to do this in reverse so that we allocate enough memory for
+	 * the bitmapset on first loop rather than reallocating each time we find
+	 * we need a bit more space.
+	 */
+	for (lstidx = list_length(fkvars) - 1; lstidx >= 0; lstidx--)
+		allitems = bms_add_member(allitems, lstidx);
+
+	for (col = 0; col < fk->conncols; col++)
+	{
+		bool  matched = false;
+
+		lstidx = 0;
+
+		forthree(lc, fkvars, lc2, indexvars, lc3, operators)
+		{
+			Var *expr = (Var *) lfirst(lc);
+			Var *idxexpr = (Var *) lfirst(lc2);
+			Oid  opr = lfirst_oid(lc3);
+
+			Assert(IsA(expr, Var));
+			Assert(IsA(idxexpr, Var));
+
+			/* Does this join qual match up to the current fkey column? */
+			if (fk->conkey[col] == expr->varattno &&
+				fk->confkey[col] == idxexpr->varattno &&
+				equality_ops_are_compatible(opr, fk->conpfeqop[col]))
+			{
+				matched = true;
+
+				/* mark this list item as matched */
+				matcheditems = bms_add_member(matcheditems, lstidx);
+
+				/*
+				 * Don't break here as there may be duplicate expressions
+				 * that we also need to match against.
+				 */
+			}
+			lstidx++;
+		}
+
+		/* punt if there's no match. */
+		if (!matched)
+			return false;
+	}
+
+	/*
+	 * Ensure that we managed to match every item in the list to a foreign key
+	 * column.
+	 */
+	if (!bms_equal(allitems, matcheditems))
+		return false;
+
+	return true; /* matched */
+}
+
 
 /*
  * Remove the target relid from the planner's data structures, having
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index fb74d6b..7ea0149 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -3712,6 +3712,7 @@ plan_cluster_use_sort(Oid tableOid, Oid indexOid)
 	rte->lateral = false;
 	rte->inh = false;
 	rte->inFromCl = true;
+	rte->skipJoinPossible = false;
 	query->rtable = list_make1(rte);
 
 	/* Set up RTE/RelOptInfo arrays */
diff --git a/src/backend/optimizer/prep/prepsecurity.c b/src/backend/optimizer/prep/prepsecurity.c
index b625b5c..74a0dca 100644
--- a/src/backend/optimizer/prep/prepsecurity.c
+++ b/src/backend/optimizer/prep/prepsecurity.c
@@ -311,6 +311,7 @@ expand_security_qual(PlannerInfo *root, List *tlist, int rt_index,
 			subrte->security_barrier = rte->security_barrier;
 			subrte->eref = copyObject(rte->eref);
 			subrte->inFromCl = true;
+			subrte->skipJoinPossible = false;
 			subquery->rtable = list_make1(subrte);
 
 			subrtr = makeNode(RangeTblRef);
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index b2becfa..fea198e 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -25,7 +25,9 @@
 #include "access/transam.h"
 #include "access/xlog.h"
 #include "catalog/catalog.h"
+#include "catalog/pg_constraint.h"
 #include "catalog/heap.h"
+#include "catalog/pg_type.h"
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -38,6 +40,7 @@
 #include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
 #include "storage/bufmgr.h"
+#include "utils/fmgroids.h"
 #include "utils/lsyscache.h"
 #include "utils/rel.h"
 #include "utils/snapmgr.h"
@@ -89,6 +92,12 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 	Relation	relation;
 	bool		hasindex;
 	List	   *indexinfos = NIL;
+	List	   *fkinfos = NIL;
+	Relation	fkeyRel;
+	Relation	fkeyRelIdx;
+	ScanKeyData fkeyScankey;
+	SysScanDesc fkeyScan;
+	HeapTuple	tuple;
 
 	/*
 	 * We need not lock the relation since it was already locked, either by
@@ -384,6 +393,111 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 
 	heap_close(relation, NoLock);
 
+	/* load foreign key constraints */
+	ScanKeyInit(&fkeyScankey,
+				Anum_pg_constraint_conrelid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(relationObjectId));
+
+	fkeyRel = heap_open(ConstraintRelationId, AccessShareLock);
+	fkeyRelIdx = index_open(ConstraintRelidIndexId, AccessShareLock);
+	fkeyScan = systable_beginscan_ordered(fkeyRel, fkeyRelIdx, NULL, 1, &fkeyScankey);
+
+	while ((tuple = systable_getnext_ordered(fkeyScan, ForwardScanDirection)) != NULL)
+	{
+		Form_pg_constraint con = (Form_pg_constraint) GETSTRUCT(tuple);
+		ForeignKeyInfo *fkinfo;
+		Datum		adatum;
+		bool		isNull;
+		ArrayType  *arr;
+		int			nelements;
+
+		/* skip if not a foreign key */
+		if (con->contype != CONSTRAINT_FOREIGN)
+			continue;
+
+		/* we're not interested unless the fkey has been validated */
+		if (!con->convalidated)
+			continue;
+
+		fkinfo = (ForeignKeyInfo *) palloc(sizeof(ForeignKeyInfo));
+		fkinfo->conindid = con->conindid;
+		fkinfo->confrelid = con->confrelid;
+		fkinfo->convalidated = con->convalidated;
+		fkinfo->conrelid = con->conrelid;
+		fkinfo->confupdtype = con->confupdtype;
+		fkinfo->confdeltype = con->confdeltype;
+		fkinfo->confmatchtype = con->confmatchtype;
+
+		adatum = heap_getattr(tuple, Anum_pg_constraint_conkey,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null conkey for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		nelements = ARR_DIMS(arr)[0];
+		if (ARR_NDIM(arr) != 1 ||
+			nelements < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != INT2OID)
+			elog(ERROR, "conkey is not a 1-D smallint array");
+
+		fkinfo->conkey = (int16 *) ARR_DATA_PTR(arr);
+		fkinfo->conncols = nelements;
+
+		adatum = heap_getattr(tuple, Anum_pg_constraint_confkey,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null confkey for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		nelements = ARR_DIMS(arr)[0];
+
+		if (ARR_NDIM(arr) != 1 ||
+			nelements < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != INT2OID)
+			elog(ERROR, "confkey is not a 1-D smallint array");
+
+		/* sanity check */
+		if (nelements != fkinfo->conncols)
+			elog(ERROR, "number of confkey elements does not equal conkey elements");
+
+		fkinfo->confkey = (int16 *) ARR_DATA_PTR(arr);
+		adatum = heap_getattr(tuple, Anum_pg_constraint_conpfeqop,
+							RelationGetDescr(fkeyRel), &isNull);
+
+		if (isNull)
+			elog(ERROR, "null conpfeqop for constraint %u",
+				HeapTupleGetOid(tuple));
+
+		arr = DatumGetArrayTypeP(adatum);		/* ensure not toasted */
+		nelements = ARR_DIMS(arr)[0];
+
+		if (ARR_NDIM(arr) != 1 ||
+			nelements < 0 ||
+			ARR_HASNULL(arr) ||
+			ARR_ELEMTYPE(arr) != OIDOID)
+			elog(ERROR, "conpfeqop is not a 1-D smallint array");
+
+		/* sanity check */
+		if (nelements != fkinfo->conncols)
+			elog(ERROR, "number of conpfeqop elements does not equal conkey elements");
+
+		fkinfo->conpfeqop = (Oid *) ARR_DATA_PTR(arr);
+
+		fkinfos = lappend(fkinfos, fkinfo);
+	}
+
+	rel->fklist = fkinfos;
+	systable_endscan_ordered(fkeyScan);
+	index_close(fkeyRelIdx, AccessShareLock);
+	heap_close(fkeyRel, AccessShareLock);
+
 	/*
 	 * Allow a plugin to editorialize on the info we obtained from the
 	 * catalogs.  Actions might include altering the assumed relation size,
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 4c76f54..58d80bb 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -115,6 +115,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind)
 	rel->lateral_relids = NULL;
 	rel->lateral_referencers = NULL;
 	rel->indexlist = NIL;
+	rel->fklist = NIL;
 	rel->pages = 0;
 	rel->tuples = 0;
 	rel->allvisfrac = 0;
@@ -377,6 +378,7 @@ build_join_rel(PlannerInfo *root,
 	joinrel->lateral_relids = NULL;
 	joinrel->lateral_referencers = NULL;
 	joinrel->indexlist = NIL;
+	joinrel->fklist = NIL;
 	joinrel->pages = 0;
 	joinrel->tuples = 0;
 	joinrel->allvisfrac = 0;
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 478584d..cafeba9 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1048,6 +1048,7 @@ addRangeTableEntry(ParseState *pstate,
 	rte->lateral = false;
 	rte->inh = inh;
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = ACL_SELECT;
 	rte->checkAsUser = InvalidOid;		/* not set-uid by default, either */
@@ -1101,6 +1102,7 @@ addRangeTableEntryForRelation(ParseState *pstate,
 	rte->lateral = false;
 	rte->inh = inh;
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = ACL_SELECT;
 	rte->checkAsUser = InvalidOid;		/* not set-uid by default, either */
@@ -1179,6 +1181,7 @@ addRangeTableEntryForSubquery(ParseState *pstate,
 	rte->lateral = lateral;
 	rte->inh = false;			/* never true for subqueries */
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = 0;
 	rte->checkAsUser = InvalidOid;
@@ -1433,6 +1436,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 	rte->lateral = lateral;
 	rte->inh = false;			/* never true for functions */
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = 0;
 	rte->checkAsUser = InvalidOid;
@@ -1505,6 +1509,7 @@ addRangeTableEntryForValues(ParseState *pstate,
 	rte->lateral = lateral;
 	rte->inh = false;			/* never true for values RTEs */
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = 0;
 	rte->checkAsUser = InvalidOid;
@@ -1573,6 +1578,7 @@ addRangeTableEntryForJoin(ParseState *pstate,
 	rte->lateral = false;
 	rte->inh = false;			/* never true for joins */
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = 0;
 	rte->checkAsUser = InvalidOid;
@@ -1673,6 +1679,7 @@ addRangeTableEntryForCTE(ParseState *pstate,
 	rte->lateral = false;
 	rte->inh = false;			/* never true for subqueries */
 	rte->inFromCl = inFromCl;
+	rte->skipJoinPossible = false;
 
 	rte->requiredPerms = 0;
 	rte->checkAsUser = InvalidOid;
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 24ade6c..11ab914 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -843,6 +843,7 @@ pg_get_triggerdef_worker(Oid trigid, bool pretty)
 		oldrte->lateral = false;
 		oldrte->inh = false;
 		oldrte->inFromCl = true;
+		oldrte->skipJoinPossible = false;
 
 		newrte = makeNode(RangeTblEntry);
 		newrte->rtekind = RTE_RELATION;
@@ -853,6 +854,7 @@ pg_get_triggerdef_worker(Oid trigid, bool pretty)
 		newrte->lateral = false;
 		newrte->inh = false;
 		newrte->inFromCl = true;
+		newrte->skipJoinPossible = false;
 
 		/* Build two-element rtable */
 		memset(&dpns, 0, sizeof(dpns));
@@ -2508,6 +2510,7 @@ deparse_context_for(const char *aliasname, Oid relid)
 	rte->lateral = false;
 	rte->inh = false;
 	rte->inFromCl = true;
+	rte->skipJoinPossible = false;
 
 	/* Build one-element rtable */
 	dpns->rtable = list_make1(rte);
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 552e498..aa81c7c 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -916,6 +916,33 @@ get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 	ReleaseSysCache(tp);
 }
 
+/*
+ * get_attnotnull
+ *
+ *		Given the relation id and the attribute number,
+ *		return the "attnotnull" field from the attribute relation.
+ */
+bool
+get_attnotnull(Oid relid, AttrNumber attnum)
+{
+	HeapTuple	tp;
+
+	tp = SearchSysCache2(ATTNUM,
+						 ObjectIdGetDatum(relid),
+						 Int16GetDatum(attnum));
+	if (HeapTupleIsValid(tp))
+	{
+		Form_pg_attribute att_tup = (Form_pg_attribute) GETSTRUCT(tp);
+		bool		result;
+
+		result = att_tup->attnotnull;
+		ReleaseSysCache(tp);
+		return result;
+	}
+	else
+		return false;
+}
+
 /*				---------- COLLATION CACHE ----------					 */
 
 /*
diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h
index d0b0356..34a75e4 100644
--- a/src/include/commands/trigger.h
+++ b/src/include/commands/trigger.h
@@ -181,6 +181,7 @@ extern void ExecBSTruncateTriggers(EState *estate,
 extern void ExecASTruncateTriggers(EState *estate,
 					   ResultRelInfo *relinfo);
 
+extern bool AfterTriggerQueueIsEmpty(void);
 extern void AfterTriggerBeginXact(void);
 extern void AfterTriggerBeginQuery(void);
 extern void AfterTriggerEndQuery(EState *estate);
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ed3ae39..1c2ef45 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -340,6 +340,7 @@ extern ProjectionInfo *ExecBuildProjectionInfo(List *targetList,
 						ExprContext *econtext,
 						TupleTableSlot *slot,
 						TupleDesc inputDesc);
+extern void ExecImplodePlan(Plan **planstate, EState *estate);
 extern void ExecAssignProjectionInfo(PlanState *planstate,
 						 TupleDesc inputDesc);
 extern void ExecFreeExprContext(PlanState *planstate);
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3e4f815..7f74202 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -813,6 +813,8 @@ typedef struct RangeTblEntry
 	bool		lateral;		/* subquery, function, or values is LATERAL? */
 	bool		inh;			/* inheritance requested? */
 	bool		inFromCl;		/* present in FROM clause? */
+	bool		skipJoinPossible; /* it may be possible to not bother joining
+								   * this relation at all */
 	AclMode		requiredPerms;	/* bitmask of required access permissions */
 	Oid			checkAsUser;	/* if valid, check access as this role */
 	Bitmapset  *selectedCols;	/* columns needing SELECT permission */
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 810b9c8..1947ce3 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -359,6 +359,8 @@ typedef struct PlannerInfo
  *		lateral_referencers - relids of rels that reference this one laterally
  *		indexlist - list of IndexOptInfo nodes for relation's indexes
  *					(always NIL if it's not a table)
+ *		fklist - list of ForeignKeyInfo's for relation's foreign key
+ *					constraints. (always NIL if it's not a table)
  *		pages - number of disk pages in relation (zero if not a table)
  *		tuples - number of tuples in relation (not considering restrictions)
  *		allvisfrac - fraction of disk pages that are marked all-visible
@@ -452,6 +454,7 @@ typedef struct RelOptInfo
 	Relids		lateral_relids; /* minimum parameterization of rel */
 	Relids		lateral_referencers;	/* rels that reference me laterally */
 	List	   *indexlist;		/* list of IndexOptInfo */
+	List	   *fklist;			/* list of ForeignKeyInfo */
 	BlockNumber pages;			/* size estimates derived from pg_class */
 	double		tuples;
 	double		allvisfrac;
@@ -542,6 +545,51 @@ typedef struct IndexOptInfo
 	bool		amhasgetbitmap; /* does AM have amgetbitmap interface? */
 } IndexOptInfo;
 
+/*
+ * ForeignKeyInfo
+ *		Used to store pg_constraint records for foreign key constraints for use
+ *		by the planner.
+ *
+ *		conindid - The index which supports the foreign key
+ *
+ *		confrelid - The relation that is referenced by this foreign key
+ *
+ *		convalidated - True if the foreign key has been validated.
+ *
+ *		conrelid - The Oid of the relation that the foreign key belongs to
+ *
+ *		confupdtype - ON UPDATE action for when the referenced table is updated
+ *
+ *		confdeltype - ON DELETE action, controls what to do when a record is
+ *					deleted from the referenced table.
+ *
+ *		confmatchtype - foreign key match type, e.g MATCH FULL, MATCH PARTIAL
+ *
+ *		conncols - Number of columns defined in the foreign key
+ *
+ *		conkey - An array of conncols elements to store the varattno of the
+ *					columns on the referencing side of the foreign key
+ *
+ *		confkey - An array of conncols elements to store the varattno of the
+ *					columns on the referenced side of the foreign key
+ *
+ *		conpfeqop - An array of conncols elements to store the operators for
+ *					PK = FK comparisons
+ */
+typedef struct ForeignKeyInfo
+{
+	Oid			conindid;		/* index supporting this constraint */
+	Oid			confrelid;		/* relation referenced by foreign key */
+	bool		convalidated;	/* constraint has been validated? */
+	Oid			conrelid;		/* relation this constraint constrains */
+	char		confupdtype;	/* foreign key's ON UPDATE action */
+	char		confdeltype;	/* foreign key's ON DELETE action */
+	char		confmatchtype;	/* foreign key's match type */
+	int			conncols;		/* number of columns references */
+	int16	   *conkey;			/* Columns of conrelid that the constraint applies to */
+	int16	   *confkey;		/* columns of confrelid that foreign key references */
+	Oid		   *conpfeqop;		/* Operator list for comparing PK to FK */
+} ForeignKeyInfo;
 
 /*
  * EquivalenceClasses
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index afa5f9b..6dada00 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -119,6 +119,8 @@ extern List *generate_join_implied_equalities(PlannerInfo *root,
 								 Relids join_relids,
 								 Relids outer_relids,
 								 RelOptInfo *inner_rel);
+extern Oid select_equality_operator(EquivalenceClass *ec, Oid lefttype,
+								 Oid righttype);
 extern bool exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2);
 extern void add_child_rel_equivalences(PlannerInfo *root,
 						   AppendRelInfo *appinfo,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 07d24d4..910190d 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -68,6 +68,7 @@ extern Oid	get_atttype(Oid relid, AttrNumber attnum);
 extern int32 get_atttypmod(Oid relid, AttrNumber attnum);
 extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
 					  Oid *typid, int32 *typmod, Oid *collid);
+extern bool get_attnotnull(Oid relid, AttrNumber attnum);
 extern char *get_collation_name(Oid colloid);
 extern char *get_constraint_name(Oid conoid);
 extern Oid	get_opclass_family(Oid opclass);
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 2501184..21c97e4 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3276,6 +3276,178 @@ select i8.* from int8_tbl i8 left join (select f1 from int4_tbl group by f1) i4
 (1 row)
 
 rollback;
+begin work;
+create temp table c (
+  id int primary key
+);
+create temp table b (
+  id int primary key,
+  c_id int not null,
+  val int not null,
+  constraint b_c_id_fkey foreign key (c_id) references c deferrable
+);
+create temp table a (
+  id int primary key,
+  b_id int not null,
+  constraint a_b_id_fkey foreign key (b_id) references b deferrable
+);
+insert into c (id) values(1);
+insert into b (id,c_id,val) values(2,1,10);
+insert into a (id,b_id) values(3,2);
+-- this should remove inner join to b
+explain (costs off)
+select a.* from a inner join b on a.b_id = b.id;
+  QUERY PLAN   
+---------------
+ Seq Scan on a
+(1 row)
+
+-- this should remove inner join to b and c
+explain (costs off)
+select a.* from a inner join b on a.b_id = b.id inner join c on b.c_id = c.id;
+  QUERY PLAN   
+---------------
+ Seq Scan on a
+(1 row)
+
+-- Ensure all of the target entries have their proper aliases.
+select a.* from a inner join b on a.b_id = b.id inner join c on b.c_id = c.id;
+ id | b_id 
+----+------
+  3 |    2
+(1 row)
+
+-- change order of tables in query, this should generate the same plan as above.
+explain (costs off)
+select a.* from c inner join b on c.id = b.c_id inner join a on a.b_id = b.id;
+  QUERY PLAN   
+---------------
+ Seq Scan on a
+(1 row)
+
+-- inner join can't be removed due to b columns in the target list
+explain (costs off)
+select * from a inner join b on a.b_id = b.id;
+          QUERY PLAN          
+------------------------------
+ Hash Join
+   Hash Cond: (a.b_id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(5 rows)
+
+-- this should not remove inner join to b due to quals restricting results from b
+explain (costs off)
+select a.* from a inner join b on a.b_id = b.id where b.val = 10;
+            QUERY PLAN            
+----------------------------------
+ Hash Join
+   Hash Cond: (a.b_id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: (val = 10)
+(6 rows)
+
+-- check merge join nodes are removed properly 
+set enable_hashjoin = off;
+-- this should remove joins to b and c.
+explain (costs off)
+select COUNT(*) from a inner join b on a.b_id = b.id left join c on a.id = c.id;
+        QUERY PLAN         
+---------------------------
+ Aggregate
+   ->  Sort
+         Sort Key: a.b_id
+         ->  Seq Scan on a
+(4 rows)
+
+-- this should remove joins to b and c, however it b will only be removed on
+-- 2nd attempt after c is removed by the left join removal code.
+explain (costs off)
+select COUNT(*) from a inner join b on a.b_id = b.id left join c on b.id = c.id;
+        QUERY PLAN         
+---------------------------
+ Aggregate
+   ->  Sort
+         Sort Key: a.b_id
+         ->  Seq Scan on a
+(4 rows)
+
+set enable_hashjoin = on;
+-- this should not remove join to b
+explain (costs off)
+select a.* from a inner join b on a.b_id = b.id where b.val = b.id;
+            QUERY PLAN            
+----------------------------------
+ Hash Join
+   Hash Cond: (a.b_id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+               Filter: (id = val)
+(6 rows)
+
+-- this should not remove the join, no foreign key exists between a.id and b.id
+explain (costs off)
+select a.* from a inner join b on a.id = b.id;
+         QUERY PLAN         
+----------------------------
+ Hash Join
+   Hash Cond: (a.id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(5 rows)
+
+-- ensure a left joined rel can't remove an inner joined rel
+explain (costs off)
+select a.* from b LEFT JOIN a on b.id = a.b_id;
+          QUERY PLAN          
+------------------------------
+ Hash Right Join
+   Hash Cond: (a.b_id = b.id)
+   ->  Seq Scan on a
+   ->  Hash
+         ->  Seq Scan on b
+(5 rows)
+
+-- Ensure we remove b, but don't try and remove c. c has no join condition.
+explain (costs off)
+select a.* from a inner join b on a.b_id = b.id cross join c;
+        QUERY PLAN         
+---------------------------
+ Nested Loop
+   ->  Seq Scan on c
+   ->  Materialize
+         ->  Seq Scan on a
+(4 rows)
+
+set constraints b_c_id_fkey deferred;
+-- join should be removed.
+explain (costs off)
+select b.* from b inner join c on b.c_id = c.id;
+  QUERY PLAN   
+---------------
+ Seq Scan on b
+(1 row)
+
+-- perform an update which will cause some pending fk triggers to be added
+update c set id = 2 where id=1;
+-- ensure inner join is no longer removed.
+explain (costs off)
+select b.* from b inner join c on b.c_id = c.id;
+          QUERY PLAN          
+------------------------------
+ Hash Join
+   Hash Cond: (b.c_id = c.id)
+   ->  Seq Scan on b
+   ->  Hash
+         ->  Seq Scan on c
+(5 rows)
+
+rollback;
 create temp table parent (k int primary key, pd int);
 create temp table child (k int unique, cd int);
 insert into parent values (1, 10), (2, 20), (3, 30);
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index 718e1d9..6cf6b87 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -977,6 +977,95 @@ select i8.* from int8_tbl i8 left join (select f1 from int4_tbl group by f1) i4
 
 rollback;
 
+begin work;
+
+create temp table c (
+  id int primary key
+);
+create temp table b (
+  id int primary key,
+  c_id int not null,
+  val int not null,
+  constraint b_c_id_fkey foreign key (c_id) references c deferrable
+);
+create temp table a (
+  id int primary key,
+  b_id int not null,
+  constraint a_b_id_fkey foreign key (b_id) references b deferrable
+);
+
+insert into c (id) values(1);
+insert into b (id,c_id,val) values(2,1,10);
+insert into a (id,b_id) values(3,2);
+
+-- this should remove inner join to b
+explain (costs off)
+select a.* from a inner join b on a.b_id = b.id;
+
+-- this should remove inner join to b and c
+explain (costs off)
+select a.* from a inner join b on a.b_id = b.id inner join c on b.c_id = c.id;
+
+-- Ensure all of the target entries have their proper aliases.
+select a.* from a inner join b on a.b_id = b.id inner join c on b.c_id = c.id;
+
+-- change order of tables in query, this should generate the same plan as above.
+explain (costs off)
+select a.* from c inner join b on c.id = b.c_id inner join a on a.b_id = b.id;
+
+-- inner join can't be removed due to b columns in the target list
+explain (costs off)
+select * from a inner join b on a.b_id = b.id;
+
+-- this should not remove inner join to b due to quals restricting results from b
+explain (costs off)
+select a.* from a inner join b on a.b_id = b.id where b.val = 10;
+
+-- check merge join nodes are removed properly
+set enable_hashjoin = off;
+
+-- this should remove joins to b and c.
+explain (costs off)
+select COUNT(*) from a inner join b on a.b_id = b.id left join c on a.id = c.id;
+
+-- this should remove joins to b and c, however it b will only be removed on
+-- 2nd attempt after c is removed by the left join removal code.
+explain (costs off)
+select COUNT(*) from a inner join b on a.b_id = b.id left join c on b.id = c.id;
+
+set enable_hashjoin = on;
+
+-- this should not remove join to b
+explain (costs off)
+select a.* from a inner join b on a.b_id = b.id where b.val = b.id;
+
+-- this should not remove the join, no foreign key exists between a.id and b.id
+explain (costs off)
+select a.* from a inner join b on a.id = b.id;
+
+-- ensure a left joined rel can't remove an inner joined rel
+explain (costs off)
+select a.* from b LEFT JOIN a on b.id = a.b_id;
+
+-- Ensure we remove b, but don't try and remove c. c has no join condition.
+explain (costs off)
+select a.* from a inner join b on a.b_id = b.id cross join c;
+
+set constraints b_c_id_fkey deferred;
+
+-- join should be removed.
+explain (costs off)
+select b.* from b inner join c on b.c_id = c.id;
+
+-- perform an update which will cause some pending fk triggers to be added
+update c set id = 2 where id=1;
+
+-- ensure inner join is no longer removed.
+explain (costs off)
+select b.* from b inner join c on b.c_id = c.id;
+
+rollback;
+
 create temp table parent (k int primary key, pd int);
 create temp table child (k int unique, cd int);
 insert into parent values (1, 10), (2, 20), (3, 30);

#41

Michael Paquier

michael.paquier@gmail.com

almost 11 years ago

In reply to: David Rowley (#40)

Re: Patch to support SEMI and ANTI join removal

On Sun, Nov 23, 2014 at 8:23 PM, David Rowley <dgrowleyml@gmail.com> wrote:

As the patch stands there's still a couple of FIXMEs in there, so there's
still a bit of work to do yet.
Comments are welcome

Hm, if there is still work to do, we may as well mark this patch as
rejected as-is, also because it stands in this state for a couple of months.
--
Michael

#42

Marko Tiikkaja

marko@joh.to

almost 11 years ago

In reply to: Michael Paquier (#41)

Re: Patch to support SEMI and ANTI join removal

On 2/13/15 8:52 AM, Michael Paquier wrote:

On Sun, Nov 23, 2014 at 8:23 PM, David Rowley <dgrowleyml@gmail.com> wrote:

As the patch stands there's still a couple of FIXMEs in there, so there's
still a bit of work to do yet.
Comments are welcome

Hm, if there is still work to do, we may as well mark this patch as
rejected as-is, also because it stands in this state for a couple of months.

I didn't bring this up before, but I'm pretty sure this patch should be
marked "returned with feedback". From what I've understood, "rejected"
means "we don't want this thing, not in this form or any other". That
doesn't seem to be the case for this patch, nor for a few others marked
"rejected" in the currently in-progress commit fest.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#43

Michael Paquier

michael.paquier@gmail.com

almost 11 years ago

In reply to: Marko Tiikkaja (#42)

Re: Patch to support SEMI and ANTI join removal

On Fri, Feb 13, 2015 at 4:57 PM, Marko Tiikkaja <marko@joh.to> wrote:

On 2/13/15 8:52 AM, Michael Paquier wrote:

On Sun, Nov 23, 2014 at 8:23 PM, David Rowley <dgrowleyml@gmail.com>
wrote:

As the patch stands there's still a couple of FIXMEs in there, so there's
still a bit of work to do yet.
Comments are welcome

Hm, if there is still work to do, we may as well mark this patch as
rejected as-is, also because it stands in this state for a couple of
months.

I didn't bring this up before, but I'm pretty sure this patch should be
marked "returned with feedback". From what I've understood, "rejected"
means "we don't want this thing, not in this form or any other". That
doesn't seem to be the case for this patch, nor for a few others marked
"rejected" in the currently in-progress commit fest.

In the new CF app, marking a patch as "returned this feedback" adds it
automatically to the next commit fest. And note that it is actually what I
did for now to move on to the next CF in the doubt:
https://commitfest.postgresql.org/3/27/
But if nothing is done, we should as well mark it as "rejected". Not based
on the fact that it is rejected based on its content, but to not bloat the
CF app with entries that have no activity for months.
--
Michael

#44

Andres Freund

andres@2ndquadrant.com

almost 11 years ago

In reply to: Michael Paquier (#43)

Re: Patch to support SEMI and ANTI join removal

On 2015-02-13 17:06:14 +0900, Michael Paquier wrote:

On Fri, Feb 13, 2015 at 4:57 PM, Marko Tiikkaja <marko@joh.to> wrote:

On 2/13/15 8:52 AM, Michael Paquier wrote:

On Sun, Nov 23, 2014 at 8:23 PM, David Rowley <dgrowleyml@gmail.com>
wrote:

As the patch stands there's still a couple of FIXMEs in there, so there's
still a bit of work to do yet.
Comments are welcome

Hm, if there is still work to do, we may as well mark this patch as
rejected as-is, also because it stands in this state for a couple of
months.

I didn't bring this up before, but I'm pretty sure this patch should be
marked "returned with feedback". From what I've understood, "rejected"
means "we don't want this thing, not in this form or any other". That
doesn't seem to be the case for this patch, nor for a few others marked
"rejected" in the currently in-progress commit fest.

In the new CF app, marking a patch as "returned this feedback" adds it
automatically to the next commit fest. And note that it is actually what I
did for now to move on to the next CF in the doubt:
https://commitfest.postgresql.org/3/27/
But if nothing is done, we should as well mark it as "rejected". Not based
on the fact that it is rejected based on its content, but to not bloat the
CF app with entries that have no activity for months.

Then the CF app needs to be fixed. Marking patches as rejected on these
grounds is a bad idea.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#45

Michael Paquier

michael.paquier@gmail.com

almost 11 years ago

In reply to: Andres Freund (#44)

Re: Patch to support SEMI and ANTI join removal

On Fri, Feb 13, 2015 at 5:12 PM, Andres Freund <andres@2ndquadrant.com>
wrote:

On 2015-02-13 17:06:14 +0900, Michael Paquier wrote:

On Fri, Feb 13, 2015 at 4:57 PM, Marko Tiikkaja <marko@joh.to> wrote:

On 2/13/15 8:52 AM, Michael Paquier wrote:

On Sun, Nov 23, 2014 at 8:23 PM, David Rowley <dgrowleyml@gmail.com>
wrote:

As the patch stands there's still a couple of FIXMEs in there, so

there's

still a bit of work to do yet.
Comments are welcome

Hm, if there is still work to do, we may as well mark this patch as
rejected as-is, also because it stands in this state for a couple of
months.

I didn't bring this up before, but I'm pretty sure this patch should be
marked "returned with feedback". From what I've understood, "rejected"
means "we don't want this thing, not in this form or any other". That
doesn't seem to be the case for this patch, nor for a few others marked
"rejected" in the currently in-progress commit fest.

In the new CF app, marking a patch as "returned this feedback" adds it
automatically to the next commit fest. And note that it is actually what

I

did for now to move on to the next CF in the doubt:
https://commitfest.postgresql.org/3/27/
But if nothing is done, we should as well mark it as "rejected". Not

based

on the fact that it is rejected based on its content, but to not bloat

the

CF app with entries that have no activity for months.

Then the CF app needs to be fixed. Marking patches as rejected on these
grounds is a bad idea.

Yup, definitely the term is incorrect. We need "Returned with feedback but
please do not add it to the next CF dear CF app".
--
Michael

#46

David Rowley

dgrowley@gmail.com

almost 11 years ago

In reply to: Michael Paquier (#45)

Re: Patch to support SEMI and ANTI join removal

There does not seem to be a delete button, so marking as "rejected" due to this now being a duplicate entry for this patch.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#47

David Rowley

dgrowley@gmail.com

almost 11 years ago

In reply to: Michael Paquier (#41)

Re: Patch to support SEMI and ANTI join removal

On 13 February 2015 at 20:52, Michael Paquier <michael.paquier@gmail.com>
wrote:

On Sun, Nov 23, 2014 at 8:23 PM, David Rowley <dgrowleyml@gmail.com>
wrote:

As the patch stands there's still a couple of FIXMEs in there, so there's
still a bit of work to do yet.
Comments are welcome

Hm, if there is still work to do, we may as well mark this patch as
rejected as-is, also because it stands in this state for a couple of months.

My apologies, I'd not realised that the thread link on the commitfest app
was pointing to the wrong thread.

I see now that the patch has been bumped off the december fest that It's
now a duplicate on the February commitfest as I'd hastily added it to Feb
before I rushed off on my summer holidays 2 weeks ago.

I've now changed the duplicate to "returned with feedback" as there was no
option that I could find to delete it. (If anyone has extra rights to do
this, could that be done instead?)

The state of the patch is currently ready for review. The FIXME stuff that
I had talked about above is old news.

Please can we use the
/messages/by-id/CAApHDvocUEYdt1uT+DLDPs2xEu=v3qJGT6HeXKonQM4rY_OsSA@mail.gmail.com
thread for further communications about this patch. I'm trying to kill this
one off due to the out-of-date subject.

Regards

David Rowley