Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)

Started by Kouhei Kaigaialmost 11 years ago39 messages

kaigai@ak.jp.nec.com

almost 11 years ago

1 attachment(s)

On Wed, Mar 18, 2015 at 9:33 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

On Wed, Mar 18, 2015 at 2:34 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

So, overall consensus for the FDW hook location is just before the set_chepest()
at standard_join_search() and merge_clump(), isn't it?

Yes, I think so.

Let me make a design of FDW hook to reduce code duplications for each FDW driver,
especially, to identify baserel/joinrel to be involved in this join.

Great, thanks!

One issue, which I think Ashutosh alluded to upthread, is that we need
to make sure it's not unreasonably difficult for foreign data wrappers
to construct the FROM clause of an SQL query to be pushed down to the
remote side. It should be simple when there are only inner joins
involved, but when there are all outer joins it might be a bit
complex. It would be very good if someone could try to write that
code, based on the new hook locations, and see how it turns out, so
that we can figure out how to address any issues that may crop up
there.

Here is an idea that provides a common utility function that break down
the supplied RelOptInfo of joinrel into a pair of join-type and a list of
baserel/joinrel being involved in the relations join. It intends to be
called by FDW driver to list up underlying relations.
IIUC, root->join_info_list will provide information of how relations are
combined to the upper joined relations, thus, I expect it is not
unreasonably complicated way to solve.
Once a RelOptInfo of the target joinrel is broken down into multiple sub-
relations (N>=2 if all inner join, elsewhere N=2), FDW driver can
reference the RestrictInfo to be used in relations join.

Anyway, I'll try to investigate the existing code for more detail today,
to clarify whether the above approach is feasible.

Sounds good. Keep in mind that, while the parse tree will obviously
reflect the way that the user actually specified the join
syntactically, it's not the job of the join_info_list to make it
simple to reconstruct that information. To the contrary,
join_info_list is supposed to be structured in a way that makes it
easy to determine whether *a particular join order is one of the legal
join orders* not *whether it is the specific join order selected by
the user*. See join_is_legal().

For FDW pushdown, I think it's sufficient to be able to identify *any
one* legal join order, not necessarily the same order the user
originally entered. For exampe, if the user entered A LEFT JOIN B ON
A.x = B.x LEFT JOIN C ON A.y = C.y and the FDW generates a query that
instead does A LEFT JOIN C ON a.y = C.y LEFT JOIN B ON A.x = B.x, I
suspect that's just fine. Particular FDWs might wish to try to be
smart about what they emit based on knowledge of what the remote
side's optimizer is likely to do, and that's fine. If the remote side
is PostgreSQL, it shouldn't matter much.

Sorry for my response late. It was not easy to code during business trip.

The attached patch adds a hook for FDW/CSP to replace entire join-subtree
by a foreign/custom-scan, according to the discussion upthread.

GetForeignJoinPaths handler of FDW is simplified as follows:
typedef void (*GetForeignJoinPaths_function) (PlannerInfo *root,
RelOptInfo *joinrel);

It takes PlannerInfo and RelOptInfo of the join-relation to be replaced
if available. RelOptInfo contains 'relids' bitmap, so FDW driver will be
able to know the relations to be involved and construct a remote join query.
However, it is not obvious with RelOptInfo to know how relations are joined.

The function below will help implement FDW driver that support remote join.

List *
get_joinrel_broken_down(PlannerInfo *root, RelOptInfo *joinrel,
SpecialJoinInfo **p_sjinfo)

It returns a list of RelOptInfo to be involved in the relations join that
is represented with 'joinrel', and also set a SpecialJoinInfo on the third
argument if not inner join.
In case of inner join, it returns multiple (more than or equal to 2)
relations to be inner-joined. Elsewhere, it returns two relations and
a valid SpecialJoinInfo.

The #if 0 ... #endif block is just for confirmation purpose to show
how hook is invoked and the joinrel is broken down with above service
routine.

postgres=# select * from t0 left join t1 on t1.aid=bid
left join t2 on t1.aid=cid
left join t3 on t1.aid=did
left join t4 on t1.aid=eid;
INFO: LEFT JOIN: t0, t1
INFO: LEFT JOIN: (t0, t1), t2
INFO: LEFT JOIN: (t0, t1), t3
INFO: LEFT JOIN: (t0, t1), t4
INFO: LEFT JOIN: (t0, t1, t3), t2
INFO: LEFT JOIN: (t0, t1, t4), t2
INFO: LEFT JOIN: (t0, t1, t4), t3
INFO: LEFT JOIN: (t0, t1, t3, t4), t2

In this case, joinrel is broken down into (t0, t1, t3, t4) and t2.
The earlier one is also joinrel, so it expects FDW driver will make
the get_joinrel_broken_down() call recurdively.

postgres=# explain select * from t0 natural join t1
natural join t2
natural join t3
natural join t4;
INFO: INNER JOIN: t0, t1
INFO: INNER JOIN: t0, t2
INFO: INNER JOIN: t0, t3
INFO: INNER JOIN: t0, t4
INFO: INNER JOIN: t0, t1, t2
INFO: INNER JOIN: t0, t1, t3
INFO: INNER JOIN: t0, t1, t4
INFO: INNER JOIN: t0, t2, t3
INFO: INNER JOIN: t0, t2, t4
INFO: INNER JOIN: t0, t3, t4
INFO: INNER JOIN: t0, t1, t2, t3
INFO: INNER JOIN: t0, t1, t2, t4
INFO: INNER JOIN: t0, t1, t3, t4
INFO: INNER JOIN: t0, t2, t3, t4
INFO: INNER JOIN: t0, t1, t2, t3, t4

In this case, joinrel is consist of inner join, so get_joinrel_broken_down()
returns a list that contains RelOptInfo of 6 base relations.

postgres=# explain select * from t0 natural join t1
left join t2 on t1.aid=t2.bid
natural join t3
natural join t4;
INFO: INNER JOIN: t0, t1
INFO: INNER JOIN: t0, t3
INFO: INNER JOIN: t0, t4
INFO: LEFT JOIN: t1, t2
INFO: INNER JOIN: (t1, t2), t0
INFO: INNER JOIN: t0, t1, t3
INFO: INNER JOIN: t0, t1, t4
INFO: INNER JOIN: t0, t3, t4
INFO: INNER JOIN: (t1, t2), t0, t3
INFO: INNER JOIN: (t1, t2), t0, t4
INFO: INNER JOIN: t0, t1, t3, t4
INFO: INNER JOIN: (t1, t2), t0, t3, t4

In mixture case, it keeps restriction of join legality (t1 and t2 must
be left joined) during its broken down.

At this moment, I'm not 100% certain about its logic. Especially, I didn't
test SEMI- and ANTI- join cases yet.
However, time is money - I want people to check overall design first, rather
than detailed debugging. Please tell me if I misunderstood the logic to break
down join relations.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Attachments:

pgsql-v9.5-custom-join.v10.patchapplication/octet-stream; name=pgsql-v9.5-custom-join.v10.patchDownload

 doc/src/sgml/custom-scan.sgml           |  43 +++++
 doc/src/sgml/fdwhandler.sgml            |  54 ++++++
 src/backend/commands/explain.c          |  15 +-
 src/backend/executor/execScan.c         |   4 +
 src/backend/executor/nodeCustom.c       |  38 ++++-
 src/backend/executor/nodeForeignscan.c  |  34 ++--
 src/backend/foreign/foreign.c           |  32 +++-
 src/backend/nodes/bitmapset.c           |  57 +++++++
 src/backend/nodes/copyfuncs.c           |   5 +
 src/backend/nodes/outfuncs.c            |   5 +
 src/backend/optimizer/geqo/geqo_eval.c  |   3 +
 src/backend/optimizer/path/allpaths.c   | 292 ++++++++++++++++++++++++++++++++
 src/backend/optimizer/path/joinpath.c   |  13 ++
 src/backend/optimizer/plan/createplan.c |  80 +++++++--
 src/backend/optimizer/plan/setrefs.c    |  64 +++++++
 src/backend/optimizer/util/plancat.c    |   7 +-
 src/backend/optimizer/util/relnode.c    |  14 ++
 src/backend/utils/adt/ruleutils.c       |   4 +
 src/include/foreign/fdwapi.h            |   8 +
 src/include/nodes/bitmapset.h           |   1 +
 src/include/nodes/plannodes.h           |  24 ++-
 src/include/nodes/relation.h            |   2 +
 src/include/optimizer/paths.h           |  21 +++
 src/include/optimizer/planmain.h        |   1 +
 24 files changed, 772 insertions(+), 49 deletions(-)

diff --git a/doc/src/sgml/custom-scan.sgml b/doc/src/sgml/custom-scan.sgml
index 8a4a3df..b1400ae 100644
--- a/doc/src/sgml/custom-scan.sgml
+++ b/doc/src/sgml/custom-scan.sgml
@@ -48,6 +48,27 @@ extern PGDLLIMPORT set_rel_pathlist_hook_type set_rel_pathlist_hook;
   </para>
 
   <para>
+   A custom scan provider will be also able to add paths by setting the
+   following hook, to replace built-in join paths by custom-scan that
+   performs as if a scan on preliminary joined relations, which us called
+   after the core code has generated what it believes to be the complete
+   and correct set of access paths for the join.
+<programlisting>
+typedef void (*set_join_pathlist_hook_type) (PlannerInfo *root,
+                                             RelOptInfo *joinrel,
+                                             RelOptInfo *outerrel,
+                                             RelOptInfo *innerrel,
+                                             List *restrictlist,
+                                             JoinType jointype,
+                                             SpecialJoinInfo *sjinfo,
+                                             SemiAntiJoinFactors *semifactors,
+                                             Relids param_source_rels,
+                                             Relids extra_lateral_rels);
+extern PGDLLIMPORT set_join_pathlist_hook_type set_join_pathlist_hook;
+</programlisting>
+  </para>
+
+  <para>
     Although this hook function can be used to examine, modify, or remove
     paths generated by the core system, a custom scan provider will typically
     confine itself to generating <structname>CustomPath</> objects and adding
@@ -124,7 +145,9 @@ typedef struct CustomScan
     Scan      scan;
     uint32    flags;
     List     *custom_exprs;
+    List     *custom_ps_tlist;
     List     *custom_private;
+    List     *custom_relids;
     const CustomScanMethods *methods;
 } CustomScan;
 </programlisting>
@@ -141,10 +164,30 @@ typedef struct CustomScan
     is only used by the custom scan provider itself.  Plan trees must be able
     to be duplicated using <function>copyObject</>, so all the data stored
     within these two fields must consist of nodes that function can handle.
+    <literal>custom_relids</> is set by the backend, thus custom-scan provider
+    does not need to touch, to track underlying relations represented by this
+    custom-scan node.
     <structfield>methods</> must point to a (usually statically allocated)
     object implementing the required custom scan methods, which are further
     detailed below.
   </para>
+  <para>
+   In case when <structname>CustomScan</> replaced built-in join paths,
+   custom-scan provider must have two characteristic setup.
+   The first one is zero on the <structfield>scan.scanrelid</>, which
+   should be usually an index of range-tables. It informs the backend
+   this <structname>CustomScan</> node is not associated with a particular
+   table. The second one is valid list of <structname>TargetEntry</> on
+   the <structfield>custom_ps_tlist</>. A <structname>CustomScan</> node
+   looks to the backend like a scan as literal, but on a relation which is
+   the result of relations join. It means we cannot construct a tuple
+   descriptor based on table definition, thus custom-scan provider must
+   introduce the expected record-type of the tuples.
+   Tuple-descriptor of scan-slot shall be constructed based on the
+   <structfield>custom_ps_tlist</>, and assigned on executor initialization.
+   Also, referenced by <command>EXPLAIN</> to solve name of the underlying
+   columns and relations.
+  </para>
 
   <sect2 id="custom-scan-plan-callbacks">
    <title>Custom Scan Callbacks</title>
diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
index c1daa4b..77477c8 100644
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -598,6 +598,60 @@ IsForeignRelUpdatable (Relation rel);
 
    </sect2>
 
+   <sect2>
+    <title>FDW Routines for remote join</title>
+    <para>
+<programlisting>
+void
+GetForeignJoinPaths(PlannerInfo *root,
+                    RelOptInfo *joinrel,
+                    RelOptInfo *outerrel,
+                    RelOptInfo *innerrel,
+                    JoinType jointype,
+                    SpecialJoinInfo *sjinfo,
+                    SemiAntiJoinFactors *semifactors,
+                    List *restrictlist,
+                    Relids extra_lateral_rels);
+</programlisting>
+     Create possible access paths for a join of two foreign tables or
+     joined relations, but both of them needs to be managed with same
+     FDW driver.
+     This optional function is called during query planning.
+    </para>
+    <para>
+     This function allows FDW driver to add <literal>ForeignScan</> path
+     towards the supplied <literal>joinrel</>. From the standpoint of
+     query planner, it looks like scan-node is added for join-relation.
+     It means, <literal>ForeignScan</> path added instead of the built-in
+     local join logic has to generate tuples as if it scans on a joined
+     and materialized relations.
+    </para>
+    <para>
+     Usually, we expect FDW drivers issues a remote query that involves
+     tables join on remote side, then FDW driver fetches the joined result
+     on local side.
+     Unlike simple table scan, definition of slot descriptor of the joined
+     relations is determined on the fly, thus we cannot know its definition
+     from the system catalog.
+     So, FDW driver is responsible to introduce the query planner expected
+     form of the joined relations. In case when <literal>ForeignScan</>
+     replaced a relations join, <literal>scanrelid</> of the generated plan
+     node shall be zero, to mark this <literal>ForeignScan</> node is not
+     associated with a particular foreign tables.
+     Also, it need to construct pseudo scan tlist (<literal>fdw_ps_tlist</>)
+     to indicate expected tuple definition.
+    </para>
+    <para>
+     Once <literal>scanrelid</> equals zero, executor initializes the slot
+     for scan according to <literal>fdw_ps_tlist</>, but excludes junk
+     entries. This list is also used to solve the name of the original
+     relation and columns, so FDW can chains expression nodes which are
+     not run on local side actually, like a join clause to be executed on
+     the remote side, however, target-entries of them will have
+     <literal>resjunk=true</>.
+    </para>
+   </sect2>
+
    <sect2 id="fdw-callbacks-explain">
     <title>FDW Routines for <command>EXPLAIN</></title>
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index a951c55..8892dca 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -730,11 +730,17 @@ ExplainPreScanNode(PlanState *planstate, Bitmapset **rels_used)
 		case T_ValuesScan:
 		case T_CteScan:
 		case T_WorkTableScan:
-		case T_ForeignScan:
-		case T_CustomScan:
 			*rels_used = bms_add_member(*rels_used,
 										((Scan *) plan)->scanrelid);
 			break;
+		case T_ForeignScan:
+			*rels_used = bms_add_members(*rels_used,
+										 ((ForeignScan *) plan)->fdw_relids);
+			break;
+		case T_CustomScan:
+			*rels_used = bms_add_members(*rels_used,
+										 ((CustomScan *) plan)->custom_relids);
+			break;
 		case T_ModifyTable:
 			*rels_used = bms_add_member(*rels_used,
 									((ModifyTable *) plan)->nominalRelation);
@@ -1072,9 +1078,12 @@ ExplainNode(PlanState *planstate, List *ancestors,
 		case T_ValuesScan:
 		case T_CteScan:
 		case T_WorkTableScan:
+			ExplainScanTarget((Scan *) plan, es);
+			break;
 		case T_ForeignScan:
 		case T_CustomScan:
-			ExplainScanTarget((Scan *) plan, es);
+			if (((Scan *) plan)->scanrelid > 0)
+				ExplainScanTarget((Scan *) plan, es);
 			break;
 		case T_IndexScan:
 			{
diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c
index 3f0d809..2f18a8a 100644
--- a/src/backend/executor/execScan.c
+++ b/src/backend/executor/execScan.c
@@ -251,6 +251,10 @@ ExecAssignScanProjectionInfo(ScanState *node)
 	/* Vars in an index-only scan's tlist should be INDEX_VAR */
 	if (IsA(scan, IndexOnlyScan))
 		varno = INDEX_VAR;
+	/* Also foreign-/custom-scan on pseudo relation should be INDEX_VAR */
+	else if (scan->scanrelid == 0 &&
+			 (IsA(scan, ForeignScan) || IsA(scan, CustomScan)))
+		varno = INDEX_VAR;
 	else
 		varno = scan->scanrelid;
 
diff --git a/src/backend/executor/nodeCustom.c b/src/backend/executor/nodeCustom.c
index b07932b..2344129 100644
--- a/src/backend/executor/nodeCustom.c
+++ b/src/backend/executor/nodeCustom.c
@@ -23,6 +23,7 @@ CustomScanState *
 ExecInitCustomScan(CustomScan *cscan, EState *estate, int eflags)
 {
 	CustomScanState    *css;
+	Index				scan_relid = cscan->scan.scanrelid;
 	Relation			scan_rel;
 
 	/* populate a CustomScanState according to the CustomScan */
@@ -48,12 +49,31 @@ ExecInitCustomScan(CustomScan *cscan, EState *estate, int eflags)
 	ExecInitScanTupleSlot(estate, &css->ss);
 	ExecInitResultTupleSlot(estate, &css->ss.ps);
 
-	/* initialize scan relation */
-	scan_rel = ExecOpenScanRelation(estate, cscan->scan.scanrelid, eflags);
-	css->ss.ss_currentRelation = scan_rel;
-	css->ss.ss_currentScanDesc = NULL;	/* set by provider */
-	ExecAssignScanType(&css->ss, RelationGetDescr(scan_rel));
-
+	/*
+	 * open the base relation and acquire appropriate lock on it, then
+	 * get the scan type from the relation descriptor, if this custom
+	 * scan is on actual relations.
+	 *
+	 * on the other hands, custom-scan may scan on a pseudo relation;
+	 * that is usually a result-set of relations join by external
+	 * computing resource, or others. It has to get the scan type from
+	 * the pseudo-scan target-list that should be assigned by custom-scan
+	 * provider.
+	 */
+	if (scan_relid > 0)
+	{
+		scan_rel = ExecOpenScanRelation(estate, scan_relid, eflags);
+		css->ss.ss_currentRelation = scan_rel;
+		css->ss.ss_currentScanDesc = NULL;	/* set by provider */
+		ExecAssignScanType(&css->ss, RelationGetDescr(scan_rel));
+	}
+	else
+	{
+		TupleDesc	ps_tupdesc;
+
+		ps_tupdesc = ExecCleanTypeFromTL(cscan->custom_ps_tlist, false);
+		ExecAssignScanType(&css->ss, ps_tupdesc);
+	}
 	css->ss.ps.ps_TupFromTlist = false;
 
 	/*
@@ -89,11 +109,11 @@ ExecEndCustomScan(CustomScanState *node)
 
 	/* Clean out the tuple table */
 	ExecClearTuple(node->ss.ps.ps_ResultTupleSlot);
-	if (node->ss.ss_ScanTupleSlot)
-		ExecClearTuple(node->ss.ss_ScanTupleSlot);
+	ExecClearTuple(node->ss.ss_ScanTupleSlot);
 
 	/* Close the heap relation */
-	ExecCloseScanRelation(node->ss.ss_currentRelation);
+	if (node->ss.ss_currentRelation)
+		ExecCloseScanRelation(node->ss.ss_currentRelation);
 }
 
 void
diff --git a/src/backend/executor/nodeForeignscan.c b/src/backend/executor/nodeForeignscan.c
index 7399053..542d176 100644
--- a/src/backend/executor/nodeForeignscan.c
+++ b/src/backend/executor/nodeForeignscan.c
@@ -102,6 +102,7 @@ ForeignScanState *
 ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
 {
 	ForeignScanState *scanstate;
+	Index		scanrelid = node->scan.scanrelid;
 	Relation	currentRelation;
 	FdwRoutine *fdwroutine;
 
@@ -141,16 +142,28 @@ ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
 	ExecInitScanTupleSlot(estate, &scanstate->ss);
 
 	/*
-	 * open the base relation and acquire appropriate lock on it.
+	 * open the base relation and acquire appropriate lock on it, then
+	 * get the scan type from the relation descriptor, if this foreign
+	 * scan is on actual foreign-table.
+	 *
+	 * on the other hands, foreign-scan may scan on a pseudo relation;
+	 * that is usually a result-set of remote relations join. It has
+	 * to get the scan type from the pseudo-scan target-list that should
+	 * be assigned by FDW driver.
 	 */
-	currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
-	scanstate->ss.ss_currentRelation = currentRelation;
+	if (scanrelid > 0)
+	{
+		currentRelation = ExecOpenScanRelation(estate, scanrelid, eflags);
+		scanstate->ss.ss_currentRelation = currentRelation;
+		ExecAssignScanType(&scanstate->ss, RelationGetDescr(currentRelation));
+	}
+	else
+	{
+		TupleDesc	ps_tupdesc;
 
-	/*
-	 * get the scan type from the relation descriptor.  (XXX at some point we
-	 * might want to let the FDW editorialize on the scan tupdesc.)
-	 */
-	ExecAssignScanType(&scanstate->ss, RelationGetDescr(currentRelation));
+		ps_tupdesc = ExecCleanTypeFromTL(node->fdw_ps_tlist, false);
+		ExecAssignScanType(&scanstate->ss, ps_tupdesc);
+	}
 
 	/*
 	 * Initialize result tuple type and projection info.
@@ -161,7 +174,7 @@ ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
 	/*
 	 * Acquire function pointers from the FDW's handler, and init fdw_state.
 	 */
-	fdwroutine = GetFdwRoutineForRelation(currentRelation, true);
+	fdwroutine = GetFdwRoutine(node->fdw_handler);
 	scanstate->fdwroutine = fdwroutine;
 	scanstate->fdw_state = NULL;
 
@@ -193,7 +206,8 @@ ExecEndForeignScan(ForeignScanState *node)
 	ExecClearTuple(node->ss.ss_ScanTupleSlot);
 
 	/* close the relation. */
-	ExecCloseScanRelation(node->ss.ss_currentRelation);
+	if (node->ss.ss_currentRelation)
+		ExecCloseScanRelation(node->ss.ss_currentRelation);
 }
 
 /* ----------------------------------------------------------------
diff --git a/src/backend/foreign/foreign.c b/src/backend/foreign/foreign.c
index cbe8b78..df69a95 100644
--- a/src/backend/foreign/foreign.c
+++ b/src/backend/foreign/foreign.c
@@ -302,13 +302,12 @@ GetFdwRoutine(Oid fdwhandler)
 	return routine;
 }
 
-
 /*
- * GetFdwRoutineByRelId - look up the handler of the foreign-data wrapper
- * for the given foreign table, and retrieve its FdwRoutine struct.
+ * GetFdwHandlerByRelId - look up the handler of the foreign-data wrapper
+ * for the given foreign table
  */
-FdwRoutine *
-GetFdwRoutineByRelId(Oid relid)
+static Oid
+GetFdwHandlerByRelId(Oid relid)
 {
 	HeapTuple	tp;
 	Form_pg_foreign_data_wrapper fdwform;
@@ -350,7 +349,18 @@ GetFdwRoutineByRelId(Oid relid)
 
 	ReleaseSysCache(tp);
 
-	/* And finally, call the handler function. */
+	return fdwhandler;
+}
+
+/*
+ * GetFdwRoutineByRelId - look up the handler of the foreign-data wrapper
+ * for the given foreign table, and retrieve its FdwRoutine struct.
+ */
+FdwRoutine *
+GetFdwRoutineByRelId(Oid relid)
+{
+	Oid			fdwhandler = GetFdwHandlerByRelId(relid);
+
 	return GetFdwRoutine(fdwhandler);
 }
 
@@ -398,6 +408,16 @@ GetFdwRoutineForRelation(Relation relation, bool makecopy)
 	return relation->rd_fdwroutine;
 }
 
+/*
+ * GetFdwHandlerForRelation
+ *
+ * returns OID of FDW handler which is associated with the given relation.
+ */
+Oid
+GetFdwHandlerForRelation(Relation relation)
+{
+	return GetFdwHandlerByRelId(RelationGetRelid(relation));
+}
 
 /*
  * IsImportableForeignTable - filter table names for IMPORT FOREIGN SCHEMA
diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index a9c3b4b..4dc3286 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -301,6 +301,63 @@ bms_difference(const Bitmapset *a, const Bitmapset *b)
 }
 
 /*
+ * bms_shift_members - move all the bits by shift
+ */
+Bitmapset *
+bms_shift_members(const Bitmapset *a, int shift)
+{
+	Bitmapset  *b;
+	bitmapword	h_word;
+	bitmapword	l_word;
+	int			nwords;
+	int			w_shift;
+	int			b_shift;
+	int			i, j;
+
+	/* fast path if result shall be NULL obviously */
+	if (a == NULL || a->nwords * BITS_PER_BITMAPWORD + shift <= 0)
+		return NULL;
+	/* actually, not shift members */
+	if (shift == 0)
+		return bms_copy(a);
+
+	nwords = (a->nwords * BITS_PER_BITMAPWORD + shift +
+			  BITS_PER_BITMAPWORD - 1) / BITS_PER_BITMAPWORD;
+	b = palloc(BITMAPSET_SIZE(nwords));
+	b->nwords = nwords;
+
+	if (shift > 0)
+	{
+		/* Left shift */
+		w_shift = WORDNUM(shift);
+		b_shift = BITNUM(shift);
+
+		for (i=0, j=-w_shift; i < b->nwords; i++, j++)
+		{
+			h_word = (j >= 0   && j   < a->nwords ? a->words[j] : 0);
+			l_word = (j-1 >= 0 && j-1 < a->nwords ? a->words[j-1] : 0);
+			b->words[i] = ((h_word << b_shift) |
+						   (l_word >> (BITS_PER_BITMAPWORD - b_shift)));
+		}
+	}
+	else
+	{
+		/* Right shift */
+		w_shift = WORDNUM(-shift);
+		b_shift = BITNUM(-shift);
+
+		for (i=0, j=-w_shift; i < b->nwords; i++, j++)
+		{
+			h_word = (j+1 >= 0 && j+1 < a->nwords ? a->words[j+1] : 0);
+			l_word = (j >= 0 && j < a->nwords ? a->words[j] : 0);
+			b->words[i] = ((h_word >> (BITS_PER_BITMAPWORD - b_shift)) |
+						   (l_word << b_shift));
+		}
+	}
+	return b;
+}
+
+/*
  * bms_is_subset - is A a subset of B?
  */
 bool
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 029761e..61379a7 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -592,8 +592,11 @@ _copyForeignScan(const ForeignScan *from)
 	/*
 	 * copy remainder of node
 	 */
+	COPY_SCALAR_FIELD(fdw_handler);
 	COPY_NODE_FIELD(fdw_exprs);
+	COPY_NODE_FIELD(fdw_ps_tlist);
 	COPY_NODE_FIELD(fdw_private);
+	COPY_BITMAPSET_FIELD(fdw_relids);
 	COPY_SCALAR_FIELD(fsSystemCol);
 
 	return newnode;
@@ -617,7 +620,9 @@ _copyCustomScan(const CustomScan *from)
 	 */
 	COPY_SCALAR_FIELD(flags);
 	COPY_NODE_FIELD(custom_exprs);
+	COPY_NODE_FIELD(custom_ps_tlist);
 	COPY_NODE_FIELD(custom_private);
+	COPY_BITMAPSET_FIELD(custom_relids);
 
 	/*
 	 * NOTE: The method field of CustomScan is required to be a pointer to a
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 385b289..a178132 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -558,8 +558,11 @@ _outForeignScan(StringInfo str, const ForeignScan *node)
 
 	_outScanInfo(str, (const Scan *) node);
 
+	WRITE_OID_FIELD(fdw_handler);
 	WRITE_NODE_FIELD(fdw_exprs);
+	WRITE_NODE_FIELD(fdw_ps_tlist);
 	WRITE_NODE_FIELD(fdw_private);
+	WRITE_BITMAPSET_FIELD(fdw_relids);
 	WRITE_BOOL_FIELD(fsSystemCol);
 }
 
@@ -572,7 +575,9 @@ _outCustomScan(StringInfo str, const CustomScan *node)
 
 	WRITE_UINT_FIELD(flags);
 	WRITE_NODE_FIELD(custom_exprs);
+	WRITE_NODE_FIELD(custom_ps_tlist);
 	WRITE_NODE_FIELD(custom_private);
+	WRITE_BITMAPSET_FIELD(custom_relids);
 	appendStringInfoString(str, " :methods ");
 	_outToken(str, node->methods->CustomName);
 	if (node->methods->TextOutCustomScan)
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index d9a20da..efa64cf 100644
--- a/src/backend/optimizer/geqo/geqo_eval.c
+++ b/src/backend/optimizer/geqo/geqo_eval.c
@@ -266,6 +266,9 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, bool force)
 			/* Keep searching if join order is not valid */
 			if (joinrel)
 			{
+				/* Add extra paths provided by extensions (FDW/CSP) */
+				add_joinrel_extra_paths(root, joinrel);
+
 				/* Find and save the cheapest paths for this joinrel */
 				set_cheapest(joinrel);
 
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 58d78e6..797c8b8 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -60,6 +60,8 @@ set_rel_pathlist_hook_type set_rel_pathlist_hook = NULL;
 /* Hook for plugins to replace standard_join_search() */
 join_search_hook_type join_search_hook = NULL;
 
+/* Hook for plugins to add extra joinpath for each level */
+set_extra_joinpaths_hook_type set_extra_joinpaths_hook = NULL;
 
 static void set_base_rel_sizes(PlannerInfo *root);
 static void set_base_rel_pathlists(PlannerInfo *root);
@@ -1572,6 +1574,293 @@ make_rel_from_joinlist(PlannerInfo *root, List *joinlist)
 }
 
 /*
+ * lookup_reloptinfo
+ *
+ * A utility function to look up a RelOptInfo that represents a set of
+ * relation ids specified by relids.
+ */
+static RelOptInfo *
+lookup_reloptinfo(PlannerInfo *root, Relids relids)
+{
+	RelOptInfo *rel = NULL;
+
+	switch (bms_membership(relids))
+	{
+	case BMS_EMPTY_SET:
+		/* should not happen */
+		break;
+	case BMS_SINGLETON:
+		rel = find_base_rel(root, bms_singleton_member(relids));
+		break;
+	case BMS_MULTIPLE:
+		rel = find_join_rel(root, relids);
+		break;
+	}
+	return rel;
+}
+
+/*
+ * get_joinrel_broken_down
+ *
+ * We intend extensions to call this function to break down the supplied
+ * 'joinrel' into multiple relations that are involved in this join.
+ * It returns a list of RelOptInfo (which is either RELOPT_BASEREL or
+ * RELOPT_JOINREL) and SpecialJoinInfo if join-type is not JOIN_INNER.
+ * If not inner join, length of the RelOptInfo list should be 2, expected
+ * to join them according to the SpecialJoinInfo also returned.
+ * If inner join, it returns RelOptInfo more than or equal to 2.
+ */
+List *
+get_joinrel_broken_down(PlannerInfo *root, RelOptInfo *joinrel,
+						SpecialJoinInfo **p_sjinfo)
+{
+	List	   *sjinfo_list = NIL;
+	List	   *child_relops = NIL;
+	Bitmapset  *inner_relids = bms_copy(joinrel->relids);
+	Bitmapset  *tempset;
+	RelOptInfo *rel;
+	ListCell   *lc;
+	int			relid;
+
+	/* sanity checks */
+	Assert(joinrel->reloptkind == RELOPT_JOINREL);
+	Assert(bms_num_members(joinrel->relids) > 1);
+
+	/*
+	 * Walk on the SpecialJoinInfo that specifies the way to join
+	 * relations that are involved in this joinrel.
+	 */
+	foreach (lc, root->join_info_list)
+	{
+		SpecialJoinInfo *sjinfo = lfirst(lc);
+		ListCell   *prev;
+		ListCell   *cell;
+		ListCell   *next;
+
+		/* This special-join involves any other relations in-addition
+		 * to the relations joined with this joinrel.
+		 */
+		if (!bms_is_subset(sjinfo->min_lefthand, joinrel->relids) ||
+			!bms_is_subset(sjinfo->min_righthand, joinrel->relids))
+			continue;
+
+		tempset = bms_union(sjinfo->min_lefthand,
+							sjinfo->min_righthand);
+		inner_relids = bms_difference(inner_relids, tempset);
+
+		/*
+		 * Remove SpecialJoinInfo if it dominates / is dominated by
+		 * another one.
+		 */
+		for (prev = NULL, cell = list_head(sjinfo_list),
+			 next = (!cell ? NULL : lnext(cell));
+			 cell != NULL;
+			 prev = cell, cell = next, next = (!cell ? NULL : lnext(cell)))
+		{
+			SpecialJoinInfo *other = lfirst(cell);
+
+			/*
+			 * If sjinfo is dominated by other one already in the list,
+			 * no need to track this SpecialJoinInfo any more.
+			 */
+			if (bms_is_subset(tempset, other->min_lefthand) ||
+				bms_is_subset(tempset, other->min_righthand))
+				break;
+
+			/*
+			 * On the other hand, this sjinfo may dominates other one
+			 * already in the list. If so, we don't need to care about
+			 * the older one any more.
+			 */
+			if ((bms_is_subset(other->min_lefthand, sjinfo->min_lefthand) &&
+				 bms_is_subset(other->min_righthand, sjinfo->min_lefthand)) ||
+				(bms_is_subset(other->min_lefthand, sjinfo->min_righthand) &&
+				 bms_is_subset(other->min_righthand, sjinfo->min_righthand)))
+			{
+				sjinfo_list = list_delete_cell(sjinfo_list, cell, prev);
+				cell = prev;
+			}
+		}
+		/* OK, it makes sense to track this sjinfo */
+		if (!cell)
+			sjinfo_list = lappend(sjinfo_list, sjinfo);
+	}
+
+	/*
+	 * An empty inner_relids means that we have no relations to be
+	 * joined using inner join manner, thus, all we can do is to pick
+	 * up a special join info dominated by this 'joinrel', and split
+	 * it into two portions.
+	 * These two relations shall be able to joined according to the
+	 * SpecialJoinInfo to be returned to the caller.
+	 */
+	if (bms_is_empty(inner_relids))
+	{
+		foreach (lc, sjinfo_list)
+		{
+			SpecialJoinInfo *sjinfo = lfirst(lc);
+			RelOptInfo		*l_rel;
+			RelOptInfo		*r_rel;
+
+			tempset = bms_difference(joinrel->relids, sjinfo->min_lefthand);
+			l_rel = lookup_reloptinfo(root, sjinfo->min_lefthand);
+			r_rel = lookup_reloptinfo(root, tempset);
+			bms_free(tempset);
+
+			if (l_rel && r_rel)
+			{
+				*p_sjinfo = sjinfo;
+				return list_make2(l_rel, r_rel);
+			}
+
+			tempset = bms_difference(joinrel->relids, sjinfo->min_righthand);
+			l_rel = lookup_reloptinfo(root, tempset);
+			r_rel = lookup_reloptinfo(root, sjinfo->min_righthand);
+			bms_free(tempset);
+
+			if (l_rel && r_rel)
+			{
+				*p_sjinfo = sjinfo;
+				return list_make2(l_rel, r_rel);
+			}
+		}
+		elog(ERROR, "could not find suitable child joinrels");
+	}
+
+	/*
+	 * Elsewhere, all the relations still in 'inner_reilds' shall be
+	 * able to be joined using inner-join manner.
+	 */
+	tempset = bms_difference(joinrel->relids, inner_relids);
+	if (!bms_is_empty(tempset))
+	{
+		rel = lookup_reloptinfo(root, tempset);
+		if (!rel)
+			elog(ERROR, "could not find RelOptInfo for given relids");
+		child_relops = lappend(child_relops, rel);
+	}
+	relid = -1;
+	while ((relid = bms_next_member(inner_relids, relid)) >= 0)
+	{
+		rel = find_base_rel(root, relid);
+		child_relops = lappend(child_relops, rel);
+	}
+	Assert(list_length(child_relops) > 1);
+	*p_sjinfo = NULL;
+	return child_relops;
+}
+
+/*
+ * add_joinrel_extra_paths
+ *
+ * Entrypoint of the hooks for FDW/CSP to add alternative scan path
+ * towards the supplied 'joinrel'.
+ */
+void
+add_joinrel_extra_paths(PlannerInfo *root, RelOptInfo *joinrel)
+{
+	/*
+	 * Consider the paths added by FDWs if and when all the relations
+	 * involved in this joinrel are managed by same foreign-data wrapper.
+	 * It is role of GetForeignJoinPaths handler of FDW driver to check
+	 * whether the combination of foreign server and/or checkAsUser is
+	 * suitable
+	 */
+	if (joinrel->fdwroutine && joinrel->fdwroutine->GetForeignJoinPaths)
+		joinrel->fdwroutine->GetForeignJoinPaths(root, joinrel);
+
+	/*
+	 * Also, consider paths added by CSPs, not only FDW, if and when
+	 * someone tries to add a custom-path that tries to replace whole
+	 * of the join subtree. (e.g, CSP that inject materialized-view
+	 * scan if join-subtree is strictly matched with its definition)
+	 */
+	if (set_extra_joinpaths_hook)
+		set_extra_joinpaths_hook(root, joinrel);
+#if 1
+	/*
+	 * The block below is just for observation of the behavior when
+	 * get_joinrel_broken_down() is called back by extensions.
+	 */
+	{
+		SpecialJoinInfo	*sjinfo;
+		List	    *child_relops;
+		JoinType	jointype;
+		ListCell   *lc;
+		StringInfoData str;
+
+		child_relops = get_joinrel_broken_down(root, joinrel, &sjinfo);
+		jointype = (!sjinfo ? JOIN_INNER : sjinfo->jointype);
+
+		initStringInfo(&str);
+		switch (jointype)
+		{
+		case JOIN_INNER:
+			appendStringInfo(&str, "INNER JOIN: ");
+			break;
+		case JOIN_LEFT:
+			appendStringInfo(&str, "LEFT JOIN: ");
+			break;
+		case JOIN_FULL:
+			appendStringInfo(&str, "FULL JOIN: ");
+			break;
+		case JOIN_RIGHT:
+			appendStringInfo(&str, "RIGHT JOIN: ");
+			break;
+		case JOIN_SEMI:
+			appendStringInfo(&str, "SEMI JOIN: ");
+			break;
+		case JOIN_ANTI:
+			appendStringInfo(&str, "ANTI JOIN: ");
+			break;
+		case JOIN_UNIQUE_OUTER:
+			appendStringInfo(&str, "JOIN UNIQUE OUTER: ");
+			break;
+		case JOIN_UNIQUE_INNER:
+			appendStringInfo(&str, "JOIN UNIQUE INNER: ");
+			break;
+		default:
+			appendStringInfo(&str, "JOIN UNKNOWN: ");
+			break;
+		}
+
+		foreach (lc, child_relops)
+		{
+			RelOptInfo *rel = lfirst(lc);
+			RangeTblEntry  *rte;
+
+			if (lc != list_head(child_relops))
+				appendStringInfo(&str, ", ");
+
+			if (rel->reloptkind == RELOPT_BASEREL)
+			{
+				rte = root->simple_rte_array[rel->relid];
+				appendStringInfo(&str, "%s", rte->eref->aliasname);
+			}
+			else
+			{
+				int		relid = -1;
+				bool	is_first = true;
+
+				appendStringInfo(&str, "(");
+				while ((relid = bms_next_member(rel->relids, relid)) >= 0)
+				{
+					if (!is_first)
+						appendStringInfo(&str, ", ");
+					rte = root->simple_rte_array[relid];
+					appendStringInfo(&str, "%s", rte->eref->aliasname);
+					is_first = false;
+				}
+				appendStringInfo(&str, ")");
+			}
+		}
+		elog(INFO, "%s", str.data);
+		pfree(str.data);
+	}
+#endif
+}
+
+/*
  * standard_join_search
  *	  Find possible joinpaths for a query by successively finding ways
  *	  to join component relations into join relations.
@@ -1645,6 +1934,9 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 		{
 			rel = (RelOptInfo *) lfirst(lc);
 
+			/* Add extra paths provided by extensions (FDW/CSP) */
+			add_joinrel_extra_paths(root, rel);
+
 			/* Find and save the cheapest paths for this rel */
 			set_cheapest(rel);
 
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 1da953f..61f1a78 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -17,10 +17,13 @@
 #include <math.h>
 
 #include "executor/executor.h"
+#include "foreign/fdwapi.h"
 #include "optimizer/cost.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 
+/* Hook for plugins to get control in add_paths_to_joinrel() */
+set_join_pathlist_hook_type set_join_pathlist_hook = NULL;
 
 #define PATH_PARAM_BY_REL(path, rel)  \
 	((path)->param_info && bms_overlap(PATH_REQ_OUTER(path), (rel)->relids))
@@ -260,6 +263,16 @@ add_paths_to_joinrel(PlannerInfo *root,
 							 restrictlist, jointype,
 							 sjinfo, &semifactors,
 							 param_source_rels, extra_lateral_rels);
+
+	/*
+	 * 5. Consider paths added by custom-scan providers, or other extensions
+	 * in addition to the built-in paths.
+	 */
+	if (set_join_pathlist_hook)
+		set_join_pathlist_hook(root, joinrel, outerrel, innerrel,
+							   restrictlist, jointype,
+							   sjinfo, &semifactors,
+							   param_source_rels, extra_lateral_rels);
 }
 
 /*
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index cb69c03..7f86fcb 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -44,7 +44,6 @@
 #include "utils/lsyscache.h"
 
 
-static Plan *create_plan_recurse(PlannerInfo *root, Path *best_path);
 static Plan *create_scan_plan(PlannerInfo *root, Path *best_path);
 static List *build_path_tlist(PlannerInfo *root, Path *path);
 static bool use_physical_tlist(PlannerInfo *root, RelOptInfo *rel);
@@ -220,7 +219,7 @@ create_plan(PlannerInfo *root, Path *best_path)
  * create_plan_recurse
  *	  Recursive guts of create_plan().
  */
-static Plan *
+Plan *
 create_plan_recurse(PlannerInfo *root, Path *best_path)
 {
 	Plan	   *plan;
@@ -1961,16 +1960,26 @@ create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
 	ForeignScan *scan_plan;
 	RelOptInfo *rel = best_path->path.parent;
 	Index		scan_relid = rel->relid;
-	RangeTblEntry *rte;
+	Oid			rel_oid = InvalidOid;
 	Bitmapset  *attrs_used = NULL;
 	ListCell   *lc;
 	int			i;
 
-	/* it should be a base rel... */
-	Assert(scan_relid > 0);
-	Assert(rel->rtekind == RTE_RELATION);
-	rte = planner_rt_fetch(scan_relid, root);
-	Assert(rte->rtekind == RTE_RELATION);
+	/*
+	 * Fetch relation-id, if this foreign-scan node actuall scans on
+	 * a particular real relation. Elsewhere, InvalidOid shall be
+	 * informed to the FDW driver.
+	 */
+	if (scan_relid > 0)
+	{
+		RangeTblEntry *rte;
+
+		Assert(rel->rtekind == RTE_RELATION);
+		rte = planner_rt_fetch(scan_relid, root);
+		Assert(rte->rtekind == RTE_RELATION);
+		rel_oid = rte->relid;
+	}
+	Assert(rel->fdwroutine != NULL);
 
 	/*
 	 * Sort clauses into best execution order.  We do this first since the FDW
@@ -1985,13 +1994,37 @@ create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
 	 * has selected some join clauses for remote use but also wants them
 	 * rechecked locally).
 	 */
-	scan_plan = rel->fdwroutine->GetForeignPlan(root, rel, rte->relid,
+	scan_plan = rel->fdwroutine->GetForeignPlan(root, rel, rel_oid,
 												best_path,
 												tlist, scan_clauses);
+	/*
+	 * Sanity check. Pseudo scan tuple-descriptor shall be constructed
+	 * based on the fdw_ps_tlist, excluding resjunk=true, so we need to
+	 * ensure all valid TLEs have to locate prior to junk ones.
+	 */
+	if (scan_plan->scan.scanrelid == 0)
+	{
+		bool	found_resjunk = false;
+
+		foreach (lc, scan_plan->fdw_ps_tlist)
+		{
+			TargetEntry	   *tle = lfirst(lc);
+
+			if (tle->resjunk)
+				found_resjunk = true;
+			else if (found_resjunk)
+				elog(ERROR, "junk TLE should not apper prior to valid one");
+		}
+	}
+	/* Set the relids that are represented by this foreign scan for Explain */
+	scan_plan->fdw_relids = best_path->path.parent->relids;
 
 	/* Copy cost data from Path to Plan; no need to make FDW do this */
 	copy_path_costsize(&scan_plan->scan.plan, &best_path->path);
 
+	/* Track FDW server-id; no need to make FDW do this */
+	scan_plan->fdw_handler = rel->fdw_handler;
+
 	/*
 	 * Replace any outer-relation variables with nestloop params in the qual
 	 * and fdw_exprs expressions.  We do this last so that the FDW doesn't
@@ -2053,12 +2086,7 @@ create_customscan_plan(PlannerInfo *root, CustomPath *best_path,
 {
 	CustomScan *cplan;
 	RelOptInfo *rel = best_path->path.parent;
-
-	/*
-	 * Right now, all we can support is CustomScan node which is associated
-	 * with a particular base relation to be scanned.
-	 */
-	Assert(rel && rel->reloptkind == RELOPT_BASEREL);
+	ListCell   *lc;
 
 	/*
 	 * Sort clauses into the best execution order, although custom-scan
@@ -2078,6 +2106,28 @@ create_customscan_plan(PlannerInfo *root, CustomPath *best_path,
 	Assert(IsA(cplan, CustomScan));
 
 	/*
+	 * Sanity check. Pseudo scan tuple-descriptor shall be constructed
+	 * based on the custom_ps_tlist, excluding resjunk=true, so we need
+	 * to ensure all valid TLEs have to locate prior to junk ones.
+	 */
+	if (cplan->scan.scanrelid == 0)
+	{
+		bool	found_resjunk = false;
+
+		foreach (lc, cplan->custom_ps_tlist)
+		{
+			TargetEntry	   *tle = lfirst(lc);
+
+			if (tle->resjunk)
+				found_resjunk = true;
+			else if (found_resjunk)
+				elog(ERROR, "junk TLE should not apper prior to valid one");
+		}
+	}
+	/* Set the relids that are represented by this custom scan for Explain */
+	cplan->custom_relids = best_path->path.parent->relids;
+
+	/*
 	 * Copy cost data from Path to Plan; no need to make custom-plan providers
 	 * do this
 	 */
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index ec828cd..2961f44 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -568,6 +568,38 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
 			{
 				ForeignScan *splan = (ForeignScan *) plan;
 
+				if (rtoffset > 0)
+					splan->fdw_relids =
+						bms_shift_members(splan->fdw_relids, rtoffset);
+
+				if (splan->scan.scanrelid == 0)
+				{
+					indexed_tlist *pscan_itlist =
+						build_tlist_index(splan->fdw_ps_tlist);
+
+					splan->scan.plan.targetlist = (List *)
+						fix_upper_expr(root,
+									   (Node *) splan->scan.plan.targetlist,
+									   pscan_itlist,
+									   INDEX_VAR,
+									   rtoffset);
+					splan->scan.plan.qual = (List *)
+						fix_upper_expr(root,
+									   (Node *) splan->scan.plan.qual,
+									   pscan_itlist,
+									   INDEX_VAR,
+									   rtoffset);
+					splan->fdw_exprs = (List *)
+						fix_upper_expr(root,
+									   (Node *) splan->fdw_exprs,
+									   pscan_itlist,
+									   INDEX_VAR,
+									   rtoffset);
+					splan->fdw_ps_tlist =
+						fix_scan_list(root, splan->fdw_ps_tlist, rtoffset);
+					pfree(pscan_itlist);
+					break;
+				}
 				splan->scan.scanrelid += rtoffset;
 				splan->scan.plan.targetlist =
 					fix_scan_list(root, splan->scan.plan.targetlist, rtoffset);
@@ -582,6 +614,38 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
 			{
 				CustomScan *splan = (CustomScan *) plan;
 
+				if (rtoffset > 0)
+					splan->custom_relids =
+						bms_shift_members(splan->custom_relids, rtoffset);
+
+				if (splan->scan.scanrelid == 0)
+				{
+					indexed_tlist *pscan_itlist =
+						build_tlist_index(splan->custom_ps_tlist);
+
+					splan->scan.plan.targetlist = (List *)
+						fix_upper_expr(root,
+									   (Node *) splan->scan.plan.targetlist,
+									   pscan_itlist,
+									   INDEX_VAR,
+									   rtoffset);
+					splan->scan.plan.qual = (List *)
+						fix_upper_expr(root,
+									   (Node *) splan->scan.plan.qual,
+									   pscan_itlist,
+									   INDEX_VAR,
+									   rtoffset);
+					splan->custom_exprs = (List *)
+						fix_upper_expr(root,
+									   (Node *) splan->custom_exprs,
+									   pscan_itlist,
+									   INDEX_VAR,
+									   rtoffset);
+					splan->custom_ps_tlist =
+						fix_scan_list(root, splan->custom_ps_tlist, rtoffset);
+					pfree(pscan_itlist);
+					break;
+				}
 				splan->scan.scanrelid += rtoffset;
 				splan->scan.plan.targetlist =
 					fix_scan_list(root, splan->scan.plan.targetlist, rtoffset);
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 313a5c1..1c570c8 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -378,10 +378,15 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 
 	/* Grab the fdwroutine info using the relcache, while we have it */
 	if (relation->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
+	{
+		rel->fdw_handler = GetFdwHandlerForRelation(relation);
 		rel->fdwroutine = GetFdwRoutineForRelation(relation, true);
+	}
 	else
+	{
+		rel->fdw_handler = InvalidOid;
 		rel->fdwroutine = NULL;
-
+	}
 	heap_close(relation, NoLock);
 
 	/*
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 8cfbea0..5623566 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -14,6 +14,7 @@
  */
 #include "postgres.h"
 
+#include "foreign/fdwapi.h"
 #include "optimizer/cost.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
@@ -122,6 +123,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind)
 	rel->subroot = NULL;
 	rel->subplan_params = NIL;
 	rel->fdwroutine = NULL;
+	rel->fdw_handler = InvalidOid;
 	rel->fdw_private = NULL;
 	rel->baserestrictinfo = NIL;
 	rel->baserestrictcost.startup = 0;
@@ -427,6 +429,18 @@ build_join_rel(PlannerInfo *root,
 							   sjinfo, restrictlist);
 
 	/*
+	 * Set FDW handler and routine if both outer and inner relation
+	 * are managed by same FDW driver.
+	 */
+	if (OidIsValid(outer_rel->fdw_handler) &&
+		OidIsValid(inner_rel->fdw_handler) &&
+		outer_rel->fdw_handler == inner_rel->fdw_handler)
+	{
+		joinrel->fdw_handler = outer_rel->fdw_handler;
+		joinrel->fdwroutine = GetFdwRoutine(joinrel->fdw_handler);
+	}
+
+	/*
 	 * Add the joinrel to the query's joinrel list, and store it into the
 	 * auxiliary hashtable if there is one.  NB: GEQO requires us to append
 	 * the new joinrel to the end of the list!
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 28e1acf..90e1107 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -3842,6 +3842,10 @@ set_deparse_planstate(deparse_namespace *dpns, PlanState *ps)
 	/* index_tlist is set only if it's an IndexOnlyScan */
 	if (IsA(ps->plan, IndexOnlyScan))
 		dpns->index_tlist = ((IndexOnlyScan *) ps->plan)->indextlist;
+	else if (IsA(ps->plan, ForeignScan))
+		dpns->index_tlist = ((ForeignScan *) ps->plan)->fdw_ps_tlist;
+	else if (IsA(ps->plan, CustomScan))
+		dpns->index_tlist = ((CustomScan *) ps->plan)->custom_ps_tlist;
 	else
 		dpns->index_tlist = NIL;
 }
diff --git a/src/include/foreign/fdwapi.h b/src/include/foreign/fdwapi.h
index 1d76841..ab0a05d 100644
--- a/src/include/foreign/fdwapi.h
+++ b/src/include/foreign/fdwapi.h
@@ -82,6 +82,9 @@ typedef void (*EndForeignModify_function) (EState *estate,
 
 typedef int (*IsForeignRelUpdatable_function) (Relation rel);
 
+typedef void (*GetForeignJoinPaths_function ) (PlannerInfo *root,
+											   RelOptInfo *joinrel);
+
 typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
 													struct ExplainState *es);
 
@@ -150,6 +153,10 @@ typedef struct FdwRoutine
 
 	/* Support functions for IMPORT FOREIGN SCHEMA */
 	ImportForeignSchema_function ImportForeignSchema;
+
+	/* Support functions for join push-down */
+	GetForeignJoinPaths_function GetForeignJoinPaths;
+
 } FdwRoutine;
 
 
@@ -157,6 +164,7 @@ typedef struct FdwRoutine
 extern FdwRoutine *GetFdwRoutine(Oid fdwhandler);
 extern FdwRoutine *GetFdwRoutineByRelId(Oid relid);
 extern FdwRoutine *GetFdwRoutineForRelation(Relation relation, bool makecopy);
+extern Oid	GetFdwHandlerForRelation(Relation relation);
 extern bool IsImportableForeignTable(const char *tablename,
 						 ImportForeignSchemaStmt *stmt);
 
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index 3a556ee..3ca9791 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -66,6 +66,7 @@ extern void bms_free(Bitmapset *a);
 extern Bitmapset *bms_union(const Bitmapset *a, const Bitmapset *b);
 extern Bitmapset *bms_intersect(const Bitmapset *a, const Bitmapset *b);
 extern Bitmapset *bms_difference(const Bitmapset *a, const Bitmapset *b);
+extern Bitmapset *bms_shift_members(const Bitmapset *a, int shift);
 extern bool bms_is_subset(const Bitmapset *a, const Bitmapset *b);
 extern BMS_Comparison bms_subset_compare(const Bitmapset *a, const Bitmapset *b);
 extern bool bms_is_member(int x, const Bitmapset *a);
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 21cbfa8..b25330e 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -471,7 +471,13 @@ typedef struct WorkTableScan
  * fdw_exprs and fdw_private are both under the control of the foreign-data
  * wrapper, but fdw_exprs is presumed to contain expression trees and will
  * be post-processed accordingly by the planner; fdw_private won't be.
- * Note that everything in both lists must be copiable by copyObject().
+ * An optional fdw_ps_tlist is used to map a reference to an attribute of
+ * underlying relation(s) on a pair of INDEX_VAR and alternative varattno.
+ * It looks like a scan on pseudo relation that is usually result of
+ * relations join on remote data source, and FDW driver is responsible to
+ * set expected target list for this. If FDW returns records as foreign-
+ * table definition, just put NIL here.
+ * Note that everything in above lists must be copiable by copyObject().
  * One way to store an arbitrary blob of bytes is to represent it as a bytea
  * Const.  Usually, though, you'll be better off choosing a representation
  * that can be dumped usefully by nodeToString().
@@ -480,18 +486,23 @@ typedef struct WorkTableScan
 typedef struct ForeignScan
 {
 	Scan		scan;
+	Oid			fdw_handler;	/* OID of FDW handler */
 	List	   *fdw_exprs;		/* expressions that FDW may evaluate */
+	List	   *fdw_ps_tlist;	/* optional pseudo-scan tlist for FDW */
 	List	   *fdw_private;	/* private data for FDW */
+	Bitmapset  *fdw_relids;		/* set of relid (index of range-tables)
+								 * represented by this node */
 	bool		fsSystemCol;	/* true if any "system column" is needed */
 } ForeignScan;
 
 /* ----------------
  *	   CustomScan node
  *
- * The comments for ForeignScan's fdw_exprs and fdw_private fields apply
- * equally to custom_exprs and custom_private.  Note that since Plan trees
- * can be copied, custom scan providers *must* fit all plan data they need
- * into those fields; embedding CustomScan in a larger struct will not work.
+ * The comments for ForeignScan's fdw_exprs, fdw_varmap and fdw_private fields
+ * apply equally to custom_exprs, custom_ps_tlist and custom_private.
+ *  Note that since Plan trees can be copied, custom scan providers *must*
+ * fit all plan data they need into those fields; embedding CustomScan in
+ * a larger struct will not work.
  * ----------------
  */
 struct CustomScan;
@@ -512,7 +523,10 @@ typedef struct CustomScan
 	Scan		scan;
 	uint32		flags;			/* mask of CUSTOMPATH_* flags, see relation.h */
 	List	   *custom_exprs;	/* expressions that custom code may evaluate */
+	List	   *custom_ps_tlist;/* optional pseudo-scan target list */
 	List	   *custom_private; /* private data for custom code */
+	Bitmapset  *custom_relids;	/* set of relid (index of range-tables)
+								 * represented by this node */
 	const CustomScanMethods *methods;
 } CustomScan;
 
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 334cf51..4eb89c6 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -366,6 +366,7 @@ typedef struct PlannerInfo
  *		subroot - PlannerInfo for subquery (NULL if it's not a subquery)
  *		subplan_params - list of PlannerParamItems to be passed to subquery
  *		fdwroutine - function hooks for FDW, if foreign table (else NULL)
+ *		fdw_handler - OID of FDW handler, if foreign table (else InvalidOid)
  *		fdw_private - private state for FDW, if foreign table (else NULL)
  *
  *		Note: for a subquery, tuples, subplan, subroot are not set immediately
@@ -461,6 +462,7 @@ typedef struct RelOptInfo
 	List	   *subplan_params; /* if subquery */
 	/* use "struct FdwRoutine" to avoid including fdwapi.h here */
 	struct FdwRoutine *fdwroutine;		/* if foreign table */
+	Oid			fdw_handler;	/* if foreign table */
 	void	   *fdw_private;	/* if foreign table */
 
 	/* used by various scans and joins: */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 6cad92e..e5676c8 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -30,13 +30,34 @@ typedef void (*set_rel_pathlist_hook_type) (PlannerInfo *root,
 														RangeTblEntry *rte);
 extern PGDLLIMPORT set_rel_pathlist_hook_type set_rel_pathlist_hook;
 
+/* Hook for plugins to get control in add_paths_to_joinrel() */
+typedef void (*set_join_pathlist_hook_type) (PlannerInfo *root,
+											 RelOptInfo *joinrel,
+											 RelOptInfo *outerrel,
+											 RelOptInfo *innerrel,
+											 List *restrictlist,
+											 JoinType jointype,
+											 SpecialJoinInfo *sjinfo,
+											 SemiAntiJoinFactors *semifactors,
+											 Relids param_source_rels,
+											 Relids extra_lateral_rels);
+extern PGDLLIMPORT set_join_pathlist_hook_type set_join_pathlist_hook;
+
 /* Hook for plugins to replace standard_join_search() */
 typedef RelOptInfo *(*join_search_hook_type) (PlannerInfo *root,
 														  int levels_needed,
 														  List *initial_rels);
 extern PGDLLIMPORT join_search_hook_type join_search_hook;
 
+/* Hook for plugins to add extra joinpath for each level */
+typedef void (*set_extra_joinpaths_hook_type)(PlannerInfo *root,
+											  RelOptInfo *joinrel);
+extern PGDLLIMPORT set_extra_joinpaths_hook_type set_extra_joinpaths_hook;
 
+extern List *get_joinrel_broken_down(PlannerInfo *root,
+									 RelOptInfo *joinrel,
+									 SpecialJoinInfo **p_sjinfo);
+extern void add_joinrel_extra_paths(PlannerInfo *root, RelOptInfo *joinrel);
 extern RelOptInfo *make_one_rel(PlannerInfo *root, List *joinlist);
 extern RelOptInfo *standard_join_search(PlannerInfo *root, int levels_needed,
 					 List *initial_rels);
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index fa72918..0c8cbcd 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -41,6 +41,7 @@ extern Plan *optimize_minmax_aggregates(PlannerInfo *root, List *tlist,
  * prototypes for plan/createplan.c
  */
 extern Plan *create_plan(PlannerInfo *root, Path *best_path);
+extern Plan *create_plan_recurse(PlannerInfo *root, Path *best_path);
 extern SubqueryScan *make_subqueryscan(List *qptlist, List *qpqual,
 				  Index scanrelid, Plan *subplan);
 extern ForeignScan *make_foreignscan(List *qptlist, List *qpqual,

Shigeru HANADA

shigeru.hanada@gmail.com

almost 11 years ago

In reply to: Kouhei Kaigai (#1)

2015/03/23 9:12、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：

Sorry for my response late. It was not easy to code during business trip.

The attached patch adds a hook for FDW/CSP to replace entire join-subtree
by a foreign/custom-scan, according to the discussion upthread.

GetForeignJoinPaths handler of FDW is simplified as follows:
typedef void (*GetForeignJoinPaths_function) (PlannerInfo *root,
RelOptInfo *joinrel);

It’s not a critical issue but I’d like to propose to rename add_joinrel_extra_paths() to add_extra_paths_to_joinrel(), because the latter would make it more clear that it does extra work in addition to add_paths_to_joinrel().

It takes PlannerInfo and RelOptInfo of the join-relation to be replaced
if available. RelOptInfo contains 'relids' bitmap, so FDW driver will be
able to know the relations to be involved and construct a remote join query.
However, it is not obvious with RelOptInfo to know how relations are joined.

The function below will help implement FDW driver that support remote join.

List *
get_joinrel_broken_down(PlannerInfo *root, RelOptInfo *joinrel,
SpecialJoinInfo **p_sjinfo)

It returns a list of RelOptInfo to be involved in the relations join that
is represented with 'joinrel', and also set a SpecialJoinInfo on the third
argument if not inner join.
In case of inner join, it returns multiple (more than or equal to 2)
relations to be inner-joined. Elsewhere, it returns two relations and
a valid SpecialJoinInfo.

As far as I tested, it works fine for SEMI and ANTI.
# I want dump function of BitmapSet for debugging, as Node has nodeToString()...

At this moment, I'm not 100% certain about its logic. Especially, I didn't
test SEMI- and ANTI- join cases yet.
However, time is money - I want people to check overall design first, rather
than detailed debugging. Please tell me if I misunderstood the logic to break
down join relations.

With applying your patch, regression tests of “updatable view” failed. regression.diff contains some errors like this:
! ERROR: could not find RelOptInfo for given relids

Could you check that?

—
Shigeru HANADA

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Kouhei Kaigai

kaigai@ak.jp.nec.com

almost 11 years ago

In reply to: Shigeru HANADA (#2)

At this moment, I'm not 100% certain about its logic. Especially, I didn't
test SEMI- and ANTI- join cases yet.
However, time is money - I want people to check overall design first, rather
than detailed debugging. Please tell me if I misunderstood the logic to break
down join relations.

With applying your patch, regression tests of “updatable view” failed.
regression.diff contains some errors like this:
! ERROR: could not find RelOptInfo for given relids

Could you check that?

It is a bug around the logic to find out two RelOptInfo that can construct
another RelOptInfo of joinrel.
Even though I'm now working to correct the logic, it is not obvious to
identify two relids that satisfy joinrel->relids.
(Yep, law of entropy enhancement...)

On the other hands, we may have a solution that does not need a complicated
reconstruction process. The original concern was, FDW driver may add paths
that will replace entire join subtree by foreign-scan on remote join multiple
times, repeatedly, but these paths shall be identical.

If we put a hook for FDW/CSP on bottom of build_join_rel(), we may be able
to solve the problem more straight-forward and simply way.
Because build_join_rel() finds a cache on root->join_rel_hash then returns
immediately if required joinrelids already has its RelOptInfo, bottom of
this function never called twice on a particular set of joinrelids.
Once FDW/CSP constructs a path that replaces entire join subtree towards
the joinrel just after construction, it shall be kept until cheaper built-in
paths are added (if exists).

This idea has one other positive side-effect. We expect remote-join is cheaper
than local join with two remote scan in most cases. Once a much cheaper path
is added prior to local join consideration, add_path_precheck() breaks path
consideration earlier.

Please comment on.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

-----Original Message-----
From: Shigeru HANADA [mailto:shigeru.hanada@gmail.com]
Sent: Tuesday, March 24, 2015 7:36 PM
To: Kaigai Kouhei(海外浩平)
Cc: Robert Haas; Tom Lane; Ashutosh Bapat; Thom Brown;
pgsql-hackers@postgreSQL.org
Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015/03/23 9:12、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：

Sorry for my response late. It was not easy to code during business trip.

The attached patch adds a hook for FDW/CSP to replace entire join-subtree
by a foreign/custom-scan, according to the discussion upthread.

GetForeignJoinPaths handler of FDW is simplified as follows:
typedef void (*GetForeignJoinPaths_function) (PlannerInfo *root,
RelOptInfo *joinrel);

It’s not a critical issue but I’d like to propose to rename
add_joinrel_extra_paths() to add_extra_paths_to_joinrel(), because the latter
would make it more clear that it does extra work in addition to
add_paths_to_joinrel().

It takes PlannerInfo and RelOptInfo of the join-relation to be replaced
if available. RelOptInfo contains 'relids' bitmap, so FDW driver will be
able to know the relations to be involved and construct a remote join query.
However, it is not obvious with RelOptInfo to know how relations are joined.

The function below will help implement FDW driver that support remote join.

List *
get_joinrel_broken_down(PlannerInfo *root, RelOptInfo *joinrel,
SpecialJoinInfo **p_sjinfo)

It returns a list of RelOptInfo to be involved in the relations join that
is represented with 'joinrel', and also set a SpecialJoinInfo on the third
argument if not inner join.
In case of inner join, it returns multiple (more than or equal to 2)
relations to be inner-joined. Elsewhere, it returns two relations and
a valid SpecialJoinInfo.

As far as I tested, it works fine for SEMI and ANTI.
# I want dump function of BitmapSet for debugging, as Node has nodeToString()...

At this moment, I'm not 100% certain about its logic. Especially, I didn't
test SEMI- and ANTI- join cases yet.
However, time is money - I want people to check overall design first, rather
than detailed debugging. Please tell me if I misunderstood the logic to break
down join relations.

With applying your patch, regression tests of “updatable view” failed.
regression.diff contains some errors like this:
! ERROR: could not find RelOptInfo for given relids

Could you check that?

—
Shigeru HANADA

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Kyotaro HORIGUCHI

horiguchi.kyotaro@lab.ntt.co.jp

almost 11 years ago

In reply to: Kouhei Kaigai (#3)

Re: Custom/Foreign-Join-APIs

Hello, I had a look on this.

At Wed, 25 Mar 2015 03:59:28 +0000, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote in <9A28C8860F777E439AA12E8AEA7694F8010C6819@BPXM15GP.gisp.nec.co.jp>

At this moment, I'm not 100% certain about its logic. Especially, I didn't
test SEMI- and ANTI- join cases yet.
However, time is money - I want people to check overall design first, rather
than detailed debugging. Please tell me if I misunderstood the logic to break
down join relations.

With applying your patch, regression tests of “updatable view” failed.
regression.diff contains some errors like this:
! ERROR: could not find RelOptInfo for given relids

Could you check that?

It is a bug around the logic to find out two RelOptInfo that can construct
another RelOptInfo of joinrel.

It is caused by split (or multilevel) joinlist. Setting
join_collapse_limit to 10 makes the query to go well.

I suppose that get_joinrel_broken_down should give up returning
result when given joinrel spans over multiple join subproblems,
becuase they cannot be merged by FDW anyway even if they
comformed the basic requirements for merging.

Even though I'm now working to correct the logic, it is not obvious to
identify two relids that satisfy joinrel->relids.
(Yep, law of entropy enhancement...)

On the other hands, we may have a solution that does not need a complicated
reconstruction process. The original concern was, FDW driver may add paths
that will replace entire join subtree by foreign-scan on remote join multiple
times, repeatedly, but these paths shall be identical.

If we put a hook for FDW/CSP on bottom of build_join_rel(), we may be able
to solve the problem more straight-forward and simply way.
Because build_join_rel() finds a cache on root->join_rel_hash then returns
immediately if required joinrelids already has its RelOptInfo, bottom of
this function never called twice on a particular set of joinrelids.
Once FDW/CSP constructs a path that replaces entire join subtree towards
the joinrel just after construction, it shall be kept until cheaper built-in
paths are added (if exists).

This idea has one other positive side-effect. We expect remote-join is cheaper
than local join with two remote scan in most cases. Once a much cheaper path
is added prior to local join consideration, add_path_precheck() breaks path
consideration earlier.

+1 as a whole.

regards,

--
堀口恭太郎

日本電信電話株式会社 NTTオープンソースソフトウェアセンタ
Phone: 03-5860-5115 / Fax: 03-5463-5490

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Shigeru HANADA

shigeru.hanada@gmail.com

almost 11 years ago

In reply to: Kouhei Kaigai (#3)

2015/03/25 12:59、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：

At this moment, I'm not 100% certain about its logic. Especially, I didn't
test SEMI- and ANTI- join cases yet.
However, time is money - I want people to check overall design first, rather
than detailed debugging. Please tell me if I misunderstood the logic to break
down join relations.

With applying your patch, regression tests of “updatable view” failed.
regression.diff contains some errors like this:
! ERROR: could not find RelOptInfo for given relids

Could you check that?

It is a bug around the logic to find out two RelOptInfo that can construct
another RelOptInfo of joinrel.
Even though I'm now working to correct the logic, it is not obvious to
identify two relids that satisfy joinrel->relids.
(Yep, law of entropy enhancement…)

IIUC, this problem is in only non-INNER JOINs because we can treat relations joined with only INNER JOIN in arbitrary order. But supporting OUTER JOINs would be necessary even for the first cut.

On the other hands, we may have a solution that does not need a complicated
reconstruction process. The original concern was, FDW driver may add paths
that will replace entire join subtree by foreign-scan on remote join multiple
times, repeatedly, but these paths shall be identical.

If we put a hook for FDW/CSP on bottom of build_join_rel(), we may be able
to solve the problem more straight-forward and simply way.
Because build_join_rel() finds a cache on root->join_rel_hash then returns
immediately if required joinrelids already has its RelOptInfo, bottom of
this function never called twice on a particular set of joinrelids.
Once FDW/CSP constructs a path that replaces entire join subtree towards
the joinrel just after construction, it shall be kept until cheaper built-in
paths are added (if exists).

This idea has one other positive side-effect. We expect remote-join is cheaper
than local join with two remote scan in most cases. Once a much cheaper path
is added prior to local join consideration, add_path_precheck() breaks path
consideration earlier.

Please comment on.

Or bottom of make_join_rel(). IMO build_join_rel() is responsible for just building (or searching from a list) a RelOptInfo for given relids. After that make_join_rel() calls add_paths_to_joinrel() with appropriate arguments per join type to generate actual Paths implements the join. make_join_rel() is called only once for particular relid combination, and there SpecialJoinInfo and restrictlist (conditions specified in JOIN-ON and WHERE), so it seems promising for FDW cases.

Though I’m not sure that it also fits custom join provider’s requirements.

—
Shigeru HANADA

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

almost 11 years ago

In reply to: Shigeru HANADA (#5)

On Wed, Mar 25, 2015 at 3:14 PM, Shigeru HANADA <shigeru.hanada@gmail.com>
wrote:

Or bottom of make_join_rel(). IMO build_join_rel() is responsible for
just building (or searching from a list) a RelOptInfo for given relids.
After that make_join_rel() calls add_paths_to_joinrel() with appropriate
arguments per join type to generate actual Paths implements the join.
make_join_rel() is called only once for particular relid combination, and
there SpecialJoinInfo and restrictlist (conditions specified in JOIN-ON and
WHERE), so it seems promising for FDW cases.

I like that idea, but I think we will have complex hook signature, it won't
remain as simple as hook (root, joinrel).

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Kouhei Kaigai

kaigai@ak.jp.nec.com

almost 11 years ago

In reply to: Ashutosh Bapat (#6)

On Wed, Mar 25, 2015 at 3:14 PM, Shigeru HANADA <shigeru.hanada@gmail.com> wrote:
Or bottom of make_join_rel(). IMO build_join_rel() is responsible for
just building (or searching from a list) a RelOptInfo for given relids. After
that make_join_rel() calls add_paths_to_joinrel() with appropriate arguments per
join type to generate actual Paths implements the join. make_join_rel() is
called only once for particular relid combination, and there SpecialJoinInfo and
restrictlist (conditions specified in JOIN-ON and WHERE), so it seems promising
for FDW cases.

I like that idea, but I think we will have complex hook signature, it won't remain
as simple as hook (root, joinrel).

In this case, GetForeignJoinPaths() will take root, joinrel, rel1, rel2,
sjinfo and restrictlist.
It is not too simple, but not complicated signature.

Even if we reconstruct rel1 and rel2 using sjinfo, we also need to compute
restrictlist using build_joinrel_restrictlist() again. It is a static function
in relnode.c. So, I don't think either of them has definitive advantage from
the standpoint of simplicity.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Shigeru HANADA

shigeru.hanada@gmail.com

almost 11 years ago

In reply to: Kouhei Kaigai (#7)

2015/03/25 19:09、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：

On Wed, Mar 25, 2015 at 3:14 PM, Shigeru HANADA <shigeru.hanada@gmail.com> wrote:
Or bottom of make_join_rel(). IMO build_join_rel() is responsible for
just building (or searching from a list) a RelOptInfo for given relids. After
that make_join_rel() calls add_paths_to_joinrel() with appropriate arguments per
join type to generate actual Paths implements the join. make_join_rel() is
called only once for particular relid combination, and there SpecialJoinInfo and
restrictlist (conditions specified in JOIN-ON and WHERE), so it seems promising
for FDW cases.

I like that idea, but I think we will have complex hook signature, it won't remain
as simple as hook (root, joinrel).

In this case, GetForeignJoinPaths() will take root, joinrel, rel1, rel2,
sjinfo and restrictlist.
It is not too simple, but not complicated signature.

Even if we reconstruct rel1 and rel2 using sjinfo, we also need to compute
restrictlist using build_joinrel_restrictlist() again. It is a static function
in relnode.c. So, I don't think either of them has definitive advantage from
the standpoint of simplicity.

The bottom of make_join_rel() seems good from the viewpoint of information, but it is called multiple times for join combinations which are essentially identical, for INNER JOIN case like this:

fdw=# explain select * from pgbench_branches b join pgbench_tellers t on t.bid = b.bid join pgbench_accounts a on a.bid = b.bid and a.bid = t.bid;
INFO: postgresGetForeignJoinPaths() 1x2
INFO: postgresGetForeignJoinPaths() 1x4
INFO: postgresGetForeignJoinPaths() 2x4
INFO: standard_join_search() old hook point
INFO: standard_join_search() old hook point
INFO: standard_join_search() old hook point
INFO: postgresGetForeignJoinPaths() 0x4
INFO: postgresGetForeignJoinPaths() 0x2
INFO: postgresGetForeignJoinPaths() 0x1
INFO: standard_join_search() old hook point
QUERY PLAN
---------------------------------------------------------
Foreign Scan (cost=100.00..102.11 rows=211 width=1068)
(1 row)

Here I’ve put probe point in the beginnig of GetForeignJoinPaths handler and just before set_cheapest() call in standard_join_search() as “old hook point”. In this example 1, 2, and 4 are base relations, and in the join level 3 planner calls GetForeignJoinPaths() three times for the combinations:

1) (1x2)x4
2) (1x4)x2
3) (2x4)x1

Tom’s suggestion is aiming at providing a chance to consider join push-down in more abstract level, IIUC. So it would be good to call handler only once for that case, for flattened combination (1x2x3).

Hum, how about skipping calling handler (or hook) if the joinrel was found by find_join_rel()? At least it suppress redundant call for different join orders, and handler can determine whether the combination can be flattened by checking that all RelOptInfo with RELOPT_JOINREL under joinrel has JOIN_INNER as jointype.

—
Shigeru HANADA

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Kouhei Kaigai

kaigai@ak.jp.nec.com

almost 11 years ago

In reply to: Shigeru HANADA (#5)

2015/03/25 12:59、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：

At this moment, I'm not 100% certain about its logic. Especially, I didn't
test SEMI- and ANTI- join cases yet.
However, time is money - I want people to check overall design first, rather
than detailed debugging. Please tell me if I misunderstood the logic to break
down join relations.

With applying your patch, regression tests of “updatable view” failed.
regression.diff contains some errors like this:
! ERROR: could not find RelOptInfo for given relids

Could you check that?

It is a bug around the logic to find out two RelOptInfo that can construct
another RelOptInfo of joinrel.
Even though I'm now working to correct the logic, it is not obvious to
identify two relids that satisfy joinrel->relids.
(Yep, law of entropy enhancement…)

IIUC, this problem is in only non-INNER JOINs because we can treat relations joined
with only INNER JOIN in arbitrary order. But supporting OUTER JOINs would be
necessary even for the first cut.

Yep. In case when joinrel contains all inner-joined relations managed by same
FDW driver, job of get_joinrel_broken_down() is quite simple.
However, people want to support outer-join also, doesn't it?

On the other hands, we may have a solution that does not need a complicated
reconstruction process. The original concern was, FDW driver may add paths
that will replace entire join subtree by foreign-scan on remote join multiple
times, repeatedly, but these paths shall be identical.

If we put a hook for FDW/CSP on bottom of build_join_rel(), we may be able
to solve the problem more straight-forward and simply way.
Because build_join_rel() finds a cache on root->join_rel_hash then returns
immediately if required joinrelids already has its RelOptInfo, bottom of
this function never called twice on a particular set of joinrelids.
Once FDW/CSP constructs a path that replaces entire join subtree towards
the joinrel just after construction, it shall be kept until cheaper built-in
paths are added (if exists).

This idea has one other positive side-effect. We expect remote-join is cheaper
than local join with two remote scan in most cases. Once a much cheaper path
is added prior to local join consideration, add_path_precheck() breaks path
consideration earlier.

Please comment on.

Or bottom of make_join_rel(). IMO build_join_rel() is responsible for just
building (or searching from a list) a RelOptInfo for given relids. After that
make_join_rel() calls add_paths_to_joinrel() with appropriate arguments per join
type to generate actual Paths implements the join. make_join_rel() is called
only once for particular relid combination, and there SpecialJoinInfo and
restrictlist (conditions specified in JOIN-ON and WHERE), so it seems promising
for FDW cases.

As long as caller can know whether build_join_rel() actually construct a new
RelOptInfo object, or not, I think it makes more sense than putting a hook
within make_join_rel().

Though I’m not sure that it also fits custom join provider’s requirements.

Join replaced by CSP has two scenarios. First one implements just an alternative
logic of built-in join, will takes underlying inner/outer node, so its hook
is located on add_paths_to_joinrel() as like built-in join logics.
Second one tries to replace entire join sub-tree by materialized view (for
example), like FDW remote join cases. So, it has to be hooked nearby the
location of GetForeignJoinPaths().
In case of the second scenario, CSP does not have private field in RelOptInfo,
so it may not obvious to check whether the given joinrel exactly matches with
a particular materialized-view or other caches.

At this moment, what I'm interested in is the first scenario, so priority of
the second case is not significant for me, at least.

Thanks.
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Kouhei Kaigai

kaigai@ak.jp.nec.com

almost 11 years ago

In reply to: Shigeru HANADA (#8)

2015/03/25 19:09、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：

On Wed, Mar 25, 2015 at 3:14 PM, Shigeru HANADA <shigeru.hanada@gmail.com>

wrote:

Or bottom of make_join_rel(). IMO build_join_rel() is responsible for
just building (or searching from a list) a RelOptInfo for given relids. After
that make_join_rel() calls add_paths_to_joinrel() with appropriate arguments

per

join type to generate actual Paths implements the join. make_join_rel() is
called only once for particular relid combination, and there SpecialJoinInfo

and

restrictlist (conditions specified in JOIN-ON and WHERE), so it seems promising
for FDW cases.

I like that idea, but I think we will have complex hook signature, it won't

remain

as simple as hook (root, joinrel).

In this case, GetForeignJoinPaths() will take root, joinrel, rel1, rel2,
sjinfo and restrictlist.
It is not too simple, but not complicated signature.

Even if we reconstruct rel1 and rel2 using sjinfo, we also need to compute
restrictlist using build_joinrel_restrictlist() again. It is a static function
in relnode.c. So, I don't think either of them has definitive advantage from
the standpoint of simplicity.

The bottom of make_join_rel() seems good from the viewpoint of information, but
it is called multiple times for join combinations which are essentially identical,
for INNER JOIN case like this:

fdw=# explain select * from pgbench_branches b join pgbench_tellers t on t.bid
= b.bid join pgbench_accounts a on a.bid = b.bid and a.bid = t.bid;
INFO: postgresGetForeignJoinPaths() 1x2
INFO: postgresGetForeignJoinPaths() 1x4
INFO: postgresGetForeignJoinPaths() 2x4
INFO: standard_join_search() old hook point
INFO: standard_join_search() old hook point
INFO: standard_join_search() old hook point
INFO: postgresGetForeignJoinPaths() 0x4
INFO: postgresGetForeignJoinPaths() 0x2
INFO: postgresGetForeignJoinPaths() 0x1
INFO: standard_join_search() old hook point
QUERY PLAN
---------------------------------------------------------
Foreign Scan (cost=100.00..102.11 rows=211 width=1068)
(1 row)

Here I’ve put probe point in the beginnig of GetForeignJoinPaths handler and just
before set_cheapest() call in standard_join_search() as “old hook point”. In
this example 1, 2, and 4 are base relations, and in the join level 3 planner calls
GetForeignJoinPaths() three times for the combinations:

1) (1x2)x4
2) (1x4)x2
3) (2x4)x1

Tom’s suggestion is aiming at providing a chance to consider join push-down in
more abstract level, IIUC. So it would be good to call handler only once for
that case, for flattened combination (1x2x3).

Hum, how about skipping calling handler (or hook) if the joinrel was found by
find_join_rel()? At least it suppress redundant call for different join orders,
and handler can determine whether the combination can be flattened by checking
that all RelOptInfo with RELOPT_JOINREL under joinrel has JOIN_INNER as jointype.

The reason why FDW handler was called multiple times on your example is,
your modified make_join_rel() does not check whether build_join_rel()
actually build a new RelOptInfo, or just a cache reference, doesn't it?

If so, I'm inclined to your proposition.
A new "bool *found" argument of build_join_rel() makes reduce number of
FDW handler call, with keeping reasonable information to build remote-
join query.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Shigeru HANADA

shigeru.hanada@gmail.com

almost 11 years ago

In reply to: Ashutosh Bapat (#6)

2015/03/25 18:53、Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> のメール：

On Wed, Mar 25, 2015 at 3:14 PM, Shigeru HANADA <shigeru.hanada@gmail.com> wrote:

Or bottom of make_join_rel(). IMO build_join_rel() is responsible for just building (or searching from a list) a RelOptInfo for given relids. After that make_join_rel() calls add_paths_to_joinrel() with appropriate arguments per join type to generate actual Paths implements the join. make_join_rel() is called only once for particular relid combination, and there SpecialJoinInfo and restrictlist (conditions specified in JOIN-ON and WHERE), so it seems promising for FDW cases.

I like that idea, but I think we will have complex hook signature, it won't remain as simple as hook (root, joinrel).

Signature of the hook (or the FDW API handler) would be like this:

typedef void (*GetForeignJoinPaths_function ) (PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
SpecialJoinInfo *sjinfo,
List *restrictlist);

This is very similar to add_paths_to_joinrel(), but lacks semifactors and extra_lateral_rels. semifactors can be obtained with compute_semi_anti_join_factors(), and extra_lateral_rels can be constructed from root->placeholder_list as add_paths_to_joinrel() does.

From the viewpoint of postgres_fdw, jointype and restrictlist is necessary to generate SELECT statement, so it would require most work done in make_join_rel again if the signature was hook(root, joinrel). sjinfo will be necessary for supporting SEMI/ANTI joins, but currently it is not in the scope of postgres_fdw.

I guess that other FDWs require at least jointype and restrictlist.

—
Shigeru HANADA

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Shigeru HANADA

shigeru.hanada@gmail.com

almost 11 years ago

In reply to: Kouhei Kaigai (#10)

2015/03/25 19:47、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：

The reason why FDW handler was called multiple times on your example is,
your modified make_join_rel() does not check whether build_join_rel()
actually build a new RelOptInfo, or just a cache reference, doesn't it?

Yep. After that change calling count looks like this:

fdw=# explain select * from pgbench_branches b join pgbench_tellers t on t.bid = b.bid join pgbench_accounts a on a.bid = b.bid and a.bid = t.bid;
INFO: postgresGetForeignJoinPaths() 1x2
INFO: postgresGetForeignJoinPaths() 1x4
INFO: postgresGetForeignJoinPaths() 2x4
INFO: standard_join_search() old hook point
INFO: standard_join_search() old hook point
INFO: standard_join_search() old hook point
INFO: postgresGetForeignJoinPaths() 0x4
INFO: standard_join_search() old hook point
QUERY PLAN
---------------------------------------------------------
Foreign Scan (cost=100.00..102.11 rows=211 width=1068)
(1 row)

fdw=#

If so, I'm inclined to your proposition.
A new "bool *found" argument of build_join_rel() makes reduce number of
FDW handler call, with keeping reasonable information to build remote-
join query.

Another idea is to pass “found” as parameter to FDW handler, and let FDW to decide to skip or not. Some of FDWs (and some of CSP?) might want to be conscious of join combination.

—
Shigeru HANADA

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

Kouhei Kaigai

kaigai@ak.jp.nec.com

almost 11 years ago

In reply to: Shigeru HANADA (#12)

2015/03/25 19:47、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：

The reason why FDW handler was called multiple times on your example is,
your modified make_join_rel() does not check whether build_join_rel()
actually build a new RelOptInfo, or just a cache reference, doesn't it?

Yep. After that change calling count looks like this:

fdw=# explain select * from pgbench_branches b join pgbench_tellers t on t.bid
= b.bid join pgbench_accounts a on a.bid = b.bid and a.bid = t.bid;
INFO: postgresGetForeignJoinPaths() 1x2
INFO: postgresGetForeignJoinPaths() 1x4
INFO: postgresGetForeignJoinPaths() 2x4
INFO: standard_join_search() old hook point
INFO: standard_join_search() old hook point
INFO: standard_join_search() old hook point
INFO: postgresGetForeignJoinPaths() 0x4
INFO: standard_join_search() old hook point
QUERY PLAN
---------------------------------------------------------
Foreign Scan (cost=100.00..102.11 rows=211 width=1068)
(1 row)

fdw=#

If so, I'm inclined to your proposition.
A new "bool *found" argument of build_join_rel() makes reduce number of
FDW handler call, with keeping reasonable information to build remote-
join query.

Another idea is to pass “found” as parameter to FDW handler, and let FDW to decide
to skip or not. Some of FDWs (and some of CSP?) might want to be conscious of
join combination.

I think it does not match the concept we stand on.
Unlike CSP, FDW intends to replace an entire join sub-tree that is
represented with a particular joinrel, regardless of the sequence
to construct a joinrel from multiple baserels.
So, it is sufficient to call GetForeignJoinPaths() once a joinrel
is constructed, isn't it?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Kouhei Kaigai

kaigai@ak.jp.nec.com

almost 11 years ago

In reply to: Shigeru HANADA (#11)

1 attachment(s)

Or bottom of make_join_rel(). IMO build_join_rel() is responsible for just

building (or searching from a list) a RelOptInfo for given relids. After that
make_join_rel() calls add_paths_to_joinrel() with appropriate arguments per join
type to generate actual Paths implements the join. make_join_rel() is called
only once for particular relid combination, and there SpecialJoinInfo and
restrictlist (conditions specified in JOIN-ON and WHERE), so it seems promising
for FDW cases.

I like that idea, but I think we will have complex hook signature, it won't

remain as simple as hook (root, joinrel).

Signature of the hook (or the FDW API handler) would be like this:

typedef void (*GetForeignJoinPaths_function ) (PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
SpecialJoinInfo *sjinfo,
List *restrictlist);

This is very similar to add_paths_to_joinrel(), but lacks semifactors and
extra_lateral_rels. semifactors can be obtained with
compute_semi_anti_join_factors(), and extra_lateral_rels can be constructed
from root->placeholder_list as add_paths_to_joinrel() does.

From the viewpoint of postgres_fdw, jointype and restrictlist is necessary to
generate SELECT statement, so it would require most work done in make_join_rel
again if the signature was hook(root, joinrel). sjinfo will be necessary for
supporting SEMI/ANTI joins, but currently it is not in the scope of postgres_fdw.

I guess that other FDWs require at least jointype and restrictlist.

The attached patch adds GetForeignJoinPaths call on make_join_rel() only when
'joinrel' is actually built and both of child relations are managed by same
FDW driver, prior to any other built-in join paths.
I adjusted the hook definition a little bit, because jointype can be reproduced
using SpecialJoinInfo. Right?

Probably, it will solve the original concern towards multiple calls of FDW
handler in case when it tries to replace an entire join subtree with a foreign-
scan on the result of remote join query.

How about your opinion?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Attachments:

pgsql-v9.5-custom-join.v11.patchapplication/octet-stream; name=pgsql-v9.5-custom-join.v11.patchDownload

 doc/src/sgml/custom-scan.sgml           | 43 ++++++++++++++++++
 doc/src/sgml/fdwhandler.sgml            | 51 +++++++++++++++++++++
 src/backend/commands/explain.c          | 15 +++++--
 src/backend/executor/execScan.c         |  4 ++
 src/backend/executor/nodeCustom.c       | 38 ++++++++++++----
 src/backend/executor/nodeForeignscan.c  | 34 +++++++++-----
 src/backend/foreign/foreign.c           | 31 ++++++++++---
 src/backend/nodes/bitmapset.c           | 57 +++++++++++++++++++++++
 src/backend/nodes/copyfuncs.c           |  5 +++
 src/backend/nodes/outfuncs.c            |  5 +++
 src/backend/optimizer/path/allpaths.c   |  1 -
 src/backend/optimizer/path/joinpath.c   | 13 ++++++
 src/backend/optimizer/path/joinrels.c   | 21 ++++++++-
 src/backend/optimizer/plan/createplan.c | 80 ++++++++++++++++++++++++++-------
 src/backend/optimizer/plan/setrefs.c    | 64 ++++++++++++++++++++++++++
 src/backend/optimizer/util/plancat.c    |  7 ++-
 src/backend/optimizer/util/relnode.c    | 22 ++++++++-
 src/backend/utils/adt/ruleutils.c       |  4 ++
 src/include/foreign/fdwapi.h            | 12 +++++
 src/include/nodes/bitmapset.h           |  1 +
 src/include/nodes/plannodes.h           | 24 +++++++---
 src/include/nodes/relation.h            |  2 +
 src/include/optimizer/pathnode.h        |  3 +-
 src/include/optimizer/paths.h           | 13 ++++++
 src/include/optimizer/planmain.h        |  1 +
 25 files changed, 499 insertions(+), 52 deletions(-)

diff --git a/doc/src/sgml/custom-scan.sgml b/doc/src/sgml/custom-scan.sgml
index 8a4a3df..b1400ae 100644
--- a/doc/src/sgml/custom-scan.sgml
+++ b/doc/src/sgml/custom-scan.sgml
@@ -48,6 +48,27 @@ extern PGDLLIMPORT set_rel_pathlist_hook_type set_rel_pathlist_hook;
   </para>
 
   <para>
+   A custom scan provider will be also able to add paths by setting the
+   following hook, to replace built-in join paths by custom-scan that
+   performs as if a scan on preliminary joined relations, which us called
+   after the core code has generated what it believes to be the complete
+   and correct set of access paths for the join.
+<programlisting>
+typedef void (*set_join_pathlist_hook_type) (PlannerInfo *root,
+                                             RelOptInfo *joinrel,
+                                             RelOptInfo *outerrel,
+                                             RelOptInfo *innerrel,
+                                             List *restrictlist,
+                                             JoinType jointype,
+                                             SpecialJoinInfo *sjinfo,
+                                             SemiAntiJoinFactors *semifactors,
+                                             Relids param_source_rels,
+                                             Relids extra_lateral_rels);
+extern PGDLLIMPORT set_join_pathlist_hook_type set_join_pathlist_hook;
+</programlisting>
+  </para>
+
+  <para>
     Although this hook function can be used to examine, modify, or remove
     paths generated by the core system, a custom scan provider will typically
     confine itself to generating <structname>CustomPath</> objects and adding
@@ -124,7 +145,9 @@ typedef struct CustomScan
     Scan      scan;
     uint32    flags;
     List     *custom_exprs;
+    List     *custom_ps_tlist;
     List     *custom_private;
+    List     *custom_relids;
     const CustomScanMethods *methods;
 } CustomScan;
 </programlisting>
@@ -141,10 +164,30 @@ typedef struct CustomScan
     is only used by the custom scan provider itself.  Plan trees must be able
     to be duplicated using <function>copyObject</>, so all the data stored
     within these two fields must consist of nodes that function can handle.
+    <literal>custom_relids</> is set by the backend, thus custom-scan provider
+    does not need to touch, to track underlying relations represented by this
+    custom-scan node.
     <structfield>methods</> must point to a (usually statically allocated)
     object implementing the required custom scan methods, which are further
     detailed below.
   </para>
+  <para>
+   In case when <structname>CustomScan</> replaced built-in join paths,
+   custom-scan provider must have two characteristic setup.
+   The first one is zero on the <structfield>scan.scanrelid</>, which
+   should be usually an index of range-tables. It informs the backend
+   this <structname>CustomScan</> node is not associated with a particular
+   table. The second one is valid list of <structname>TargetEntry</> on
+   the <structfield>custom_ps_tlist</>. A <structname>CustomScan</> node
+   looks to the backend like a scan as literal, but on a relation which is
+   the result of relations join. It means we cannot construct a tuple
+   descriptor based on table definition, thus custom-scan provider must
+   introduce the expected record-type of the tuples.
+   Tuple-descriptor of scan-slot shall be constructed based on the
+   <structfield>custom_ps_tlist</>, and assigned on executor initialization.
+   Also, referenced by <command>EXPLAIN</> to solve name of the underlying
+   columns and relations.
+  </para>
 
   <sect2 id="custom-scan-plan-callbacks">
    <title>Custom Scan Callbacks</title>
diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
index c1daa4b..54ba45f 100644
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -598,6 +598,57 @@ IsForeignRelUpdatable (Relation rel);
 
    </sect2>
 
+   <sect2>
+    <title>FDW Routines for remote join</title>
+    <para>
+<programlisting>
+void
+GetForeignJoinPaths(PlannerInfo *root,
+                    RelOptInfo *joinrel,
+                    RelOptInfo *outerrel,
+                    RelOptInfo *innerrel,
+                    SpecialJoinInfo *sjinfo,
+                    List *restrictlist);
+</programlisting>
+     Create possible access paths for a join of two foreign tables or
+     joined relations, but both of them needs to be managed with same
+     FDW driver.
+     This optional function is called during query planning.
+    </para>
+    <para>
+     This function allows FDW driver to add <literal>ForeignScan</> path
+     towards the supplied <literal>joinrel</>. From the standpoint of
+     query planner, it looks like scan-node is added for join-relation.
+     It means, <literal>ForeignScan</> path added instead of the built-in
+     local join logic has to generate tuples as if it scans on a joined
+     and materialized relations.
+    </para>
+    <para>
+     Usually, we expect FDW drivers issues a remote query that involves
+     tables join on remote side, then FDW driver fetches the joined result
+     on local side.
+     Unlike simple table scan, definition of slot descriptor of the joined
+     relations is determined on the fly, thus we cannot know its definition
+     from the system catalog.
+     So, FDW driver is responsible to introduce the query planner expected
+     form of the joined relations. In case when <literal>ForeignScan</>
+     replaced a relations join, <literal>scanrelid</> of the generated plan
+     node shall be zero, to mark this <literal>ForeignScan</> node is not
+     associated with a particular foreign tables.
+     Also, it need to construct pseudo scan tlist (<literal>fdw_ps_tlist</>)
+     to indicate expected tuple definition.
+    </para>
+    <para>
+     Once <literal>scanrelid</> equals zero, executor initializes the slot
+     for scan according to <literal>fdw_ps_tlist</>, but excludes junk
+     entries. This list is also used to solve the name of the original
+     relation and columns, so FDW can chains expression nodes which are
+     not run on local side actually, like a join clause to be executed on
+     the remote side, however, target-entries of them will have
+     <literal>resjunk=true</>.
+    </para>
+   </sect2>
+
    <sect2 id="fdw-callbacks-explain">
     <title>FDW Routines for <command>EXPLAIN</></title>
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index a951c55..8892dca 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -730,11 +730,17 @@ ExplainPreScanNode(PlanState *planstate, Bitmapset **rels_used)
 		case T_ValuesScan:
 		case T_CteScan:
 		case T_WorkTableScan:
-		case T_ForeignScan:
-		case T_CustomScan:
 			*rels_used = bms_add_member(*rels_used,
 										((Scan *) plan)->scanrelid);
 			break;
+		case T_ForeignScan:
+			*rels_used = bms_add_members(*rels_used,
+										 ((ForeignScan *) plan)->fdw_relids);
+			break;
+		case T_CustomScan:
+			*rels_used = bms_add_members(*rels_used,
+										 ((CustomScan *) plan)->custom_relids);
+			break;
 		case T_ModifyTable:
 			*rels_used = bms_add_member(*rels_used,
 									((ModifyTable *) plan)->nominalRelation);
@@ -1072,9 +1078,12 @@ ExplainNode(PlanState *planstate, List *ancestors,
 		case T_ValuesScan:
 		case T_CteScan:
 		case T_WorkTableScan:
+			ExplainScanTarget((Scan *) plan, es);
+			break;
 		case T_ForeignScan:
 		case T_CustomScan:
-			ExplainScanTarget((Scan *) plan, es);
+			if (((Scan *) plan)->scanrelid > 0)
+				ExplainScanTarget((Scan *) plan, es);
 			break;
 		case T_IndexScan:
 			{
diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c
index 3f0d809..2f18a8a 100644
--- a/src/backend/executor/execScan.c
+++ b/src/backend/executor/execScan.c
@@ -251,6 +251,10 @@ ExecAssignScanProjectionInfo(ScanState *node)
 	/* Vars in an index-only scan's tlist should be INDEX_VAR */
 	if (IsA(scan, IndexOnlyScan))
 		varno = INDEX_VAR;
+	/* Also foreign-/custom-scan on pseudo relation should be INDEX_VAR */
+	else if (scan->scanrelid == 0 &&
+			 (IsA(scan, ForeignScan) || IsA(scan, CustomScan)))
+		varno = INDEX_VAR;
 	else
 		varno = scan->scanrelid;
 
diff --git a/src/backend/executor/nodeCustom.c b/src/backend/executor/nodeCustom.c
index b07932b..2344129 100644
--- a/src/backend/executor/nodeCustom.c
+++ b/src/backend/executor/nodeCustom.c
@@ -23,6 +23,7 @@ CustomScanState *
 ExecInitCustomScan(CustomScan *cscan, EState *estate, int eflags)
 {
 	CustomScanState    *css;
+	Index				scan_relid = cscan->scan.scanrelid;
 	Relation			scan_rel;
 
 	/* populate a CustomScanState according to the CustomScan */
@@ -48,12 +49,31 @@ ExecInitCustomScan(CustomScan *cscan, EState *estate, int eflags)
 	ExecInitScanTupleSlot(estate, &css->ss);
 	ExecInitResultTupleSlot(estate, &css->ss.ps);
 
-	/* initialize scan relation */
-	scan_rel = ExecOpenScanRelation(estate, cscan->scan.scanrelid, eflags);
-	css->ss.ss_currentRelation = scan_rel;
-	css->ss.ss_currentScanDesc = NULL;	/* set by provider */
-	ExecAssignScanType(&css->ss, RelationGetDescr(scan_rel));
-
+	/*
+	 * open the base relation and acquire appropriate lock on it, then
+	 * get the scan type from the relation descriptor, if this custom
+	 * scan is on actual relations.
+	 *
+	 * on the other hands, custom-scan may scan on a pseudo relation;
+	 * that is usually a result-set of relations join by external
+	 * computing resource, or others. It has to get the scan type from
+	 * the pseudo-scan target-list that should be assigned by custom-scan
+	 * provider.
+	 */
+	if (scan_relid > 0)
+	{
+		scan_rel = ExecOpenScanRelation(estate, scan_relid, eflags);
+		css->ss.ss_currentRelation = scan_rel;
+		css->ss.ss_currentScanDesc = NULL;	/* set by provider */
+		ExecAssignScanType(&css->ss, RelationGetDescr(scan_rel));
+	}
+	else
+	{
+		TupleDesc	ps_tupdesc;
+
+		ps_tupdesc = ExecCleanTypeFromTL(cscan->custom_ps_tlist, false);
+		ExecAssignScanType(&css->ss, ps_tupdesc);
+	}
 	css->ss.ps.ps_TupFromTlist = false;
 
 	/*
@@ -89,11 +109,11 @@ ExecEndCustomScan(CustomScanState *node)
 
 	/* Clean out the tuple table */
 	ExecClearTuple(node->ss.ps.ps_ResultTupleSlot);
-	if (node->ss.ss_ScanTupleSlot)
-		ExecClearTuple(node->ss.ss_ScanTupleSlot);
+	ExecClearTuple(node->ss.ss_ScanTupleSlot);
 
 	/* Close the heap relation */
-	ExecCloseScanRelation(node->ss.ss_currentRelation);
+	if (node->ss.ss_currentRelation)
+		ExecCloseScanRelation(node->ss.ss_currentRelation);
 }
 
 void
diff --git a/src/backend/executor/nodeForeignscan.c b/src/backend/executor/nodeForeignscan.c
index 7399053..542d176 100644
--- a/src/backend/executor/nodeForeignscan.c
+++ b/src/backend/executor/nodeForeignscan.c
@@ -102,6 +102,7 @@ ForeignScanState *
 ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
 {
 	ForeignScanState *scanstate;
+	Index		scanrelid = node->scan.scanrelid;
 	Relation	currentRelation;
 	FdwRoutine *fdwroutine;
 
@@ -141,16 +142,28 @@ ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
 	ExecInitScanTupleSlot(estate, &scanstate->ss);
 
 	/*
-	 * open the base relation and acquire appropriate lock on it.
+	 * open the base relation and acquire appropriate lock on it, then
+	 * get the scan type from the relation descriptor, if this foreign
+	 * scan is on actual foreign-table.
+	 *
+	 * on the other hands, foreign-scan may scan on a pseudo relation;
+	 * that is usually a result-set of remote relations join. It has
+	 * to get the scan type from the pseudo-scan target-list that should
+	 * be assigned by FDW driver.
 	 */
-	currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
-	scanstate->ss.ss_currentRelation = currentRelation;
+	if (scanrelid > 0)
+	{
+		currentRelation = ExecOpenScanRelation(estate, scanrelid, eflags);
+		scanstate->ss.ss_currentRelation = currentRelation;
+		ExecAssignScanType(&scanstate->ss, RelationGetDescr(currentRelation));
+	}
+	else
+	{
+		TupleDesc	ps_tupdesc;
 
-	/*
-	 * get the scan type from the relation descriptor.  (XXX at some point we
-	 * might want to let the FDW editorialize on the scan tupdesc.)
-	 */
-	ExecAssignScanType(&scanstate->ss, RelationGetDescr(currentRelation));
+		ps_tupdesc = ExecCleanTypeFromTL(node->fdw_ps_tlist, false);
+		ExecAssignScanType(&scanstate->ss, ps_tupdesc);
+	}
 
 	/*
 	 * Initialize result tuple type and projection info.
@@ -161,7 +174,7 @@ ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
 	/*
 	 * Acquire function pointers from the FDW's handler, and init fdw_state.
 	 */
-	fdwroutine = GetFdwRoutineForRelation(currentRelation, true);
+	fdwroutine = GetFdwRoutine(node->fdw_handler);
 	scanstate->fdwroutine = fdwroutine;
 	scanstate->fdw_state = NULL;
 
@@ -193,7 +206,8 @@ ExecEndForeignScan(ForeignScanState *node)
 	ExecClearTuple(node->ss.ss_ScanTupleSlot);
 
 	/* close the relation. */
-	ExecCloseScanRelation(node->ss.ss_currentRelation);
+	if (node->ss.ss_currentRelation)
+		ExecCloseScanRelation(node->ss.ss_currentRelation);
 }
 
 /* ----------------------------------------------------------------
diff --git a/src/backend/foreign/foreign.c b/src/backend/foreign/foreign.c
index cbe8b78..1901749 100644
--- a/src/backend/foreign/foreign.c
+++ b/src/backend/foreign/foreign.c
@@ -304,11 +304,11 @@ GetFdwRoutine(Oid fdwhandler)
 
 
 /*
- * GetFdwRoutineByRelId - look up the handler of the foreign-data wrapper
- * for the given foreign table, and retrieve its FdwRoutine struct.
+ * GetFdwHandlerByRelId - look up the handler of the foreign-data wrapper
+ * for the given foreign table
  */
-FdwRoutine *
-GetFdwRoutineByRelId(Oid relid)
+static Oid
+GetFdwHandlerByRelId(Oid relid)
 {
 	HeapTuple	tp;
 	Form_pg_foreign_data_wrapper fdwform;
@@ -350,7 +350,18 @@ GetFdwRoutineByRelId(Oid relid)
 
 	ReleaseSysCache(tp);
 
-	/* And finally, call the handler function. */
+	return fdwhandler;
+}
+
+/*
+ * GetFdwRoutineByRelId - look up the handler of the foreign-data wrapper
+ * for the given foreign table, and retrieve its FdwRoutine struct.
+ */
+FdwRoutine *
+GetFdwRoutineByRelId(Oid relid)
+{
+	Oid			fdwhandler = GetFdwHandlerByRelId(relid);
+
 	return GetFdwRoutine(fdwhandler);
 }
 
@@ -398,6 +409,16 @@ GetFdwRoutineForRelation(Relation relation, bool makecopy)
 	return relation->rd_fdwroutine;
 }
 
+/*
+ * GetFdwHandlerForRelation
+ *
+ * returns OID of FDW handler which is associated with the given relation.
+ */
+Oid
+GetFdwHandlerForRelation(Relation relation)
+{
+	return GetFdwHandlerByRelId(RelationGetRelid(relation));
+}
 
 /*
  * IsImportableForeignTable - filter table names for IMPORT FOREIGN SCHEMA
diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index a9c3b4b..4dc3286 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -301,6 +301,63 @@ bms_difference(const Bitmapset *a, const Bitmapset *b)
 }
 
 /*
+ * bms_shift_members - move all the bits by shift
+ */
+Bitmapset *
+bms_shift_members(const Bitmapset *a, int shift)
+{
+	Bitmapset  *b;
+	bitmapword	h_word;
+	bitmapword	l_word;
+	int			nwords;
+	int			w_shift;
+	int			b_shift;
+	int			i, j;
+
+	/* fast path if result shall be NULL obviously */
+	if (a == NULL || a->nwords * BITS_PER_BITMAPWORD + shift <= 0)
+		return NULL;
+	/* actually, not shift members */
+	if (shift == 0)
+		return bms_copy(a);
+
+	nwords = (a->nwords * BITS_PER_BITMAPWORD + shift +
+			  BITS_PER_BITMAPWORD - 1) / BITS_PER_BITMAPWORD;
+	b = palloc(BITMAPSET_SIZE(nwords));
+	b->nwords = nwords;
+
+	if (shift > 0)
+	{
+		/* Left shift */
+		w_shift = WORDNUM(shift);
+		b_shift = BITNUM(shift);
+
+		for (i=0, j=-w_shift; i < b->nwords; i++, j++)
+		{
+			h_word = (j >= 0   && j   < a->nwords ? a->words[j] : 0);
+			l_word = (j-1 >= 0 && j-1 < a->nwords ? a->words[j-1] : 0);
+			b->words[i] = ((h_word << b_shift) |
+						   (l_word >> (BITS_PER_BITMAPWORD - b_shift)));
+		}
+	}
+	else
+	{
+		/* Right shift */
+		w_shift = WORDNUM(-shift);
+		b_shift = BITNUM(-shift);
+
+		for (i=0, j=-w_shift; i < b->nwords; i++, j++)
+		{
+			h_word = (j+1 >= 0 && j+1 < a->nwords ? a->words[j+1] : 0);
+			l_word = (j >= 0 && j < a->nwords ? a->words[j] : 0);
+			b->words[i] = ((h_word >> (BITS_PER_BITMAPWORD - b_shift)) |
+						   (l_word << b_shift));
+		}
+	}
+	return b;
+}
+
+/*
  * bms_is_subset - is A a subset of B?
  */
 bool
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 029761e..61379a7 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -592,8 +592,11 @@ _copyForeignScan(const ForeignScan *from)
 	/*
 	 * copy remainder of node
 	 */
+	COPY_SCALAR_FIELD(fdw_handler);
 	COPY_NODE_FIELD(fdw_exprs);
+	COPY_NODE_FIELD(fdw_ps_tlist);
 	COPY_NODE_FIELD(fdw_private);
+	COPY_BITMAPSET_FIELD(fdw_relids);
 	COPY_SCALAR_FIELD(fsSystemCol);
 
 	return newnode;
@@ -617,7 +620,9 @@ _copyCustomScan(const CustomScan *from)
 	 */
 	COPY_SCALAR_FIELD(flags);
 	COPY_NODE_FIELD(custom_exprs);
+	COPY_NODE_FIELD(custom_ps_tlist);
 	COPY_NODE_FIELD(custom_private);
+	COPY_BITMAPSET_FIELD(custom_relids);
 
 	/*
 	 * NOTE: The method field of CustomScan is required to be a pointer to a
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 385b289..a178132 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -558,8 +558,11 @@ _outForeignScan(StringInfo str, const ForeignScan *node)
 
 	_outScanInfo(str, (const Scan *) node);
 
+	WRITE_OID_FIELD(fdw_handler);
 	WRITE_NODE_FIELD(fdw_exprs);
+	WRITE_NODE_FIELD(fdw_ps_tlist);
 	WRITE_NODE_FIELD(fdw_private);
+	WRITE_BITMAPSET_FIELD(fdw_relids);
 	WRITE_BOOL_FIELD(fsSystemCol);
 }
 
@@ -572,7 +575,9 @@ _outCustomScan(StringInfo str, const CustomScan *node)
 
 	WRITE_UINT_FIELD(flags);
 	WRITE_NODE_FIELD(custom_exprs);
+	WRITE_NODE_FIELD(custom_ps_tlist);
 	WRITE_NODE_FIELD(custom_private);
+	WRITE_BITMAPSET_FIELD(custom_relids);
 	appendStringInfoString(str, " :methods ");
 	_outToken(str, node->methods->CustomName);
 	if (node->methods->TextOutCustomScan)
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 58d78e6..14872ae 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -60,7 +60,6 @@ set_rel_pathlist_hook_type set_rel_pathlist_hook = NULL;
 /* Hook for plugins to replace standard_join_search() */
 join_search_hook_type join_search_hook = NULL;
 
-
 static void set_base_rel_sizes(PlannerInfo *root);
 static void set_base_rel_pathlists(PlannerInfo *root);
 static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 1da953f..61f1a78 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -17,10 +17,13 @@
 #include <math.h>
 
 #include "executor/executor.h"
+#include "foreign/fdwapi.h"
 #include "optimizer/cost.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 
+/* Hook for plugins to get control in add_paths_to_joinrel() */
+set_join_pathlist_hook_type set_join_pathlist_hook = NULL;
 
 #define PATH_PARAM_BY_REL(path, rel)  \
 	((path)->param_info && bms_overlap(PATH_REQ_OUTER(path), (rel)->relids))
@@ -260,6 +263,16 @@ add_paths_to_joinrel(PlannerInfo *root,
 							 restrictlist, jointype,
 							 sjinfo, &semifactors,
 							 param_source_rels, extra_lateral_rels);
+
+	/*
+	 * 5. Consider paths added by custom-scan providers, or other extensions
+	 * in addition to the built-in paths.
+	 */
+	if (set_join_pathlist_hook)
+		set_join_pathlist_hook(root, joinrel, outerrel, innerrel,
+							   restrictlist, jointype,
+							   sjinfo, &semifactors,
+							   param_source_rels, extra_lateral_rels);
 }
 
 /*
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index fe9fd57..b1c7bcb 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -14,6 +14,7 @@
  */
 #include "postgres.h"
 
+#include "foreign/fdwapi.h"
 #include "optimizer/joininfo.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
@@ -582,6 +583,7 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
 	SpecialJoinInfo sjinfo_data;
 	RelOptInfo *joinrel;
 	List	   *restrictlist;
+	bool		found;
 
 	/* We should never try to join two overlapping sets of rels. */
 	Assert(!bms_overlap(rel1->relids, rel2->relids));
@@ -635,7 +637,7 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
 	 * goes with this particular joining.
 	 */
 	joinrel = build_join_rel(root, joinrelids, rel1, rel2, sjinfo,
-							 &restrictlist);
+							 &restrictlist, &found);
 
 	/*
 	 * If we've already proven this join is empty, we needn't consider any
@@ -648,6 +650,23 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
 	}
 
 	/*
+	 * Prior to all the built-in join logics, consider paths that replaces
+	 * an entire join sub-tree by foreign-scan path, both of inner/outer
+	 * relations are managed by same FDW driver.
+	 * We expect remote join path has usually cheaper cost than local join
+	 * on top of two foreign-scan, so we consult FDW driver to add remote-
+	 * join path first, to break off path consideration with local join
+	 * logics.
+	 */
+	if (!found &&
+		joinrel->fdwroutine &&
+		joinrel->fdwroutine->GetForeignJoinPaths)
+	{
+		joinrel->fdwroutine->GetForeignJoinPaths(root, joinrel, rel1, rel2,
+												 sjinfo, restrictlist);
+	}
+
+	/*
 	 * Consider paths using each rel as both outer and inner.  Depending on
 	 * the join type, a provably empty outer or inner rel might mean the join
 	 * is provably empty too; in which case throw away any previously computed
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index cb69c03..7f86fcb 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -44,7 +44,6 @@
 #include "utils/lsyscache.h"
 
 
-static Plan *create_plan_recurse(PlannerInfo *root, Path *best_path);
 static Plan *create_scan_plan(PlannerInfo *root, Path *best_path);
 static List *build_path_tlist(PlannerInfo *root, Path *path);
 static bool use_physical_tlist(PlannerInfo *root, RelOptInfo *rel);
@@ -220,7 +219,7 @@ create_plan(PlannerInfo *root, Path *best_path)
  * create_plan_recurse
  *	  Recursive guts of create_plan().
  */
-static Plan *
+Plan *
 create_plan_recurse(PlannerInfo *root, Path *best_path)
 {
 	Plan	   *plan;
@@ -1961,16 +1960,26 @@ create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
 	ForeignScan *scan_plan;
 	RelOptInfo *rel = best_path->path.parent;
 	Index		scan_relid = rel->relid;
-	RangeTblEntry *rte;
+	Oid			rel_oid = InvalidOid;
 	Bitmapset  *attrs_used = NULL;
 	ListCell   *lc;
 	int			i;
 
-	/* it should be a base rel... */
-	Assert(scan_relid > 0);
-	Assert(rel->rtekind == RTE_RELATION);
-	rte = planner_rt_fetch(scan_relid, root);
-	Assert(rte->rtekind == RTE_RELATION);
+	/*
+	 * Fetch relation-id, if this foreign-scan node actuall scans on
+	 * a particular real relation. Elsewhere, InvalidOid shall be
+	 * informed to the FDW driver.
+	 */
+	if (scan_relid > 0)
+	{
+		RangeTblEntry *rte;
+
+		Assert(rel->rtekind == RTE_RELATION);
+		rte = planner_rt_fetch(scan_relid, root);
+		Assert(rte->rtekind == RTE_RELATION);
+		rel_oid = rte->relid;
+	}
+	Assert(rel->fdwroutine != NULL);
 
 	/*
 	 * Sort clauses into best execution order.  We do this first since the FDW
@@ -1985,13 +1994,37 @@ create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
 	 * has selected some join clauses for remote use but also wants them
 	 * rechecked locally).
 	 */
-	scan_plan = rel->fdwroutine->GetForeignPlan(root, rel, rte->relid,
+	scan_plan = rel->fdwroutine->GetForeignPlan(root, rel, rel_oid,
 												best_path,
 												tlist, scan_clauses);
+	/*
+	 * Sanity check. Pseudo scan tuple-descriptor shall be constructed
+	 * based on the fdw_ps_tlist, excluding resjunk=true, so we need to
+	 * ensure all valid TLEs have to locate prior to junk ones.
+	 */
+	if (scan_plan->scan.scanrelid == 0)
+	{
+		bool	found_resjunk = false;
+
+		foreach (lc, scan_plan->fdw_ps_tlist)
+		{
+			TargetEntry	   *tle = lfirst(lc);
+
+			if (tle->resjunk)
+				found_resjunk = true;
+			else if (found_resjunk)
+				elog(ERROR, "junk TLE should not apper prior to valid one");
+		}
+	}
+	/* Set the relids that are represented by this foreign scan for Explain */
+	scan_plan->fdw_relids = best_path->path.parent->relids;
 
 	/* Copy cost data from Path to Plan; no need to make FDW do this */
 	copy_path_costsize(&scan_plan->scan.plan, &best_path->path);
 
+	/* Track FDW server-id; no need to make FDW do this */
+	scan_plan->fdw_handler = rel->fdw_handler;
+
 	/*
 	 * Replace any outer-relation variables with nestloop params in the qual
 	 * and fdw_exprs expressions.  We do this last so that the FDW doesn't
@@ -2053,12 +2086,7 @@ create_customscan_plan(PlannerInfo *root, CustomPath *best_path,
 {
 	CustomScan *cplan;
 	RelOptInfo *rel = best_path->path.parent;
-
-	/*
-	 * Right now, all we can support is CustomScan node which is associated
-	 * with a particular base relation to be scanned.
-	 */
-	Assert(rel && rel->reloptkind == RELOPT_BASEREL);
+	ListCell   *lc;
 
 	/*
 	 * Sort clauses into the best execution order, although custom-scan
@@ -2078,6 +2106,28 @@ create_customscan_plan(PlannerInfo *root, CustomPath *best_path,
 	Assert(IsA(cplan, CustomScan));
 
 	/*
+	 * Sanity check. Pseudo scan tuple-descriptor shall be constructed
+	 * based on the custom_ps_tlist, excluding resjunk=true, so we need
+	 * to ensure all valid TLEs have to locate prior to junk ones.
+	 */
+	if (cplan->scan.scanrelid == 0)
+	{
+		bool	found_resjunk = false;
+
+		foreach (lc, cplan->custom_ps_tlist)
+		{
+			TargetEntry	   *tle = lfirst(lc);
+
+			if (tle->resjunk)
+				found_resjunk = true;
+			else if (found_resjunk)
+				elog(ERROR, "junk TLE should not apper prior to valid one");
+		}
+	}
+	/* Set the relids that are represented by this custom scan for Explain */
+	cplan->custom_relids = best_path->path.parent->relids;
+
+	/*
 	 * Copy cost data from Path to Plan; no need to make custom-plan providers
 	 * do this
 	 */
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index ec828cd..2961f44 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -568,6 +568,38 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
 			{
 				ForeignScan *splan = (ForeignScan *) plan;
 
+				if (rtoffset > 0)
+					splan->fdw_relids =
+						bms_shift_members(splan->fdw_relids, rtoffset);
+
+				if (splan->scan.scanrelid == 0)
+				{
+					indexed_tlist *pscan_itlist =
+						build_tlist_index(splan->fdw_ps_tlist);
+
+					splan->scan.plan.targetlist = (List *)
+						fix_upper_expr(root,
+									   (Node *) splan->scan.plan.targetlist,
+									   pscan_itlist,
+									   INDEX_VAR,
+									   rtoffset);
+					splan->scan.plan.qual = (List *)
+						fix_upper_expr(root,
+									   (Node *) splan->scan.plan.qual,
+									   pscan_itlist,
+									   INDEX_VAR,
+									   rtoffset);
+					splan->fdw_exprs = (List *)
+						fix_upper_expr(root,
+									   (Node *) splan->fdw_exprs,
+									   pscan_itlist,
+									   INDEX_VAR,
+									   rtoffset);
+					splan->fdw_ps_tlist =
+						fix_scan_list(root, splan->fdw_ps_tlist, rtoffset);
+					pfree(pscan_itlist);
+					break;
+				}
 				splan->scan.scanrelid += rtoffset;
 				splan->scan.plan.targetlist =
 					fix_scan_list(root, splan->scan.plan.targetlist, rtoffset);
@@ -582,6 +614,38 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
 			{
 				CustomScan *splan = (CustomScan *) plan;
 
+				if (rtoffset > 0)
+					splan->custom_relids =
+						bms_shift_members(splan->custom_relids, rtoffset);
+
+				if (splan->scan.scanrelid == 0)
+				{
+					indexed_tlist *pscan_itlist =
+						build_tlist_index(splan->custom_ps_tlist);
+
+					splan->scan.plan.targetlist = (List *)
+						fix_upper_expr(root,
+									   (Node *) splan->scan.plan.targetlist,
+									   pscan_itlist,
+									   INDEX_VAR,
+									   rtoffset);
+					splan->scan.plan.qual = (List *)
+						fix_upper_expr(root,
+									   (Node *) splan->scan.plan.qual,
+									   pscan_itlist,
+									   INDEX_VAR,
+									   rtoffset);
+					splan->custom_exprs = (List *)
+						fix_upper_expr(root,
+									   (Node *) splan->custom_exprs,
+									   pscan_itlist,
+									   INDEX_VAR,
+									   rtoffset);
+					splan->custom_ps_tlist =
+						fix_scan_list(root, splan->custom_ps_tlist, rtoffset);
+					pfree(pscan_itlist);
+					break;
+				}
 				splan->scan.scanrelid += rtoffset;
 				splan->scan.plan.targetlist =
 					fix_scan_list(root, splan->scan.plan.targetlist, rtoffset);
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 313a5c1..1c570c8 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -378,10 +378,15 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 
 	/* Grab the fdwroutine info using the relcache, while we have it */
 	if (relation->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
+	{
+		rel->fdw_handler = GetFdwHandlerForRelation(relation);
 		rel->fdwroutine = GetFdwRoutineForRelation(relation, true);
+	}
 	else
+	{
+		rel->fdw_handler = InvalidOid;
 		rel->fdwroutine = NULL;
-
+	}
 	heap_close(relation, NoLock);
 
 	/*
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 8cfbea0..da2bd22 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -14,6 +14,7 @@
  */
 #include "postgres.h"
 
+#include "foreign/fdwapi.h"
 #include "optimizer/cost.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
@@ -122,6 +123,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind)
 	rel->subroot = NULL;
 	rel->subplan_params = NIL;
 	rel->fdwroutine = NULL;
+	rel->fdw_handler = InvalidOid;
 	rel->fdw_private = NULL;
 	rel->baserestrictinfo = NIL;
 	rel->baserestrictcost.startup = 0;
@@ -316,6 +318,8 @@ find_join_rel(PlannerInfo *root, Relids relids)
  * 'restrictlist_ptr': result variable.  If not NULL, *restrictlist_ptr
  *		receives the list of RestrictInfo nodes that apply to this
  *		particular pair of joinable relations.
+ * 'found' : indicates whether RelOptInfo is actually constructed.
+ *		true, if it was already built and on the cache.
  *
  * restrictlist_ptr makes the routine's API a little grotty, but it saves
  * duplicated calculation of the restrictlist...
@@ -326,7 +330,8 @@ build_join_rel(PlannerInfo *root,
 			   RelOptInfo *outer_rel,
 			   RelOptInfo *inner_rel,
 			   SpecialJoinInfo *sjinfo,
-			   List **restrictlist_ptr)
+			   List **restrictlist_ptr,
+			   bool *found)
 {
 	RelOptInfo *joinrel;
 	List	   *restrictlist;
@@ -347,8 +352,11 @@ build_join_rel(PlannerInfo *root,
 														   joinrel,
 														   outer_rel,
 														   inner_rel);
+		*found = true;
 		return joinrel;
 	}
+	/* not found on the cache */
+	*found = false;
 
 	/*
 	 * Nope, so make one.
@@ -427,6 +435,18 @@ build_join_rel(PlannerInfo *root,
 							   sjinfo, restrictlist);
 
 	/*
+	 * Set FDW handler and routine if both outer and inner relation
+	 * are managed by same FDW driver.
+	 */
+	if (OidIsValid(outer_rel->fdw_handler) &&
+		OidIsValid(inner_rel->fdw_handler) &&
+		outer_rel->fdw_handler == inner_rel->fdw_handler)
+	{
+		joinrel->fdw_handler = outer_rel->fdw_handler;
+		joinrel->fdwroutine = GetFdwRoutine(joinrel->fdw_handler);
+	}
+
+	/*
 	 * Add the joinrel to the query's joinrel list, and store it into the
 	 * auxiliary hashtable if there is one.  NB: GEQO requires us to append
 	 * the new joinrel to the end of the list!
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 28e1acf..90e1107 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -3842,6 +3842,10 @@ set_deparse_planstate(deparse_namespace *dpns, PlanState *ps)
 	/* index_tlist is set only if it's an IndexOnlyScan */
 	if (IsA(ps->plan, IndexOnlyScan))
 		dpns->index_tlist = ((IndexOnlyScan *) ps->plan)->indextlist;
+	else if (IsA(ps->plan, ForeignScan))
+		dpns->index_tlist = ((ForeignScan *) ps->plan)->fdw_ps_tlist;
+	else if (IsA(ps->plan, CustomScan))
+		dpns->index_tlist = ((CustomScan *) ps->plan)->custom_ps_tlist;
 	else
 		dpns->index_tlist = NIL;
 }
diff --git a/src/include/foreign/fdwapi.h b/src/include/foreign/fdwapi.h
index 1d76841..d3a5261 100644
--- a/src/include/foreign/fdwapi.h
+++ b/src/include/foreign/fdwapi.h
@@ -82,6 +82,13 @@ typedef void (*EndForeignModify_function) (EState *estate,
 
 typedef int (*IsForeignRelUpdatable_function) (Relation rel);
 
+typedef void (*GetForeignJoinPaths_function ) (PlannerInfo *root,
+											   RelOptInfo *joinrel,
+											   RelOptInfo *outerrel,
+											   RelOptInfo *innerrel,
+											   SpecialJoinInfo *sjinfo,
+											   List *restrictlist);
+
 typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
 													struct ExplainState *es);
 
@@ -150,6 +157,10 @@ typedef struct FdwRoutine
 
 	/* Support functions for IMPORT FOREIGN SCHEMA */
 	ImportForeignSchema_function ImportForeignSchema;
+
+	/* Support functions for join push-down */
+	GetForeignJoinPaths_function GetForeignJoinPaths;
+
 } FdwRoutine;
 
 
@@ -157,6 +168,7 @@ typedef struct FdwRoutine
 extern FdwRoutine *GetFdwRoutine(Oid fdwhandler);
 extern FdwRoutine *GetFdwRoutineByRelId(Oid relid);
 extern FdwRoutine *GetFdwRoutineForRelation(Relation relation, bool makecopy);
+extern Oid	GetFdwHandlerForRelation(Relation relation);
 extern bool IsImportableForeignTable(const char *tablename,
 						 ImportForeignSchemaStmt *stmt);
 
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index 3a556ee..3ca9791 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -66,6 +66,7 @@ extern void bms_free(Bitmapset *a);
 extern Bitmapset *bms_union(const Bitmapset *a, const Bitmapset *b);
 extern Bitmapset *bms_intersect(const Bitmapset *a, const Bitmapset *b);
 extern Bitmapset *bms_difference(const Bitmapset *a, const Bitmapset *b);
+extern Bitmapset *bms_shift_members(const Bitmapset *a, int shift);
 extern bool bms_is_subset(const Bitmapset *a, const Bitmapset *b);
 extern BMS_Comparison bms_subset_compare(const Bitmapset *a, const Bitmapset *b);
 extern bool bms_is_member(int x, const Bitmapset *a);
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 21cbfa8..b25330e 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -471,7 +471,13 @@ typedef struct WorkTableScan
  * fdw_exprs and fdw_private are both under the control of the foreign-data
  * wrapper, but fdw_exprs is presumed to contain expression trees and will
  * be post-processed accordingly by the planner; fdw_private won't be.
- * Note that everything in both lists must be copiable by copyObject().
+ * An optional fdw_ps_tlist is used to map a reference to an attribute of
+ * underlying relation(s) on a pair of INDEX_VAR and alternative varattno.
+ * It looks like a scan on pseudo relation that is usually result of
+ * relations join on remote data source, and FDW driver is responsible to
+ * set expected target list for this. If FDW returns records as foreign-
+ * table definition, just put NIL here.
+ * Note that everything in above lists must be copiable by copyObject().
  * One way to store an arbitrary blob of bytes is to represent it as a bytea
  * Const.  Usually, though, you'll be better off choosing a representation
  * that can be dumped usefully by nodeToString().
@@ -480,18 +486,23 @@ typedef struct WorkTableScan
 typedef struct ForeignScan
 {
 	Scan		scan;
+	Oid			fdw_handler;	/* OID of FDW handler */
 	List	   *fdw_exprs;		/* expressions that FDW may evaluate */
+	List	   *fdw_ps_tlist;	/* optional pseudo-scan tlist for FDW */
 	List	   *fdw_private;	/* private data for FDW */
+	Bitmapset  *fdw_relids;		/* set of relid (index of range-tables)
+								 * represented by this node */
 	bool		fsSystemCol;	/* true if any "system column" is needed */
 } ForeignScan;
 
 /* ----------------
  *	   CustomScan node
  *
- * The comments for ForeignScan's fdw_exprs and fdw_private fields apply
- * equally to custom_exprs and custom_private.  Note that since Plan trees
- * can be copied, custom scan providers *must* fit all plan data they need
- * into those fields; embedding CustomScan in a larger struct will not work.
+ * The comments for ForeignScan's fdw_exprs, fdw_varmap and fdw_private fields
+ * apply equally to custom_exprs, custom_ps_tlist and custom_private.
+ *  Note that since Plan trees can be copied, custom scan providers *must*
+ * fit all plan data they need into those fields; embedding CustomScan in
+ * a larger struct will not work.
  * ----------------
  */
 struct CustomScan;
@@ -512,7 +523,10 @@ typedef struct CustomScan
 	Scan		scan;
 	uint32		flags;			/* mask of CUSTOMPATH_* flags, see relation.h */
 	List	   *custom_exprs;	/* expressions that custom code may evaluate */
+	List	   *custom_ps_tlist;/* optional pseudo-scan target list */
 	List	   *custom_private; /* private data for custom code */
+	Bitmapset  *custom_relids;	/* set of relid (index of range-tables)
+								 * represented by this node */
 	const CustomScanMethods *methods;
 } CustomScan;
 
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 334cf51..4eb89c6 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -366,6 +366,7 @@ typedef struct PlannerInfo
  *		subroot - PlannerInfo for subquery (NULL if it's not a subquery)
  *		subplan_params - list of PlannerParamItems to be passed to subquery
  *		fdwroutine - function hooks for FDW, if foreign table (else NULL)
+ *		fdw_handler - OID of FDW handler, if foreign table (else InvalidOid)
  *		fdw_private - private state for FDW, if foreign table (else NULL)
  *
  *		Note: for a subquery, tuples, subplan, subroot are not set immediately
@@ -461,6 +462,7 @@ typedef struct RelOptInfo
 	List	   *subplan_params; /* if subquery */
 	/* use "struct FdwRoutine" to avoid including fdwapi.h here */
 	struct FdwRoutine *fdwroutine;		/* if foreign table */
+	Oid			fdw_handler;	/* if foreign table */
 	void	   *fdw_private;	/* if foreign table */
 
 	/* used by various scans and joins: */
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 9923f0e..3053f0f 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -141,7 +141,8 @@ extern RelOptInfo *build_join_rel(PlannerInfo *root,
 			   RelOptInfo *outer_rel,
 			   RelOptInfo *inner_rel,
 			   SpecialJoinInfo *sjinfo,
-			   List **restrictlist_ptr);
+			   List **restrictlist_ptr,
+			   bool *found);
 extern RelOptInfo *build_empty_join_rel(PlannerInfo *root);
 extern AppendRelInfo *find_childrel_appendrelinfo(PlannerInfo *root,
 							RelOptInfo *rel);
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 6cad92e..c42c69d 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -30,6 +30,19 @@ typedef void (*set_rel_pathlist_hook_type) (PlannerInfo *root,
 														RangeTblEntry *rte);
 extern PGDLLIMPORT set_rel_pathlist_hook_type set_rel_pathlist_hook;
 
+/* Hook for plugins to get control in add_paths_to_joinrel() */
+typedef void (*set_join_pathlist_hook_type) (PlannerInfo *root,
+											 RelOptInfo *joinrel,
+											 RelOptInfo *outerrel,
+											 RelOptInfo *innerrel,
+											 List *restrictlist,
+											 JoinType jointype,
+											 SpecialJoinInfo *sjinfo,
+											 SemiAntiJoinFactors *semifactors,
+											 Relids param_source_rels,
+											 Relids extra_lateral_rels);
+extern PGDLLIMPORT set_join_pathlist_hook_type set_join_pathlist_hook;
+
 /* Hook for plugins to replace standard_join_search() */
 typedef RelOptInfo *(*join_search_hook_type) (PlannerInfo *root,
 														  int levels_needed,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index fa72918..0c8cbcd 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -41,6 +41,7 @@ extern Plan *optimize_minmax_aggregates(PlannerInfo *root, List *tlist,
  * prototypes for plan/createplan.c
  */
 extern Plan *create_plan(PlannerInfo *root, Path *best_path);
+extern Plan *create_plan_recurse(PlannerInfo *root, Path *best_path);
 extern SubqueryScan *make_subqueryscan(List *qptlist, List *qpqual,
 				  Index scanrelid, Plan *subplan);
 extern ForeignScan *make_foreignscan(List *qptlist, List *qpqual,

#15

Shigeru HANADA

shigeru.hanada@gmail.com

almost 11 years ago

In reply to: Kouhei Kaigai (#14)

2015/03/26 10:51、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：

The attached patch adds GetForeignJoinPaths call on make_join_rel() only when
'joinrel' is actually built and both of child relations are managed by same
FDW driver, prior to any other built-in join paths.
I adjusted the hook definition a little bit, because jointype can be reproduced
using SpecialJoinInfo. Right?

OK.

Probably, it will solve the original concern towards multiple calls of FDW
handler in case when it tries to replace an entire join subtree with a foreign-
scan on the result of remote join query.

How about your opinion?

Seems fine. I’ve fixed my postgres_fdw code to fit the new version, and am working on handling a whole-join-tree.

It would be difficult in the 9.5 cycle, but a hook point where we can handle whole joinrel might allow us to optimize a query which accesses multiple parent tables, each is inherited by foreign tables and partitioned with identical join key, by building a path tree which joins sharded tables first, and then union those results.

--
Shigeru HANADA
shigeru.hanada@gmail.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

Shigeru Hanada

shigeru.hanada@gmail.com

almost 11 years ago

In reply to: Shigeru HANADA (#15)

1 attachment(s)

Attached is the patch which adds join push-down support to postgres_fdw
(v7). It supports SELECT statements with JOIN, but has some more possible
enhancements (see below). I'd like to share the WIP patch here to get
comments about new FDW API design provided by KaiGai-san's v11 patch.

To make reviewing easier, I summarized changes against Custom/Foreign join
v11 patch.

Changes for Join push-down support
==================================
- Add FDW API GetForeignJoinPaths(). It generates a ForeignPath which
represents a scan against pseudo join relation represented by given
RelOptInfo.
- Expand deparsing module to handle multi-relation queries. Steps of
deparsing a join query:

1) Optimizer calls postgresGetForeignPaths() for each BASEREL. Here
postgres_fdw does the same things as before, except adding column aliases
in SELECT clause.
2) Optimizer calls postgresGetForeignJoinPaths() for each JOINREL.
Optimizer calls once per RelOptInfo with reloptkind == RELOPT_JOINREL, so
postgres_fdw should consider both A JOIN B and B JOIN A in one call.

postgres_fdw checks whether the join can be pushed down.

a) Both outer and inner relations can be pushed down (NULL in
RelOptInfo#fdw_private indicates such situation)
b) Outmost command is a SELECT (this can be relaxed in the future)
c) Join type is inner or one of outer
d) Server of all relations in the join are identical
e) Effective user id for all relations in the join are identical (they
might be different some were accessed via views)
f) No local filters (this can be relaxed if inner && non-volatile)
g) Join conditions doesn't contain any "unsafe" expression
h) Remote filter doesn't contain any "unsafe" expression

If all criteria passed, postgres_fdw makes ForeignPath for the join and
store these information in its fdw_private.

a) ForeignPath of outer relation, first non-parameterized one
b) ForeignPath of outer relation, first non-parameterized one
c) join type (as integer)
d) join conditions (as List of Expr)
e) other conditions (as List of Expr)

As of now the costs of the path is not so accurate, this is a possible
enhancement.

2) Optimizer calls postgresGetForeignPlan() for the cheapest topmost Path.
If foreign join is the cheapest way to execute the query, optimizer calls
postgresGetForeignPlan for the topmost path generated by
postgresGetForeignJoinPaths. As Robert and Tom mentioned in the thread,
large_table JOIN huge_table might be removed even (large_table JOIN
huge_table) JOIN small_table is the cheapest in the join level 3, so
postgres_fdw can't assume that paths in lower level survived planning.

To cope with the situation, I'm trying to avoid calling create_plan_recurse()
for underlying paths by putting necessary information into
PgFdwRelationInfo and link it to appropriate RelOptInfo.

Anyway in current version query string is built up from bottom (BASEREL) to
upper recursively. For a join, unerlying outer/inner query are put into
FROM clause with wrapping with parenthesis and aliasing. For example:

select * from pgbench_branches b join pgbench_tellers t on t.bid = b.bid;

is transformed to a query like this:

SELECT l.a1, l.a2, l.a3, r.a1, r.a2, r.a3, r.a4 FROM (SELECT bid a9,
bbalance a10, filler a11 FROM public.pgbench_branches) l (a1, a2, a3) INNER
JOIN (SELECT tid a9, bid a10, balance a11, filler a12 FROM
public.pgbench_tellers)
r (a1, a2, a3, a4) ON ((l.a1 = r.a2));

As in the remote query, column aliasing uses attnum-based numbering with
shifted by FirstLowInvalidHeapAttributeNumber to make all attnum positive.
For instance, this system uses alias "a9" for the first user column. For
readability of code around this, I introduced TO_RELATEVE() macro which
converts absolute attnum (-8~) to relative ones (0~). Current deparser can
also handle whole-row references (attnum == 0) correctly.

3) Executor calls BeginForeignScan to initialize a scan. Here TupleDesc is
taken from the slot, not Relation.

Possible enhancement
====================
- Make deparseSelectSql() more general, thus it can handle both simple
SELECT and join SELECT by calling itself recursively. This would avoid
assuming that underlying ForeignPath remains in RelOptInfo. (WIP)
- Move appendConditions() calls into deparse.c, to clarify responsibility
of modules.
- more accurate estimation
- more detailed information for error location (currently "foreign table"
is used as relation name always)

Attachments:

foreign_join_v7.patchapplication/octet-stream; name=foreign_join_v7.patchDownload

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 94fab18..dee1479 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -44,7 +44,9 @@
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
+#include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "nodes/plannodes.h"
 #include "optimizer/clauses.h"
 #include "optimizer/var.h"
 #include "parser/parsetree.h"
@@ -89,6 +91,8 @@ typedef struct deparse_expr_cxt
 	RelOptInfo *foreignrel;		/* the foreign relation we are planning for */
 	StringInfo	buf;			/* output buffer to append to */
 	List	  **params_list;	/* exprs that will become remote Params */
+	List	   *outertlist;		/* outer child's target list */
+	List	   *innertlist;		/* inner child's target list */
 } deparse_expr_cxt;
 
 /*
@@ -137,12 +141,19 @@ static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod,
 static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
 					   deparse_expr_cxt *context);
 
+/*
+ * convert absolute attnum to relative one.  This would be handy for handling
+ * attnum for attrs_used and column aliases.
+ */
+#define TO_RELATIVE(x)	((x) - FirstLowInvalidHeapAttributeNumber)
+
 
 /*
  * Examine each qual clause in input_conds, and classify them into two groups,
  * which are returned as two lists:
  *	- remote_conds contains expressions that can be evaluated remotely
  *	- local_conds contains expressions that can't be evaluated remotely
+ * Note that each element is Expr, which was stripped from RestrictInfo, 
  */
 void
 classifyConditions(PlannerInfo *root,
@@ -161,9 +172,9 @@ classifyConditions(PlannerInfo *root,
 		RestrictInfo *ri = (RestrictInfo *) lfirst(lc);
 
 		if (is_foreign_expr(root, baserel, ri->clause))
-			*remote_conds = lappend(*remote_conds, ri);
+			*remote_conds = lappend(*remote_conds, ri->clause);
 		else
-			*local_conds = lappend(*local_conds, ri);
+			*local_conds = lappend(*local_conds, ri->clause);
 	}
 }
 
@@ -250,7 +261,7 @@ foreign_expr_walker(Node *node,
 				 * Param's collation, ie it's not safe for it to have a
 				 * non-default collation.
 				 */
-				if (var->varno == glob_cxt->foreignrel->relid &&
+				if (bms_is_member(var->varno, glob_cxt->foreignrel->relids) &&
 					var->varlevelsup == 0)
 				{
 					/* Var belongs to foreign table */
@@ -731,8 +742,7 @@ deparseTargetList(StringInfo buf,
 	*retrieved_attrs = NIL;
 
 	/* If there's a whole-row reference, we'll need all the columns. */
-	have_wholerow = bms_is_member(0 - FirstLowInvalidHeapAttributeNumber,
-								  attrs_used);
+	have_wholerow = bms_is_member(TO_RELATIVE(0), attrs_used);
 
 	first = true;
 	for (i = 1; i <= tupdesc->natts; i++)
@@ -743,15 +753,14 @@ deparseTargetList(StringInfo buf,
 		if (attr->attisdropped)
 			continue;
 
-		if (have_wholerow ||
-			bms_is_member(i - FirstLowInvalidHeapAttributeNumber,
-						  attrs_used))
+		if (have_wholerow || bms_is_member(TO_RELATIVE(i), attrs_used))
 		{
 			if (!first)
 				appendStringInfoString(buf, ", ");
 			first = false;
 
 			deparseColumnRef(buf, rtindex, i, root);
+			appendStringInfo(buf, " a%d", TO_RELATIVE(i));
 
 			*retrieved_attrs = lappend_int(*retrieved_attrs, i);
 		}
@@ -761,14 +770,14 @@ deparseTargetList(StringInfo buf,
 	 * Add ctid if needed.  We currently don't support retrieving any other
 	 * system columns.
 	 */
-	if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-					  attrs_used))
+	if (bms_is_member(TO_RELATIVE(SelfItemPointerAttributeNumber), attrs_used))
 	{
 		if (!first)
 			appendStringInfoString(buf, ", ");
 		first = false;
 
-		appendStringInfoString(buf, "ctid");
+		appendStringInfo(buf, "ctid a%d",
+						 TO_RELATIVE(SelfItemPointerAttributeNumber));
 
 		*retrieved_attrs = lappend_int(*retrieved_attrs,
 									   SelfItemPointerAttributeNumber);
@@ -780,7 +789,8 @@ deparseTargetList(StringInfo buf,
 }
 
 /*
- * Deparse WHERE clauses in given list of RestrictInfos and append them to buf.
+ * Deparse conditions, such as WHERE clause and ON clause of JOIN, in given
+ * list of Expr and append them to buf.
  *
  * baserel is the foreign table we're planning for.
  *
@@ -794,12 +804,14 @@ deparseTargetList(StringInfo buf,
  * so Params and other-relation Vars should be replaced by dummy values.
  */
 void
-appendWhereClause(StringInfo buf,
-				  PlannerInfo *root,
-				  RelOptInfo *baserel,
-				  List *exprs,
-				  bool is_first,
-				  List **params)
+appendConditions(StringInfo buf,
+				 PlannerInfo *root,
+				 RelOptInfo *baserel,
+				 List *outertlist,
+				 List *innertlist,
+				 List *exprs,
+				 const char *prefix,
+				 List **params)
 {
 	deparse_expr_cxt context;
 	int			nestlevel;
@@ -813,31 +825,266 @@ appendWhereClause(StringInfo buf,
 	context.foreignrel = baserel;
 	context.buf = buf;
 	context.params_list = params;
+	context.outertlist = outertlist;
+	context.innertlist = innertlist;
 
 	/* Make sure any constants in the exprs are printed portably */
 	nestlevel = set_transmission_modes();
 
 	foreach(lc, exprs)
 	{
-		RestrictInfo *ri = (RestrictInfo *) lfirst(lc);
+		Expr	   *expr = (Expr *) lfirst(lc);
 
 		/* Connect expressions with "AND" and parenthesize each condition. */
-		if (is_first)
-			appendStringInfoString(buf, " WHERE ");
-		else
-			appendStringInfoString(buf, " AND ");
+		if (prefix)
+			appendStringInfo(buf, "%s", prefix);
 
 		appendStringInfoChar(buf, '(');
-		deparseExpr(ri->clause, &context);
+		deparseExpr(expr, &context);
 		appendStringInfoChar(buf, ')');
 
-		is_first = false;
+		prefix= " AND ";
 	}
 
 	reset_transmission_modes(nestlevel);
 }
 
 /*
+ * Returns position index (start with 1) of given var in given target list, or
+ * 0 when not found.
+ */
+static int
+find_var_pos(Var *node, List *tlist)
+{
+	int		pos = 1;
+	ListCell *lc;
+
+	foreach(lc, tlist)
+	{
+		Var *var = (Var *) lfirst(lc);
+
+		if (equal(var, node))
+		{
+			return pos;
+		}
+		pos++;
+	}
+
+	return 0;
+}
+
+/*
+ * Deparse given Var into buf.
+ */
+static void
+deparseJoinVar(Var *node, deparse_expr_cxt *context)
+{
+	char		side;
+	int			pos;
+
+	pos = find_var_pos(node, context->outertlist);
+	if (pos > 0)
+		side = 'l';
+	else
+	{
+		side = 'r';
+		pos = find_var_pos(node, context->innertlist);
+	}
+	Assert(pos > 0);
+	Assert(side == 'l' || side == 'r');
+
+	/*
+	 * We treat whole-row reference same as ordinary attribute references,
+	 * because such transformation should be done in lower level.
+	 */
+	appendStringInfo(context->buf, "%c.a%d", side, pos);
+}
+
+/*
+ * Deparse column alias list for a subquery in FROM clause.
+ */
+static void
+deparseColumnAliases(StringInfo buf, List *tlist)
+{
+	int			pos;
+	ListCell   *lc;
+
+	pos = 1;
+	foreach(lc, tlist)
+	{
+		/* Deparse column alias for the subquery */
+		if (pos > 1)
+			appendStringInfoString(buf, ", ");
+		appendStringInfo(buf, "a%d", pos);
+		pos++;
+	}
+}
+
+/*
+ * Deparse "wrapper" SQL for a query which contains whole-row reference.
+ * This is almost same as ExecProjection does for such targetlist entry,
+ * but join push-down skips that step so we do it in remote SQL.
+ */
+static const char *
+deparseProjectionSql(List *targetlist, const char *sql, char side)
+{
+	StringInfoData buf;
+	ListCell   *lc;
+	bool		first = true;
+	bool		have_wholerow = false;
+	bool		have_ctid = false;
+
+	/*
+	 * We have nothing to do if the targetlist contains no special reference,
+	 * such as whole-row and ctid.
+	 */
+	foreach(lc, targetlist)
+	{
+		Var		   *var = (Var *) lfirst(lc);
+		if (var->varattno == 0)
+		{
+			have_wholerow = true;
+			break;
+		}
+		else if (var->varattno == SelfItemPointerAttributeNumber)
+		{
+			have_ctid = true;
+			break;
+		}
+	}
+	if (!have_wholerow && !have_ctid)
+		return sql;
+
+	/*
+	 * Construct a SELECT statement which has the original query in its FROM
+	 * clause, and have target list entries in its SELECT clause.  The number
+	 * used in column aliases are attnum - FirstLowInvalidHeapAttributeNumber in
+	 * order to make all numbers positive even for system columns which have
+	 * minus value as attnum.
+	 */
+	initStringInfo(&buf);
+	appendStringInfoString(&buf, "SELECT ");
+	foreach(lc, targetlist)
+	{
+		Var *var = (Var *) lfirst(lc);
+
+		if (!first)
+			appendStringInfoString(&buf, ", ");
+	
+		if (var->varattno == 0)
+			appendStringInfo(&buf, "%c", side);
+		else
+			appendStringInfo(&buf, "%c.a%d", side, TO_RELATIVE(var->varattno));
+
+		first = false;
+	}
+	appendStringInfo(&buf, " FROM (%s) %c", sql, side);
+
+	return buf.data;
+}
+
+/*
+ * Construct a SELECT statement which contains join clause.
+ *
+ * We also create an TargetEntry List of the columns being retrieved, which is
+ * returned to *fdw_ps_tlist.
+ *
+ * path_o, tl_o, sql_o are respectively path, targetlist, and remote query
+ * statement of the outer child relation.  postfix _i means those for the inner
+ * child relation.  jointype and joinclauses are information of join method.
+ * fdw_ps_tlist is output parameter to pass target list of the pseudo scan to
+ * caller.
+ */
+void
+deparseJoinSql(StringInfo sql,
+			   PlannerInfo *root,
+			   RelOptInfo *baserel,
+			   RelOptInfo *outerrel,
+			   RelOptInfo *innerrel,
+			   const char *sql_o,
+			   const char *sql_i,
+			   JoinType jointype,
+			   List *joinclauses,
+			   List *otherclauses,
+			   List **fdw_ps_tlist)
+{
+	StringInfoData selbuf;		/* buffer for SELECT clause */
+	StringInfoData abuf_o;		/* buffer for column alias list of outer */
+	StringInfoData abuf_i;		/* buffer for column alias list of inner */
+	int			i;
+	ListCell   *lc;
+	const char *jointype_str;
+	deparse_expr_cxt context;
+
+	context.root = root;
+	context.foreignrel = baserel;
+	context.buf = &selbuf;
+	context.params_list = NULL;
+	context.outertlist = outerrel->reltargetlist;
+	context.innertlist = innerrel->reltargetlist;
+
+	jointype_str = jointype == JOIN_INNER ? "INNER" :
+				   jointype == JOIN_LEFT ? "LEFT" :
+				   jointype == JOIN_RIGHT ? "RIGHT" :
+				   jointype == JOIN_FULL ? "FULL" : "";
+
+	/* print SELECT clause of the join scan */
+	initStringInfo(&selbuf);
+	i = 0;
+	foreach(lc, baserel->reltargetlist)
+	{
+		Var		   *var = (Var *) lfirst(lc);
+		TargetEntry *tle;
+
+		if (i > 0)
+			appendStringInfoString(&selbuf, ", ");
+		deparseJoinVar(var, &context);
+
+		tle = makeTargetEntry((Expr *) copyObject(var),
+							  i + 1, pstrdup(""), false);
+		if (fdw_ps_tlist)
+			*fdw_ps_tlist = lappend(*fdw_ps_tlist, copyObject(tle));
+
+		i++;
+	}
+	if (i == 0)
+		appendStringInfoString(&selbuf, "NULL");
+
+	/*
+	 * Do pseudo-projection for an underlying scan on a foreign table, if a) the
+	 * relation is a base relation, and b) its targetlist contains whole-row
+	 * reference.
+	 */
+	if (outerrel->reloptkind == RELOPT_BASEREL)
+		sql_o = deparseProjectionSql(outerrel->reltargetlist, sql_o, 'l');
+	if (innerrel->reloptkind == RELOPT_BASEREL)
+		sql_i = deparseProjectionSql(innerrel->reltargetlist, sql_i, 'r');
+
+	/* Deparse column alias portion of subquery in FROM clause. */
+	initStringInfo(&abuf_o);
+	deparseColumnAliases(&abuf_o, outerrel->reltargetlist);
+	initStringInfo(&abuf_i);
+	deparseColumnAliases(&abuf_i, innerrel->reltargetlist);
+
+	/* Construct SELECT statement */
+	appendStringInfo(sql, "SELECT %s FROM", selbuf.data);
+	appendStringInfo(sql, " (%s) l (%s) %s JOIN (%s) r (%s)",
+					 sql_o, abuf_o.data, jointype_str, sql_i, abuf_i.data);
+	/* Append ON clause */
+	if (joinclauses)
+		appendConditions(sql, root, baserel,
+						 outerrel->reltargetlist, innerrel->reltargetlist,
+						 joinclauses,
+						 " ON ", NULL);
+	/* Append WHERE clause */
+	if (otherclauses)
+		appendConditions(sql, root, baserel,
+						 outerrel->reltargetlist, innerrel->reltargetlist,
+						 otherclauses,
+						 " WHERE ", NULL);
+}
+
+/*
  * deparse remote INSERT statement
  *
  * The statement text is appended to buf, and we also create an integer List
@@ -976,8 +1223,7 @@ deparseReturningList(StringInfo buf, PlannerInfo *root,
 	if (trig_after_row)
 	{
 		/* whole-row reference acquires all non-system columns */
-		attrs_used =
-			bms_make_singleton(0 - FirstLowInvalidHeapAttributeNumber);
+		attrs_used = bms_make_singleton(TO_RELATIVE(0));
 	}
 
 	if (returningList != NIL)
@@ -1261,6 +1507,8 @@ deparseExpr(Expr *node, deparse_expr_cxt *context)
 /*
  * Deparse given Var node into context->buf.
  *
+ * If context has valid innerrel, this is invoked for a join conditions.
+ *
  * If the Var belongs to the foreign relation, just print its remote name.
  * Otherwise, it's effectively a Param (and will in fact be a Param at
  * run time).  Handle it the same way we handle plain Params --- see
@@ -1271,39 +1519,46 @@ deparseVar(Var *node, deparse_expr_cxt *context)
 {
 	StringInfo	buf = context->buf;
 
-	if (node->varno == context->foreignrel->relid &&
-		node->varlevelsup == 0)
+	if (context->foreignrel->reloptkind == RELOPT_JOINREL)
 	{
-		/* Var belongs to foreign table */
-		deparseColumnRef(buf, node->varno, node->varattno, context->root);
+		deparseJoinVar(node, context);
 	}
 	else
 	{
-		/* Treat like a Param */
-		if (context->params_list)
+		if (node->varno == context->foreignrel->relid &&
+			node->varlevelsup == 0)
 		{
-			int			pindex = 0;
-			ListCell   *lc;
-
-			/* find its index in params_list */
-			foreach(lc, *context->params_list)
+			/* Var belongs to foreign table */
+			deparseColumnRef(buf, node->varno, node->varattno, context->root);
+		}
+		else
+		{
+			/* Treat like a Param */
+			if (context->params_list)
 			{
-				pindex++;
-				if (equal(node, (Node *) lfirst(lc)))
-					break;
+				int			pindex = 0;
+				ListCell   *lc;
+
+				/* find its index in params_list */
+				foreach(lc, *context->params_list)
+				{
+					pindex++;
+					if (equal(node, (Node *) lfirst(lc)))
+						break;
+				}
+				if (lc == NULL)
+				{
+					/* not in list, so add it */
+					pindex++;
+					*context->params_list = lappend(*context->params_list, node);
+				}
+
+				printRemoteParam(pindex, node->vartype, node->vartypmod, context);
 			}
-			if (lc == NULL)
+			else
 			{
-				/* not in list, so add it */
-				pindex++;
-				*context->params_list = lappend(*context->params_list, node);
+				printRemotePlaceholder(node->vartype, node->vartypmod, context);
 			}
-
-			printRemoteParam(pindex, node->vartype, node->vartypmod, context);
-		}
-		else
-		{
-			printRemotePlaceholder(node->vartype, node->vartypmod, context);
 		}
 	}
 }
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 783cb41..afb3f8f 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9,11 +9,16 @@ DO $d$
             OPTIONS (dbname '$$||current_database()||$$',
                      port '$$||current_setting('port')||$$'
             )$$;
+        EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER postgres_fdw
+            OPTIONS (dbname '$$||current_database()||$$',
+                     port '$$||current_setting('port')||$$'
+            )$$;
     END;
 $d$;
 CREATE USER MAPPING FOR public SERVER testserver1
 	OPTIONS (user 'value', password 'value');
 CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
+CREATE USER MAPPING FOR CURRENT_USER SERVER loopback2;
 -- ===================================================================
 -- create objects used through FDW loopback server
 -- ===================================================================
@@ -35,6 +40,18 @@ CREATE TABLE "S 1"."T 2" (
 	c2 text,
 	CONSTRAINT t2_pkey PRIMARY KEY (c1)
 );
+CREATE TABLE "S 1"."T 3" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text,
+	CONSTRAINT t3_pkey PRIMARY KEY (c1)
+);
+CREATE TABLE "S 1"."T 4" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c4 text,
+	CONSTRAINT t4_pkey PRIMARY KEY (c1)
+);
 INSERT INTO "S 1"."T 1"
 	SELECT id,
 	       id % 10,
@@ -49,8 +66,22 @@ INSERT INTO "S 1"."T 2"
 	SELECT id,
 	       'AAA' || to_char(id, 'FM000')
 	FROM generate_series(1, 100) id;
+INSERT INTO "S 1"."T 3"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 3" WHERE c1 % 2 != 0;	-- delete for outer join tests
+INSERT INTO "S 1"."T 4"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 4" WHERE c1 % 3 != 0;	-- delete for outer join tests
 ANALYZE "S 1"."T 1";
 ANALYZE "S 1"."T 2";
+ANALYZE "S 1"."T 3";
+ANALYZE "S 1"."T 4";
 -- ===================================================================
 -- create foreign tables
 -- ===================================================================
@@ -78,6 +109,26 @@ CREATE FOREIGN TABLE ft2 (
 	c8 user_enum
 ) SERVER loopback;
 ALTER FOREIGN TABLE ft2 DROP COLUMN cx;
+CREATE FOREIGN TABLE ft4 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3');
+CREATE FOREIGN TABLE ft5 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE FOREIGN TABLE ft6 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback2 OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE USER view_owner;
+GRANT ALL ON ft5 TO view_owner;
+CREATE VIEW v_ft5 AS SELECT * FROM ft5;
+ALTER VIEW v_ft5 OWNER TO view_owner;
+CREATE USER MAPPING FOR view_owner SERVER loopback;
 -- ===================================================================
 -- tests for validator
 -- ===================================================================
@@ -119,12 +170,15 @@ ALTER FOREIGN TABLE ft2 OPTIONS (schema_name 'S 1', table_name 'T 1');
 ALTER FOREIGN TABLE ft1 ALTER COLUMN c1 OPTIONS (column_name 'C 1');
 ALTER FOREIGN TABLE ft2 ALTER COLUMN c1 OPTIONS (column_name 'C 1');
 \det+
-                             List of foreign tables
- Schema | Table |  Server  |              FDW Options              | Description 
---------+-------+----------+---------------------------------------+-------------
- public | ft1   | loopback | (schema_name 'S 1', table_name 'T 1') | 
- public | ft2   | loopback | (schema_name 'S 1', table_name 'T 1') | 
-(2 rows)
+                              List of foreign tables
+ Schema | Table |  Server   |              FDW Options              | Description 
+--------+-------+-----------+---------------------------------------+-------------
+ public | ft1   | loopback  | (schema_name 'S 1', table_name 'T 1') | 
+ public | ft2   | loopback  | (schema_name 'S 1', table_name 'T 1') | 
+ public | ft4   | loopback  | (schema_name 'S 1', table_name 'T 3') | 
+ public | ft5   | loopback  | (schema_name 'S 1', table_name 'T 4') | 
+ public | ft6   | loopback2 | (schema_name 'S 1', table_name 'T 4') | 
+(5 rows)
 
 -- Now we should be able to run ANALYZE.
 -- To exercise multiple code paths, we use local stats on ft1
@@ -160,8 +214,8 @@ SELECT * FROM ft1 ORDER BY c3, c1 OFFSET 100 LIMIT 10;
 (10 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                     QUERY PLAN                                                      
+---------------------------------------------------------------------------------------------------------------------
  Limit
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    ->  Sort
@@ -169,7 +223,7 @@ EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan on public.ft1 t1
                Output: c1, c2, c3, c4, c5, c6, c7, c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (8 rows)
 
 SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
@@ -189,8 +243,8 @@ SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
 
 -- whole-row reference
 EXPLAIN (VERBOSE, COSTS false) SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                     QUERY PLAN                                                      
+---------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.*, c3, c1
    ->  Sort
@@ -198,7 +252,7 @@ EXPLAIN (VERBOSE, COSTS false) SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSE
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan on public.ft1 t1
                Output: t1.*, c3, c1
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (8 rows)
 
 SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
@@ -224,11 +278,11 @@ SELECT * FROM ft1 WHERE false;
 
 -- with WHERE clause
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
-                                                                   QUERY PLAN                                                                   
-------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                   QUERY PLAN                                                                                   
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c7 >= '1'::bpchar)) AND (("C 1" = 101)) AND ((c6 = '1'::text))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c7 >= '1'::bpchar)) AND (("C 1" = 101)) AND ((c6 = '1'::text))
 (3 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
@@ -239,13 +293,13 @@ SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
 
 -- with FOR UPDATE/SHARE
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
-                                                   QUERY PLAN                                                   
-----------------------------------------------------------------------------------------------------------------
+                                                                   QUERY PLAN                                                                   
+------------------------------------------------------------------------------------------------------------------------------------------------
  LockRows
    Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 101)) FOR UPDATE
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 101)) FOR UPDATE
 (5 rows)
 
 SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
@@ -255,13 +309,13 @@ SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
 (1 row)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
-                                                  QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
+                                                                  QUERY PLAN                                                                   
+-----------------------------------------------------------------------------------------------------------------------------------------------
  LockRows
    Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 102)) FOR SHARE
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 102)) FOR SHARE
 (5 rows)
 
 SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
@@ -277,22 +331,6 @@ SELECT COUNT(*) FROM ft1 t1;
   1000
 (1 row)
 
--- join two tables
-SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
- c1  
------
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
-(10 rows)
-
 -- subquery
 SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <= 10) ORDER BY c1;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -353,153 +391,148 @@ CREATE OPERATOR === (
     NEGATOR = !==
 );
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = postgres_fdw_abs(t1.c2);
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c1 = postgres_fdw_abs(t1.c2))
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 === t1.c2;
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c1 === t1.c2)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = abs(t1.c2);
-                                            QUERY PLAN                                             
----------------------------------------------------------------------------------------------------
+                                                            QUERY PLAN                                                             
+-----------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = abs(c2)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = abs(c2)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = t1.c2;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
+                                                          QUERY PLAN                                                          
+------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = c2))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = c2))
 (3 rows)
 
 -- ===================================================================
 -- WHERE with remotely-executable conditions
 -- ===================================================================
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 1;         -- Var, OpExpr(b), Const
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
-                                                  QUERY PLAN                                                  
---------------------------------------------------------------------------------------------------------------
+                                                                  QUERY PLAN                                                                  
+----------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 100)) AND ((c2 = 0))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 100)) AND ((c2 = 0))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NULL;        -- NullTest
-                                           QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
+                                                           QUERY PLAN                                                            
+---------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;    -- NullTest
-                                             QUERY PLAN                                              
------------------------------------------------------------------------------------------------------
+                                                             QUERY PLAN                                                              
+-------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
-                                                     QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
+                                                                     QUERY PLAN                                                                      
+-----------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((round(abs("C 1"), 0) = 1::numeric))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((round(abs("C 1"), 0) = 1::numeric))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = -c1;          -- OpExpr(l)
-                                             QUERY PLAN                                              
------------------------------------------------------------------------------------------------------
+                                                             QUERY PLAN                                                              
+-------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = (- "C 1")))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = (- "C 1")))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE 1 = c1!;           -- OpExpr(r)
-                                                QUERY PLAN                                                
-----------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((1::numeric = ("C 1" !)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((1::numeric = ("C 1" !)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE (c1 IS NOT NULL) IS DISTINCT FROM (c1 IS NOT NULL); -- DistinctExpr
-                                                                 QUERY PLAN                                                                 
---------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                 QUERY PLAN                                                                                 
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" IS NOT NULL) IS DISTINCT FROM ("C 1" IS NOT NULL)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" IS NOT NULL) IS DISTINCT FROM ("C 1" IS NOT NULL)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = ANY(ARRAY[c2, 1, c1 + 0]); -- ScalarArrayOpExpr
-                                                        QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
+                                                                        QUERY PLAN                                                                         
+-----------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = ANY (ARRAY[c2, 1, ("C 1" + 0)])))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = ANY (ARRAY[c2, 1, ("C 1" + 0)])))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = (ARRAY[c1,c2,3])[1]; -- ArrayRef
-                                                      QUERY PLAN                                                      
-----------------------------------------------------------------------------------------------------------------------
+                                                                      QUERY PLAN                                                                      
+------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = ((ARRAY["C 1", c2, 3])[1])))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = ((ARRAY["C 1", c2, 3])[1])))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c6 = E'foo''s\\bar';  -- check special chars
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                  
+---------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c6 = E'foo''s\\bar'::text))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c6 = E'foo''s\\bar'::text))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c8 = 'foo';  -- can't be sent to remote
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 -- parameterized remote path
 EXPLAIN (VERBOSE, COSTS false)
   SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
- Nested Loop
+                                                                                                                                                                                                               QUERY PLAN                                                                                                                                                                                                                
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan
    Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8, b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
-   ->  Foreign Scan on public.ft2 a
-         Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 47))
-   ->  Foreign Scan on public.ft2 b
-         Output: b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
-(8 rows)
+   Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, r.a1, r.a2, r.a3, r.a4, r.a5, r.a6, r.a7, r.a8 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 47))) l (a1, a2, a3, a4, a5, a6, a7, a8) INNER JOIN (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r (a1, a2, a3, a4, a5, a6, a7, a8) ON ((l.a2 = r.a1))
+(3 rows)
 
 SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  | c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -511,18 +544,18 @@ SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
 EXPLAIN (VERBOSE, COSTS false)
   SELECT * FROM ft2 a, ft2 b
   WHERE a.c2 = 6 AND b.c1 = a.c1 AND a.c8 = 'foo' AND b.c7 = upper(a.c7);
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                 
+--------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop
    Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8, b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
    ->  Foreign Scan on public.ft2 a
          Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8
          Filter: (a.c8 = 'foo'::user_enum)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c2 = 6))
+         Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c2 = 6))
    ->  Foreign Scan on public.ft2 b
          Output: b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
          Filter: (upper((a.c7)::text) = (b.c7)::text)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
+         Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
 (10 rows)
 
 SELECT * FROM ft2 a, ft2 b
@@ -651,21 +684,557 @@ SELECT * FROM ft2 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft1 WHERE c1 < 5));
 (4 rows)
 
 -- ===================================================================
+-- JOIN queries
+-- ===================================================================
+-- join two tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                                                                      QUERY PLAN                                                                                       
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1, t1.c3
+               Remote SQL: SELECT l.a1, l.a2, r.a1 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l (a1, a2) INNER JOIN (SELECT "C 1" a9 FROM "S 1"."T 1") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1  | c1  
+-----+-----
+ 101 | 101
+ 102 | 102
+ 103 | 103
+ 104 | 104
+ 105 | 105
+ 106 | 106
+ 107 | 107
+ 108 | 108
+ 109 | 109
+ 110 | 110
+(10 rows)
+
+-- join three tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1, t3.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+                                                                                                                                                   QUERY PLAN                                                                                                                                                   
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t3.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t3.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1, t3.c1, t1.c3
+               Remote SQL: SELECT l.a1, l.a2, l.a3, r.a1 FROM (SELECT l.a1, l.a2, r.a1 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l (a1, a2) INNER JOIN (SELECT "C 1" a9 FROM "S 1"."T 1") r (a1) ON ((l.a1 = r.a1))) l (a1, a2, a3) INNER JOIN (SELECT c1 a9 FROM "S 1"."T 3") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1, t3.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 | c1 
+----+----+----
+ 22 | 22 | 22
+ 24 | 24 | 24
+ 26 | 26 | 26
+ 28 | 28 | 28
+ 30 | 30 | 30
+ 32 | 32 | 32
+ 34 | 34 | 34
+ 36 | 36 | 36
+ 38 | 38 | 38
+ 40 | 40 | 40
+(10 rows)
+
+-- left outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                         QUERY PLAN                                                                          
+-------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 3") l (a1) LEFT JOIN (SELECT c1 a9 FROM "S 1"."T 4") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 22 |   
+ 24 | 24
+ 26 |   
+ 28 |   
+ 30 | 30
+ 32 |   
+ 34 |   
+ 36 | 36
+ 38 |   
+ 40 |   
+(10 rows)
+
+-- right outer join
+SET enable_mergejoin = off; -- palnner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                         QUERY PLAN                                                                          
+-------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 4") l (a1) LEFT JOIN (SELECT c1 a9 FROM "S 1"."T 3") r (a1) ON ((r.a1 = l.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+    | 33
+ 36 | 36
+    | 39
+ 42 | 42
+    | 45
+ 48 | 48
+    | 51
+ 54 | 54
+    | 57
+ 60 | 60
+(10 rows)
+
+SET enable_mergejoin = on;
+-- full outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+                                                                         QUERY PLAN                                                                          
+-------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 3") l (a1) FULL JOIN (SELECT c1 a9 FROM "S 1"."T 4") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+ c1  | c1 
+-----+----
+  92 |   
+  94 |   
+  96 | 96
+  98 |   
+ 100 |   
+     |  3
+     |  9
+     | 15
+     | 21
+     | 27
+(10 rows)
+
+-- full outer join + WHERE clause, only matched rows
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                                              QUERY PLAN                                                                                               
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 3") l (a1) FULL JOIN (SELECT c1 a9 FROM "S 1"."T 4") r (a1) ON ((l.a1 = r.a1)) WHERE (((l.a1 = r.a1) OR (l.a1 IS NULL)))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 66 | 66
+ 72 | 72
+ 78 | 78
+ 84 | 84
+ 90 | 90
+ 96 | 96
+    |  3
+    |  9
+    | 15
+    | 21
+(10 rows)
+
+-- join at WHERE clause 
+SET enable_mergejoin = off; -- palnner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                          QUERY PLAN                                                                          
+--------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 3") l (a1) INNER JOIN (SELECT c1 a9 FROM "S 1"."T 4") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 66 | 66
+ 72 | 72
+ 78 | 78
+ 84 | 84
+ 90 | 90
+ 96 | 96
+(6 rows)
+
+SET enable_mergejoin = on;
+-- join in CTE
+EXPLAIN (COSTS false, VERBOSE)
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+                                                                                    QUERY PLAN                                                                                     
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t.c1_1, t.c2_1, t.c1_3
+   CTE t
+     ->  Foreign Scan
+           Output: t1.c1, t1.c3, t2.c1
+           Remote SQL: SELECT l.a1, l.a2, r.a1 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l (a1, a2) INNER JOIN (SELECT "C 1" a9 FROM "S 1"."T 1") r (a1) ON ((l.a1 = r.a1))
+   ->  Sort
+         Output: t.c1_1, t.c2_1, t.c1_3
+         Sort Key: t.c1_3, t.c1_1
+         ->  CTE Scan on t
+               Output: t.c1_1, t.c2_1, t.c1_3
+(11 rows)
+
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+ c1_1 | c2_1 
+------+------
+  101 |  101
+  102 |  102
+  103 |  103
+  104 |  104
+  105 |  105
+  106 |  106
+  107 |  107
+  108 |  108
+  109 |  109
+  110 |  110
+(10 rows)
+
+-- partially unsafe to push down, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+                                                                                          QUERY PLAN                                                                                           
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Nested Loop
+               Output: t1.c1
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Materialize
+                     ->  Foreign Scan
+                           Remote SQL: SELECT NULL FROM (SELECT "C 1" a9 FROM "S 1"."T 1" WHERE (("C 1" = "C 1"))) l (a1) INNER JOIN (SELECT c1 a9 FROM "S 1"."T 3") r (a1) ON ((l.a1 = r.a1))
+(13 rows)
+
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+ c1 
+----
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+(10 rows)
+
+-- SEMI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Hash Join
+               Output: t1.c1
+               Hash Cond: (t1.c1 = t2.c1)
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t2.c1
+                     ->  HashAggregate
+                           Output: t2.c1
+                           Group Key: t2.c1
+                           ->  Foreign Scan on public.ft2 t2
+                                 Output: t2.c1
+                                 Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+(19 rows)
+
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+ c1  
+-----
+ 101
+ 102
+ 103
+ 104
+ 105
+ 106
+ 107
+ 108
+ 109
+ 110
+(10 rows)
+
+-- ANTI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+                              QUERY PLAN                              
+----------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Hash Anti Join
+               Output: t1.c1
+               Hash Cond: (t1.c1 = t2.c2)
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t2.c2
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c2
+                           Remote SQL: SELECT c2 a10 FROM "S 1"."T 1"
+(16 rows)
+
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+ c1  
+-----
+ 110
+ 111
+ 112
+ 113
+ 114
+ 115
+ 116
+ 117
+ 118
+ 119
+(10 rows)
+
+-- CROSS JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Nested Loop
+               Output: t1.c1, t2.c1
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Materialize
+                     Output: t2.c1
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c1
+                           Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+(15 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1  
+----+-----
+  1 | 101
+  1 | 102
+  1 | 103
+  1 | 104
+  1 | 105
+  1 | 106
+  1 | 107
+  1 | 108
+  1 | 109
+  1 | 110
+(10 rows)
+
+-- different server
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Merge Join
+         Output: t1.c1, t2.c1
+         Merge Cond: (t1.c1 = t2.c1)
+         ->  Sort
+               Output: t1.c1
+               Sort Key: t1.c1
+               ->  Foreign Scan on public.ft5 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+         ->  Sort
+               Output: t2.c1
+               Sort Key: t2.c1
+               ->  Foreign Scan on public.ft6 t2
+                     Output: t2.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1 
+----+----
+(0 rows)
+
+-- different effective user for permission check
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Limit
+   Output: t1.c1, ft5.c1
+   ->  Merge Join
+         Output: t1.c1, ft5.c1
+         Merge Cond: (t1.c1 = ft5.c1)
+         ->  Sort
+               Output: t1.c1
+               Sort Key: t1.c1
+               ->  Foreign Scan on public.ft5 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+         ->  Sort
+               Output: ft5.c1
+               Sort Key: ft5.c1
+               ->  Foreign Scan on public.ft5
+                     Output: ft5.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1 
+----+----
+(0 rows)
+
+-- unsafe join conditions
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Merge Join
+               Output: t1.c1, t2.c1, t1.c3
+               Merge Cond: (t1.c8 = t2.c8)
+               ->  Sort
+                     Output: t1.c1, t1.c3, t1.c8
+                     Sort Key: t1.c8
+                     ->  Foreign Scan on public.ft1 t1
+                           Output: t1.c1, t1.c3, t1.c8
+                           Remote SQL: SELECT "C 1" a10, c3 a12, c8 a17 FROM "S 1"."T 1"
+               ->  Sort
+                     Output: t2.c1, t2.c8
+                     Sort Key: t2.c8
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c1, t2.c8
+                           Remote SQL: SELECT "C 1" a9, c8 a17 FROM "S 1"."T 1"
+(20 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1 | c1  
+----+-----
+  1 | 102
+  1 | 103
+  1 | 104
+  1 | 105
+  1 | 106
+  1 | 107
+  1 | 108
+  1 | 109
+  1 | 110
+  1 |   1
+(10 rows)
+
+-- local filter (unsafe conditions on one side)
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Hash Join
+               Output: t1.c1, t2.c1, t1.c3
+               Hash Cond: (t2.c1 = t1.c1)
+               ->  Foreign Scan on public.ft2 t2
+                     Output: t2.c1
+                     Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t1.c1, t1.c3
+                     ->  Foreign Scan on public.ft1 t1
+                           Output: t1.c1, t1.c3
+                           Filter: (t1.c8 = 'foo'::user_enum)
+                           Remote SQL: SELECT "C 1" a10, c3 a12, c8 a17 FROM "S 1"."T 1"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1  | c1  
+-----+-----
+ 101 | 101
+ 102 | 102
+ 103 | 103
+ 104 | 104
+ 105 | 105
+ 106 | 106
+ 107 | 107
+ 108 | 108
+ 109 | 109
+ 110 | 110
+(10 rows)
+
+-- ===================================================================
 -- parameterized queries
 -- ===================================================================
 -- simple join
 PREPARE st1(int, int) AS SELECT t1.c3, t2.c3 FROM ft1 t1, ft2 t2 WHERE t1.c1 = $1 AND t2.c1 = $2;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st1(1, 2);
-                             QUERY PLAN                             
---------------------------------------------------------------------
+                               QUERY PLAN                               
+------------------------------------------------------------------------
  Nested Loop
    Output: t1.c3, t2.c3
    ->  Foreign Scan on public.ft1 t1
          Output: t1.c3
-         Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+         Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" = 1))
    ->  Foreign Scan on public.ft2 t2
          Output: t2.c3
-         Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" = 2))
+         Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" = 2))
 (8 rows)
 
 EXECUTE st1(1, 1);
@@ -683,8 +1252,8 @@ EXECUTE st1(101, 101);
 -- subquery using stable function (can't be sent to remote)
 PREPARE st2(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c4) = '1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st2(10, 20);
-                                                QUERY PLAN                                                
-----------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Sort
    Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
    Sort Key: t1.c1
@@ -693,13 +1262,13 @@ EXPLAIN (VERBOSE, COSTS false) EXECUTE st2(10, 20);
          Join Filter: (t1.c3 = t2.c3)
          ->  Foreign Scan on public.ft1 t1
                Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20))
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" < 20))
          ->  Materialize
                Output: t2.c3
                ->  Foreign Scan on public.ft2 t2
                      Output: t2.c3
                      Filter: (date(t2.c4) = '01-17-1970'::date)
-                     Remote SQL: SELECT c3, c4 FROM "S 1"."T 1" WHERE (("C 1" > 10))
+                     Remote SQL: SELECT c3 a12, c4 a13 FROM "S 1"."T 1" WHERE (("C 1" > 10))
 (15 rows)
 
 EXECUTE st2(10, 20);
@@ -717,8 +1286,8 @@ EXECUTE st2(101, 121);
 -- subquery using immutable function (can be sent to remote)
 PREPARE st3(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c5) = '1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st3(10, 20);
-                                                      QUERY PLAN                                                       
------------------------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Sort
    Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
    Sort Key: t1.c1
@@ -727,12 +1296,12 @@ EXPLAIN (VERBOSE, COSTS false) EXECUTE st3(10, 20);
          Join Filter: (t1.c3 = t2.c3)
          ->  Foreign Scan on public.ft1 t1
                Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20))
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" < 20))
          ->  Materialize
                Output: t2.c3
                ->  Foreign Scan on public.ft2 t2
                      Output: t2.c3
-                     Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date))
+                     Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date))
 (14 rows)
 
 EXECUTE st3(10, 20);
@@ -749,108 +1318,108 @@ EXECUTE st3(20, 30);
 -- custom plan should be chosen initially
 PREPARE st4(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 = $1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 -- once we try it enough times, should switch to generic plan
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
 (3 rows)
 
 -- value of $1 should not be sent to remote
 PREPARE st5(user_enum,int) AS SELECT * FROM ft1 t1 WHERE c8 = $1 and c1 = $2;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = $1)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
 (4 rows)
 
 EXECUTE st5('foo', 1);
@@ -868,14 +1437,14 @@ DEALLOCATE st5;
 -- System columns, except ctid, should not be sent to remote
 EXPLAIN (VERBOSE, COSTS false)
 SELECT * FROM ft1 t1 WHERE t1.tableoid = 'pg_class'::regclass LIMIT 1;
-                                  QUERY PLAN                                   
--------------------------------------------------------------------------------
+                                                  QUERY PLAN                                                   
+---------------------------------------------------------------------------------------------------------------
  Limit
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8
          Filter: (t1.tableoid = '1259'::oid)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (6 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.tableoid = 'ft1'::regclass LIMIT 1;
@@ -886,13 +1455,13 @@ SELECT * FROM ft1 t1 WHERE t1.tableoid = 'ft1'::regclass LIMIT 1;
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
-                                  QUERY PLAN                                   
--------------------------------------------------------------------------------
+                                                  QUERY PLAN                                                   
+---------------------------------------------------------------------------------------------------------------
  Limit
    Output: ((tableoid)::regclass), c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: (tableoid)::regclass, c1, c2, c3, c4, c5, c6, c7, c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (5 rows)
 
 SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
@@ -903,11 +1472,11 @@ SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((ctid = '(0,2)'::tid))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((ctid = '(0,2)'::tid))
 (3 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
@@ -918,13 +1487,13 @@ SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT ctid, * FROM ft1 t1 LIMIT 1;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                       QUERY PLAN                                                       
+------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: ctid, c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: ctid, c1, c2, c3, c4, c5, c6, c7, c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17, ctid a7 FROM "S 1"."T 1"
 (5 rows)
 
 SELECT ctid, * FROM ft1 t1 LIMIT 1;
@@ -987,7 +1556,7 @@ FETCH c;
 SAVEPOINT s;
 SELECT * FROM ft1 WHERE 1 / (c1 - 1) > 0;  -- ERROR
 ERROR:  division by zero
-CONTEXT:  Remote SQL command: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (((1 / ("C 1" - 1)) > 0))
+CONTEXT:  Remote SQL command: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (((1 / ("C 1" - 1)) > 0))
 ROLLBACK TO s;
 FETCH c;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -1010,64 +1579,64 @@ create foreign table ft3 (f1 text collate "C", f2 text)
   server loopback options (table_name 'loct3');
 -- can be sent to remote
 explain (verbose, costs off) select * from ft3 where f1 = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f1 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f1 = 'foo'::text))
 (3 rows)
 
 explain (verbose, costs off) select * from ft3 where f1 COLLATE "C" = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f1 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f1 = 'foo'::text))
 (3 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f2 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f2 = 'foo'::text))
 (3 rows)
 
 -- can't be sent to remote
 explain (verbose, costs off) select * from ft3 where f1 COLLATE "POSIX" = 'foo';
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: ((ft3.f1)::text = 'foo'::text)
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f1 = 'foo' COLLATE "C";
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: (ft3.f1 = 'foo'::text COLLATE "C")
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 COLLATE "C" = 'foo';
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: ((ft3.f2)::text = 'foo'::text)
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 = 'foo' COLLATE "C";
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: (ft3.f2 = 'foo'::text COLLATE "C")
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 -- ===================================================================
@@ -1085,7 +1654,7 @@ INSERT INTO ft2 (c1,c2,c3) SELECT c1+1000,c2+100, c3 || c3 FROM ft2 LIMIT 20;
                Output: ((ft2_1.c1 + 1000)), ((ft2_1.c2 + 100)), ((ft2_1.c3 || ft2_1.c3))
                ->  Foreign Scan on public.ft2 ft2_1
                      Output: (ft2_1.c1 + 1000), (ft2_1.c2 + 100), (ft2_1.c3 || ft2_1.c3)
-                     Remote SQL: SELECT "C 1", c2, c3 FROM "S 1"."T 1"
+                     Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12 FROM "S 1"."T 1"
 (9 rows)
 
 INSERT INTO ft2 (c1,c2,c3) SELECT c1+1000,c2+100, c3 || c3 FROM ft2 LIMIT 20;
@@ -1219,26 +1788,26 @@ UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
          Hash Cond: (ft2.c2 = ft1.c1)
          ->  Foreign Scan on public.ft2
                Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c8, ft2.ctid
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c8, ctid FROM "S 1"."T 1" FOR UPDATE
+               Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c8 a17, ctid a7 FROM "S 1"."T 1" FOR UPDATE
          ->  Hash
                Output: ft1.*, ft1.c1
                ->  Foreign Scan on public.ft1
                      Output: ft1.*, ft1.c1
-                     Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 9))
+                     Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 9))
 (13 rows)
 
 UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
 EXPLAIN (verbose, costs off)
   DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
-                                       QUERY PLAN                                       
-----------------------------------------------------------------------------------------
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
  Delete on public.ft2
    Output: c1, c4
-   Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1 RETURNING "C 1", c4
+   Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1 RETURNING "C 1" a9, c4 a13
    ->  Foreign Scan on public.ft2
          Output: ctid
-         Remote SQL: SELECT ctid FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 5)) FOR UPDATE
+         Remote SQL: SELECT ctid a7 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 5)) FOR UPDATE
 (6 rows)
 
 DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
@@ -1351,8 +1920,8 @@ DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
 
 EXPLAIN (verbose, costs off)
 DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
-                                                      QUERY PLAN                                                      
-----------------------------------------------------------------------------------------------------------------------
+                                                                      QUERY PLAN                                                                      
+------------------------------------------------------------------------------------------------------------------------------------------------------
  Delete on public.ft2
    Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1
    ->  Hash Join
@@ -1360,12 +1929,12 @@ DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
          Hash Cond: (ft2.c2 = ft1.c1)
          ->  Foreign Scan on public.ft2
                Output: ft2.ctid, ft2.c2
-               Remote SQL: SELECT c2, ctid FROM "S 1"."T 1" FOR UPDATE
+               Remote SQL: SELECT c2 a10, ctid a7 FROM "S 1"."T 1" FOR UPDATE
          ->  Hash
                Output: ft1.*, ft1.c1
                ->  Foreign Scan on public.ft1
                      Output: ft1.*, ft1.c1
-                     Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 2))
+                     Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 2))
 (13 rows)
 
 DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
@@ -3027,386 +3596,6 @@ NOTICE:  NEW: (13,"test triggered !")
 (1 row)
 
 -- ===================================================================
--- test inheritance features
--- ===================================================================
-CREATE TABLE a (aa TEXT);
-CREATE TABLE loct (aa TEXT, bb TEXT);
-CREATE FOREIGN TABLE b (bb TEXT) INHERITS (a)
-  SERVER loopback OPTIONS (table_name 'loct');
-INSERT INTO a(aa) VALUES('aaa');
-INSERT INTO a(aa) VALUES('aaaa');
-INSERT INTO a(aa) VALUES('aaaaa');
-INSERT INTO b(aa) VALUES('bbb');
-INSERT INTO b(aa) VALUES('bbbb');
-INSERT INTO b(aa) VALUES('bbbbb');
-SELECT tableoid::regclass, * FROM a;
- tableoid |  aa   
-----------+-------
- a        | aaa
- a        | aaaa
- a        | aaaaa
- b        | bbb
- b        | bbbb
- b        | bbbbb
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |  aa   | bb 
-----------+-------+----
- b        | bbb   | 
- b        | bbbb  | 
- b        | bbbbb | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |  aa   
-----------+-------
- a        | aaa
- a        | aaaa
- a        | aaaaa
-(3 rows)
-
-UPDATE a SET aa = 'zzzzzz' WHERE aa LIKE 'aaaa%';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
- b        | bbb
- b        | bbbb
- b        | bbbbb
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |  aa   | bb 
-----------+-------+----
- b        | bbb   | 
- b        | bbbb  | 
- b        | bbbbb | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
-(3 rows)
-
-UPDATE b SET aa = 'new';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
- b        | new
- b        | new
- b        | new
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid | aa  | bb 
-----------+-----+----
- b        | new | 
- b        | new | 
- b        | new | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
-(3 rows)
-
-UPDATE a SET aa = 'newtoo';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | newtoo
- a        | newtoo
- a        | newtoo
- b        | newtoo
- b        | newtoo
- b        | newtoo
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |   aa   | bb 
-----------+--------+----
- b        | newtoo | 
- b        | newtoo | 
- b        | newtoo | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | newtoo
- a        | newtoo
- a        | newtoo
-(3 rows)
-
-DELETE FROM a;
-SELECT tableoid::regclass, * FROM a;
- tableoid | aa 
-----------+----
-(0 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid | aa | bb 
-----------+----+----
-(0 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid | aa 
-----------+----
-(0 rows)
-
-DROP TABLE a CASCADE;
-NOTICE:  drop cascades to foreign table b
-DROP TABLE loct;
--- Check SELECT FOR UPDATE/SHARE with an inherited source table
-create table loct1 (f1 int, f2 int, f3 int);
-create table loct2 (f1 int, f2 int, f3 int);
-create table foo (f1 int, f2 int);
-create foreign table foo2 (f3 int) inherits (foo)
-  server loopback options (table_name 'loct1');
-create table bar (f1 int, f2 int);
-create foreign table bar2 (f3 int) inherits (bar)
-  server loopback options (table_name 'loct2');
-insert into foo values(1,1);
-insert into foo values(3,3);
-insert into foo2 values(2,2,2);
-insert into foo2 values(4,4,4);
-insert into bar values(1,11);
-insert into bar values(2,22);
-insert into bar values(6,66);
-insert into bar2 values(3,33,33);
-insert into bar2 values(4,44,44);
-insert into bar2 values(7,77,77);
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for update;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
- LockRows
-   Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-   ->  Hash Join
-         Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Append
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.ctid, bar2.tableoid, bar2.*
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(22 rows)
-
-select * from bar where f1 in (select f1 from foo) for update;
- f1 | f2 
-----+----
-  1 | 11
-  2 | 22
-  3 | 33
-  4 | 44
-(4 rows)
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for share;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
- LockRows
-   Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-   ->  Hash Join
-         Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Append
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.ctid, bar2.tableoid, bar2.*
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR SHARE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(22 rows)
-
-select * from bar where f1 in (select f1 from foo) for share;
- f1 | f2 
-----+----
-  1 | 11
-  2 | 22
-  3 | 33
-  4 | 44
-(4 rows)
-
--- Check UPDATE with inherited target and an inherited source table
-explain (verbose, costs off)
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
- Update on public.bar
-   Update on public.bar
-   Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
-   ->  Hash Join
-         Output: bar.f1, (bar.f2 + 100), bar.ctid, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Seq Scan on public.bar
-               Output: bar.f1, bar.f2, bar.ctid
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-   ->  Hash Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar2.f1 = foo.f1)
-         ->  Foreign Scan on public.bar2
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-               Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(37 rows)
-
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-select tableoid::regclass, * from bar order by 1,2;
- tableoid | f1 | f2  
-----------+----+-----
- bar      |  1 | 111
- bar      |  2 | 122
- bar      |  6 |  66
- bar2     |  3 | 133
- bar2     |  4 | 144
- bar2     |  7 |  77
-(6 rows)
-
--- Check UPDATE with inherited target and an appendrel subquery
-explain (verbose, costs off)
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-                                      QUERY PLAN                                      
---------------------------------------------------------------------------------------
- Update on public.bar
-   Update on public.bar
-   Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
-   ->  Hash Join
-         Output: bar.f1, (bar.f2 + 100), bar.ctid, (ROW(foo.f1))
-         Hash Cond: (foo.f1 = bar.f1)
-         ->  Append
-               ->  Seq Scan on public.foo
-                     Output: ROW(foo.f1), foo.f1
-               ->  Foreign Scan on public.foo2
-                     Output: ROW(foo2.f1), foo2.f1
-                     Remote SQL: SELECT f1 FROM public.loct1
-               ->  Seq Scan on public.foo foo_1
-                     Output: ROW((foo_1.f1 + 3)), (foo_1.f1 + 3)
-               ->  Foreign Scan on public.foo2 foo2_1
-                     Output: ROW((foo2_1.f1 + 3)), (foo2_1.f1 + 3)
-                     Remote SQL: SELECT f1 FROM public.loct1
-         ->  Hash
-               Output: bar.f1, bar.f2, bar.ctid
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid
-   ->  Merge Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, (ROW(foo.f1))
-         Merge Cond: (bar2.f1 = foo.f1)
-         ->  Sort
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-               Sort Key: bar2.f1
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Sort
-               Output: (ROW(foo.f1)), foo.f1
-               Sort Key: foo.f1
-               ->  Append
-                     ->  Seq Scan on public.foo
-                           Output: ROW(foo.f1), foo.f1
-                     ->  Foreign Scan on public.foo2
-                           Output: ROW(foo2.f1), foo2.f1
-                           Remote SQL: SELECT f1 FROM public.loct1
-                     ->  Seq Scan on public.foo foo_1
-                           Output: ROW((foo_1.f1 + 3)), (foo_1.f1 + 3)
-                     ->  Foreign Scan on public.foo2 foo2_1
-                           Output: ROW((foo2_1.f1 + 3)), (foo2_1.f1 + 3)
-                           Remote SQL: SELECT f1 FROM public.loct1
-(45 rows)
-
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-select tableoid::regclass, * from bar order by 1,2;
- tableoid | f1 | f2  
-----------+----+-----
- bar      |  1 | 211
- bar      |  2 | 222
- bar      |  6 | 166
- bar2     |  3 | 233
- bar2     |  4 | 244
- bar2     |  7 | 177
-(6 rows)
-
--- Test that WHERE CURRENT OF is not supported
-begin;
-declare c cursor for select * from bar where f1 = 7;
-fetch from c;
- f1 | f2  
-----+-----
-  7 | 177
-(1 row)
-
-update bar set f2 = null where current of c;
-ERROR:  WHERE CURRENT OF is not supported for this table type
-rollback;
-drop table foo cascade;
-NOTICE:  drop cascades to foreign table foo2
-drop table bar cascade;
-NOTICE:  drop cascades to foreign table bar2
-drop table loct1;
-drop table loct2;
--- ===================================================================
 -- test IMPORT FOREIGN SCHEMA
 -- ===================================================================
 CREATE SCHEMA import_source;
@@ -3636,3 +3825,6 @@ QUERY:  CREATE FOREIGN TABLE t5 (
 OPTIONS (schema_name 'import_source', table_name 't5');
 CONTEXT:  importing foreign table "t5"
 ROLLBACK;
+-- Cleanup
+DROP OWNED BY view_owner;
+DROP USER view_owner;
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 478e124..593a08d 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -48,7 +48,8 @@ PG_MODULE_MAGIC;
 
 /*
  * FDW-specific planner information kept in RelOptInfo.fdw_private for a
- * foreign table.  This information is collected by postgresGetForeignRelSize.
+ * foreign table or foreign join.  This information is collected by
+ * postgresGetForeignRelSize, or calculated from join source relations.
  */
 typedef struct PgFdwRelationInfo
 {
@@ -78,10 +79,31 @@ typedef struct PgFdwRelationInfo
 	ForeignTable *table;
 	ForeignServer *server;
 	UserMapping *user;			/* only set in use_remote_estimate mode */
+	Oid			userid;
 } PgFdwRelationInfo;
 
 /*
- * Indexes of FDW-private information stored in fdw_private lists.
+ * Indexes of FDW-private information stored in fdw_private of ForeignPath.
+ * We use fdw_private of a ForeighPath only when the path represents a join
+ * which can be pushed down to remote side.
+ *
+ * 1) Outer child path node
+ * 2) Inner child path node
+ * 3) Join type number(as an Integer node)
+ * 4) Expr list of join conditions
+ */
+enum FdwPathPrivateIndex
+{
+	FdwPathPrivateOuterPath,
+	FdwPathPrivateInnerPath,
+	FdwPathPrivateJoinType,
+	FdwPathPrivateJoinClauses,
+	FdwPathPrivateOtherClauses,
+};
+
+/*
+ * Indexes of FDW-private information stored in fdw_private of ForeignScan of
+ * a simple foreign table scan for a SELECT statement.
  *
  * We store various information in ForeignScan.fdw_private to pass it from
  * planner to executor.  Currently we store:
@@ -98,7 +120,11 @@ enum FdwScanPrivateIndex
 	/* SQL statement to execute remotely (as a String node) */
 	FdwScanPrivateSelectSql,
 	/* Integer list of attribute numbers retrieved by the SELECT */
-	FdwScanPrivateRetrievedAttrs
+	FdwScanPrivateRetrievedAttrs,
+	/* Integer value of server for the scan */
+	FdwScanPrivateServerOid,
+	/* Integer value of effective userid for the scan */
+	FdwScanPrivateUserOid,
 };
 
 /*
@@ -128,7 +154,8 @@ enum FdwModifyPrivateIndex
  */
 typedef struct PgFdwScanState
 {
-	Relation	rel;			/* relcache entry for the foreign table */
+	const char *relname;		/* name of relation beign scanned */
+	TupleDesc	tupdesc;		/* tuple descriptor of the scan */
 	AttInMetadata *attinmeta;	/* attribute datatype conversion metadata */
 
 	/* extracted fdw_private data */
@@ -214,7 +241,8 @@ typedef struct PgFdwAnalyzeState
  */
 typedef struct ConversionLocation
 {
-	Relation	rel;			/* foreign table's relcache entry */
+	const char *relname;		/* name of relation being processed */
+	TupleDesc	tupdesc;		/* tuple descriptor for attribute names */
 	AttrNumber	cur_attno;		/* attribute number being processed, or 0 */
 } ConversionLocation;
 
@@ -288,6 +316,12 @@ static bool postgresAnalyzeForeignTable(Relation relation,
 							BlockNumber *totalpages);
 static List *postgresImportForeignSchema(ImportForeignSchemaStmt *stmt,
 							Oid serverOid);
+static void postgresGetForeignJoinPaths(PlannerInfo *root,
+						   RelOptInfo *joinrel,
+						   RelOptInfo *outerrel,
+						   RelOptInfo *innerrel,
+						   SpecialJoinInfo *sjinfo,
+						   List *restrictlisti);
 
 /*
  * Helper functions
@@ -323,7 +357,8 @@ static void analyze_row_processor(PGresult *res, int row,
 					  PgFdwAnalyzeState *astate);
 static HeapTuple make_tuple_from_result_row(PGresult *res,
 						   int row,
-						   Relation rel,
+						   const char *relname,
+						   TupleDesc tupdesc,
 						   AttInMetadata *attinmeta,
 						   List *retrieved_attrs,
 						   MemoryContext temp_context);
@@ -368,6 +403,9 @@ postgres_fdw_handler(PG_FUNCTION_ARGS)
 	/* Support functions for IMPORT FOREIGN SCHEMA */
 	routine->ImportForeignSchema = postgresImportForeignSchema;
 
+	/* Support functions for join push-down */
+	routine->GetForeignJoinPaths = postgresGetForeignJoinPaths;
+
 	PG_RETURN_POINTER(routine);
 }
 
@@ -385,6 +423,7 @@ postgresGetForeignRelSize(PlannerInfo *root,
 {
 	PgFdwRelationInfo *fpinfo;
 	ListCell   *lc;
+	RangeTblEntry *rte;
 
 	/*
 	 * We use PgFdwRelationInfo to pass various information to subsequent
@@ -428,18 +467,20 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	}
 
 	/*
+	 * Retrieve RTE to obtain checkAsUser.  checkAsUser is used to determine
+	 * the user to use to obtain user mapping.
+	 */
+	rte = planner_rt_fetch(baserel->relid, root);
+	fpinfo->userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
+
+	/*
 	 * If the table or the server is configured to use remote estimates,
 	 * identify which user to do remote access as during planning.  This
 	 * should match what ExecCheckRTEPerms() does.  If we fail due to lack of
 	 * permissions, the query would have failed at runtime anyway.
 	 */
 	if (fpinfo->use_remote_estimate)
-	{
-		RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
-		Oid			userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-		fpinfo->user = GetUserMapping(userid, fpinfo->server->serverid);
-	}
+		fpinfo->user = GetUserMapping(fpinfo->userid, fpinfo->server->serverid);
 	else
 		fpinfo->user = NULL;
 
@@ -463,10 +504,9 @@ postgresGetForeignRelSize(PlannerInfo *root,
 				   &fpinfo->attrs_used);
 	foreach(lc, fpinfo->local_conds)
 	{
-		RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
+		Expr *expr = (Expr *) lfirst(lc);
 
-		pull_varattnos((Node *) rinfo->clause, baserel->relid,
-					   &fpinfo->attrs_used);
+		pull_varattnos((Node *) expr, baserel->relid, &fpinfo->attrs_used);
 	}
 
 	/*
@@ -752,6 +792,8 @@ postgresGetForeignPlan(PlannerInfo *root,
 	List	   *retrieved_attrs;
 	StringInfoData sql;
 	ListCell   *lc;
+	List	   *fdw_ps_tlist = NIL;
+	ForeignScan *scan;
 
 	/*
 	 * Separate the scan_clauses into those that can be executed remotely and
@@ -768,9 +810,6 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 *
 	 * This code must match "extract_actual_clauses(scan_clauses, false)"
 	 * except for the additional decision about remote versus local execution.
-	 * Note however that we only strip the RestrictInfo nodes from the
-	 * local_exprs list, since appendWhereClause expects a list of
-	 * RestrictInfos.
 	 */
 	foreach(lc, scan_clauses)
 	{
@@ -783,11 +822,11 @@ postgresGetForeignPlan(PlannerInfo *root,
 			continue;
 
 		if (list_member_ptr(fpinfo->remote_conds, rinfo))
-			remote_conds = lappend(remote_conds, rinfo);
+			remote_conds = lappend(remote_conds, rinfo->clause);
 		else if (list_member_ptr(fpinfo->local_conds, rinfo))
 			local_exprs = lappend(local_exprs, rinfo->clause);
 		else if (is_foreign_expr(root, baserel, rinfo->clause))
-			remote_conds = lappend(remote_conds, rinfo);
+			remote_conds = lappend(remote_conds, rinfo->clause);
 		else
 			local_exprs = lappend(local_exprs, rinfo->clause);
 	}
@@ -797,68 +836,126 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 * expressions to be sent as parameters.
 	 */
 	initStringInfo(&sql);
-	deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used,
-					 &retrieved_attrs);
-	if (remote_conds)
-		appendWhereClause(&sql, root, baserel, remote_conds,
-						  true, &params_list);
-
-	/*
-	 * Add FOR UPDATE/SHARE if appropriate.  We apply locking during the
-	 * initial row fetch, rather than later on as is done for local tables.
-	 * The extra roundtrips involved in trying to duplicate the local
-	 * semantics exactly don't seem worthwhile (see also comments for
-	 * RowMarkType).
-	 *
-	 * Note: because we actually run the query as a cursor, this assumes that
-	 * DECLARE CURSOR ... FOR UPDATE is supported, which it isn't before 8.3.
-	 */
-	if (baserel->relid == root->parse->resultRelation &&
-		(root->parse->commandType == CMD_UPDATE ||
-		 root->parse->commandType == CMD_DELETE))
-	{
-		/* Relation is UPDATE/DELETE target, so use FOR UPDATE */
-		appendStringInfoString(&sql, " FOR UPDATE");
-	}
-	else
+	if (scan_relid > 0)
 	{
-		PlanRowMark *rc = get_plan_rowmark(root->rowMarks, baserel->relid);
+		deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used,
+						 &retrieved_attrs);
+		if (remote_conds)
+			appendConditions(&sql, root, baserel, NULL, NULL,
+							 remote_conds, " WHERE ", &params_list);
 
-		if (rc)
+		/*
+		 * Add FOR UPDATE/SHARE if appropriate.  We apply locking during the
+		 * initial row fetch, rather than later on as is done for local tables.
+		 * The extra roundtrips involved in trying to duplicate the local
+		 * semantics exactly don't seem worthwhile (see also comments for
+		 * RowMarkType).
+		 *
+		 * Note: because we actually run the query as a cursor, this assumes
+		 * that DECLARE CURSOR ... FOR UPDATE is supported, which it isn't
+		 * before 8.3.
+		 */
+		if (baserel->relid == root->parse->resultRelation &&
+			(root->parse->commandType == CMD_UPDATE ||
+			 root->parse->commandType == CMD_DELETE))
 		{
-			/*
-			 * Relation is specified as a FOR UPDATE/SHARE target, so handle
-			 * that.  (But we could also see LCS_NONE, meaning this isn't a
-			 * target relation after all.)
-			 *
-			 * For now, just ignore any [NO] KEY specification, since (a) it's
-			 * not clear what that means for a remote table that we don't have
-			 * complete information about, and (b) it wouldn't work anyway on
-			 * older remote servers.  Likewise, we don't worry about NOWAIT.
-			 */
-			switch (rc->strength)
+			/* Relation is UPDATE/DELETE target, so use FOR UPDATE */
+			appendStringInfoString(&sql, " FOR UPDATE");
+		}
+		else
+		{
+			PlanRowMark *rc = get_plan_rowmark(root->rowMarks, baserel->relid);
+
+			if (rc)
 			{
-				case LCS_NONE:
-					/* No locking needed */
-					break;
-				case LCS_FORKEYSHARE:
-				case LCS_FORSHARE:
-					appendStringInfoString(&sql, " FOR SHARE");
-					break;
-				case LCS_FORNOKEYUPDATE:
-				case LCS_FORUPDATE:
-					appendStringInfoString(&sql, " FOR UPDATE");
-					break;
+				/*
+				 * Relation is specified as a FOR UPDATE/SHARE target, so handle
+				 * that.  (But we could also see LCS_NONE, meaning this isn't a
+				 * target relation after all.)
+				 *
+				 * For now, just ignore any [NO] KEY specification, since (a)
+				 * it's not clear what that means for a remote table that we
+				 * don't have complete information about, and (b) it wouldn't
+				 * work anyway on older remote servers.  Likewise, we don't
+				 * worry about NOWAIT.
+				 */
+				switch (rc->strength)
+				{
+					case LCS_NONE:
+						/* No locking needed */
+						break;
+					case LCS_FORKEYSHARE:
+					case LCS_FORSHARE:
+						appendStringInfoString(&sql, " FOR SHARE");
+						break;
+					case LCS_FORNOKEYUPDATE:
+					case LCS_FORUPDATE:
+						appendStringInfoString(&sql, " FOR UPDATE");
+						break;
+				}
 			}
 		}
 	}
+	else
+	{
+		/* Join case */
+		RelOptInfo *rel_o;
+		RelOptInfo *rel_i;
+		Path	   *path_o;
+		Path	   *path_i;
+		const char *sql_o;
+		const char *sql_i;
+		ForeignScan *plan_o;
+		ForeignScan *plan_i;
+		JoinType	jointype;
+		List	   *joinclauses;
+		List	   *otherclauses;
+		int			i;
+
+		/*
+		 * Retrieve infomation from fdw_private.
+		 */
+		path_o = list_nth(best_path->fdw_private, FdwPathPrivateOuterPath);
+		path_i = list_nth(best_path->fdw_private, FdwPathPrivateInnerPath);
+		jointype = intVal(list_nth(best_path->fdw_private,
+								   FdwPathPrivateJoinType));
+		joinclauses = list_nth(best_path->fdw_private,
+							   FdwPathPrivateJoinClauses);
+		otherclauses = list_nth(best_path->fdw_private,
+							    FdwPathPrivateOtherClauses);
+
+		rel_o = path_o->parent;
+		rel_i = path_i->parent;
+
+		/*
+		 * Construct remote query from the bottom to the top.  ForeignScan plan
+		 * node of underlying scans are not necessary for execute the plan tree,
+		 * but creating them is handy way to construct remote query recursively.
+		 */
+		plan_o = (ForeignScan *) create_plan_recurse(root, path_o);
+		Assert(IsA(plan_o, ForeignScan));
+		sql_o = strVal(list_nth(plan_o->fdw_private, FdwScanPrivateSelectSql));
+
+		plan_i = (ForeignScan *) create_plan_recurse(root, path_i);
+		Assert(IsA(plan_i, ForeignScan));
+		sql_i = strVal(list_nth(plan_i->fdw_private, FdwScanPrivateSelectSql));
+
+		deparseJoinSql(&sql, root, baserel, rel_o, rel_i,
+					   sql_o, sql_i, jointype, joinclauses, otherclauses,
+					   &fdw_ps_tlist);
+		retrieved_attrs = NIL;
+		for (i = 0; i < list_length(fdw_ps_tlist); i++)
+			retrieved_attrs = lappend_int(retrieved_attrs, i + 1);
+	}
 
 	/*
-	 * Build the fdw_private list that will be available to the executor.
+	 * Build the fdw_private list that will be available in the executor.
 	 * Items in the list must match enum FdwScanPrivateIndex, above.
 	 */
-	fdw_private = list_make2(makeString(sql.data),
-							 retrieved_attrs);
+	fdw_private = list_make4(makeString(sql.data),
+							 retrieved_attrs,
+							 makeInteger(fpinfo->server->serverid),
+							 makeInteger(fpinfo->userid));
 
 	/*
 	 * Create the ForeignScan node from target list, local filtering
@@ -868,11 +965,18 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 * field of the finished plan node; we can't keep them in private state
 	 * because then they wouldn't be subject to later planner processing.
 	 */
-	return make_foreignscan(tlist,
+	scan = make_foreignscan(tlist,
 							local_exprs,
 							scan_relid,
 							params_list,
 							fdw_private);
+
+	/*
+	 * set fdw_ps_tlist to handle tuples generated by this scan.
+	 */
+	scan->fdw_ps_tlist = fdw_ps_tlist;
+
+	return scan;
 }
 
 /*
@@ -885,9 +989,8 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	ForeignScan *fsplan = (ForeignScan *) node->ss.ps.plan;
 	EState	   *estate = node->ss.ps.state;
 	PgFdwScanState *fsstate;
-	RangeTblEntry *rte;
+	Oid			serverid;
 	Oid			userid;
-	ForeignTable *table;
 	ForeignServer *server;
 	UserMapping *user;
 	int			numParams;
@@ -907,22 +1010,13 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	node->fdw_state = (void *) fsstate;
 
 	/*
-	 * Identify which user to do the remote access as.  This should match what
-	 * ExecCheckRTEPerms() does.
-	 */
-	rte = rt_fetch(fsplan->scan.scanrelid, estate->es_range_table);
-	userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-	/* Get info about foreign table. */
-	fsstate->rel = node->ss.ss_currentRelation;
-	table = GetForeignTable(RelationGetRelid(fsstate->rel));
-	server = GetForeignServer(table->serverid);
-	user = GetUserMapping(userid, server->serverid);
-
-	/*
 	 * Get connection to the foreign server.  Connection manager will
 	 * establish new connection if necessary.
 	 */
+	serverid = intVal(list_nth(fsplan->fdw_private, FdwScanPrivateServerOid));
+	userid = intVal(list_nth(fsplan->fdw_private, FdwScanPrivateUserOid));
+	server = GetForeignServer(serverid);
+	user = GetUserMapping(userid, server->serverid);
 	fsstate->conn = GetConnection(server, user, false);
 
 	/* Assign a unique ID for my cursor */
@@ -933,7 +1027,7 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	fsstate->query = strVal(list_nth(fsplan->fdw_private,
 									 FdwScanPrivateSelectSql));
 	fsstate->retrieved_attrs = (List *) list_nth(fsplan->fdw_private,
-											   FdwScanPrivateRetrievedAttrs);
+												 FdwScanPrivateRetrievedAttrs);
 
 	/* Create contexts for batches of tuples and per-tuple temp workspace. */
 	fsstate->batch_cxt = AllocSetContextCreate(estate->es_query_cxt,
@@ -947,8 +1041,18 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 											  ALLOCSET_SMALL_INITSIZE,
 											  ALLOCSET_SMALL_MAXSIZE);
 
-	/* Get info we'll need for input data conversion. */
-	fsstate->attinmeta = TupleDescGetAttInMetadata(RelationGetDescr(fsstate->rel));
+	/* Get info we'll need for input data conversion and error report. */
+	if (fsplan->scan.scanrelid > 0)
+	{
+		fsstate->relname = RelationGetRelationName(node->ss.ss_currentRelation);
+		fsstate->tupdesc = RelationGetDescr(node->ss.ss_currentRelation);
+	}
+	else
+	{
+		fsstate->relname = "foreign join";	/* TODO should be more detailed? */
+		fsstate->tupdesc = node->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
+	}
+	fsstate->attinmeta = TupleDescGetAttInMetadata(fsstate->tupdesc);
 
 	/* Prepare for output conversion of parameters used in remote query. */
 	numParams = list_length(fsplan->fdw_exprs);
@@ -1751,11 +1855,13 @@ estimate_path_cost_size(PlannerInfo *root,
 		deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used,
 						 &retrieved_attrs);
 		if (fpinfo->remote_conds)
-			appendWhereClause(&sql, root, baserel, fpinfo->remote_conds,
-							  true, NULL);
+			appendConditions(&sql, root, baserel, NULL, NULL,
+							 fpinfo->remote_conds, " WHERE ", NULL);
 		if (remote_join_conds)
-			appendWhereClause(&sql, root, baserel, remote_join_conds,
-							  (fpinfo->remote_conds == NIL), NULL);
+			appendConditions(&sql, root, baserel, NULL, NULL,
+							 remote_join_conds,
+							 fpinfo->remote_conds == NIL ? " WHERE " : " AND ",
+							 NULL);
 
 		/* Get the remote estimate */
 		conn = GetConnection(fpinfo->server, fpinfo->user, false);
@@ -2055,7 +2161,8 @@ fetch_more_data(ForeignScanState *node)
 		{
 			fsstate->tuples[i] =
 				make_tuple_from_result_row(res, i,
-										   fsstate->rel,
+										   fsstate->relname,
+										   fsstate->tupdesc,
 										   fsstate->attinmeta,
 										   fsstate->retrieved_attrs,
 										   fsstate->temp_cxt);
@@ -2273,7 +2380,8 @@ store_returning_result(PgFdwModifyState *fmstate,
 		HeapTuple	newtup;
 
 		newtup = make_tuple_from_result_row(res, 0,
-											fmstate->rel,
+										RelationGetRelationName(fmstate->rel),
+											RelationGetDescr(fmstate->rel),
 											fmstate->attinmeta,
 											fmstate->retrieved_attrs,
 											fmstate->temp_cxt);
@@ -2565,7 +2673,8 @@ analyze_row_processor(PGresult *res, int row, PgFdwAnalyzeState *astate)
 		oldcontext = MemoryContextSwitchTo(astate->anl_cxt);
 
 		astate->rows[pos] = make_tuple_from_result_row(res, row,
-													   astate->rel,
+										   RelationGetRelationName(astate->rel),
+											   RelationGetDescr(astate->rel),
 													   astate->attinmeta,
 													 astate->retrieved_attrs,
 													   astate->temp_cxt);
@@ -2839,6 +2948,302 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
 }
 
 /*
+ * Construct PgFdwRelationInfo from two join sources
+ */
+static PgFdwRelationInfo *
+merge_fpinfo(RelOptInfo *outerrel, RelOptInfo *innerrel, JoinType jointype)
+{
+	PgFdwRelationInfo *fpinfo_o;
+	PgFdwRelationInfo *fpinfo_i;
+	PgFdwRelationInfo *fpinfo;
+
+	fpinfo_o = (PgFdwRelationInfo *) outerrel->fdw_private;
+	fpinfo_i = (PgFdwRelationInfo *) innerrel->fdw_private;
+
+	fpinfo = (PgFdwRelationInfo *) palloc0(sizeof(PgFdwRelationInfo));
+
+	fpinfo->remote_conds = list_concat(copyObject(fpinfo_o->remote_conds),
+									   copyObject(fpinfo_i->remote_conds));
+	fpinfo->local_conds = list_concat(copyObject(fpinfo_o->local_conds),
+									  copyObject(fpinfo_i->local_conds));
+
+	fpinfo->attrs_used = NULL;		/* Use fdw_ps_tlist */
+	fpinfo->local_conds_cost.startup = fpinfo_o->local_conds_cost.startup +
+									   fpinfo_i->local_conds_cost.startup;
+	fpinfo->local_conds_cost.per_tuple = fpinfo_o->local_conds_cost.per_tuple +
+										 fpinfo_i->local_conds_cost.per_tuple;
+	fpinfo->local_conds_sel = fpinfo_o->local_conds_sel *
+							  fpinfo_i->local_conds_sel;
+	if (jointype == JOIN_INNER)
+		fpinfo->rows = Min(fpinfo_o->rows, fpinfo_i->rows);
+	else
+		fpinfo->rows = Max(fpinfo_o->rows, fpinfo_i->rows);
+	/* TODO estimate more accurately */
+	fpinfo->rows = Min(fpinfo_o->rows, fpinfo_i->rows);
+	fpinfo->width = fpinfo_o->width + fpinfo_i->width;
+	fpinfo->use_remote_estimate = false;
+	fpinfo->fdw_startup_cost = (fpinfo_o->fdw_startup_cost +
+								fpinfo_i->fdw_startup_cost) / 2;
+	fpinfo->fdw_tuple_cost = (fpinfo_o->fdw_tuple_cost +
+							  fpinfo_i->fdw_tuple_cost) / 2;
+
+	fpinfo->startup_cost = fpinfo->fdw_startup_cost +
+						   fpinfo_i->fdw_startup_cost;
+	fpinfo->total_cost = fpinfo->startup_cost +
+						 fpinfo->fdw_tuple_cost * fpinfo->rows;
+
+	fpinfo->table = NULL;	/* always NULL in join case */
+	fpinfo->server = fpinfo_o->server;
+	fpinfo->user = fpinfo_o->user ? fpinfo_o->user : fpinfo_i->user;
+	/* checkAsuser must be identical */
+	fpinfo->userid = fpinfo_o->userid;
+
+	return fpinfo;
+}
+
+/*
+ * postgresGetForeignJoinPaths
+ *		Add possible ForeignPath to joinrel.
+ *
+ * Joins satify conditions below can be pushed down to remote PostgreSQL server.
+ *
+ * 1) Join type is inner or outer
+ * 2) Join conditions consist of remote-safe expressions.
+ * 3) Join source relations don't have any local filter.
+ */
+static void
+postgresGetForeignJoinPaths(PlannerInfo *root,
+							RelOptInfo *joinrel,
+							RelOptInfo *outerrel,
+							RelOptInfo *innerrel,
+							SpecialJoinInfo *sjinfo,
+							List *restrictlist)
+{
+	Index			rti;
+	Oid				serverid = InvalidOid;
+	Oid				userid = InvalidOid;
+	PgFdwRelationInfo *fpinfo;
+	JoinType		jointype = !sjinfo ? JOIN_INNER : sjinfo->jointype;
+	ForeignPath	   *joinpath;
+	double			rows;
+	Cost			startup_cost;
+	Cost			total_cost;
+
+	ForeignPath	   *path_o;
+	ForeignPath	   *path_i;
+	ListCell	   *lc;
+	List		   *fdw_private;
+	List		   *joinclauses;
+	List		   *otherclauses;
+
+	/*
+	 * fdw_private might be NULL if outer/inner relation is not safe to
+	 * push-down.
+	 */
+	if (!outerrel->fdw_private)
+	{
+		ereport(DEBUG3, (errmsg("outer is not safe to push-down")));
+		return;
+	}
+	if (!innerrel->fdw_private)
+	{
+		ereport(DEBUG3, (errmsg("inner is not safe to push-down")));
+		return;
+	}
+
+	/*
+	 * Currently we don't push-down joins in query for UPDATE/DELETE, because it
+	 * introduces complexity of whole-row-reference.  This restriction might be
+	 * relaxed in a future release.
+	 */ 
+	if (root->parse->commandType != CMD_SELECT)
+	{
+		ereport(DEBUG3, (errmsg("command type is not SELECT")));
+		return;
+	}
+
+	/*
+	 * We support all outer joins in addition to inner join.
+	 */
+	if (jointype != JOIN_INNER && jointype != JOIN_LEFT &&
+		jointype != JOIN_RIGHT && jointype != JOIN_FULL)
+	{
+		ereport(DEBUG3, (errmsg("unsupported join type (SEMI, ANTI)")));
+		return;
+	}
+
+	/*
+	 * Both outer and inner relation must have a ForeignPath at least.
+	 * Currently we choose the first ForeighPath which doesn't have param_info,
+	 * thus simplest ForeignPath, but we need to consider others when we
+	 * support multiple ForeignPaths for a RelOptInfo.
+	 */
+	Assert(joinrel->fdw_handler != InvalidOid);
+	for (rti = 1; rti < root->simple_rel_array_size; rti++)
+	{
+		RelOptInfo	   *rel;
+		PgFdwRelationInfo *fpinfo_tmp;
+
+		/*
+		 * Skip relations which is not used in the join.
+		 */
+		if (!bms_is_member(rti, joinrel->relids))
+			continue;
+		rel = root->simple_rel_array[rti];
+		Assert(rel);
+
+		/*
+		 * All relations in the join must belong to same server.
+		 */
+		fpinfo_tmp = rel->fdw_private;
+		if (serverid != InvalidOid && fpinfo_tmp->server->serverid != serverid)
+		{
+			ereport(DEBUG3, (errmsg("server unmatch")));
+			return;
+		}
+		serverid = fpinfo_tmp->server->serverid;
+
+		/*
+		 * No source relation can have local conditions.  This can be relaxed
+		 * if the join is an inner join and local conditions don't contain
+		 * volatile function/operator, but as of now we leave it as future
+		 * enhancement.
+		 */
+		if (fpinfo_tmp->local_conds != NULL)
+		{
+			ereport(DEBUG3, (errmsg("join with local filter")));
+			return;
+		}
+
+		/*
+		 * effective userid of all source relations should be identical.
+		 */
+		if (userid != InvalidOid && fpinfo_tmp->userid != userid)
+		{
+			ereport(DEBUG3, (errmsg("unmatch userid")));
+			return;
+		}
+		userid = fpinfo_tmp->userid;
+	}
+
+	/*
+	 * Separete restrictlist into two lists, join conditions and remote filters.
+	 */
+	joinclauses = restrictlist;
+	if (IS_OUTER_JOIN(jointype))
+	{
+		extract_actual_join_clauses(joinclauses, &joinclauses, &otherclauses);
+	}
+	else
+	{
+		joinclauses = extract_actual_clauses(joinclauses, false);
+		otherclauses = NIL;
+	}
+
+	/*
+	 * Note that CROSS JOIN (cartesian product) is transformed to JOIN_INNER
+	 * with empty joinclauses.  Pushing down CROSS JOIN usually produces more
+	 * result than retrieving each tables separately, so we don't push down
+	 * such joins.
+	 */
+	if (jointype == JOIN_INNER && joinclauses == NIL)
+	{
+		ereport(DEBUG3, (errmsg("unsupported join type (CROSS)")));
+		return;
+	}
+
+	/*
+	 * Join condition must be safe to push down.
+	 */
+	foreach(lc, joinclauses)
+	{
+		Expr *expr = (Expr *) lfirst(lc);
+
+		if (!is_foreign_expr(root, joinrel, expr))
+		{
+			ereport(DEBUG3, (errmsg("join quals contains unsafe conditions")));
+			return;
+		}
+	}
+
+	/*
+	 * Other condition evaluated on remote side must be safe to push down.
+	 */
+	foreach(lc, otherclauses)
+	{
+		Expr *expr = (Expr *) lfirst(lc);
+
+		if (!is_foreign_expr(root, joinrel, expr))
+		{
+			ereport(DEBUG3, (errmsg("remote filter contains unsafe conditions")));
+			return;
+		}
+	}
+
+	/* Here we know that this join can be pushed-down to remote side. */
+
+	/* Construct fpinfo for the join relation */
+	fpinfo = merge_fpinfo(outerrel, innerrel, jointype); 
+	joinrel->fdw_private = fpinfo;
+
+	/* TODO determine more accurate cost and rows of the join. */
+	rows = fpinfo->rows;
+	startup_cost = fpinfo->startup_cost;
+	total_cost = fpinfo->total_cost;
+
+	/*
+	 * Find child paths to pass them to GetForeignPlan.  We ignore
+	 * parameterized path here to simplify planning.
+	 */
+	path_o = path_i = NULL;
+	foreach(lc, outerrel->pathlist)
+	{
+		Path *path = (Path *) lfirst(lc);
+		if (IsA(path, ForeignPath) && !path->param_info)
+		{
+			path_o = (ForeignPath *) path;
+			break;
+		}
+	}
+	foreach(lc, innerrel->pathlist)
+	{
+		Path *path = (Path *) lfirst(lc);
+		if (IsA(path, ForeignPath) && !path->param_info)
+		{
+			path_i = (ForeignPath *) path;
+			break;
+		}
+	}
+	Assert(path_o);
+	Assert(path_i);
+
+	fdw_private = list_make4(path_o,
+							 path_i,
+							 makeInteger(jointype),
+							 joinclauses);
+	fdw_private = lappend(fdw_private, otherclauses);
+
+	/*
+	 * Create a new join path and add it to the joinrel which represents a join
+	 * between foreign tables.
+	 */
+	joinpath = create_foreignscan_path(root,
+									   joinrel,
+									   rows,
+									   startup_cost,
+									   total_cost,
+									   NIL,		/* no pathkeys */
+									   NULL,	/* no required_outer */
+									   fdw_private);
+
+	/* Add generated path into joinrel by add_path(). */
+	add_path(joinrel, (Path *) joinpath);
+	elog(DEBUG3, "join path added");
+
+	/* TODO consider parameterized paths */
+}
+
+/*
  * Create a tuple from the specified row of the PGresult.
  *
  * rel is the local representation of the foreign table, attinmeta is
@@ -2849,13 +3254,13 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
 static HeapTuple
 make_tuple_from_result_row(PGresult *res,
 						   int row,
-						   Relation rel,
+						   const char *relname,
+						   TupleDesc tupdesc,
 						   AttInMetadata *attinmeta,
 						   List *retrieved_attrs,
 						   MemoryContext temp_context)
 {
 	HeapTuple	tuple;
-	TupleDesc	tupdesc = RelationGetDescr(rel);
 	Datum	   *values;
 	bool	   *nulls;
 	ItemPointer ctid = NULL;
@@ -2882,7 +3287,8 @@ make_tuple_from_result_row(PGresult *res,
 	/*
 	 * Set up and install callback to report where conversion error occurs.
 	 */
-	errpos.rel = rel;
+	errpos.relname = relname;
+	errpos.tupdesc = tupdesc;
 	errpos.cur_attno = 0;
 	errcallback.callback = conversion_error_callback;
 	errcallback.arg = (void *) &errpos;
@@ -2967,10 +3373,10 @@ static void
 conversion_error_callback(void *arg)
 {
 	ConversionLocation *errpos = (ConversionLocation *) arg;
-	TupleDesc	tupdesc = RelationGetDescr(errpos->rel);
+	TupleDesc	tupdesc = errpos->tupdesc;
 
 	if (errpos->cur_attno > 0 && errpos->cur_attno <= tupdesc->natts)
 		errcontext("column \"%s\" of foreign table \"%s\"",
 				   NameStr(tupdesc->attrs[errpos->cur_attno - 1]->attname),
-				   RelationGetRelationName(errpos->rel));
+				   errpos->relname);
 }
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 950c6f7..2ce8cb2e 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -16,6 +16,7 @@
 #include "foreign/foreign.h"
 #include "lib/stringinfo.h"
 #include "nodes/relation.h"
+#include "nodes/plannodes.h"
 #include "utils/relcache.h"
 
 #include "libpq-fe.h"
@@ -52,12 +53,25 @@ extern void deparseSelectSql(StringInfo buf,
 				 RelOptInfo *baserel,
 				 Bitmapset *attrs_used,
 				 List **retrieved_attrs);
-extern void appendWhereClause(StringInfo buf,
+extern void appendConditions(StringInfo buf,
 				  PlannerInfo *root,
 				  RelOptInfo *baserel,
+				  List *outertlist,
+				  List *innertlist,
 				  List *exprs,
-				  bool is_first,
+				  const char *prefix,
 				  List **params);
+extern void deparseJoinSql(StringInfo sql,
+			   PlannerInfo *root,
+			   RelOptInfo *baserel,
+			   RelOptInfo *outerrel,
+			   RelOptInfo *innerrel,
+			   const char *sql_o,
+			   const char *sql_i,
+			   JoinType jointype,
+			   List *joinclauses,
+			   List *otherclauses,
+			   List **retrieved_attrs);
 extern void deparseInsertSql(StringInfo buf, PlannerInfo *root,
 				 Index rtindex, Relation rel,
 				 List *targetAttrs, List *returningList,
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 4a23457..749d159 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -11,12 +11,17 @@ DO $d$
             OPTIONS (dbname '$$||current_database()||$$',
                      port '$$||current_setting('port')||$$'
             )$$;
+        EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER postgres_fdw
+            OPTIONS (dbname '$$||current_database()||$$',
+                     port '$$||current_setting('port')||$$'
+            )$$;
     END;
 $d$;
 
 CREATE USER MAPPING FOR public SERVER testserver1
 	OPTIONS (user 'value', password 'value');
 CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
+CREATE USER MAPPING FOR CURRENT_USER SERVER loopback2;
 
 -- ===================================================================
 -- create objects used through FDW loopback server
@@ -39,6 +44,18 @@ CREATE TABLE "S 1"."T 2" (
 	c2 text,
 	CONSTRAINT t2_pkey PRIMARY KEY (c1)
 );
+CREATE TABLE "S 1"."T 3" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text,
+	CONSTRAINT t3_pkey PRIMARY KEY (c1)
+);
+CREATE TABLE "S 1"."T 4" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c4 text,
+	CONSTRAINT t4_pkey PRIMARY KEY (c1)
+);
 
 INSERT INTO "S 1"."T 1"
 	SELECT id,
@@ -54,9 +71,23 @@ INSERT INTO "S 1"."T 2"
 	SELECT id,
 	       'AAA' || to_char(id, 'FM000')
 	FROM generate_series(1, 100) id;
+INSERT INTO "S 1"."T 3"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 3" WHERE c1 % 2 != 0;	-- delete for outer join tests
+INSERT INTO "S 1"."T 4"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 4" WHERE c1 % 3 != 0;	-- delete for outer join tests
 
 ANALYZE "S 1"."T 1";
 ANALYZE "S 1"."T 2";
+ANALYZE "S 1"."T 3";
+ANALYZE "S 1"."T 4";
 
 -- ===================================================================
 -- create foreign tables
@@ -87,6 +118,29 @@ CREATE FOREIGN TABLE ft2 (
 ) SERVER loopback;
 ALTER FOREIGN TABLE ft2 DROP COLUMN cx;
 
+CREATE FOREIGN TABLE ft4 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3');
+
+CREATE FOREIGN TABLE ft5 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4');
+
+CREATE FOREIGN TABLE ft6 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback2 OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE USER view_owner;
+GRANT ALL ON ft5 TO view_owner;
+CREATE VIEW v_ft5 AS SELECT * FROM ft5;
+ALTER VIEW v_ft5 OWNER TO view_owner;
+CREATE USER MAPPING FOR view_owner SERVER loopback;
+
 -- ===================================================================
 -- tests for validator
 -- ===================================================================
@@ -158,8 +212,6 @@ EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
 SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
 -- aggregate
 SELECT COUNT(*) FROM ft1 t1;
--- join two tables
-SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
 -- subquery
 SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <= 10) ORDER BY c1;
 -- subquery+MAX
@@ -216,6 +268,78 @@ SELECT * FROM ft1 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft2 WHERE c1 < 5));
 SELECT * FROM ft2 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft1 WHERE c1 < 5));
 
 -- ===================================================================
+-- JOIN queries
+-- ===================================================================
+-- join two tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- join three tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1, t3.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1, t3.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+-- left outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+-- right outer join
+SET enable_mergejoin = off; -- palnner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+SET enable_mergejoin = on;
+-- full outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+-- full outer join + WHERE clause, only matched rows
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+-- join at WHERE clause 
+SET enable_mergejoin = off; -- palnner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SET enable_mergejoin = on;
+-- join in CTE
+EXPLAIN (COSTS false, VERBOSE)
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+-- partially unsafe to push down, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+-- SEMI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+-- ANTI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+-- CROSS JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- different server
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- different effective user for permission check
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- unsafe join conditions
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- local filter (unsafe conditions on one side)
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+
+-- ===================================================================
 -- parameterized queries
 -- ===================================================================
 -- simple join
@@ -666,116 +790,6 @@ UPDATE rem1 SET f2 = 'testo';
 INSERT INTO rem1(f2) VALUES ('test') RETURNING ctid;
 
 -- ===================================================================
--- test inheritance features
--- ===================================================================
-
-CREATE TABLE a (aa TEXT);
-CREATE TABLE loct (aa TEXT, bb TEXT);
-CREATE FOREIGN TABLE b (bb TEXT) INHERITS (a)
-  SERVER loopback OPTIONS (table_name 'loct');
-
-INSERT INTO a(aa) VALUES('aaa');
-INSERT INTO a(aa) VALUES('aaaa');
-INSERT INTO a(aa) VALUES('aaaaa');
-
-INSERT INTO b(aa) VALUES('bbb');
-INSERT INTO b(aa) VALUES('bbbb');
-INSERT INTO b(aa) VALUES('bbbbb');
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE a SET aa = 'zzzzzz' WHERE aa LIKE 'aaaa%';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE b SET aa = 'new';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE a SET aa = 'newtoo';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-DELETE FROM a;
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-DROP TABLE a CASCADE;
-DROP TABLE loct;
-
--- Check SELECT FOR UPDATE/SHARE with an inherited source table
-create table loct1 (f1 int, f2 int, f3 int);
-create table loct2 (f1 int, f2 int, f3 int);
-
-create table foo (f1 int, f2 int);
-create foreign table foo2 (f3 int) inherits (foo)
-  server loopback options (table_name 'loct1');
-create table bar (f1 int, f2 int);
-create foreign table bar2 (f3 int) inherits (bar)
-  server loopback options (table_name 'loct2');
-
-insert into foo values(1,1);
-insert into foo values(3,3);
-insert into foo2 values(2,2,2);
-insert into foo2 values(4,4,4);
-insert into bar values(1,11);
-insert into bar values(2,22);
-insert into bar values(6,66);
-insert into bar2 values(3,33,33);
-insert into bar2 values(4,44,44);
-insert into bar2 values(7,77,77);
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for update;
-select * from bar where f1 in (select f1 from foo) for update;
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for share;
-select * from bar where f1 in (select f1 from foo) for share;
-
--- Check UPDATE with inherited target and an inherited source table
-explain (verbose, costs off)
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-
-select tableoid::regclass, * from bar order by 1,2;
-
--- Check UPDATE with inherited target and an appendrel subquery
-explain (verbose, costs off)
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-
-select tableoid::regclass, * from bar order by 1,2;
-
--- Test that WHERE CURRENT OF is not supported
-begin;
-declare c cursor for select * from bar where f1 = 7;
-fetch from c;
-update bar set f2 = null where current of c;
-rollback;
-
-drop table foo cascade;
-drop table bar cascade;
-drop table loct1;
-drop table loct2;
-
--- ===================================================================
 -- test IMPORT FOREIGN SCHEMA
 -- ===================================================================
 
@@ -831,3 +845,7 @@ DROP TYPE "Colors" CASCADE;
 IMPORT FOREIGN SCHEMA import_source LIMIT TO (t5)
   FROM SERVER loopback INTO import_dest5;  -- ERROR
 ROLLBACK;
+
+-- Cleanup
+DROP OWNED BY view_owner;
+DROP USER view_owner;

#17

Kouhei Kaigai

kaigai@ak.jp.nec.com

almost 11 years ago

In reply to: Shigeru Hanada (#16)

Hanada-san,

Thanks for your dedicated efforts for remote-join feature.
Below are the comments from my side.

* Bug - mixture of ctid system column and whole row-reference
In case when "ctid" system column is required, deparseSelectSql()
adds ctid reference on the base relation scan level.
On the other hands, whole-row reference is transformed to
a reference to the underlying relation. It will work fine if
system column is not specified. However, system column reference
breaks tuple layout from the expected one.
Eventually it leads an error.

postgres=# select ft1.ctid,ft1 from ft1,ft2 where a=b;
ERROR: malformed record literal: "(2,2,bbb,"(0,2)")"
DETAIL: Too many columns.
CONTEXT: column "" of foreign table "foreign join"
STATEMENT: select ft1.ctid,ft1 from ft1,ft2 where a=b;

postgres=# explain verbose
select ft1.ctid,ft1 from ft1,ft2 where a=b;
QUERY PLAN
--------------------------------------------------------------------------------
Foreign Scan (cost=200.00..208.35 rows=835 width=70)
Output: ft1.ctid, ft1.*
Remote SQL: SELECT l.a1, l.a2 FROM (SELECT l.a7, l, l.a10 FROM (SELECT id a9,
a a10, atext a11, ctid a7 FROM public.t1) l) l (a1, a2, a3) INNER JOIN (SELECT
b a10 FROM public.t2) r (a1) ON ((l.a3 = r.a1))

"l" of the first SELECT represents a whole-row reference.
However, underlying SELECT contains system columns in its target-
list.

Is it available to construct such a query?
SELECT l.a1, l.a2 FROM (SELECT (id,a,atext), ctid) l (a1, a2) ...
^^^^^^^^^^
Probably, it is a job of deparseProjectionSql().

* postgresGetForeignJoinPaths()
It walks on the root->simple_rel_array to check whether
all the relations involved are manged by same foreign
server with same credential.
We may have more graceful way for this. Pay attention on
the fdw_private field of innerrel/outerrel. If they have
a valid fdw_private, it means FDW driver (postgres_fdw)
considers all the underlying relations scan/join are
available to run the remote-server.
So, all we need to check is whether server-id and user-id
of both relations are identical or not.

* merge_fpinfo()
It seems to me fpinfo->rows should be joinrel->rows, and
fpinfo->width also should be joinrel->width.
No need to have special intelligence here, isn't it?

* explain output

EXPLAIN output may be a bit insufficient to know what does it
actually try to do.

postgres=# explain select * from ft1,ft2 where a = b;
QUERY PLAN
--------------------------------------------------------
Foreign Scan (cost=200.00..212.80 rows=1280 width=80)
(1 row)

Even though it is not an immediate request, it seems to me
better to show users joined relations and remote ON/WHERE
clause here.

Please don't hesitate to consult me, if you have any questions.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Shigeru Hanada
Sent: Friday, April 03, 2015 7:32 PM
To: Kaigai Kouhei(海外浩平)
Cc: Ashutosh Bapat; Robert Haas; Tom Lane; Thom Brown;
pgsql-hackers@postgreSQL.org
Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

Attached is the patch which adds join push-down support to postgres_fdw (v7).
It supports SELECT statements with JOIN, but has some more possible enhancements
(see below). I'd like to share the WIP patch here to get comments about new FDW
API design provided by KaiGai-san's v11 patch.

To make reviewing easier, I summarized changes against Custom/Foreign join v11
patch.

Changes for Join push-down support
==================================
- Add FDW API GetForeignJoinPaths(). It generates a ForeignPath which represents
a scan against pseudo join relation represented by given RelOptInfo.
- Expand deparsing module to handle multi-relation queries. Steps of deparsing
a join query:

1) Optimizer calls postgresGetForeignPaths() for each BASEREL. Here
postgres_fdw does the same things as before, except adding column aliases in SELECT
clause.
2) Optimizer calls postgresGetForeignJoinPaths() for each JOINREL. Optimizer
calls once per RelOptInfo with reloptkind == RELOPT_JOINREL, so postgres_fdw
should consider both A JOIN B and B JOIN A in one call.

postgres_fdw checks whether the join can be pushed down.

a) Both outer and inner relations can be pushed down (NULL in
RelOptInfo#fdw_private indicates such situation)
b) Outmost command is a SELECT (this can be relaxed in the future)
c) Join type is inner or one of outer
d) Server of all relations in the join are identical
e) Effective user id for all relations in the join are identical (they might be
different some were accessed via views)
f) No local filters (this can be relaxed if inner && non-volatile)
g) Join conditions doesn't contain any "unsafe" expression
h) Remote filter doesn't contain any "unsafe" expression

If all criteria passed, postgres_fdw makes ForeignPath for the join and store
these information in its fdw_private.

a) ForeignPath of outer relation, first non-parameterized one
b) ForeignPath of outer relation, first non-parameterized one
c) join type (as integer)
d) join conditions (as List of Expr)
e) other conditions (as List of Expr)

As of now the costs of the path is not so accurate, this is a possible enhancement.

2) Optimizer calls postgresGetForeignPlan() for the cheapest topmost Path. If
foreign join is the cheapest way to execute the query, optimizer calls
postgresGetForeignPlan for the topmost path generated by
postgresGetForeignJoinPaths. As Robert and Tom mentioned in the thread,
large_table JOIN huge_table might be removed even (large_table JOIN huge_table)
JOIN small_table is the cheapest in the join level 3, so postgres_fdw can't assume
that paths in lower level survived planning.

To cope with the situation, I'm trying to avoid calling create_plan_recurse()
for underlying paths by putting necessary information into PgFdwRelationInfo and
link it to appropriate RelOptInfo.

Anyway in current version query string is built up from bottom (BASEREL) to upper
recursively. For a join, unerlying outer/inner query are put into FROM clause
with wrapping with parenthesis and aliasing. For example:

select * from pgbench_branches b join pgbench_tellers t on t.bid = b.bid;

is transformed to a query like this:

SELECT l.a1, l.a2, l.a3, r.a1, r.a2, r.a3, r.a4 FROM (SELECT bid a9, bbalance
a10, filler a11 FROM public.pgbench_branches) l (a1, a2, a3) INNER JOIN (SELECT
tid a9, bid a10, balance a11, filler a12 FROM public.pgbench_tellers) r (a1, a2,
a3, a4) ON ((l.a1 = r.a2));

As in the remote query, column aliasing uses attnum-based numbering with shifted
by FirstLowInvalidHeapAttributeNumber to make all attnum positive. For instance,
this system uses alias "a9" for the first user column. For readability of code
around this, I introduced TO_RELATEVE() macro which converts absolute attnum
(-8~) to relative ones (0~). Current deparser can also handle whole-row
references (attnum == 0) correctly.

3) Executor calls BeginForeignScan to initialize a scan. Here TupleDesc is taken
from the slot, not Relation.

Possible enhancement
====================
- Make deparseSelectSql() more general, thus it can handle both simple SELECT
and join SELECT by calling itself recursively. This would avoid assuming that
underlying ForeignPath remains in RelOptInfo. (WIP)
- Move appendConditions() calls into deparse.c, to clarify responsibility of
modules.
- more accurate estimation
- more detailed information for error location (currently "foreign table" is used
as relation name always)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18

Shigeru Hanada

shigeru.hanada@gmail.com

almost 11 years ago

In reply to: Kouhei Kaigai (#17)

1 attachment(s)

Hi KaiGai-san,

Thanks for the review. Attached is the v8 patch of foreign join support for postgres_fdw.

In addition to your comments, I removed useless code that retrieves ForeignPath from outer/inner RelOptInfo and store them into ForeignPath#fdw_private. Now postgres_fdw’s join pushd-down is free from existence of ForeignPath under the join relation. This means that we can support the case Robert mentioned before, that whole "(huge JOIN large) JOIN small” can be pushed down even if “(huge JOIN large)” is dominated by another join path.

2015-04-07 13:46 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:

Thanks for your dedicated efforts for remote-join feature.
Below are the comments from my side.

* Bug - mixture of ctid system column and whole row-reference
In case when "ctid" system column is required, deparseSelectSql()
adds ctid reference on the base relation scan level.
On the other hands, whole-row reference is transformed to
a reference to the underlying relation. It will work fine if
system column is not specified. However, system column reference
breaks tuple layout from the expected one.
Eventually it leads an error.

I too found the bug. As you suggested, deparseProjectionSql() is the place to fix.

postgres=# select ft1.ctid,ft1 from ft1,ft2 where a=b;
ERROR: malformed record literal: "(2,2,bbb,"(0,2)")"
DETAIL: Too many columns.
CONTEXT: column "" of foreign table "foreign join"
STATEMENT: select ft1.ctid,ft1 from ft1,ft2 where a=b;

postgres=# explain verbose
select ft1.ctid,ft1 from ft1,ft2 where a=b;
QUERY PLAN
--------------------------------------------------------------------------------
Foreign Scan (cost=200.00..208.35 rows=835 width=70)
Output: ft1.ctid, ft1.*
Remote SQL: SELECT l.a1, l.a2 FROM (SELECT l.a7, l, l.a10 FROM (SELECT id a9,
a a10, atext a11, ctid a7 FROM public.t1) l) l (a1, a2, a3) INNER JOIN (SELECT
b a10 FROM public.t2) r (a1) ON ((l.a3 = r.a1))

"l" of the first SELECT represents a whole-row reference.
However, underlying SELECT contains system columns in its target-
list.

Is it available to construct such a query?
SELECT l.a1, l.a2 FROM (SELECT (id,a,atext), ctid) l (a1, a2) ...
^^^^^^^^^^

Simple relation reference such as "l" is not sufficient for the purpose, yes. But putting columns in parentheses would not work when a user column is referenced in original query.

I implemented deparseProjectionSql to use ROW(...) expression for a whole-row reference in the target list, in addition to ordinary column references for actually used columns and ctid.

Please see the test case for mixed use of ctid and whole-row reference to postgres_fdw’s regression tests. Now a whole-row reference in the remote query looks like this:

-- ctid with whole-row reference
EXPLAIN (COSTS false, VERBOSE)
SELECT t1.ctid, t1, t2 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;

In fact l.a12 and l.a10, for t1.c3 and t1.c1, are redundant in transferred data, but IMO this would simplify the code for deparsing.

* postgresGetForeignJoinPaths()
It walks on the root->simple_rel_array to check whether
all the relations involved are manged by same foreign
server with same credential.
We may have more graceful way for this. Pay attention on
the fdw_private field of innerrel/outerrel. If they have
a valid fdw_private, it means FDW driver (postgres_fdw)
considers all the underlying relations scan/join are
available to run the remote-server.
So, all we need to check is whether server-id and user-id
of both relations are identical or not.

Exactly. I fixed the code not to loop around.

* merge_fpinfo()
It seems to me fpinfo->rows should be joinrel->rows, and
fpinfo->width also should be joinrel->width.
No need to have special intelligence here, isn't it?

Oops. They are vestige of my struggle which disabled SELECT clause optimization (omit unused columns). Now width and rows are inherited from joinrel. Besides that, fdw_startup_cost and fdw_tuple_cost seem wrong, so I fixed them to use simple summary, not average.

* explain output

EXPLAIN output may be a bit insufficient to know what does it
actually try to do.

postgres=# explain select * from ft1,ft2 where a = b;
QUERY PLAN
--------------------------------------------------------
Foreign Scan (cost=200.00..212.80 rows=1280 width=80)
(1 row)

Even though it is not an immediate request, it seems to me
better to show users joined relations and remote ON/WHERE
clause here.

Like this?

Foreign Scan on ft1 INNER JOIN ft2 ON ft1.a = ft2.b (cost=200.00..212.80 rows=1280 width=80)
…

It might produce a very long line in a case of joining many tables because it contains most of remote query other than SELECT clause, but I prefer detailed. Another idea is to print “Join Cond” and “Remote Filter” as separated EXPLAIN items.

Note that v8 patch doesn’t contain this change yet!

--
Shigeru HANADA

Attachments:

foreign_join_v8.patchapplication/octet-stream; name=foreign_join_v8.patchDownload

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 94fab18..0b29159 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -44,8 +44,11 @@
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
+#include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "nodes/plannodes.h"
 #include "optimizer/clauses.h"
+#include "optimizer/prep.h"
 #include "optimizer/var.h"
 #include "parser/parsetree.h"
 #include "utils/builtins.h"
@@ -89,6 +92,8 @@ typedef struct deparse_expr_cxt
 	RelOptInfo *foreignrel;		/* the foreign relation we are planning for */
 	StringInfo	buf;			/* output buffer to append to */
 	List	  **params_list;	/* exprs that will become remote Params */
+	List	   *outertlist;		/* outer child's target list */
+	List	   *innertlist;		/* inner child's target list */
 } deparse_expr_cxt;
 
 /*
@@ -137,12 +142,19 @@ static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod,
 static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
 					   deparse_expr_cxt *context);
 
+/*
+ * convert absolute attnum to relative one.  This would be handy for handling
+ * attnum for attrs_used and column aliases.
+ */
+#define TO_RELATIVE(x)	((x) - FirstLowInvalidHeapAttributeNumber)
+
 
 /*
  * Examine each qual clause in input_conds, and classify them into two groups,
  * which are returned as two lists:
  *	- remote_conds contains expressions that can be evaluated remotely
  *	- local_conds contains expressions that can't be evaluated remotely
+ * Note that each element is Expr, which was stripped from RestrictInfo, 
  */
 void
 classifyConditions(PlannerInfo *root,
@@ -161,9 +173,9 @@ classifyConditions(PlannerInfo *root,
 		RestrictInfo *ri = (RestrictInfo *) lfirst(lc);
 
 		if (is_foreign_expr(root, baserel, ri->clause))
-			*remote_conds = lappend(*remote_conds, ri);
+			*remote_conds = lappend(*remote_conds, ri->clause);
 		else
-			*local_conds = lappend(*local_conds, ri);
+			*local_conds = lappend(*local_conds, ri->clause);
 	}
 }
 
@@ -250,7 +262,7 @@ foreign_expr_walker(Node *node,
 				 * Param's collation, ie it's not safe for it to have a
 				 * non-default collation.
 				 */
-				if (var->varno == glob_cxt->foreignrel->relid &&
+				if (bms_is_member(var->varno, glob_cxt->foreignrel->relids) &&
 					var->varlevelsup == 0)
 				{
 					/* Var belongs to foreign table */
@@ -681,12 +693,57 @@ deparseSelectSql(StringInfo buf,
 				 PlannerInfo *root,
 				 RelOptInfo *baserel,
 				 Bitmapset *attrs_used,
+				 List *remote_conds,
+				 List **params_list,
+				 List **fdw_ps_tlist,
 				 List **retrieved_attrs)
 {
+	PgFdwRelationInfo  *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
 	RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
 	Relation	rel;
 
 	/*
+	 * If given relation was a join relation, recursively construct statement
+	 * by putting each outer and inner relations in FROM clause as a subquery
+	 * with aliasing.
+	 */
+	if (baserel->reloptkind == RELOPT_JOINREL)
+	{
+		RelOptInfo		   *rel_o = fpinfo->outerrel;
+		RelOptInfo		   *rel_i = fpinfo->innerrel;
+		PgFdwRelationInfo  *fpinfo_o = (PgFdwRelationInfo *) rel_o->fdw_private;
+		PgFdwRelationInfo  *fpinfo_i = (PgFdwRelationInfo *) rel_i->fdw_private;
+		StringInfoData		sql_o;
+		StringInfoData		sql_i;
+		List			   *ret_attrs_tmp;	/* not used */
+
+		/*
+		 * Deparse query for outer and inner relation, and combine them into
+		 * a query.
+		 */
+		initStringInfo(&sql_o);
+		deparseSelectSql(&sql_o, root, rel_o, fpinfo_o->attrs_used,
+						 fpinfo_o->remote_conds, params_list,
+						 fdw_ps_tlist, &ret_attrs_tmp);
+		initStringInfo(&sql_i);
+		deparseSelectSql(&sql_i, root, rel_i, fpinfo_i->attrs_used,
+						 fpinfo_i->remote_conds, params_list,
+						 fdw_ps_tlist, &ret_attrs_tmp);
+
+		deparseJoinSql(buf, root, baserel,
+					   fpinfo->outerrel,
+					   fpinfo->innerrel,
+					   sql_o.data,
+					   sql_i.data,
+					   fpinfo->jointype,
+					   fpinfo->joinclauses,
+					   fpinfo->otherclauses,
+					   fdw_ps_tlist,
+					   retrieved_attrs);
+		return;
+	}
+
+	/*
 	 * Core code already has some lock on each rel being planned, so we can
 	 * use NoLock here.
 	 */
@@ -705,6 +762,65 @@ deparseSelectSql(StringInfo buf,
 	appendStringInfoString(buf, " FROM ");
 	deparseRelation(buf, rel);
 
+	/*
+	 * Construct WHERE clause
+	 */
+	if (remote_conds)
+		appendConditions(buf, root, baserel, NULL, NULL, remote_conds,
+						 " WHERE ", params_list);
+
+	/*
+	 * Add FOR UPDATE/SHARE if appropriate.  We apply locking during the
+	 * initial row fetch, rather than later on as is done for local tables.
+	 * The extra roundtrips involved in trying to duplicate the local
+	 * semantics exactly don't seem worthwhile (see also comments for
+	 * RowMarkType).
+	 *
+	 * Note: because we actually run the query as a cursor, this assumes
+	 * that DECLARE CURSOR ... FOR UPDATE is supported, which it isn't
+	 * before 8.3.
+	 */
+	if (baserel->relid == root->parse->resultRelation &&
+		(root->parse->commandType == CMD_UPDATE ||
+		 root->parse->commandType == CMD_DELETE))
+	{
+		/* Relation is UPDATE/DELETE target, so use FOR UPDATE */
+		appendStringInfoString(buf, " FOR UPDATE");
+	}
+	else
+	{
+		PlanRowMark *rc = get_plan_rowmark(root->rowMarks, baserel->relid);
+
+		if (rc)
+		{
+			/*
+			 * Relation is specified as a FOR UPDATE/SHARE target, so handle
+			 * that.  (But we could also see LCS_NONE, meaning this isn't a
+			 * target relation after all.)
+			 *
+			 * For now, just ignore any [NO] KEY specification, since (a)
+			 * it's not clear what that means for a remote table that we
+			 * don't have complete information about, and (b) it wouldn't
+			 * work anyway on older remote servers.  Likewise, we don't
+			 * worry about NOWAIT.
+			 */
+			switch (rc->strength)
+			{
+				case LCS_NONE:
+					/* No locking needed */
+					break;
+				case LCS_FORKEYSHARE:
+				case LCS_FORSHARE:
+					appendStringInfoString(buf, " FOR SHARE");
+					break;
+				case LCS_FORNOKEYUPDATE:
+				case LCS_FORUPDATE:
+					appendStringInfoString(buf, " FOR UPDATE");
+					break;
+			}
+		}
+	}
+
 	heap_close(rel, NoLock);
 }
 
@@ -731,8 +847,7 @@ deparseTargetList(StringInfo buf,
 	*retrieved_attrs = NIL;
 
 	/* If there's a whole-row reference, we'll need all the columns. */
-	have_wholerow = bms_is_member(0 - FirstLowInvalidHeapAttributeNumber,
-								  attrs_used);
+	have_wholerow = bms_is_member(TO_RELATIVE(0), attrs_used);
 
 	first = true;
 	for (i = 1; i <= tupdesc->natts; i++)
@@ -743,15 +858,14 @@ deparseTargetList(StringInfo buf,
 		if (attr->attisdropped)
 			continue;
 
-		if (have_wholerow ||
-			bms_is_member(i - FirstLowInvalidHeapAttributeNumber,
-						  attrs_used))
+		if (have_wholerow || bms_is_member(TO_RELATIVE(i), attrs_used))
 		{
 			if (!first)
 				appendStringInfoString(buf, ", ");
 			first = false;
 
 			deparseColumnRef(buf, rtindex, i, root);
+			appendStringInfo(buf, " a%d", TO_RELATIVE(i));
 
 			*retrieved_attrs = lappend_int(*retrieved_attrs, i);
 		}
@@ -761,17 +875,17 @@ deparseTargetList(StringInfo buf,
 	 * Add ctid if needed.  We currently don't support retrieving any other
 	 * system columns.
 	 */
-	if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-					  attrs_used))
+	if (bms_is_member(TO_RELATIVE(SelfItemPointerAttributeNumber), attrs_used))
 	{
 		if (!first)
 			appendStringInfoString(buf, ", ");
 		first = false;
 
-		appendStringInfoString(buf, "ctid");
+		appendStringInfo(buf, "ctid a%d",
+						 TO_RELATIVE(SelfItemPointerAttributeNumber));
 
 		*retrieved_attrs = lappend_int(*retrieved_attrs,
-									   SelfItemPointerAttributeNumber);
+										   SelfItemPointerAttributeNumber);
 	}
 
 	/* Don't generate bad syntax if no undropped columns */
@@ -780,7 +894,8 @@ deparseTargetList(StringInfo buf,
 }
 
 /*
- * Deparse WHERE clauses in given list of RestrictInfos and append them to buf.
+ * Deparse conditions, such as WHERE clause and ON clause of JOIN, in given
+ * list of Expr and append them to buf.
  *
  * baserel is the foreign table we're planning for.
  *
@@ -794,12 +909,14 @@ deparseTargetList(StringInfo buf,
  * so Params and other-relation Vars should be replaced by dummy values.
  */
 void
-appendWhereClause(StringInfo buf,
-				  PlannerInfo *root,
-				  RelOptInfo *baserel,
-				  List *exprs,
-				  bool is_first,
-				  List **params)
+appendConditions(StringInfo buf,
+				 PlannerInfo *root,
+				 RelOptInfo *baserel,
+				 List *outertlist,
+				 List *innertlist,
+				 List *exprs,
+				 const char *prefix,
+				 List **params)
 {
 	deparse_expr_cxt context;
 	int			nestlevel;
@@ -813,31 +930,319 @@ appendWhereClause(StringInfo buf,
 	context.foreignrel = baserel;
 	context.buf = buf;
 	context.params_list = params;
+	context.outertlist = outertlist;
+	context.innertlist = innertlist;
 
 	/* Make sure any constants in the exprs are printed portably */
 	nestlevel = set_transmission_modes();
 
 	foreach(lc, exprs)
 	{
-		RestrictInfo *ri = (RestrictInfo *) lfirst(lc);
+		Expr	   *expr = (Expr *) lfirst(lc);
 
 		/* Connect expressions with "AND" and parenthesize each condition. */
-		if (is_first)
-			appendStringInfoString(buf, " WHERE ");
-		else
-			appendStringInfoString(buf, " AND ");
+		if (prefix)
+			appendStringInfo(buf, "%s", prefix);
 
 		appendStringInfoChar(buf, '(');
-		deparseExpr(ri->clause, &context);
+		deparseExpr(expr, &context);
 		appendStringInfoChar(buf, ')');
 
-		is_first = false;
+		prefix= " AND ";
 	}
 
 	reset_transmission_modes(nestlevel);
 }
 
 /*
+ * Returns position index (start with 1) of given var in given target list, or
+ * 0 when not found.
+ */
+static int
+find_var_pos(Var *node, List *tlist)
+{
+	int		pos = 1;
+	ListCell *lc;
+
+	foreach(lc, tlist)
+	{
+		Var *var = (Var *) lfirst(lc);
+
+		if (equal(var, node))
+		{
+			return pos;
+		}
+		pos++;
+	}
+
+	return 0;
+}
+
+/*
+ * Deparse given Var into buf.
+ */
+static void
+deparseJoinVar(Var *node, deparse_expr_cxt *context)
+{
+	char		side;
+	int			pos;
+
+	pos = find_var_pos(node, context->outertlist);
+	if (pos > 0)
+		side = 'l';
+	else
+	{
+		side = 'r';
+		pos = find_var_pos(node, context->innertlist);
+	}
+	Assert(pos > 0);
+	Assert(side == 'l' || side == 'r');
+
+	/*
+	 * We treat whole-row reference same as ordinary attribute references,
+	 * because such transformation should be done in lower level.
+	 */
+	appendStringInfo(context->buf, "%c.a%d", side, pos);
+}
+
+/*
+ * Deparse column alias list for a subquery in FROM clause.
+ */
+static void
+deparseColumnAliases(StringInfo buf, List *tlist)
+{
+	int			pos;
+	ListCell   *lc;
+
+	pos = 1;
+	foreach(lc, tlist)
+	{
+		/* Deparse column alias for the subquery */
+		if (pos > 1)
+			appendStringInfoString(buf, ", ");
+		appendStringInfo(buf, "a%d", pos);
+		pos++;
+	}
+}
+
+/*
+ * Deparse "wrapper" SQL for a query which contains whole-row reference or ctid.
+ * If the SQL is enough simple, just return it.
+ *
+ * Projecting whole-row value from each column value is done by ExecProjection
+ * for results of a scan against an ordinary tables, but join push-down omits
+ * ExecProjection calls so we need to do it in the remote SQL.
+ */
+static const char *
+deparseProjectionSql(PlannerInfo *root,
+					 RelOptInfo *baserel,
+					 const char *sql,
+					 char side)
+{
+	StringInfoData wholerow;
+	StringInfoData buf;
+	ListCell   *lc;
+	bool		first;
+	bool		have_wholerow = false;
+	bool		have_ctid = false;
+
+	/*
+	 * We have nothing to do if the targetlist contains no special reference,
+	 * such as whole-row and ctid.
+	 */
+	foreach(lc, baserel->reltargetlist)
+	{
+		Var		   *var = (Var *) lfirst(lc);
+		if (var->varattno == 0)
+		{
+			have_wholerow = true;
+			if (have_ctid)
+				break;
+		}
+		else if (var->varattno == SelfItemPointerAttributeNumber)
+		{
+			have_ctid = true;
+			if (have_wholerow)
+				break;
+		}
+	}
+	if (!have_wholerow && !have_ctid)
+		return sql;
+
+	/*
+	 * Construct whole-row reference with ROW() syntax
+	 */
+	if (have_wholerow)
+	{
+		RangeTblEntry *rte;
+		Relation		rel;
+		TupleDesc		tupdesc;
+		int				i;
+
+		/* Obtain TupleDesc for deparsing all valid columns */
+		rte = planner_rt_fetch(baserel->relid, root);
+		rel = heap_open(rte->relid, NoLock);
+		tupdesc = rel->rd_att;
+
+		/* Print all valid columns in ROW() to generate whole-row value */
+		initStringInfo(&wholerow);
+		appendStringInfoString(&wholerow, "ROW(");
+		first = true;
+		for (i = 1; i <= tupdesc->natts; i++)
+		{
+			Form_pg_attribute attr = tupdesc->attrs[i - 1];
+
+			/* Ignore dropped columns. */
+			if (attr->attisdropped)
+				continue;
+
+			if (!first)
+				appendStringInfoString(&wholerow, ", ");
+			first = false;
+
+			appendStringInfo(&wholerow, "%c.a%d", side, TO_RELATIVE(i));
+		}
+		appendStringInfoString(&wholerow, ")");
+
+		heap_close(rel, NoLock);
+	}
+
+	/*
+	 * Construct a SELECT statement which has the original query in its FROM
+	 * clause, and have target list entries in its SELECT clause.  The number
+	 * used in column aliases are attnum - FirstLowInvalidHeapAttributeNumber in
+	 * order to make all numbers positive even for system columns which have
+	 * minus value as attnum.
+	 */
+	initStringInfo(&buf);
+	appendStringInfoString(&buf, "SELECT ");
+	first = true;
+	foreach(lc, baserel->reltargetlist)
+	{
+		Var *var = (Var *) lfirst(lc);
+
+		if (!first)
+			appendStringInfoString(&buf, ", ");
+	
+		if (var->varattno == 0)
+			appendStringInfo(&buf, "%s", wholerow.data);
+		else
+			appendStringInfo(&buf, "%c.a%d", side, TO_RELATIVE(var->varattno));
+
+		first = false;
+	}
+	appendStringInfo(&buf, " FROM (%s) %c", sql, side);
+
+	return buf.data;
+}
+
+/*
+ * Construct a SELECT statement which contains join clause.
+ *
+ * We also create an TargetEntry List of the columns being retrieved, which is
+ * returned to *fdw_ps_tlist.
+ *
+ * path_o, tl_o, sql_o are respectively path, targetlist, and remote query
+ * statement of the outer child relation.  postfix _i means those for the inner
+ * child relation.  jointype and joinclauses are information of join method.
+ * fdw_ps_tlist is output parameter to pass target list of the pseudo scan to
+ * caller.
+ */
+void
+deparseJoinSql(StringInfo buf,
+			   PlannerInfo *root,
+			   RelOptInfo *baserel,
+			   RelOptInfo *outerrel,
+			   RelOptInfo *innerrel,
+			   const char *sql_o,
+			   const char *sql_i,
+			   JoinType jointype,
+			   List *joinclauses,
+			   List *otherclauses,
+			   List **fdw_ps_tlist,
+			   List **retrieved_attrs)
+{
+	StringInfoData selbuf;		/* buffer for SELECT clause */
+	StringInfoData abuf_o;		/* buffer for column alias list of outer */
+	StringInfoData abuf_i;		/* buffer for column alias list of inner */
+	int			i;
+	ListCell   *lc;
+	const char *jointype_str;
+	deparse_expr_cxt context;
+
+	context.root = root;
+	context.foreignrel = baserel;
+	context.buf = &selbuf;
+	context.params_list = NULL;
+	context.outertlist = outerrel->reltargetlist;
+	context.innertlist = innerrel->reltargetlist;
+
+	jointype_str = jointype == JOIN_INNER ? "INNER" :
+				   jointype == JOIN_LEFT ? "LEFT" :
+				   jointype == JOIN_RIGHT ? "RIGHT" :
+				   jointype == JOIN_FULL ? "FULL" : "";
+
+	*retrieved_attrs = NIL;
+
+	/* print SELECT clause of the join scan */
+	initStringInfo(&selbuf);
+	i = 0;
+	foreach(lc, baserel->reltargetlist)
+	{
+		Var		   *var = (Var *) lfirst(lc);
+		TargetEntry *tle;
+
+		if (i > 0)
+			appendStringInfoString(&selbuf, ", ");
+		deparseJoinVar(var, &context);
+
+		tle = makeTargetEntry((Expr *) copyObject(var),
+							  i + 1, pstrdup(""), false);
+		if (fdw_ps_tlist)
+			*fdw_ps_tlist = lappend(*fdw_ps_tlist, copyObject(tle));
+
+		*retrieved_attrs = lappend_int(*retrieved_attrs, i + 1);
+
+		i++;
+	}
+	if (i == 0)
+		appendStringInfoString(&selbuf, "NULL");
+
+	/*
+	 * Do pseudo-projection for an underlying scan on a foreign table, if a) the
+	 * relation is a base relation, and b) its targetlist contains whole-row
+	 * reference.
+	 */
+	if (outerrel->reloptkind == RELOPT_BASEREL)
+		sql_o = deparseProjectionSql(root, outerrel, sql_o, 'l');
+	if (innerrel->reloptkind == RELOPT_BASEREL)
+		sql_i = deparseProjectionSql(root, innerrel, sql_i, 'r');
+
+	/* Deparse column alias portion of subquery in FROM clause. */
+	initStringInfo(&abuf_o);
+	deparseColumnAliases(&abuf_o, outerrel->reltargetlist);
+	initStringInfo(&abuf_i);
+	deparseColumnAliases(&abuf_i, innerrel->reltargetlist);
+
+	/* Construct SELECT statement */
+	appendStringInfo(buf, "SELECT %s FROM", selbuf.data);
+	appendStringInfo(buf, " (%s) l (%s) %s JOIN (%s) r (%s)",
+					 sql_o, abuf_o.data, jointype_str, sql_i, abuf_i.data);
+	/* Append ON clause */
+	if (joinclauses)
+		appendConditions(buf, root, baserel,
+						 outerrel->reltargetlist, innerrel->reltargetlist,
+						 joinclauses,
+						 " ON ", NULL);
+	/* Append WHERE clause */
+	if (otherclauses)
+		appendConditions(buf, root, baserel,
+						 outerrel->reltargetlist, innerrel->reltargetlist,
+						 otherclauses,
+						 " WHERE ", NULL);
+}
+
+/*
  * deparse remote INSERT statement
  *
  * The statement text is appended to buf, and we also create an integer List
@@ -976,8 +1381,7 @@ deparseReturningList(StringInfo buf, PlannerInfo *root,
 	if (trig_after_row)
 	{
 		/* whole-row reference acquires all non-system columns */
-		attrs_used =
-			bms_make_singleton(0 - FirstLowInvalidHeapAttributeNumber);
+		attrs_used = bms_make_singleton(TO_RELATIVE(0));
 	}
 
 	if (returningList != NIL)
@@ -1261,6 +1665,8 @@ deparseExpr(Expr *node, deparse_expr_cxt *context)
 /*
  * Deparse given Var node into context->buf.
  *
+ * If context has valid innerrel, this is invoked for a join conditions.
+ *
  * If the Var belongs to the foreign relation, just print its remote name.
  * Otherwise, it's effectively a Param (and will in fact be a Param at
  * run time).  Handle it the same way we handle plain Params --- see
@@ -1271,39 +1677,46 @@ deparseVar(Var *node, deparse_expr_cxt *context)
 {
 	StringInfo	buf = context->buf;
 
-	if (node->varno == context->foreignrel->relid &&
-		node->varlevelsup == 0)
+	if (context->foreignrel->reloptkind == RELOPT_JOINREL)
 	{
-		/* Var belongs to foreign table */
-		deparseColumnRef(buf, node->varno, node->varattno, context->root);
+		deparseJoinVar(node, context);
 	}
 	else
 	{
-		/* Treat like a Param */
-		if (context->params_list)
+		if (node->varno == context->foreignrel->relid &&
+			node->varlevelsup == 0)
 		{
-			int			pindex = 0;
-			ListCell   *lc;
-
-			/* find its index in params_list */
-			foreach(lc, *context->params_list)
+			/* Var belongs to foreign table */
+			deparseColumnRef(buf, node->varno, node->varattno, context->root);
+		}
+		else
+		{
+			/* Treat like a Param */
+			if (context->params_list)
 			{
-				pindex++;
-				if (equal(node, (Node *) lfirst(lc)))
-					break;
+				int			pindex = 0;
+				ListCell   *lc;
+
+				/* find its index in params_list */
+				foreach(lc, *context->params_list)
+				{
+					pindex++;
+					if (equal(node, (Node *) lfirst(lc)))
+						break;
+				}
+				if (lc == NULL)
+				{
+					/* not in list, so add it */
+					pindex++;
+					*context->params_list = lappend(*context->params_list, node);
+				}
+
+				printRemoteParam(pindex, node->vartype, node->vartypmod, context);
 			}
-			if (lc == NULL)
+			else
 			{
-				/* not in list, so add it */
-				pindex++;
-				*context->params_list = lappend(*context->params_list, node);
+				printRemotePlaceholder(node->vartype, node->vartypmod, context);
 			}
-
-			printRemoteParam(pindex, node->vartype, node->vartypmod, context);
-		}
-		else
-		{
-			printRemotePlaceholder(node->vartype, node->vartypmod, context);
 		}
 	}
 }
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 783cb41..19c1115 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9,11 +9,16 @@ DO $d$
             OPTIONS (dbname '$$||current_database()||$$',
                      port '$$||current_setting('port')||$$'
             )$$;
+        EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER postgres_fdw
+            OPTIONS (dbname '$$||current_database()||$$',
+                     port '$$||current_setting('port')||$$'
+            )$$;
     END;
 $d$;
 CREATE USER MAPPING FOR public SERVER testserver1
 	OPTIONS (user 'value', password 'value');
 CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
+CREATE USER MAPPING FOR CURRENT_USER SERVER loopback2;
 -- ===================================================================
 -- create objects used through FDW loopback server
 -- ===================================================================
@@ -35,6 +40,18 @@ CREATE TABLE "S 1"."T 2" (
 	c2 text,
 	CONSTRAINT t2_pkey PRIMARY KEY (c1)
 );
+CREATE TABLE "S 1"."T 3" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text,
+	CONSTRAINT t3_pkey PRIMARY KEY (c1)
+);
+CREATE TABLE "S 1"."T 4" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c4 text,
+	CONSTRAINT t4_pkey PRIMARY KEY (c1)
+);
 INSERT INTO "S 1"."T 1"
 	SELECT id,
 	       id % 10,
@@ -49,8 +66,22 @@ INSERT INTO "S 1"."T 2"
 	SELECT id,
 	       'AAA' || to_char(id, 'FM000')
 	FROM generate_series(1, 100) id;
+INSERT INTO "S 1"."T 3"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 3" WHERE c1 % 2 != 0;	-- delete for outer join tests
+INSERT INTO "S 1"."T 4"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 4" WHERE c1 % 3 != 0;	-- delete for outer join tests
 ANALYZE "S 1"."T 1";
 ANALYZE "S 1"."T 2";
+ANALYZE "S 1"."T 3";
+ANALYZE "S 1"."T 4";
 -- ===================================================================
 -- create foreign tables
 -- ===================================================================
@@ -78,6 +109,26 @@ CREATE FOREIGN TABLE ft2 (
 	c8 user_enum
 ) SERVER loopback;
 ALTER FOREIGN TABLE ft2 DROP COLUMN cx;
+CREATE FOREIGN TABLE ft4 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3');
+CREATE FOREIGN TABLE ft5 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE FOREIGN TABLE ft6 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback2 OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE USER view_owner;
+GRANT ALL ON ft5 TO view_owner;
+CREATE VIEW v_ft5 AS SELECT * FROM ft5;
+ALTER VIEW v_ft5 OWNER TO view_owner;
+CREATE USER MAPPING FOR view_owner SERVER loopback;
 -- ===================================================================
 -- tests for validator
 -- ===================================================================
@@ -119,12 +170,15 @@ ALTER FOREIGN TABLE ft2 OPTIONS (schema_name 'S 1', table_name 'T 1');
 ALTER FOREIGN TABLE ft1 ALTER COLUMN c1 OPTIONS (column_name 'C 1');
 ALTER FOREIGN TABLE ft2 ALTER COLUMN c1 OPTIONS (column_name 'C 1');
 \det+
-                             List of foreign tables
- Schema | Table |  Server  |              FDW Options              | Description 
---------+-------+----------+---------------------------------------+-------------
- public | ft1   | loopback | (schema_name 'S 1', table_name 'T 1') | 
- public | ft2   | loopback | (schema_name 'S 1', table_name 'T 1') | 
-(2 rows)
+                              List of foreign tables
+ Schema | Table |  Server   |              FDW Options              | Description 
+--------+-------+-----------+---------------------------------------+-------------
+ public | ft1   | loopback  | (schema_name 'S 1', table_name 'T 1') | 
+ public | ft2   | loopback  | (schema_name 'S 1', table_name 'T 1') | 
+ public | ft4   | loopback  | (schema_name 'S 1', table_name 'T 3') | 
+ public | ft5   | loopback  | (schema_name 'S 1', table_name 'T 4') | 
+ public | ft6   | loopback2 | (schema_name 'S 1', table_name 'T 4') | 
+(5 rows)
 
 -- Now we should be able to run ANALYZE.
 -- To exercise multiple code paths, we use local stats on ft1
@@ -160,8 +214,8 @@ SELECT * FROM ft1 ORDER BY c3, c1 OFFSET 100 LIMIT 10;
 (10 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                     QUERY PLAN                                                      
+---------------------------------------------------------------------------------------------------------------------
  Limit
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    ->  Sort
@@ -169,7 +223,7 @@ EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan on public.ft1 t1
                Output: c1, c2, c3, c4, c5, c6, c7, c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (8 rows)
 
 SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
@@ -189,8 +243,8 @@ SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
 
 -- whole-row reference
 EXPLAIN (VERBOSE, COSTS false) SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                     QUERY PLAN                                                      
+---------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.*, c3, c1
    ->  Sort
@@ -198,7 +252,7 @@ EXPLAIN (VERBOSE, COSTS false) SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSE
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan on public.ft1 t1
                Output: t1.*, c3, c1
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (8 rows)
 
 SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
@@ -224,11 +278,11 @@ SELECT * FROM ft1 WHERE false;
 
 -- with WHERE clause
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
-                                                                   QUERY PLAN                                                                   
-------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                   QUERY PLAN                                                                                   
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c7 >= '1'::bpchar)) AND (("C 1" = 101)) AND ((c6 = '1'::text))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c7 >= '1'::bpchar)) AND (("C 1" = 101)) AND ((c6 = '1'::text))
 (3 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
@@ -239,13 +293,13 @@ SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
 
 -- with FOR UPDATE/SHARE
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
-                                                   QUERY PLAN                                                   
-----------------------------------------------------------------------------------------------------------------
+                                                                   QUERY PLAN                                                                   
+------------------------------------------------------------------------------------------------------------------------------------------------
  LockRows
    Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 101)) FOR UPDATE
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 101)) FOR UPDATE
 (5 rows)
 
 SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
@@ -255,13 +309,13 @@ SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
 (1 row)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
-                                                  QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
+                                                                  QUERY PLAN                                                                   
+-----------------------------------------------------------------------------------------------------------------------------------------------
  LockRows
    Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 102)) FOR SHARE
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 102)) FOR SHARE
 (5 rows)
 
 SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
@@ -277,22 +331,6 @@ SELECT COUNT(*) FROM ft1 t1;
   1000
 (1 row)
 
--- join two tables
-SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
- c1  
------
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
-(10 rows)
-
 -- subquery
 SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <= 10) ORDER BY c1;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -353,153 +391,148 @@ CREATE OPERATOR === (
     NEGATOR = !==
 );
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = postgres_fdw_abs(t1.c2);
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c1 = postgres_fdw_abs(t1.c2))
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 === t1.c2;
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c1 === t1.c2)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = abs(t1.c2);
-                                            QUERY PLAN                                             
----------------------------------------------------------------------------------------------------
+                                                            QUERY PLAN                                                             
+-----------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = abs(c2)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = abs(c2)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = t1.c2;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
+                                                          QUERY PLAN                                                          
+------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = c2))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = c2))
 (3 rows)
 
 -- ===================================================================
 -- WHERE with remotely-executable conditions
 -- ===================================================================
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 1;         -- Var, OpExpr(b), Const
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
-                                                  QUERY PLAN                                                  
---------------------------------------------------------------------------------------------------------------
+                                                                  QUERY PLAN                                                                  
+----------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 100)) AND ((c2 = 0))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 100)) AND ((c2 = 0))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NULL;        -- NullTest
-                                           QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
+                                                           QUERY PLAN                                                            
+---------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;    -- NullTest
-                                             QUERY PLAN                                              
------------------------------------------------------------------------------------------------------
+                                                             QUERY PLAN                                                              
+-------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
-                                                     QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
+                                                                     QUERY PLAN                                                                      
+-----------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((round(abs("C 1"), 0) = 1::numeric))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((round(abs("C 1"), 0) = 1::numeric))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = -c1;          -- OpExpr(l)
-                                             QUERY PLAN                                              
------------------------------------------------------------------------------------------------------
+                                                             QUERY PLAN                                                              
+-------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = (- "C 1")))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = (- "C 1")))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE 1 = c1!;           -- OpExpr(r)
-                                                QUERY PLAN                                                
-----------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((1::numeric = ("C 1" !)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((1::numeric = ("C 1" !)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE (c1 IS NOT NULL) IS DISTINCT FROM (c1 IS NOT NULL); -- DistinctExpr
-                                                                 QUERY PLAN                                                                 
---------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                 QUERY PLAN                                                                                 
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" IS NOT NULL) IS DISTINCT FROM ("C 1" IS NOT NULL)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" IS NOT NULL) IS DISTINCT FROM ("C 1" IS NOT NULL)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = ANY(ARRAY[c2, 1, c1 + 0]); -- ScalarArrayOpExpr
-                                                        QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
+                                                                        QUERY PLAN                                                                         
+-----------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = ANY (ARRAY[c2, 1, ("C 1" + 0)])))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = ANY (ARRAY[c2, 1, ("C 1" + 0)])))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = (ARRAY[c1,c2,3])[1]; -- ArrayRef
-                                                      QUERY PLAN                                                      
-----------------------------------------------------------------------------------------------------------------------
+                                                                      QUERY PLAN                                                                      
+------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = ((ARRAY["C 1", c2, 3])[1])))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = ((ARRAY["C 1", c2, 3])[1])))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c6 = E'foo''s\\bar';  -- check special chars
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                  
+---------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c6 = E'foo''s\\bar'::text))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c6 = E'foo''s\\bar'::text))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c8 = 'foo';  -- can't be sent to remote
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 -- parameterized remote path
 EXPLAIN (VERBOSE, COSTS false)
   SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
- Nested Loop
+                                                                                                                                                                                                               QUERY PLAN                                                                                                                                                                                                                
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan
    Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8, b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
-   ->  Foreign Scan on public.ft2 a
-         Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 47))
-   ->  Foreign Scan on public.ft2 b
-         Output: b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
-(8 rows)
+   Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, r.a1, r.a2, r.a3, r.a4, r.a5, r.a6, r.a7, r.a8 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 47))) l (a1, a2, a3, a4, a5, a6, a7, a8) INNER JOIN (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r (a1, a2, a3, a4, a5, a6, a7, a8) ON ((l.a2 = r.a1))
+(3 rows)
 
 SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  | c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -511,18 +544,18 @@ SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
 EXPLAIN (VERBOSE, COSTS false)
   SELECT * FROM ft2 a, ft2 b
   WHERE a.c2 = 6 AND b.c1 = a.c1 AND a.c8 = 'foo' AND b.c7 = upper(a.c7);
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                 
+--------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop
    Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8, b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
    ->  Foreign Scan on public.ft2 a
          Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8
          Filter: (a.c8 = 'foo'::user_enum)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c2 = 6))
+         Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c2 = 6))
    ->  Foreign Scan on public.ft2 b
          Output: b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
          Filter: (upper((a.c7)::text) = (b.c7)::text)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
+         Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
 (10 rows)
 
 SELECT * FROM ft2 a, ft2 b
@@ -651,21 +684,587 @@ SELECT * FROM ft2 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft1 WHERE c1 < 5));
 (4 rows)
 
 -- ===================================================================
+-- JOIN queries
+-- ===================================================================
+-- join two tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                                                                      QUERY PLAN                                                                                       
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1, t1.c3
+               Remote SQL: SELECT l.a1, l.a2, r.a1 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l (a1, a2) INNER JOIN (SELECT "C 1" a9 FROM "S 1"."T 1") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1  | c1  
+-----+-----
+ 101 | 101
+ 102 | 102
+ 103 | 103
+ 104 | 104
+ 105 | 105
+ 106 | 106
+ 107 | 107
+ 108 | 108
+ 109 | 109
+ 110 | 110
+(10 rows)
+
+-- join three tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1, t3.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+                                                                                                                                                   QUERY PLAN                                                                                                                                                   
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t3.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t3.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1, t3.c1, t1.c3
+               Remote SQL: SELECT l.a1, l.a2, l.a3, r.a1 FROM (SELECT l.a1, l.a2, r.a1 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l (a1, a2) INNER JOIN (SELECT "C 1" a9 FROM "S 1"."T 1") r (a1) ON ((l.a1 = r.a1))) l (a1, a2, a3) INNER JOIN (SELECT c1 a9 FROM "S 1"."T 3") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1, t3.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 | c1 
+----+----+----
+ 22 | 22 | 22
+ 24 | 24 | 24
+ 26 | 26 | 26
+ 28 | 28 | 28
+ 30 | 30 | 30
+ 32 | 32 | 32
+ 34 | 34 | 34
+ 36 | 36 | 36
+ 38 | 38 | 38
+ 40 | 40 | 40
+(10 rows)
+
+-- left outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                         QUERY PLAN                                                                          
+-------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 3") l (a1) LEFT JOIN (SELECT c1 a9 FROM "S 1"."T 4") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 22 |   
+ 24 | 24
+ 26 |   
+ 28 |   
+ 30 | 30
+ 32 |   
+ 34 |   
+ 36 | 36
+ 38 |   
+ 40 |   
+(10 rows)
+
+-- right outer join
+SET enable_mergejoin = off; -- palnner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                         QUERY PLAN                                                                          
+-------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 4") l (a1) LEFT JOIN (SELECT c1 a9 FROM "S 1"."T 3") r (a1) ON ((r.a1 = l.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+    | 33
+ 36 | 36
+    | 39
+ 42 | 42
+    | 45
+ 48 | 48
+    | 51
+ 54 | 54
+    | 57
+ 60 | 60
+(10 rows)
+
+SET enable_mergejoin = on;
+-- full outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+                                                                         QUERY PLAN                                                                          
+-------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 3") l (a1) FULL JOIN (SELECT c1 a9 FROM "S 1"."T 4") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+ c1  | c1 
+-----+----
+  92 |   
+  94 |   
+  96 | 96
+  98 |   
+ 100 |   
+     |  3
+     |  9
+     | 15
+     | 21
+     | 27
+(10 rows)
+
+-- full outer join + WHERE clause, only matched rows
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                                              QUERY PLAN                                                                                               
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 3") l (a1) FULL JOIN (SELECT c1 a9 FROM "S 1"."T 4") r (a1) ON ((l.a1 = r.a1)) WHERE (((l.a1 = r.a1) OR (l.a1 IS NULL)))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 66 | 66
+ 72 | 72
+ 78 | 78
+ 84 | 84
+ 90 | 90
+ 96 | 96
+    |  3
+    |  9
+    | 15
+    | 21
+(10 rows)
+
+-- join at WHERE clause 
+SET enable_mergejoin = off; -- palnner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                          QUERY PLAN                                                                          
+--------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 3") l (a1) INNER JOIN (SELECT c1 a9 FROM "S 1"."T 4") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 66 | 66
+ 72 | 72
+ 78 | 78
+ 84 | 84
+ 90 | 90
+ 96 | 96
+(6 rows)
+
+SET enable_mergejoin = on;
+-- join in CTE
+EXPLAIN (COSTS false, VERBOSE)
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+                                                                                    QUERY PLAN                                                                                     
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t.c1_1, t.c2_1, t.c1_3
+   CTE t
+     ->  Foreign Scan
+           Output: t1.c1, t1.c3, t2.c1
+           Remote SQL: SELECT l.a1, l.a2, r.a1 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l (a1, a2) INNER JOIN (SELECT "C 1" a9 FROM "S 1"."T 1") r (a1) ON ((l.a1 = r.a1))
+   ->  Sort
+         Output: t.c1_1, t.c2_1, t.c1_3
+         Sort Key: t.c1_3, t.c1_1
+         ->  CTE Scan on t
+               Output: t.c1_1, t.c2_1, t.c1_3
+(11 rows)
+
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+ c1_1 | c2_1 
+------+------
+  101 |  101
+  102 |  102
+  103 |  103
+  104 |  104
+  105 |  105
+  106 |  106
+  107 |  107
+  108 |  108
+  109 |  109
+  110 |  110
+(10 rows)
+
+-- ctid with whole-row reference
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                                                                                                                                                                                                                                   QUERY PLAN                                                                                                                                                                                                                                                    
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+   ->  Sort
+         Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+               Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, r.a1 FROM (SELECT l.a7, ROW(l.a10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17), l.a10, l.a12 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17, ctid a7 FROM "S 1"."T 1") l) l (a1, a2, a3, a4) INNER JOIN (SELECT ROW(r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a9 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2) ON ((l.a3 = r.a2))
+(8 rows)
+
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+  ctid  |                                             t1                                             |                                             t2                                             | c1  
+--------+--------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------+-----
+ (1,4)  | (101,1,00101,"Fri Jan 02 00:00:00 1970 PST","Fri Jan 02 00:00:00 1970",1,"1         ",foo) | (101,1,00101,"Fri Jan 02 00:00:00 1970 PST","Fri Jan 02 00:00:00 1970",1,"1         ",foo) | 101
+ (1,5)  | (102,2,00102,"Sat Jan 03 00:00:00 1970 PST","Sat Jan 03 00:00:00 1970",2,"2         ",foo) | (102,2,00102,"Sat Jan 03 00:00:00 1970 PST","Sat Jan 03 00:00:00 1970",2,"2         ",foo) | 102
+ (1,6)  | (103,3,00103,"Sun Jan 04 00:00:00 1970 PST","Sun Jan 04 00:00:00 1970",3,"3         ",foo) | (103,3,00103,"Sun Jan 04 00:00:00 1970 PST","Sun Jan 04 00:00:00 1970",3,"3         ",foo) | 103
+ (1,7)  | (104,4,00104,"Mon Jan 05 00:00:00 1970 PST","Mon Jan 05 00:00:00 1970",4,"4         ",foo) | (104,4,00104,"Mon Jan 05 00:00:00 1970 PST","Mon Jan 05 00:00:00 1970",4,"4         ",foo) | 104
+ (1,8)  | (105,5,00105,"Tue Jan 06 00:00:00 1970 PST","Tue Jan 06 00:00:00 1970",5,"5         ",foo) | (105,5,00105,"Tue Jan 06 00:00:00 1970 PST","Tue Jan 06 00:00:00 1970",5,"5         ",foo) | 105
+ (1,9)  | (106,6,00106,"Wed Jan 07 00:00:00 1970 PST","Wed Jan 07 00:00:00 1970",6,"6         ",foo) | (106,6,00106,"Wed Jan 07 00:00:00 1970 PST","Wed Jan 07 00:00:00 1970",6,"6         ",foo) | 106
+ (1,10) | (107,7,00107,"Thu Jan 08 00:00:00 1970 PST","Thu Jan 08 00:00:00 1970",7,"7         ",foo) | (107,7,00107,"Thu Jan 08 00:00:00 1970 PST","Thu Jan 08 00:00:00 1970",7,"7         ",foo) | 107
+ (1,11) | (108,8,00108,"Fri Jan 09 00:00:00 1970 PST","Fri Jan 09 00:00:00 1970",8,"8         ",foo) | (108,8,00108,"Fri Jan 09 00:00:00 1970 PST","Fri Jan 09 00:00:00 1970",8,"8         ",foo) | 108
+ (1,12) | (109,9,00109,"Sat Jan 10 00:00:00 1970 PST","Sat Jan 10 00:00:00 1970",9,"9         ",foo) | (109,9,00109,"Sat Jan 10 00:00:00 1970 PST","Sat Jan 10 00:00:00 1970",9,"9         ",foo) | 109
+ (1,13) | (110,0,00110,"Sun Jan 11 00:00:00 1970 PST","Sun Jan 11 00:00:00 1970",0,"0         ",foo) | (110,0,00110,"Sun Jan 11 00:00:00 1970 PST","Sun Jan 11 00:00:00 1970",0,"0         ",foo) | 110
+(10 rows)
+
+-- partially unsafe to push down, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+                                                                                          QUERY PLAN                                                                                           
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Nested Loop
+               Output: t1.c1
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Materialize
+                     ->  Foreign Scan
+                           Remote SQL: SELECT NULL FROM (SELECT "C 1" a9 FROM "S 1"."T 1" WHERE (("C 1" = "C 1"))) l (a1) INNER JOIN (SELECT c1 a9 FROM "S 1"."T 3") r (a1) ON ((l.a1 = r.a1))
+(13 rows)
+
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+ c1 
+----
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+(10 rows)
+
+-- SEMI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Hash Join
+               Output: t1.c1
+               Hash Cond: (t1.c1 = t2.c1)
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t2.c1
+                     ->  HashAggregate
+                           Output: t2.c1
+                           Group Key: t2.c1
+                           ->  Foreign Scan on public.ft2 t2
+                                 Output: t2.c1
+                                 Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+(19 rows)
+
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+ c1  
+-----
+ 101
+ 102
+ 103
+ 104
+ 105
+ 106
+ 107
+ 108
+ 109
+ 110
+(10 rows)
+
+-- ANTI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+                              QUERY PLAN                              
+----------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Hash Anti Join
+               Output: t1.c1
+               Hash Cond: (t1.c1 = t2.c2)
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t2.c2
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c2
+                           Remote SQL: SELECT c2 a10 FROM "S 1"."T 1"
+(16 rows)
+
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+ c1  
+-----
+ 110
+ 111
+ 112
+ 113
+ 114
+ 115
+ 116
+ 117
+ 118
+ 119
+(10 rows)
+
+-- CROSS JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Nested Loop
+               Output: t1.c1, t2.c1
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Materialize
+                     Output: t2.c1
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c1
+                           Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+(15 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1  
+----+-----
+  1 | 101
+  1 | 102
+  1 | 103
+  1 | 104
+  1 | 105
+  1 | 106
+  1 | 107
+  1 | 108
+  1 | 109
+  1 | 110
+(10 rows)
+
+-- different server
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Merge Join
+         Output: t1.c1, t2.c1
+         Merge Cond: (t1.c1 = t2.c1)
+         ->  Sort
+               Output: t1.c1
+               Sort Key: t1.c1
+               ->  Foreign Scan on public.ft5 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+         ->  Sort
+               Output: t2.c1
+               Sort Key: t2.c1
+               ->  Foreign Scan on public.ft6 t2
+                     Output: t2.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1 
+----+----
+(0 rows)
+
+-- different effective user for permission check
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Limit
+   Output: t1.c1, ft5.c1
+   ->  Merge Join
+         Output: t1.c1, ft5.c1
+         Merge Cond: (t1.c1 = ft5.c1)
+         ->  Sort
+               Output: t1.c1
+               Sort Key: t1.c1
+               ->  Foreign Scan on public.ft5 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+         ->  Sort
+               Output: ft5.c1
+               Sort Key: ft5.c1
+               ->  Foreign Scan on public.ft5
+                     Output: ft5.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1 
+----+----
+(0 rows)
+
+-- unsafe join conditions
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Merge Join
+               Output: t1.c1, t2.c1, t1.c3
+               Merge Cond: (t1.c8 = t2.c8)
+               ->  Sort
+                     Output: t1.c1, t1.c3, t1.c8
+                     Sort Key: t1.c8
+                     ->  Foreign Scan on public.ft1 t1
+                           Output: t1.c1, t1.c3, t1.c8
+                           Remote SQL: SELECT "C 1" a10, c3 a12, c8 a17 FROM "S 1"."T 1"
+               ->  Sort
+                     Output: t2.c1, t2.c8
+                     Sort Key: t2.c8
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c1, t2.c8
+                           Remote SQL: SELECT "C 1" a9, c8 a17 FROM "S 1"."T 1"
+(20 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1 | c1  
+----+-----
+  1 | 102
+  1 | 103
+  1 | 104
+  1 | 105
+  1 | 106
+  1 | 107
+  1 | 108
+  1 | 109
+  1 | 110
+  1 |   1
+(10 rows)
+
+-- local filter (unsafe conditions on one side)
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Hash Join
+               Output: t1.c1, t2.c1, t1.c3
+               Hash Cond: (t2.c1 = t1.c1)
+               ->  Foreign Scan on public.ft2 t2
+                     Output: t2.c1
+                     Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t1.c1, t1.c3
+                     ->  Foreign Scan on public.ft1 t1
+                           Output: t1.c1, t1.c3
+                           Filter: (t1.c8 = 'foo'::user_enum)
+                           Remote SQL: SELECT "C 1" a10, c3 a12, c8 a17 FROM "S 1"."T 1"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1  | c1  
+-----+-----
+ 101 | 101
+ 102 | 102
+ 103 | 103
+ 104 | 104
+ 105 | 105
+ 106 | 106
+ 107 | 107
+ 108 | 108
+ 109 | 109
+ 110 | 110
+(10 rows)
+
+-- ===================================================================
 -- parameterized queries
 -- ===================================================================
 -- simple join
 PREPARE st1(int, int) AS SELECT t1.c3, t2.c3 FROM ft1 t1, ft2 t2 WHERE t1.c1 = $1 AND t2.c1 = $2;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st1(1, 2);
-                             QUERY PLAN                             
---------------------------------------------------------------------
+                               QUERY PLAN                               
+------------------------------------------------------------------------
  Nested Loop
    Output: t1.c3, t2.c3
    ->  Foreign Scan on public.ft1 t1
          Output: t1.c3
-         Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+         Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" = 1))
    ->  Foreign Scan on public.ft2 t2
          Output: t2.c3
-         Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" = 2))
+         Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" = 2))
 (8 rows)
 
 EXECUTE st1(1, 1);
@@ -683,8 +1282,8 @@ EXECUTE st1(101, 101);
 -- subquery using stable function (can't be sent to remote)
 PREPARE st2(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c4) = '1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st2(10, 20);
-                                                QUERY PLAN                                                
-----------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Sort
    Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
    Sort Key: t1.c1
@@ -693,13 +1292,13 @@ EXPLAIN (VERBOSE, COSTS false) EXECUTE st2(10, 20);
          Join Filter: (t1.c3 = t2.c3)
          ->  Foreign Scan on public.ft1 t1
                Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20))
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" < 20))
          ->  Materialize
                Output: t2.c3
                ->  Foreign Scan on public.ft2 t2
                      Output: t2.c3
                      Filter: (date(t2.c4) = '01-17-1970'::date)
-                     Remote SQL: SELECT c3, c4 FROM "S 1"."T 1" WHERE (("C 1" > 10))
+                     Remote SQL: SELECT c3 a12, c4 a13 FROM "S 1"."T 1" WHERE (("C 1" > 10))
 (15 rows)
 
 EXECUTE st2(10, 20);
@@ -717,8 +1316,8 @@ EXECUTE st2(101, 121);
 -- subquery using immutable function (can be sent to remote)
 PREPARE st3(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c5) = '1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st3(10, 20);
-                                                      QUERY PLAN                                                       
------------------------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Sort
    Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
    Sort Key: t1.c1
@@ -727,12 +1326,12 @@ EXPLAIN (VERBOSE, COSTS false) EXECUTE st3(10, 20);
          Join Filter: (t1.c3 = t2.c3)
          ->  Foreign Scan on public.ft1 t1
                Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20))
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" < 20))
          ->  Materialize
                Output: t2.c3
                ->  Foreign Scan on public.ft2 t2
                      Output: t2.c3
-                     Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date))
+                     Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date))
 (14 rows)
 
 EXECUTE st3(10, 20);
@@ -749,108 +1348,108 @@ EXECUTE st3(20, 30);
 -- custom plan should be chosen initially
 PREPARE st4(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 = $1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 -- once we try it enough times, should switch to generic plan
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
 (3 rows)
 
 -- value of $1 should not be sent to remote
 PREPARE st5(user_enum,int) AS SELECT * FROM ft1 t1 WHERE c8 = $1 and c1 = $2;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = $1)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
 (4 rows)
 
 EXECUTE st5('foo', 1);
@@ -868,14 +1467,14 @@ DEALLOCATE st5;
 -- System columns, except ctid, should not be sent to remote
 EXPLAIN (VERBOSE, COSTS false)
 SELECT * FROM ft1 t1 WHERE t1.tableoid = 'pg_class'::regclass LIMIT 1;
-                                  QUERY PLAN                                   
--------------------------------------------------------------------------------
+                                                  QUERY PLAN                                                   
+---------------------------------------------------------------------------------------------------------------
  Limit
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8
          Filter: (t1.tableoid = '1259'::oid)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (6 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.tableoid = 'ft1'::regclass LIMIT 1;
@@ -886,13 +1485,13 @@ SELECT * FROM ft1 t1 WHERE t1.tableoid = 'ft1'::regclass LIMIT 1;
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
-                                  QUERY PLAN                                   
--------------------------------------------------------------------------------
+                                                  QUERY PLAN                                                   
+---------------------------------------------------------------------------------------------------------------
  Limit
    Output: ((tableoid)::regclass), c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: (tableoid)::regclass, c1, c2, c3, c4, c5, c6, c7, c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (5 rows)
 
 SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
@@ -903,11 +1502,11 @@ SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((ctid = '(0,2)'::tid))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((ctid = '(0,2)'::tid))
 (3 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
@@ -918,13 +1517,13 @@ SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT ctid, * FROM ft1 t1 LIMIT 1;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                       QUERY PLAN                                                       
+------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: ctid, c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: ctid, c1, c2, c3, c4, c5, c6, c7, c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17, ctid a7 FROM "S 1"."T 1"
 (5 rows)
 
 SELECT ctid, * FROM ft1 t1 LIMIT 1;
@@ -987,7 +1586,7 @@ FETCH c;
 SAVEPOINT s;
 SELECT * FROM ft1 WHERE 1 / (c1 - 1) > 0;  -- ERROR
 ERROR:  division by zero
-CONTEXT:  Remote SQL command: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (((1 / ("C 1" - 1)) > 0))
+CONTEXT:  Remote SQL command: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (((1 / ("C 1" - 1)) > 0))
 ROLLBACK TO s;
 FETCH c;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -1010,64 +1609,64 @@ create foreign table ft3 (f1 text collate "C", f2 text)
   server loopback options (table_name 'loct3');
 -- can be sent to remote
 explain (verbose, costs off) select * from ft3 where f1 = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f1 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f1 = 'foo'::text))
 (3 rows)
 
 explain (verbose, costs off) select * from ft3 where f1 COLLATE "C" = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f1 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f1 = 'foo'::text))
 (3 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f2 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f2 = 'foo'::text))
 (3 rows)
 
 -- can't be sent to remote
 explain (verbose, costs off) select * from ft3 where f1 COLLATE "POSIX" = 'foo';
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: ((ft3.f1)::text = 'foo'::text)
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f1 = 'foo' COLLATE "C";
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: (ft3.f1 = 'foo'::text COLLATE "C")
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 COLLATE "C" = 'foo';
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: ((ft3.f2)::text = 'foo'::text)
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 = 'foo' COLLATE "C";
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: (ft3.f2 = 'foo'::text COLLATE "C")
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 -- ===================================================================
@@ -1085,7 +1684,7 @@ INSERT INTO ft2 (c1,c2,c3) SELECT c1+1000,c2+100, c3 || c3 FROM ft2 LIMIT 20;
                Output: ((ft2_1.c1 + 1000)), ((ft2_1.c2 + 100)), ((ft2_1.c3 || ft2_1.c3))
                ->  Foreign Scan on public.ft2 ft2_1
                      Output: (ft2_1.c1 + 1000), (ft2_1.c2 + 100), (ft2_1.c3 || ft2_1.c3)
-                     Remote SQL: SELECT "C 1", c2, c3 FROM "S 1"."T 1"
+                     Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12 FROM "S 1"."T 1"
 (9 rows)
 
 INSERT INTO ft2 (c1,c2,c3) SELECT c1+1000,c2+100, c3 || c3 FROM ft2 LIMIT 20;
@@ -1210,35 +1809,27 @@ UPDATE ft2 SET c2 = c2 + 400, c3 = c3 || '_update7' WHERE c1 % 10 = 7 RETURNING
 EXPLAIN (verbose, costs off)
 UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
-                                                                            QUERY PLAN                                                                             
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                                                                                                                       QUERY PLAN                                                                                                                                                                                                                                                                       
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Update on public.ft2
    Remote SQL: UPDATE "S 1"."T 1" SET c2 = $2, c3 = $3, c7 = $4 WHERE ctid = $1
-   ->  Hash Join
+   ->  Foreign Scan
          Output: ft2.c1, (ft2.c2 + 500), NULL::integer, (ft2.c3 || '_update9'::text), ft2.c4, ft2.c5, ft2.c6, 'ft2       '::character(10), ft2.c8, ft2.ctid, ft1.*
-         Hash Cond: (ft2.c2 = ft1.c1)
-         ->  Foreign Scan on public.ft2
-               Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c8, ft2.ctid
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c8, ctid FROM "S 1"."T 1" FOR UPDATE
-         ->  Hash
-               Output: ft1.*, ft1.c1
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.*, ft1.c1
-                     Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 9))
-(13 rows)
+         Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, r.a1 FROM (SELECT l.a9, l.a10, l.a12, l.a13, l.a14, l.a15, l.a17, l.a7 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c8 a17, ctid a7 FROM "S 1"."T 1" FOR UPDATE) l) l (a1, a2, a3, a4, a5, a6, a7, a8) INNER JOIN (SELECT ROW(r.a10, r.a11, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a10 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 9))) r) r (a1, a2) ON ((l.a2 = r.a2))
+(5 rows)
 
 UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
 EXPLAIN (verbose, costs off)
   DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
-                                       QUERY PLAN                                       
-----------------------------------------------------------------------------------------
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
  Delete on public.ft2
    Output: c1, c4
-   Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1 RETURNING "C 1", c4
+   Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1 RETURNING "C 1" a9, c4 a13
    ->  Foreign Scan on public.ft2
          Output: ctid
-         Remote SQL: SELECT ctid FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 5)) FOR UPDATE
+         Remote SQL: SELECT ctid a7 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 5)) FOR UPDATE
 (6 rows)
 
 DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
@@ -1351,22 +1942,14 @@ DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
 
 EXPLAIN (verbose, costs off)
 DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
-                                                      QUERY PLAN                                                      
-----------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                                        QUERY PLAN                                                                                                                                                                                         
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Delete on public.ft2
    Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1
-   ->  Hash Join
+   ->  Foreign Scan
          Output: ft2.ctid, ft1.*
-         Hash Cond: (ft2.c2 = ft1.c1)
-         ->  Foreign Scan on public.ft2
-               Output: ft2.ctid, ft2.c2
-               Remote SQL: SELECT c2, ctid FROM "S 1"."T 1" FOR UPDATE
-         ->  Hash
-               Output: ft1.*, ft1.c1
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.*, ft1.c1
-                     Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 2))
-(13 rows)
+         Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a7, l.a10 FROM (SELECT c2 a10, ctid a7 FROM "S 1"."T 1" FOR UPDATE) l) l (a1, a2) INNER JOIN (SELECT ROW(r.a10, r.a11, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a10 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 2))) r) r (a1, a2) ON ((l.a2 = r.a2))
+(5 rows)
 
 DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
 SELECT c1,c2,c3,c4 FROM ft2 ORDER BY c1;
@@ -3027,386 +3610,6 @@ NOTICE:  NEW: (13,"test triggered !")
 (1 row)
 
 -- ===================================================================
--- test inheritance features
--- ===================================================================
-CREATE TABLE a (aa TEXT);
-CREATE TABLE loct (aa TEXT, bb TEXT);
-CREATE FOREIGN TABLE b (bb TEXT) INHERITS (a)
-  SERVER loopback OPTIONS (table_name 'loct');
-INSERT INTO a(aa) VALUES('aaa');
-INSERT INTO a(aa) VALUES('aaaa');
-INSERT INTO a(aa) VALUES('aaaaa');
-INSERT INTO b(aa) VALUES('bbb');
-INSERT INTO b(aa) VALUES('bbbb');
-INSERT INTO b(aa) VALUES('bbbbb');
-SELECT tableoid::regclass, * FROM a;
- tableoid |  aa   
-----------+-------
- a        | aaa
- a        | aaaa
- a        | aaaaa
- b        | bbb
- b        | bbbb
- b        | bbbbb
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |  aa   | bb 
-----------+-------+----
- b        | bbb   | 
- b        | bbbb  | 
- b        | bbbbb | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |  aa   
-----------+-------
- a        | aaa
- a        | aaaa
- a        | aaaaa
-(3 rows)
-
-UPDATE a SET aa = 'zzzzzz' WHERE aa LIKE 'aaaa%';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
- b        | bbb
- b        | bbbb
- b        | bbbbb
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |  aa   | bb 
-----------+-------+----
- b        | bbb   | 
- b        | bbbb  | 
- b        | bbbbb | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
-(3 rows)
-
-UPDATE b SET aa = 'new';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
- b        | new
- b        | new
- b        | new
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid | aa  | bb 
-----------+-----+----
- b        | new | 
- b        | new | 
- b        | new | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
-(3 rows)
-
-UPDATE a SET aa = 'newtoo';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | newtoo
- a        | newtoo
- a        | newtoo
- b        | newtoo
- b        | newtoo
- b        | newtoo
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |   aa   | bb 
-----------+--------+----
- b        | newtoo | 
- b        | newtoo | 
- b        | newtoo | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | newtoo
- a        | newtoo
- a        | newtoo
-(3 rows)
-
-DELETE FROM a;
-SELECT tableoid::regclass, * FROM a;
- tableoid | aa 
-----------+----
-(0 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid | aa | bb 
-----------+----+----
-(0 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid | aa 
-----------+----
-(0 rows)
-
-DROP TABLE a CASCADE;
-NOTICE:  drop cascades to foreign table b
-DROP TABLE loct;
--- Check SELECT FOR UPDATE/SHARE with an inherited source table
-create table loct1 (f1 int, f2 int, f3 int);
-create table loct2 (f1 int, f2 int, f3 int);
-create table foo (f1 int, f2 int);
-create foreign table foo2 (f3 int) inherits (foo)
-  server loopback options (table_name 'loct1');
-create table bar (f1 int, f2 int);
-create foreign table bar2 (f3 int) inherits (bar)
-  server loopback options (table_name 'loct2');
-insert into foo values(1,1);
-insert into foo values(3,3);
-insert into foo2 values(2,2,2);
-insert into foo2 values(4,4,4);
-insert into bar values(1,11);
-insert into bar values(2,22);
-insert into bar values(6,66);
-insert into bar2 values(3,33,33);
-insert into bar2 values(4,44,44);
-insert into bar2 values(7,77,77);
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for update;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
- LockRows
-   Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-   ->  Hash Join
-         Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Append
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.ctid, bar2.tableoid, bar2.*
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(22 rows)
-
-select * from bar where f1 in (select f1 from foo) for update;
- f1 | f2 
-----+----
-  1 | 11
-  2 | 22
-  3 | 33
-  4 | 44
-(4 rows)
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for share;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
- LockRows
-   Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-   ->  Hash Join
-         Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Append
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.ctid, bar2.tableoid, bar2.*
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR SHARE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(22 rows)
-
-select * from bar where f1 in (select f1 from foo) for share;
- f1 | f2 
-----+----
-  1 | 11
-  2 | 22
-  3 | 33
-  4 | 44
-(4 rows)
-
--- Check UPDATE with inherited target and an inherited source table
-explain (verbose, costs off)
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
- Update on public.bar
-   Update on public.bar
-   Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
-   ->  Hash Join
-         Output: bar.f1, (bar.f2 + 100), bar.ctid, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Seq Scan on public.bar
-               Output: bar.f1, bar.f2, bar.ctid
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-   ->  Hash Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar2.f1 = foo.f1)
-         ->  Foreign Scan on public.bar2
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-               Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(37 rows)
-
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-select tableoid::regclass, * from bar order by 1,2;
- tableoid | f1 | f2  
-----------+----+-----
- bar      |  1 | 111
- bar      |  2 | 122
- bar      |  6 |  66
- bar2     |  3 | 133
- bar2     |  4 | 144
- bar2     |  7 |  77
-(6 rows)
-
--- Check UPDATE with inherited target and an appendrel subquery
-explain (verbose, costs off)
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-                                      QUERY PLAN                                      
---------------------------------------------------------------------------------------
- Update on public.bar
-   Update on public.bar
-   Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
-   ->  Hash Join
-         Output: bar.f1, (bar.f2 + 100), bar.ctid, (ROW(foo.f1))
-         Hash Cond: (foo.f1 = bar.f1)
-         ->  Append
-               ->  Seq Scan on public.foo
-                     Output: ROW(foo.f1), foo.f1
-               ->  Foreign Scan on public.foo2
-                     Output: ROW(foo2.f1), foo2.f1
-                     Remote SQL: SELECT f1 FROM public.loct1
-               ->  Seq Scan on public.foo foo_1
-                     Output: ROW((foo_1.f1 + 3)), (foo_1.f1 + 3)
-               ->  Foreign Scan on public.foo2 foo2_1
-                     Output: ROW((foo2_1.f1 + 3)), (foo2_1.f1 + 3)
-                     Remote SQL: SELECT f1 FROM public.loct1
-         ->  Hash
-               Output: bar.f1, bar.f2, bar.ctid
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid
-   ->  Merge Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, (ROW(foo.f1))
-         Merge Cond: (bar2.f1 = foo.f1)
-         ->  Sort
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-               Sort Key: bar2.f1
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Sort
-               Output: (ROW(foo.f1)), foo.f1
-               Sort Key: foo.f1
-               ->  Append
-                     ->  Seq Scan on public.foo
-                           Output: ROW(foo.f1), foo.f1
-                     ->  Foreign Scan on public.foo2
-                           Output: ROW(foo2.f1), foo2.f1
-                           Remote SQL: SELECT f1 FROM public.loct1
-                     ->  Seq Scan on public.foo foo_1
-                           Output: ROW((foo_1.f1 + 3)), (foo_1.f1 + 3)
-                     ->  Foreign Scan on public.foo2 foo2_1
-                           Output: ROW((foo2_1.f1 + 3)), (foo2_1.f1 + 3)
-                           Remote SQL: SELECT f1 FROM public.loct1
-(45 rows)
-
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-select tableoid::regclass, * from bar order by 1,2;
- tableoid | f1 | f2  
-----------+----+-----
- bar      |  1 | 211
- bar      |  2 | 222
- bar      |  6 | 166
- bar2     |  3 | 233
- bar2     |  4 | 244
- bar2     |  7 | 177
-(6 rows)
-
--- Test that WHERE CURRENT OF is not supported
-begin;
-declare c cursor for select * from bar where f1 = 7;
-fetch from c;
- f1 | f2  
-----+-----
-  7 | 177
-(1 row)
-
-update bar set f2 = null where current of c;
-ERROR:  WHERE CURRENT OF is not supported for this table type
-rollback;
-drop table foo cascade;
-NOTICE:  drop cascades to foreign table foo2
-drop table bar cascade;
-NOTICE:  drop cascades to foreign table bar2
-drop table loct1;
-drop table loct2;
--- ===================================================================
 -- test IMPORT FOREIGN SCHEMA
 -- ===================================================================
 CREATE SCHEMA import_source;
@@ -3636,3 +3839,6 @@ QUERY:  CREATE FOREIGN TABLE t5 (
 OPTIONS (schema_name 'import_source', table_name 't5');
 CONTEXT:  importing foreign table "t5"
 ROLLBACK;
+-- Cleanup
+DROP OWNED BY view_owner;
+DROP USER view_owner;
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 478e124..8e89523 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -28,7 +28,6 @@
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/planmain.h"
-#include "optimizer/prep.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/var.h"
 #include "parser/parsetree.h"
@@ -47,41 +46,8 @@ PG_MODULE_MAGIC;
 #define DEFAULT_FDW_TUPLE_COST		0.01
 
 /*
- * FDW-specific planner information kept in RelOptInfo.fdw_private for a
- * foreign table.  This information is collected by postgresGetForeignRelSize.
- */
-typedef struct PgFdwRelationInfo
-{
-	/* baserestrictinfo clauses, broken down into safe and unsafe subsets. */
-	List	   *remote_conds;
-	List	   *local_conds;
-
-	/* Bitmap of attr numbers we need to fetch from the remote server. */
-	Bitmapset  *attrs_used;
-
-	/* Cost and selectivity of local_conds. */
-	QualCost	local_conds_cost;
-	Selectivity local_conds_sel;
-
-	/* Estimated size and cost for a scan with baserestrictinfo quals. */
-	double		rows;
-	int			width;
-	Cost		startup_cost;
-	Cost		total_cost;
-
-	/* Options extracted from catalogs. */
-	bool		use_remote_estimate;
-	Cost		fdw_startup_cost;
-	Cost		fdw_tuple_cost;
-
-	/* Cached catalog information. */
-	ForeignTable *table;
-	ForeignServer *server;
-	UserMapping *user;			/* only set in use_remote_estimate mode */
-} PgFdwRelationInfo;
-
-/*
- * Indexes of FDW-private information stored in fdw_private lists.
+ * Indexes of FDW-private information stored in fdw_private of ForeignScan of
+ * a simple foreign table scan for a SELECT statement.
  *
  * We store various information in ForeignScan.fdw_private to pass it from
  * planner to executor.  Currently we store:
@@ -98,7 +64,11 @@ enum FdwScanPrivateIndex
 	/* SQL statement to execute remotely (as a String node) */
 	FdwScanPrivateSelectSql,
 	/* Integer list of attribute numbers retrieved by the SELECT */
-	FdwScanPrivateRetrievedAttrs
+	FdwScanPrivateRetrievedAttrs,
+	/* Integer value of server for the scan */
+	FdwScanPrivateServerOid,
+	/* Integer value of effective userid for the scan */
+	FdwScanPrivateUserOid,
 };
 
 /*
@@ -128,7 +98,8 @@ enum FdwModifyPrivateIndex
  */
 typedef struct PgFdwScanState
 {
-	Relation	rel;			/* relcache entry for the foreign table */
+	const char *relname;		/* name of relation being scanned */
+	TupleDesc	tupdesc;		/* tuple descriptor of the scan */
 	AttInMetadata *attinmeta;	/* attribute datatype conversion metadata */
 
 	/* extracted fdw_private data */
@@ -194,6 +165,8 @@ typedef struct PgFdwAnalyzeState
 	AttInMetadata *attinmeta;	/* attribute datatype conversion metadata */
 	List	   *retrieved_attrs;	/* attr numbers retrieved by query */
 
+	char	   *query;			/* text of SELECT command */
+
 	/* collected sample rows */
 	HeapTuple  *rows;			/* array of size targrows */
 	int			targrows;		/* target # of sample rows */
@@ -214,7 +187,10 @@ typedef struct PgFdwAnalyzeState
  */
 typedef struct ConversionLocation
 {
-	Relation	rel;			/* foreign table's relcache entry */
+	const char *relname;		/* name of relation being processed, or NULL for
+								   a foreign join */
+	const char *query;			/* query being processed */
+	TupleDesc	tupdesc;		/* tuple descriptor for attribute names */
 	AttrNumber	cur_attno;		/* attribute number being processed, or 0 */
 } ConversionLocation;
 
@@ -288,6 +264,12 @@ static bool postgresAnalyzeForeignTable(Relation relation,
 							BlockNumber *totalpages);
 static List *postgresImportForeignSchema(ImportForeignSchemaStmt *stmt,
 							Oid serverOid);
+static void postgresGetForeignJoinPaths(PlannerInfo *root,
+						   RelOptInfo *joinrel,
+						   RelOptInfo *outerrel,
+						   RelOptInfo *innerrel,
+						   SpecialJoinInfo *sjinfo,
+						   List *restrictlisti);
 
 /*
  * Helper functions
@@ -323,7 +305,9 @@ static void analyze_row_processor(PGresult *res, int row,
 					  PgFdwAnalyzeState *astate);
 static HeapTuple make_tuple_from_result_row(PGresult *res,
 						   int row,
-						   Relation rel,
+						   const char *relname,
+						   const char *query,
+						   TupleDesc tupdesc,
 						   AttInMetadata *attinmeta,
 						   List *retrieved_attrs,
 						   MemoryContext temp_context);
@@ -368,6 +352,9 @@ postgres_fdw_handler(PG_FUNCTION_ARGS)
 	/* Support functions for IMPORT FOREIGN SCHEMA */
 	routine->ImportForeignSchema = postgresImportForeignSchema;
 
+	/* Support functions for join push-down */
+	routine->GetForeignJoinPaths = postgresGetForeignJoinPaths;
+
 	PG_RETURN_POINTER(routine);
 }
 
@@ -383,7 +370,9 @@ postgresGetForeignRelSize(PlannerInfo *root,
 						  RelOptInfo *baserel,
 						  Oid foreigntableid)
 {
+	RangeTblEntry *rte;
 	PgFdwRelationInfo *fpinfo;
+	ForeignTable *table;
 	ListCell   *lc;
 
 	/*
@@ -394,8 +383,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	baserel->fdw_private = (void *) fpinfo;
 
 	/* Look up foreign-table catalog info. */
-	fpinfo->table = GetForeignTable(foreigntableid);
-	fpinfo->server = GetForeignServer(fpinfo->table->serverid);
+	table = GetForeignTable(foreigntableid);
+	fpinfo->server = GetForeignServer(table->serverid);
 
 	/*
 	 * Extract user-settable option values.  Note that per-table setting of
@@ -416,7 +405,7 @@ postgresGetForeignRelSize(PlannerInfo *root,
 		else if (strcmp(def->defname, "fdw_tuple_cost") == 0)
 			fpinfo->fdw_tuple_cost = strtod(defGetString(def), NULL);
 	}
-	foreach(lc, fpinfo->table->options)
+	foreach(lc, table->options)
 	{
 		DefElem    *def = (DefElem *) lfirst(lc);
 
@@ -428,20 +417,12 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	}
 
 	/*
-	 * If the table or the server is configured to use remote estimates,
-	 * identify which user to do remote access as during planning.  This
+	 * Identify which user to do remote access as during planning.  This
 	 * should match what ExecCheckRTEPerms() does.  If we fail due to lack of
 	 * permissions, the query would have failed at runtime anyway.
 	 */
-	if (fpinfo->use_remote_estimate)
-	{
-		RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
-		Oid			userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-		fpinfo->user = GetUserMapping(userid, fpinfo->server->serverid);
-	}
-	else
-		fpinfo->user = NULL;
+	rte = planner_rt_fetch(baserel->relid, root);
+	fpinfo->userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
 
 	/*
 	 * Identify which baserestrictinfo clauses can be sent to the remote
@@ -463,10 +444,9 @@ postgresGetForeignRelSize(PlannerInfo *root,
 				   &fpinfo->attrs_used);
 	foreach(lc, fpinfo->local_conds)
 	{
-		RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
+		Expr *expr = (Expr *) lfirst(lc);
 
-		pull_varattnos((Node *) rinfo->clause, baserel->relid,
-					   &fpinfo->attrs_used);
+		pull_varattnos((Node *) expr, baserel->relid, &fpinfo->attrs_used);
 	}
 
 	/*
@@ -752,6 +732,8 @@ postgresGetForeignPlan(PlannerInfo *root,
 	List	   *retrieved_attrs;
 	StringInfoData sql;
 	ListCell   *lc;
+	List	   *fdw_ps_tlist = NIL;
+	ForeignScan *scan;
 
 	/*
 	 * Separate the scan_clauses into those that can be executed remotely and
@@ -768,9 +750,6 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 *
 	 * This code must match "extract_actual_clauses(scan_clauses, false)"
 	 * except for the additional decision about remote versus local execution.
-	 * Note however that we only strip the RestrictInfo nodes from the
-	 * local_exprs list, since appendWhereClause expects a list of
-	 * RestrictInfos.
 	 */
 	foreach(lc, scan_clauses)
 	{
@@ -783,11 +762,11 @@ postgresGetForeignPlan(PlannerInfo *root,
 			continue;
 
 		if (list_member_ptr(fpinfo->remote_conds, rinfo))
-			remote_conds = lappend(remote_conds, rinfo);
+			remote_conds = lappend(remote_conds, rinfo->clause);
 		else if (list_member_ptr(fpinfo->local_conds, rinfo))
 			local_exprs = lappend(local_exprs, rinfo->clause);
 		else if (is_foreign_expr(root, baserel, rinfo->clause))
-			remote_conds = lappend(remote_conds, rinfo);
+			remote_conds = lappend(remote_conds, rinfo->clause);
 		else
 			local_exprs = lappend(local_exprs, rinfo->clause);
 	}
@@ -797,68 +776,17 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 * expressions to be sent as parameters.
 	 */
 	initStringInfo(&sql);
-	deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used,
-					 &retrieved_attrs);
-	if (remote_conds)
-		appendWhereClause(&sql, root, baserel, remote_conds,
-						  true, &params_list);
-
-	/*
-	 * Add FOR UPDATE/SHARE if appropriate.  We apply locking during the
-	 * initial row fetch, rather than later on as is done for local tables.
-	 * The extra roundtrips involved in trying to duplicate the local
-	 * semantics exactly don't seem worthwhile (see also comments for
-	 * RowMarkType).
-	 *
-	 * Note: because we actually run the query as a cursor, this assumes that
-	 * DECLARE CURSOR ... FOR UPDATE is supported, which it isn't before 8.3.
-	 */
-	if (baserel->relid == root->parse->resultRelation &&
-		(root->parse->commandType == CMD_UPDATE ||
-		 root->parse->commandType == CMD_DELETE))
-	{
-		/* Relation is UPDATE/DELETE target, so use FOR UPDATE */
-		appendStringInfoString(&sql, " FOR UPDATE");
-	}
-	else
-	{
-		PlanRowMark *rc = get_plan_rowmark(root->rowMarks, baserel->relid);
-
-		if (rc)
-		{
-			/*
-			 * Relation is specified as a FOR UPDATE/SHARE target, so handle
-			 * that.  (But we could also see LCS_NONE, meaning this isn't a
-			 * target relation after all.)
-			 *
-			 * For now, just ignore any [NO] KEY specification, since (a) it's
-			 * not clear what that means for a remote table that we don't have
-			 * complete information about, and (b) it wouldn't work anyway on
-			 * older remote servers.  Likewise, we don't worry about NOWAIT.
-			 */
-			switch (rc->strength)
-			{
-				case LCS_NONE:
-					/* No locking needed */
-					break;
-				case LCS_FORKEYSHARE:
-				case LCS_FORSHARE:
-					appendStringInfoString(&sql, " FOR SHARE");
-					break;
-				case LCS_FORNOKEYUPDATE:
-				case LCS_FORUPDATE:
-					appendStringInfoString(&sql, " FOR UPDATE");
-					break;
-			}
-		}
-	}
+	deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used, remote_conds,
+					 &params_list, &fdw_ps_tlist, &retrieved_attrs);
 
 	/*
-	 * Build the fdw_private list that will be available to the executor.
+	 * Build the fdw_private list that will be available in the executor.
 	 * Items in the list must match enum FdwScanPrivateIndex, above.
 	 */
-	fdw_private = list_make2(makeString(sql.data),
-							 retrieved_attrs);
+	fdw_private = list_make4(makeString(sql.data),
+							 retrieved_attrs,
+							 makeInteger(fpinfo->server->serverid),
+							 makeInteger(fpinfo->userid));
 
 	/*
 	 * Create the ForeignScan node from target list, local filtering
@@ -868,11 +796,18 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 * field of the finished plan node; we can't keep them in private state
 	 * because then they wouldn't be subject to later planner processing.
 	 */
-	return make_foreignscan(tlist,
+	scan = make_foreignscan(tlist,
 							local_exprs,
 							scan_relid,
 							params_list,
 							fdw_private);
+
+	/*
+	 * set fdw_ps_tlist to handle tuples generated by this scan.
+	 */
+	scan->fdw_ps_tlist = fdw_ps_tlist;
+
+	return scan;
 }
 
 /*
@@ -885,9 +820,8 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	ForeignScan *fsplan = (ForeignScan *) node->ss.ps.plan;
 	EState	   *estate = node->ss.ps.state;
 	PgFdwScanState *fsstate;
-	RangeTblEntry *rte;
+	Oid			serverid;
 	Oid			userid;
-	ForeignTable *table;
 	ForeignServer *server;
 	UserMapping *user;
 	int			numParams;
@@ -907,22 +841,13 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	node->fdw_state = (void *) fsstate;
 
 	/*
-	 * Identify which user to do the remote access as.  This should match what
-	 * ExecCheckRTEPerms() does.
-	 */
-	rte = rt_fetch(fsplan->scan.scanrelid, estate->es_range_table);
-	userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-	/* Get info about foreign table. */
-	fsstate->rel = node->ss.ss_currentRelation;
-	table = GetForeignTable(RelationGetRelid(fsstate->rel));
-	server = GetForeignServer(table->serverid);
-	user = GetUserMapping(userid, server->serverid);
-
-	/*
 	 * Get connection to the foreign server.  Connection manager will
 	 * establish new connection if necessary.
 	 */
+	serverid = intVal(list_nth(fsplan->fdw_private, FdwScanPrivateServerOid));
+	userid = intVal(list_nth(fsplan->fdw_private, FdwScanPrivateUserOid));
+	server = GetForeignServer(serverid);
+	user = GetUserMapping(userid, server->serverid);
 	fsstate->conn = GetConnection(server, user, false);
 
 	/* Assign a unique ID for my cursor */
@@ -932,8 +857,8 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	/* Get private info created by planner functions. */
 	fsstate->query = strVal(list_nth(fsplan->fdw_private,
 									 FdwScanPrivateSelectSql));
-	fsstate->retrieved_attrs = (List *) list_nth(fsplan->fdw_private,
-											   FdwScanPrivateRetrievedAttrs);
+	fsstate->retrieved_attrs = list_nth(fsplan->fdw_private,
+										FdwScanPrivateRetrievedAttrs);
 
 	/* Create contexts for batches of tuples and per-tuple temp workspace. */
 	fsstate->batch_cxt = AllocSetContextCreate(estate->es_query_cxt,
@@ -947,8 +872,18 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 											  ALLOCSET_SMALL_INITSIZE,
 											  ALLOCSET_SMALL_MAXSIZE);
 
-	/* Get info we'll need for input data conversion. */
-	fsstate->attinmeta = TupleDescGetAttInMetadata(RelationGetDescr(fsstate->rel));
+	/* Get info we'll need for input data conversion and error report. */
+	if (fsplan->scan.scanrelid > 0)
+	{
+		fsstate->relname = RelationGetRelationName(node->ss.ss_currentRelation);
+		fsstate->tupdesc = RelationGetDescr(node->ss.ss_currentRelation);
+	}
+	else
+	{
+		fsstate->relname = NULL;
+		fsstate->tupdesc = node->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
+	}
+	fsstate->attinmeta = TupleDescGetAttInMetadata(fsstate->tupdesc);
 
 	/* Prepare for output conversion of parameters used in remote query. */
 	numParams = list_length(fsplan->fdw_exprs);
@@ -1726,10 +1661,12 @@ estimate_path_cost_size(PlannerInfo *root,
 	 */
 	if (fpinfo->use_remote_estimate)
 	{
+		List	   *remote_conds;
 		List	   *remote_join_conds;
 		List	   *local_join_conds;
-		StringInfoData sql;
 		List	   *retrieved_attrs;
+		StringInfoData sql;
+		UserMapping *user;
 		PGconn	   *conn;
 		Selectivity local_sel;
 		QualCost	local_cost;
@@ -1741,24 +1678,24 @@ estimate_path_cost_size(PlannerInfo *root,
 		classifyConditions(root, baserel, join_conds,
 						   &remote_join_conds, &local_join_conds);
 
+		remote_conds = copyObject(fpinfo->remote_conds);
+		remote_conds = list_concat(remote_conds, remote_join_conds);
+
 		/*
 		 * Construct EXPLAIN query including the desired SELECT, FROM, and
 		 * WHERE clauses.  Params and other-relation Vars are replaced by
 		 * dummy values.
+		 * Here we waste params_list and fdw_ps_tlist because they are
+		 * unnecessary for EXPLAIN.
 		 */
 		initStringInfo(&sql);
 		appendStringInfoString(&sql, "EXPLAIN ");
-		deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used,
-						 &retrieved_attrs);
-		if (fpinfo->remote_conds)
-			appendWhereClause(&sql, root, baserel, fpinfo->remote_conds,
-							  true, NULL);
-		if (remote_join_conds)
-			appendWhereClause(&sql, root, baserel, remote_join_conds,
-							  (fpinfo->remote_conds == NIL), NULL);
+		deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used, remote_conds,
+						 NULL, NULL, &retrieved_attrs);
 
 		/* Get the remote estimate */
-		conn = GetConnection(fpinfo->server, fpinfo->user, false);
+		user = GetUserMapping(fpinfo->userid, fpinfo->server->serverid);
+		conn = GetConnection(fpinfo->server, user, false);
 		get_remote_estimate(sql.data, conn, &rows, &width,
 							&startup_cost, &total_cost);
 		ReleaseConnection(conn);
@@ -2055,7 +1992,9 @@ fetch_more_data(ForeignScanState *node)
 		{
 			fsstate->tuples[i] =
 				make_tuple_from_result_row(res, i,
-										   fsstate->rel,
+										   fsstate->relname,
+										   fsstate->query,
+										   fsstate->tupdesc,
 										   fsstate->attinmeta,
 										   fsstate->retrieved_attrs,
 										   fsstate->temp_cxt);
@@ -2273,7 +2212,9 @@ store_returning_result(PgFdwModifyState *fmstate,
 		HeapTuple	newtup;
 
 		newtup = make_tuple_from_result_row(res, 0,
-											fmstate->rel,
+										RelationGetRelationName(fmstate->rel),
+											fmstate->query,
+											RelationGetDescr(fmstate->rel),
 											fmstate->attinmeta,
 											fmstate->retrieved_attrs,
 											fmstate->temp_cxt);
@@ -2423,6 +2364,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
 	initStringInfo(&sql);
 	appendStringInfo(&sql, "DECLARE c%u CURSOR FOR ", cursor_number);
 	deparseAnalyzeSql(&sql, relation, &astate.retrieved_attrs);
+	astate.query = sql.data;
 
 	/* In what follows, do not risk leaking any PGresults. */
 	PG_TRY();
@@ -2565,7 +2507,9 @@ analyze_row_processor(PGresult *res, int row, PgFdwAnalyzeState *astate)
 		oldcontext = MemoryContextSwitchTo(astate->anl_cxt);
 
 		astate->rows[pos] = make_tuple_from_result_row(res, row,
-													   astate->rel,
+										   RelationGetRelationName(astate->rel),
+													   astate->query,
+											   RelationGetDescr(astate->rel),
 													   astate->attinmeta,
 													 astate->retrieved_attrs,
 													   astate->temp_cxt);
@@ -2839,6 +2783,258 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
 }
 
 /*
+ * Construct PgFdwRelationInfo from two join sources
+ */
+static PgFdwRelationInfo *
+merge_fpinfo(RelOptInfo *outerrel, RelOptInfo *innerrel, JoinType jointype)
+{
+	PgFdwRelationInfo *fpinfo_o;
+	PgFdwRelationInfo *fpinfo_i;
+	PgFdwRelationInfo *fpinfo;
+
+	fpinfo_o = (PgFdwRelationInfo *) outerrel->fdw_private;
+	fpinfo_i = (PgFdwRelationInfo *) innerrel->fdw_private;
+
+	fpinfo = (PgFdwRelationInfo *) palloc0(sizeof(PgFdwRelationInfo));
+
+	/* Join relation must have conditions come from sources */
+	fpinfo->remote_conds = list_concat(copyObject(fpinfo_o->remote_conds),
+									   copyObject(fpinfo_i->remote_conds));
+	fpinfo->local_conds = list_concat(copyObject(fpinfo_o->local_conds),
+									  copyObject(fpinfo_i->local_conds));
+
+	/* Only for simple foreign table scan */
+	fpinfo->attrs_used = NULL;
+
+	/* rows and width will be set later */
+	fpinfo->rows = 0;
+	fpinfo->width = 0;
+
+	/*
+	 * TODO estimate more accurately
+	 */
+	fpinfo->local_conds_cost.startup = fpinfo_o->local_conds_cost.startup +
+									   fpinfo_i->local_conds_cost.startup;
+	fpinfo->local_conds_cost.per_tuple = fpinfo_o->local_conds_cost.per_tuple +
+										 fpinfo_i->local_conds_cost.per_tuple;
+	fpinfo->local_conds_sel = fpinfo_o->local_conds_sel *
+							  fpinfo_i->local_conds_sel;
+	fpinfo->use_remote_estimate = false;
+	fpinfo->fdw_startup_cost = fpinfo_o->fdw_startup_cost +
+							   fpinfo_i->fdw_startup_cost;
+	fpinfo->fdw_tuple_cost = fpinfo_o->fdw_tuple_cost +
+							 fpinfo_i->fdw_tuple_cost;
+
+	/* total_cost will be set later. */
+	fpinfo->startup_cost = fpinfo_o->startup_cost + fpinfo_i->startup_cost;
+	fpinfo->total_cost = 0;
+
+	/* serverid and userid are respectively identical */
+	fpinfo->server = fpinfo_o->server;
+	fpinfo->userid = fpinfo_o->userid;
+
+	fpinfo->outerrel = outerrel;
+	fpinfo->innerrel = innerrel;
+	fpinfo->jointype = jointype;
+
+	/* joinclauses and otherclauses will be set later */
+
+	return fpinfo;
+}
+
+/*
+ * postgresGetForeignJoinPaths
+ *		Add possible ForeignPath to joinrel.
+ *
+ * Joins satisfy conditions below can be pushed down to the remote PostgreSQL
+ * server.
+ *
+ * 1) Join type is INNER or OUTER (one of LEFT/RIGHT/FULL)
+ * 2) Both outer and inner portions are safe to push-down
+ * 3) All foreign tables in the join belong to the same foreign server
+ * 4) All foreign tables are accessed with identical user
+ * 5) All join conditions are safe to push down
+ * 6) No relation has local filter (this can be relaxed for INNER JOIN with
+ * no volatile function/operator, but as of now we want safer way)
+ */
+static void
+postgresGetForeignJoinPaths(PlannerInfo *root,
+							RelOptInfo *joinrel,
+							RelOptInfo *outerrel,
+							RelOptInfo *innerrel,
+							SpecialJoinInfo *sjinfo,
+							List *restrictlist)
+{
+	PgFdwRelationInfo *fpinfo;
+	PgFdwRelationInfo *fpinfo_o;
+	PgFdwRelationInfo *fpinfo_i;
+	JoinType		jointype = !sjinfo ? JOIN_INNER : sjinfo->jointype;
+	ForeignPath	   *joinpath;
+	double			rows;
+	Cost			startup_cost;
+	Cost			total_cost;
+
+	ListCell	   *lc;
+	List		   *joinclauses;
+	List		   *otherclauses;
+
+	/*
+	 * We support all outer joins in addition to inner join.  CROSS JOIN is
+	 * an INNER JOIN with no conditions internally, so will be checked later.
+	 */
+	if (jointype != JOIN_INNER && jointype != JOIN_LEFT &&
+		jointype != JOIN_RIGHT && jointype != JOIN_FULL)
+	{
+		ereport(DEBUG3, (errmsg("unsupported join type (SEMI, ANTI)")));
+		return;
+	}
+
+	/*
+	 * Having valid PgFdwRelationInfo in RelOptInfo#fdw_private indicates that
+	 * scanning against the relation can be pushed down.  If either of them
+	 * doesn't have PgFdwRelationInfo, give up to push down this join relation.
+	 */
+	if (!outerrel->fdw_private)
+	{
+		ereport(DEBUG3, (errmsg("outer is not safe to push-down")));
+		return;
+	}
+	if (!innerrel->fdw_private)
+	{
+		ereport(DEBUG3, (errmsg("inner is not safe to push-down")));
+		return;
+	}
+	fpinfo_o = (PgFdwRelationInfo *) outerrel->fdw_private;
+	fpinfo_i = (PgFdwRelationInfo *) innerrel->fdw_private;
+
+	/*
+	 * All relations in the join must belong to same server.  Having a valid
+	 * fdw_private means that all relations in the relations belong to the
+	 * server the fdw_private has, so what we should do is just compare
+	 * serverid of outer/inner relations.
+	 */
+	if (fpinfo_o->server->serverid != fpinfo_i->server->serverid)
+	{
+		ereport(DEBUG3, (errmsg("server unmatch")));
+		return;
+	}
+
+	/*
+	 * effective userid of all source relations should be identical.
+	 * Having a valid fdw_private means that all relations in the relations is
+	 * accessed with identical user, so what we should do is just compare
+	 * userid of outer/inner relations.
+	 */
+	if (fpinfo_o->userid != fpinfo_i->userid)
+	{
+		ereport(DEBUG3, (errmsg("unmatch userid")));
+		return;
+	}
+
+	/*
+	 * No source relation can have local conditions.  This can be relaxed
+	 * if the join is an inner join and local conditions don't contain
+	 * volatile function/operator, but as of now we leave it as future
+	 * enhancement.
+	 */
+	if (fpinfo_o->local_conds != NULL || fpinfo_i->local_conds != NULL)
+	{
+		ereport(DEBUG3, (errmsg("join with local filter")));
+		return;
+	}
+
+	/*
+	 * Separate restrictlist into two lists, join conditions and remote filters.
+	 */
+	joinclauses = restrictlist;
+	if (IS_OUTER_JOIN(jointype))
+	{
+		extract_actual_join_clauses(joinclauses, &joinclauses, &otherclauses);
+	}
+	else
+	{
+		joinclauses = extract_actual_clauses(joinclauses, false);
+		otherclauses = NIL;
+	}
+
+	/*
+	 * Note that CROSS JOIN (cartesian product) is transformed to JOIN_INNER
+	 * with empty joinclauses.  Pushing down CROSS JOIN usually produces more
+	 * result than retrieving each tables separately, so we don't push down
+	 * such joins.
+	 */
+	if (jointype == JOIN_INNER && joinclauses == NIL)
+	{
+		ereport(DEBUG3, (errmsg("unsupported join type (CROSS)")));
+		return;
+	}
+
+	/*
+	 * Join condition must be safe to push down.
+	 */
+	foreach(lc, joinclauses)
+	{
+		Expr *expr = (Expr *) lfirst(lc);
+
+		if (!is_foreign_expr(root, joinrel, expr))
+		{
+			ereport(DEBUG3, (errmsg("join quals contains unsafe conditions")));
+			return;
+		}
+	}
+
+	/*
+	 * Other condition for the join must be safe to push down.
+	 */
+	foreach(lc, otherclauses)
+	{
+		Expr *expr = (Expr *) lfirst(lc);
+
+		if (!is_foreign_expr(root, joinrel, expr))
+		{
+			ereport(DEBUG3, (errmsg("filter contains unsafe conditions")));
+			return;
+		}
+	}
+
+	/* Here we know that this join can be pushed-down to remote side. */
+
+	/* Construct fpinfo for the join relation */
+	fpinfo = merge_fpinfo(outerrel, innerrel, jointype); 
+	fpinfo->rows = joinrel->rows;
+	fpinfo->width = joinrel->width;
+	fpinfo->total_cost = fpinfo->startup_cost +
+						 fpinfo->fdw_tuple_cost * fpinfo->rows;
+	fpinfo->joinclauses = joinclauses;
+	fpinfo->otherclauses = otherclauses;
+	joinrel->fdw_private = fpinfo;
+
+	/* TODO determine more accurate cost and rows of the join. */
+	rows = joinrel->rows;
+	startup_cost = fpinfo->startup_cost;
+	total_cost = fpinfo->total_cost;
+
+	/*
+	 * Create a new join path and add it to the joinrel which represents a join
+	 * between foreign tables.
+	 */
+	joinpath = create_foreignscan_path(root,
+									   joinrel,
+									   rows,
+									   startup_cost,
+									   total_cost,
+									   NIL,		/* no pathkeys */
+									   NULL,	/* no required_outer */
+									   NIL);	/* no fdw_private */
+
+	/* Add generated path into joinrel by add_path(). */
+	add_path(joinrel, (Path *) joinpath);
+	elog(DEBUG3, "join path added");
+
+	/* TODO consider parameterized paths */
+}
+
+/*
  * Create a tuple from the specified row of the PGresult.
  *
  * rel is the local representation of the foreign table, attinmeta is
@@ -2849,13 +3045,14 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
 static HeapTuple
 make_tuple_from_result_row(PGresult *res,
 						   int row,
-						   Relation rel,
+						   const char *relname,
+						   const char *query,
+						   TupleDesc tupdesc,
 						   AttInMetadata *attinmeta,
 						   List *retrieved_attrs,
 						   MemoryContext temp_context)
 {
 	HeapTuple	tuple;
-	TupleDesc	tupdesc = RelationGetDescr(rel);
 	Datum	   *values;
 	bool	   *nulls;
 	ItemPointer ctid = NULL;
@@ -2882,7 +3079,9 @@ make_tuple_from_result_row(PGresult *res,
 	/*
 	 * Set up and install callback to report where conversion error occurs.
 	 */
-	errpos.rel = rel;
+	errpos.relname = relname;
+	errpos.query = query;
+	errpos.tupdesc = tupdesc;
 	errpos.cur_attno = 0;
 	errcallback.callback = conversion_error_callback;
 	errcallback.arg = (void *) &errpos;
@@ -2966,11 +3165,39 @@ make_tuple_from_result_row(PGresult *res,
 static void
 conversion_error_callback(void *arg)
 {
+	const char *attname;
+	const char *relname;
 	ConversionLocation *errpos = (ConversionLocation *) arg;
-	TupleDesc	tupdesc = RelationGetDescr(errpos->rel);
+	TupleDesc	tupdesc = errpos->tupdesc;
+	StringInfoData buf;
+
+	if (errpos->relname)
+	{
+		/* error occurred in a scan against a foreign table */ 
+		initStringInfo(&buf);
+		if (errpos->cur_attno > 0)
+			appendStringInfo(&buf, "column \"%s\"",
+					 NameStr(tupdesc->attrs[errpos->cur_attno - 1]->attname));
+		else if (errpos->cur_attno == SelfItemPointerAttributeNumber)
+			appendStringInfoString(&buf, "column \"ctid\"");
+		attname = buf.data;
+
+		initStringInfo(&buf);
+		appendStringInfo(&buf, "foreign table \"%s\"", errpos->relname);
+		relname = buf.data;
+	}
+	else
+	{
+		/* error occurred in a scan against a foreign join */ 
+		initStringInfo(&buf);
+		appendStringInfo(&buf, "column %d", errpos->cur_attno - 1);
+		attname = buf.data;
+
+		initStringInfo(&buf);
+		appendStringInfo(&buf, "foreign join \"%s\"", errpos->query);
+		relname = buf.data;
+	}
 
 	if (errpos->cur_attno > 0 && errpos->cur_attno <= tupdesc->natts)
-		errcontext("column \"%s\" of foreign table \"%s\"",
-				   NameStr(tupdesc->attrs[errpos->cur_attno - 1]->attname),
-				   RelationGetRelationName(errpos->rel));
+		errcontext("%s of %s", attname, relname);
 }
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 950c6f7..0d05e5d 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -16,10 +16,52 @@
 #include "foreign/foreign.h"
 #include "lib/stringinfo.h"
 #include "nodes/relation.h"
+#include "nodes/plannodes.h"
 #include "utils/relcache.h"
 
 #include "libpq-fe.h"
 
+/*
+ * FDW-specific planner information kept in RelOptInfo.fdw_private for a
+ * foreign table or a foreign join.  This information is collected by
+ * postgresGetForeignRelSize, or calculated from join source relations.
+ */
+typedef struct PgFdwRelationInfo
+{
+	/* baserestrictinfo clauses, broken down into safe and unsafe subsets. */
+	List	   *remote_conds;
+	List	   *local_conds;
+
+	/* Bitmap of attr numbers we need to fetch from the remote server. */
+	Bitmapset  *attrs_used;
+
+	/* Cost and selectivity of local_conds. */
+	QualCost	local_conds_cost;
+	Selectivity local_conds_sel;
+
+	/* Estimated size and cost for a scan with baserestrictinfo quals. */
+	double		rows;
+	int			width;
+	Cost		startup_cost;
+	Cost		total_cost;
+
+	/* Options extracted from catalogs. */
+	bool		use_remote_estimate;
+	Cost		fdw_startup_cost;
+	Cost		fdw_tuple_cost;
+
+	/* Cached catalog information. */
+	ForeignServer *server;
+	Oid			userid;
+
+	/* Join information */
+	RelOptInfo *outerrel;
+	RelOptInfo *innerrel;
+	JoinType	jointype;
+	List	   *joinclauses;
+	List	   *otherclauses;
+} PgFdwRelationInfo;
+
 /* in postgres_fdw.c */
 extern int	set_transmission_modes(void);
 extern void reset_transmission_modes(int nestlevel);
@@ -51,13 +93,30 @@ extern void deparseSelectSql(StringInfo buf,
 				 PlannerInfo *root,
 				 RelOptInfo *baserel,
 				 Bitmapset *attrs_used,
+				 List *remote_conds,
+				 List **params_list,
+				 List **fdw_ps_tlist,
 				 List **retrieved_attrs);
-extern void appendWhereClause(StringInfo buf,
+extern void appendConditions(StringInfo buf,
 				  PlannerInfo *root,
 				  RelOptInfo *baserel,
+				  List *outertlist,
+				  List *innertlist,
 				  List *exprs,
-				  bool is_first,
+				  const char *prefix,
 				  List **params);
+extern void deparseJoinSql(StringInfo sql,
+			   PlannerInfo *root,
+			   RelOptInfo *baserel,
+			   RelOptInfo *outerrel,
+			   RelOptInfo *innerrel,
+			   const char *sql_o,
+			   const char *sql_i,
+			   JoinType jointype,
+			   List *joinclauses,
+			   List *otherclauses,
+			   List **fdw_ps_tlist,
+			   List **retrieved_attrs);
 extern void deparseInsertSql(StringInfo buf, PlannerInfo *root,
 				 Index rtindex, Relation rel,
 				 List *targetAttrs, List *returningList,
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 4a23457..05bd2f6 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -11,12 +11,17 @@ DO $d$
             OPTIONS (dbname '$$||current_database()||$$',
                      port '$$||current_setting('port')||$$'
             )$$;
+        EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER postgres_fdw
+            OPTIONS (dbname '$$||current_database()||$$',
+                     port '$$||current_setting('port')||$$'
+            )$$;
     END;
 $d$;
 
 CREATE USER MAPPING FOR public SERVER testserver1
 	OPTIONS (user 'value', password 'value');
 CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
+CREATE USER MAPPING FOR CURRENT_USER SERVER loopback2;
 
 -- ===================================================================
 -- create objects used through FDW loopback server
@@ -39,6 +44,18 @@ CREATE TABLE "S 1"."T 2" (
 	c2 text,
 	CONSTRAINT t2_pkey PRIMARY KEY (c1)
 );
+CREATE TABLE "S 1"."T 3" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text,
+	CONSTRAINT t3_pkey PRIMARY KEY (c1)
+);
+CREATE TABLE "S 1"."T 4" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c4 text,
+	CONSTRAINT t4_pkey PRIMARY KEY (c1)
+);
 
 INSERT INTO "S 1"."T 1"
 	SELECT id,
@@ -54,9 +71,23 @@ INSERT INTO "S 1"."T 2"
 	SELECT id,
 	       'AAA' || to_char(id, 'FM000')
 	FROM generate_series(1, 100) id;
+INSERT INTO "S 1"."T 3"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 3" WHERE c1 % 2 != 0;	-- delete for outer join tests
+INSERT INTO "S 1"."T 4"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 4" WHERE c1 % 3 != 0;	-- delete for outer join tests
 
 ANALYZE "S 1"."T 1";
 ANALYZE "S 1"."T 2";
+ANALYZE "S 1"."T 3";
+ANALYZE "S 1"."T 4";
 
 -- ===================================================================
 -- create foreign tables
@@ -87,6 +118,29 @@ CREATE FOREIGN TABLE ft2 (
 ) SERVER loopback;
 ALTER FOREIGN TABLE ft2 DROP COLUMN cx;
 
+CREATE FOREIGN TABLE ft4 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3');
+
+CREATE FOREIGN TABLE ft5 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4');
+
+CREATE FOREIGN TABLE ft6 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback2 OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE USER view_owner;
+GRANT ALL ON ft5 TO view_owner;
+CREATE VIEW v_ft5 AS SELECT * FROM ft5;
+ALTER VIEW v_ft5 OWNER TO view_owner;
+CREATE USER MAPPING FOR view_owner SERVER loopback;
+
 -- ===================================================================
 -- tests for validator
 -- ===================================================================
@@ -158,8 +212,6 @@ EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
 SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
 -- aggregate
 SELECT COUNT(*) FROM ft1 t1;
--- join two tables
-SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
 -- subquery
 SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <= 10) ORDER BY c1;
 -- subquery+MAX
@@ -216,6 +268,82 @@ SELECT * FROM ft1 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft2 WHERE c1 < 5));
 SELECT * FROM ft2 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft1 WHERE c1 < 5));
 
 -- ===================================================================
+-- JOIN queries
+-- ===================================================================
+-- join two tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- join three tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1, t3.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1, t3.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+-- left outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+-- right outer join
+SET enable_mergejoin = off; -- palnner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+SET enable_mergejoin = on;
+-- full outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+-- full outer join + WHERE clause, only matched rows
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+-- join at WHERE clause 
+SET enable_mergejoin = off; -- palnner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SET enable_mergejoin = on;
+-- join in CTE
+EXPLAIN (COSTS false, VERBOSE)
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+-- ctid with whole-row reference
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- partially unsafe to push down, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+-- SEMI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+-- ANTI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+-- CROSS JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- different server
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- different effective user for permission check
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- unsafe join conditions
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- local filter (unsafe conditions on one side)
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+
+-- ===================================================================
 -- parameterized queries
 -- ===================================================================
 -- simple join
@@ -666,116 +794,6 @@ UPDATE rem1 SET f2 = 'testo';
 INSERT INTO rem1(f2) VALUES ('test') RETURNING ctid;
 
 -- ===================================================================
--- test inheritance features
--- ===================================================================
-
-CREATE TABLE a (aa TEXT);
-CREATE TABLE loct (aa TEXT, bb TEXT);
-CREATE FOREIGN TABLE b (bb TEXT) INHERITS (a)
-  SERVER loopback OPTIONS (table_name 'loct');
-
-INSERT INTO a(aa) VALUES('aaa');
-INSERT INTO a(aa) VALUES('aaaa');
-INSERT INTO a(aa) VALUES('aaaaa');
-
-INSERT INTO b(aa) VALUES('bbb');
-INSERT INTO b(aa) VALUES('bbbb');
-INSERT INTO b(aa) VALUES('bbbbb');
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE a SET aa = 'zzzzzz' WHERE aa LIKE 'aaaa%';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE b SET aa = 'new';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE a SET aa = 'newtoo';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-DELETE FROM a;
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-DROP TABLE a CASCADE;
-DROP TABLE loct;
-
--- Check SELECT FOR UPDATE/SHARE with an inherited source table
-create table loct1 (f1 int, f2 int, f3 int);
-create table loct2 (f1 int, f2 int, f3 int);
-
-create table foo (f1 int, f2 int);
-create foreign table foo2 (f3 int) inherits (foo)
-  server loopback options (table_name 'loct1');
-create table bar (f1 int, f2 int);
-create foreign table bar2 (f3 int) inherits (bar)
-  server loopback options (table_name 'loct2');
-
-insert into foo values(1,1);
-insert into foo values(3,3);
-insert into foo2 values(2,2,2);
-insert into foo2 values(4,4,4);
-insert into bar values(1,11);
-insert into bar values(2,22);
-insert into bar values(6,66);
-insert into bar2 values(3,33,33);
-insert into bar2 values(4,44,44);
-insert into bar2 values(7,77,77);
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for update;
-select * from bar where f1 in (select f1 from foo) for update;
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for share;
-select * from bar where f1 in (select f1 from foo) for share;
-
--- Check UPDATE with inherited target and an inherited source table
-explain (verbose, costs off)
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-
-select tableoid::regclass, * from bar order by 1,2;
-
--- Check UPDATE with inherited target and an appendrel subquery
-explain (verbose, costs off)
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-
-select tableoid::regclass, * from bar order by 1,2;
-
--- Test that WHERE CURRENT OF is not supported
-begin;
-declare c cursor for select * from bar where f1 = 7;
-fetch from c;
-update bar set f2 = null where current of c;
-rollback;
-
-drop table foo cascade;
-drop table bar cascade;
-drop table loct1;
-drop table loct2;
-
--- ===================================================================
 -- test IMPORT FOREIGN SCHEMA
 -- ===================================================================
 
@@ -831,3 +849,7 @@ DROP TYPE "Colors" CASCADE;
 IMPORT FOREIGN SCHEMA import_source LIMIT TO (t5)
   FROM SERVER loopback INTO import_dest5;  -- ERROR
 ROLLBACK;
+
+-- Cleanup
+DROP OWNED BY view_owner;
+DROP USER view_owner;

#19

Shigeru Hanada

shigeru.hanada@gmail.com

almost 11 years ago

In reply to: Shigeru Hanada (#18)

1 attachment(s)

Sorry , the document portion was not in the v8 patch. Please use v9
patch instead.

2015-04-07 15:53 GMT+09:00 Shigeru Hanada <shigeru.hanada@gmail.com>:

Hi KaiGai-san,

Thanks for the review. Attached is the v8 patch of foreign join support for postgres_fdw.

In addition to your comments, I removed useless code that retrieves ForeignPath from outer/inner RelOptInfo and store them into ForeignPath#fdw_private. Now postgres_fdw’s join pushd-down is free from existence of ForeignPath under the join relation. This means that we can support the case Robert mentioned before, that whole "(huge JOIN large) JOIN small” can be pushed down even if “(huge JOIN large)” is dominated by another join path.

2015-04-07 13:46 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:

Thanks for your dedicated efforts for remote-join feature.
Below are the comments from my side.

* Bug - mixture of ctid system column and whole row-reference
In case when "ctid" system column is required, deparseSelectSql()
adds ctid reference on the base relation scan level.
On the other hands, whole-row reference is transformed to
a reference to the underlying relation. It will work fine if
system column is not specified. However, system column reference
breaks tuple layout from the expected one.
Eventually it leads an error.

I too found the bug. As you suggested, deparseProjectionSql() is the place to fix.

postgres=# select ft1.ctid,ft1 from ft1,ft2 where a=b;
ERROR: malformed record literal: "(2,2,bbb,"(0,2)")"
DETAIL: Too many columns.
CONTEXT: column "" of foreign table "foreign join"
STATEMENT: select ft1.ctid,ft1 from ft1,ft2 where a=b;

postgres=# explain verbose
select ft1.ctid,ft1 from ft1,ft2 where a=b;
QUERY PLAN
--------------------------------------------------------------------------------
Foreign Scan (cost=200.00..208.35 rows=835 width=70)
Output: ft1.ctid, ft1.*
Remote SQL: SELECT l.a1, l.a2 FROM (SELECT l.a7, l, l.a10 FROM (SELECT id a9,
a a10, atext a11, ctid a7 FROM public.t1) l) l (a1, a2, a3) INNER JOIN (SELECT
b a10 FROM public.t2) r (a1) ON ((l.a3 = r.a1))

"l" of the first SELECT represents a whole-row reference.
However, underlying SELECT contains system columns in its target-
list.

Is it available to construct such a query?
SELECT l.a1, l.a2 FROM (SELECT (id,a,atext), ctid) l (a1, a2) ...
^^^^^^^^^^

Simple relation reference such as "l" is not sufficient for the purpose, yes. But putting columns in parentheses would not work when a user column is referenced in original query.

I implemented deparseProjectionSql to use ROW(...) expression for a whole-row reference in the target list, in addition to ordinary column references for actually used columns and ctid.

Please see the test case for mixed use of ctid and whole-row reference to postgres_fdw’s regression tests. Now a whole-row reference in the remote query looks like this:

-- ctid with whole-row reference
EXPLAIN (COSTS false, VERBOSE)
SELECT t1.ctid, t1, t2 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit
Output: t1.ctid, t1.*, t2.*, t1.c3, t1.c1
-> Sort
Output: t1.ctid, t1.*, t2.*, t1.c3, t1.c1
Sort Key: t1.c3, t1.c1
-> Foreign Scan
Output: t1.ctid, t1.*, t2.*, t1.c3, t1.c1
Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, r.a1 FROM (SELECT l.a7, ROW(l.a10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17), l.a12, l.a10 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a1
(8 rows)

In fact l.a12 and l.a10, for t1.c3 and t1.c1, are redundant in transferred data, but IMO this would simplify the code for deparsing.

* postgresGetForeignJoinPaths()
It walks on the root->simple_rel_array to check whether
all the relations involved are manged by same foreign
server with same credential.
We may have more graceful way for this. Pay attention on
the fdw_private field of innerrel/outerrel. If they have
a valid fdw_private, it means FDW driver (postgres_fdw)
considers all the underlying relations scan/join are
available to run the remote-server.
So, all we need to check is whether server-id and user-id
of both relations are identical or not.

Exactly. I fixed the code not to loop around.

* merge_fpinfo()
It seems to me fpinfo->rows should be joinrel->rows, and
fpinfo->width also should be joinrel->width.
No need to have special intelligence here, isn't it?

Oops. They are vestige of my struggle which disabled SELECT clause optimization (omit unused columns). Now width and rows are inherited from joinrel. Besides that, fdw_startup_cost and fdw_tuple_cost seem wrong, so I fixed them to use simple summary, not average.

* explain output

EXPLAIN output may be a bit insufficient to know what does it
actually try to do.

postgres=# explain select * from ft1,ft2 where a = b;
QUERY PLAN
--------------------------------------------------------
Foreign Scan (cost=200.00..212.80 rows=1280 width=80)
(1 row)

Even though it is not an immediate request, it seems to me
better to show users joined relations and remote ON/WHERE
clause here.

Like this?

Foreign Scan on ft1 INNER JOIN ft2 ON ft1.a = ft2.b (cost=200.00..212.80 rows=1280 width=80)
…

It might produce a very long line in a case of joining many tables because it contains most of remote query other than SELECT clause, but I prefer detailed. Another idea is to print “Join Cond” and “Remote Filter” as separated EXPLAIN items.

Note that v8 patch doesn’t contain this change yet!

--
Shigeru HANADA

--
Shigeru HANADA

Attachments:

foreign_join_v9.patchapplication/octet-stream; name=foreign_join_v9.patchDownload

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 94fab18..0b29159 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -44,8 +44,11 @@
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
+#include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "nodes/plannodes.h"
 #include "optimizer/clauses.h"
+#include "optimizer/prep.h"
 #include "optimizer/var.h"
 #include "parser/parsetree.h"
 #include "utils/builtins.h"
@@ -89,6 +92,8 @@ typedef struct deparse_expr_cxt
 	RelOptInfo *foreignrel;		/* the foreign relation we are planning for */
 	StringInfo	buf;			/* output buffer to append to */
 	List	  **params_list;	/* exprs that will become remote Params */
+	List	   *outertlist;		/* outer child's target list */
+	List	   *innertlist;		/* inner child's target list */
 } deparse_expr_cxt;
 
 /*
@@ -137,12 +142,19 @@ static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod,
 static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
 					   deparse_expr_cxt *context);
 
+/*
+ * convert absolute attnum to relative one.  This would be handy for handling
+ * attnum for attrs_used and column aliases.
+ */
+#define TO_RELATIVE(x)	((x) - FirstLowInvalidHeapAttributeNumber)
+
 
 /*
  * Examine each qual clause in input_conds, and classify them into two groups,
  * which are returned as two lists:
  *	- remote_conds contains expressions that can be evaluated remotely
  *	- local_conds contains expressions that can't be evaluated remotely
+ * Note that each element is Expr, which was stripped from RestrictInfo, 
  */
 void
 classifyConditions(PlannerInfo *root,
@@ -161,9 +173,9 @@ classifyConditions(PlannerInfo *root,
 		RestrictInfo *ri = (RestrictInfo *) lfirst(lc);
 
 		if (is_foreign_expr(root, baserel, ri->clause))
-			*remote_conds = lappend(*remote_conds, ri);
+			*remote_conds = lappend(*remote_conds, ri->clause);
 		else
-			*local_conds = lappend(*local_conds, ri);
+			*local_conds = lappend(*local_conds, ri->clause);
 	}
 }
 
@@ -250,7 +262,7 @@ foreign_expr_walker(Node *node,
 				 * Param's collation, ie it's not safe for it to have a
 				 * non-default collation.
 				 */
-				if (var->varno == glob_cxt->foreignrel->relid &&
+				if (bms_is_member(var->varno, glob_cxt->foreignrel->relids) &&
 					var->varlevelsup == 0)
 				{
 					/* Var belongs to foreign table */
@@ -681,12 +693,57 @@ deparseSelectSql(StringInfo buf,
 				 PlannerInfo *root,
 				 RelOptInfo *baserel,
 				 Bitmapset *attrs_used,
+				 List *remote_conds,
+				 List **params_list,
+				 List **fdw_ps_tlist,
 				 List **retrieved_attrs)
 {
+	PgFdwRelationInfo  *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
 	RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
 	Relation	rel;
 
 	/*
+	 * If given relation was a join relation, recursively construct statement
+	 * by putting each outer and inner relations in FROM clause as a subquery
+	 * with aliasing.
+	 */
+	if (baserel->reloptkind == RELOPT_JOINREL)
+	{
+		RelOptInfo		   *rel_o = fpinfo->outerrel;
+		RelOptInfo		   *rel_i = fpinfo->innerrel;
+		PgFdwRelationInfo  *fpinfo_o = (PgFdwRelationInfo *) rel_o->fdw_private;
+		PgFdwRelationInfo  *fpinfo_i = (PgFdwRelationInfo *) rel_i->fdw_private;
+		StringInfoData		sql_o;
+		StringInfoData		sql_i;
+		List			   *ret_attrs_tmp;	/* not used */
+
+		/*
+		 * Deparse query for outer and inner relation, and combine them into
+		 * a query.
+		 */
+		initStringInfo(&sql_o);
+		deparseSelectSql(&sql_o, root, rel_o, fpinfo_o->attrs_used,
+						 fpinfo_o->remote_conds, params_list,
+						 fdw_ps_tlist, &ret_attrs_tmp);
+		initStringInfo(&sql_i);
+		deparseSelectSql(&sql_i, root, rel_i, fpinfo_i->attrs_used,
+						 fpinfo_i->remote_conds, params_list,
+						 fdw_ps_tlist, &ret_attrs_tmp);
+
+		deparseJoinSql(buf, root, baserel,
+					   fpinfo->outerrel,
+					   fpinfo->innerrel,
+					   sql_o.data,
+					   sql_i.data,
+					   fpinfo->jointype,
+					   fpinfo->joinclauses,
+					   fpinfo->otherclauses,
+					   fdw_ps_tlist,
+					   retrieved_attrs);
+		return;
+	}
+
+	/*
 	 * Core code already has some lock on each rel being planned, so we can
 	 * use NoLock here.
 	 */
@@ -705,6 +762,65 @@ deparseSelectSql(StringInfo buf,
 	appendStringInfoString(buf, " FROM ");
 	deparseRelation(buf, rel);
 
+	/*
+	 * Construct WHERE clause
+	 */
+	if (remote_conds)
+		appendConditions(buf, root, baserel, NULL, NULL, remote_conds,
+						 " WHERE ", params_list);
+
+	/*
+	 * Add FOR UPDATE/SHARE if appropriate.  We apply locking during the
+	 * initial row fetch, rather than later on as is done for local tables.
+	 * The extra roundtrips involved in trying to duplicate the local
+	 * semantics exactly don't seem worthwhile (see also comments for
+	 * RowMarkType).
+	 *
+	 * Note: because we actually run the query as a cursor, this assumes
+	 * that DECLARE CURSOR ... FOR UPDATE is supported, which it isn't
+	 * before 8.3.
+	 */
+	if (baserel->relid == root->parse->resultRelation &&
+		(root->parse->commandType == CMD_UPDATE ||
+		 root->parse->commandType == CMD_DELETE))
+	{
+		/* Relation is UPDATE/DELETE target, so use FOR UPDATE */
+		appendStringInfoString(buf, " FOR UPDATE");
+	}
+	else
+	{
+		PlanRowMark *rc = get_plan_rowmark(root->rowMarks, baserel->relid);
+
+		if (rc)
+		{
+			/*
+			 * Relation is specified as a FOR UPDATE/SHARE target, so handle
+			 * that.  (But we could also see LCS_NONE, meaning this isn't a
+			 * target relation after all.)
+			 *
+			 * For now, just ignore any [NO] KEY specification, since (a)
+			 * it's not clear what that means for a remote table that we
+			 * don't have complete information about, and (b) it wouldn't
+			 * work anyway on older remote servers.  Likewise, we don't
+			 * worry about NOWAIT.
+			 */
+			switch (rc->strength)
+			{
+				case LCS_NONE:
+					/* No locking needed */
+					break;
+				case LCS_FORKEYSHARE:
+				case LCS_FORSHARE:
+					appendStringInfoString(buf, " FOR SHARE");
+					break;
+				case LCS_FORNOKEYUPDATE:
+				case LCS_FORUPDATE:
+					appendStringInfoString(buf, " FOR UPDATE");
+					break;
+			}
+		}
+	}
+
 	heap_close(rel, NoLock);
 }
 
@@ -731,8 +847,7 @@ deparseTargetList(StringInfo buf,
 	*retrieved_attrs = NIL;
 
 	/* If there's a whole-row reference, we'll need all the columns. */
-	have_wholerow = bms_is_member(0 - FirstLowInvalidHeapAttributeNumber,
-								  attrs_used);
+	have_wholerow = bms_is_member(TO_RELATIVE(0), attrs_used);
 
 	first = true;
 	for (i = 1; i <= tupdesc->natts; i++)
@@ -743,15 +858,14 @@ deparseTargetList(StringInfo buf,
 		if (attr->attisdropped)
 			continue;
 
-		if (have_wholerow ||
-			bms_is_member(i - FirstLowInvalidHeapAttributeNumber,
-						  attrs_used))
+		if (have_wholerow || bms_is_member(TO_RELATIVE(i), attrs_used))
 		{
 			if (!first)
 				appendStringInfoString(buf, ", ");
 			first = false;
 
 			deparseColumnRef(buf, rtindex, i, root);
+			appendStringInfo(buf, " a%d", TO_RELATIVE(i));
 
 			*retrieved_attrs = lappend_int(*retrieved_attrs, i);
 		}
@@ -761,17 +875,17 @@ deparseTargetList(StringInfo buf,
 	 * Add ctid if needed.  We currently don't support retrieving any other
 	 * system columns.
 	 */
-	if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-					  attrs_used))
+	if (bms_is_member(TO_RELATIVE(SelfItemPointerAttributeNumber), attrs_used))
 	{
 		if (!first)
 			appendStringInfoString(buf, ", ");
 		first = false;
 
-		appendStringInfoString(buf, "ctid");
+		appendStringInfo(buf, "ctid a%d",
+						 TO_RELATIVE(SelfItemPointerAttributeNumber));
 
 		*retrieved_attrs = lappend_int(*retrieved_attrs,
-									   SelfItemPointerAttributeNumber);
+										   SelfItemPointerAttributeNumber);
 	}
 
 	/* Don't generate bad syntax if no undropped columns */
@@ -780,7 +894,8 @@ deparseTargetList(StringInfo buf,
 }
 
 /*
- * Deparse WHERE clauses in given list of RestrictInfos and append them to buf.
+ * Deparse conditions, such as WHERE clause and ON clause of JOIN, in given
+ * list of Expr and append them to buf.
  *
  * baserel is the foreign table we're planning for.
  *
@@ -794,12 +909,14 @@ deparseTargetList(StringInfo buf,
  * so Params and other-relation Vars should be replaced by dummy values.
  */
 void
-appendWhereClause(StringInfo buf,
-				  PlannerInfo *root,
-				  RelOptInfo *baserel,
-				  List *exprs,
-				  bool is_first,
-				  List **params)
+appendConditions(StringInfo buf,
+				 PlannerInfo *root,
+				 RelOptInfo *baserel,
+				 List *outertlist,
+				 List *innertlist,
+				 List *exprs,
+				 const char *prefix,
+				 List **params)
 {
 	deparse_expr_cxt context;
 	int			nestlevel;
@@ -813,31 +930,319 @@ appendWhereClause(StringInfo buf,
 	context.foreignrel = baserel;
 	context.buf = buf;
 	context.params_list = params;
+	context.outertlist = outertlist;
+	context.innertlist = innertlist;
 
 	/* Make sure any constants in the exprs are printed portably */
 	nestlevel = set_transmission_modes();
 
 	foreach(lc, exprs)
 	{
-		RestrictInfo *ri = (RestrictInfo *) lfirst(lc);
+		Expr	   *expr = (Expr *) lfirst(lc);
 
 		/* Connect expressions with "AND" and parenthesize each condition. */
-		if (is_first)
-			appendStringInfoString(buf, " WHERE ");
-		else
-			appendStringInfoString(buf, " AND ");
+		if (prefix)
+			appendStringInfo(buf, "%s", prefix);
 
 		appendStringInfoChar(buf, '(');
-		deparseExpr(ri->clause, &context);
+		deparseExpr(expr, &context);
 		appendStringInfoChar(buf, ')');
 
-		is_first = false;
+		prefix= " AND ";
 	}
 
 	reset_transmission_modes(nestlevel);
 }
 
 /*
+ * Returns position index (start with 1) of given var in given target list, or
+ * 0 when not found.
+ */
+static int
+find_var_pos(Var *node, List *tlist)
+{
+	int		pos = 1;
+	ListCell *lc;
+
+	foreach(lc, tlist)
+	{
+		Var *var = (Var *) lfirst(lc);
+
+		if (equal(var, node))
+		{
+			return pos;
+		}
+		pos++;
+	}
+
+	return 0;
+}
+
+/*
+ * Deparse given Var into buf.
+ */
+static void
+deparseJoinVar(Var *node, deparse_expr_cxt *context)
+{
+	char		side;
+	int			pos;
+
+	pos = find_var_pos(node, context->outertlist);
+	if (pos > 0)
+		side = 'l';
+	else
+	{
+		side = 'r';
+		pos = find_var_pos(node, context->innertlist);
+	}
+	Assert(pos > 0);
+	Assert(side == 'l' || side == 'r');
+
+	/*
+	 * We treat whole-row reference same as ordinary attribute references,
+	 * because such transformation should be done in lower level.
+	 */
+	appendStringInfo(context->buf, "%c.a%d", side, pos);
+}
+
+/*
+ * Deparse column alias list for a subquery in FROM clause.
+ */
+static void
+deparseColumnAliases(StringInfo buf, List *tlist)
+{
+	int			pos;
+	ListCell   *lc;
+
+	pos = 1;
+	foreach(lc, tlist)
+	{
+		/* Deparse column alias for the subquery */
+		if (pos > 1)
+			appendStringInfoString(buf, ", ");
+		appendStringInfo(buf, "a%d", pos);
+		pos++;
+	}
+}
+
+/*
+ * Deparse "wrapper" SQL for a query which contains whole-row reference or ctid.
+ * If the SQL is enough simple, just return it.
+ *
+ * Projecting whole-row value from each column value is done by ExecProjection
+ * for results of a scan against an ordinary tables, but join push-down omits
+ * ExecProjection calls so we need to do it in the remote SQL.
+ */
+static const char *
+deparseProjectionSql(PlannerInfo *root,
+					 RelOptInfo *baserel,
+					 const char *sql,
+					 char side)
+{
+	StringInfoData wholerow;
+	StringInfoData buf;
+	ListCell   *lc;
+	bool		first;
+	bool		have_wholerow = false;
+	bool		have_ctid = false;
+
+	/*
+	 * We have nothing to do if the targetlist contains no special reference,
+	 * such as whole-row and ctid.
+	 */
+	foreach(lc, baserel->reltargetlist)
+	{
+		Var		   *var = (Var *) lfirst(lc);
+		if (var->varattno == 0)
+		{
+			have_wholerow = true;
+			if (have_ctid)
+				break;
+		}
+		else if (var->varattno == SelfItemPointerAttributeNumber)
+		{
+			have_ctid = true;
+			if (have_wholerow)
+				break;
+		}
+	}
+	if (!have_wholerow && !have_ctid)
+		return sql;
+
+	/*
+	 * Construct whole-row reference with ROW() syntax
+	 */
+	if (have_wholerow)
+	{
+		RangeTblEntry *rte;
+		Relation		rel;
+		TupleDesc		tupdesc;
+		int				i;
+
+		/* Obtain TupleDesc for deparsing all valid columns */
+		rte = planner_rt_fetch(baserel->relid, root);
+		rel = heap_open(rte->relid, NoLock);
+		tupdesc = rel->rd_att;
+
+		/* Print all valid columns in ROW() to generate whole-row value */
+		initStringInfo(&wholerow);
+		appendStringInfoString(&wholerow, "ROW(");
+		first = true;
+		for (i = 1; i <= tupdesc->natts; i++)
+		{
+			Form_pg_attribute attr = tupdesc->attrs[i - 1];
+
+			/* Ignore dropped columns. */
+			if (attr->attisdropped)
+				continue;
+
+			if (!first)
+				appendStringInfoString(&wholerow, ", ");
+			first = false;
+
+			appendStringInfo(&wholerow, "%c.a%d", side, TO_RELATIVE(i));
+		}
+		appendStringInfoString(&wholerow, ")");
+
+		heap_close(rel, NoLock);
+	}
+
+	/*
+	 * Construct a SELECT statement which has the original query in its FROM
+	 * clause, and have target list entries in its SELECT clause.  The number
+	 * used in column aliases are attnum - FirstLowInvalidHeapAttributeNumber in
+	 * order to make all numbers positive even for system columns which have
+	 * minus value as attnum.
+	 */
+	initStringInfo(&buf);
+	appendStringInfoString(&buf, "SELECT ");
+	first = true;
+	foreach(lc, baserel->reltargetlist)
+	{
+		Var *var = (Var *) lfirst(lc);
+
+		if (!first)
+			appendStringInfoString(&buf, ", ");
+	
+		if (var->varattno == 0)
+			appendStringInfo(&buf, "%s", wholerow.data);
+		else
+			appendStringInfo(&buf, "%c.a%d", side, TO_RELATIVE(var->varattno));
+
+		first = false;
+	}
+	appendStringInfo(&buf, " FROM (%s) %c", sql, side);
+
+	return buf.data;
+}
+
+/*
+ * Construct a SELECT statement which contains join clause.
+ *
+ * We also create an TargetEntry List of the columns being retrieved, which is
+ * returned to *fdw_ps_tlist.
+ *
+ * path_o, tl_o, sql_o are respectively path, targetlist, and remote query
+ * statement of the outer child relation.  postfix _i means those for the inner
+ * child relation.  jointype and joinclauses are information of join method.
+ * fdw_ps_tlist is output parameter to pass target list of the pseudo scan to
+ * caller.
+ */
+void
+deparseJoinSql(StringInfo buf,
+			   PlannerInfo *root,
+			   RelOptInfo *baserel,
+			   RelOptInfo *outerrel,
+			   RelOptInfo *innerrel,
+			   const char *sql_o,
+			   const char *sql_i,
+			   JoinType jointype,
+			   List *joinclauses,
+			   List *otherclauses,
+			   List **fdw_ps_tlist,
+			   List **retrieved_attrs)
+{
+	StringInfoData selbuf;		/* buffer for SELECT clause */
+	StringInfoData abuf_o;		/* buffer for column alias list of outer */
+	StringInfoData abuf_i;		/* buffer for column alias list of inner */
+	int			i;
+	ListCell   *lc;
+	const char *jointype_str;
+	deparse_expr_cxt context;
+
+	context.root = root;
+	context.foreignrel = baserel;
+	context.buf = &selbuf;
+	context.params_list = NULL;
+	context.outertlist = outerrel->reltargetlist;
+	context.innertlist = innerrel->reltargetlist;
+
+	jointype_str = jointype == JOIN_INNER ? "INNER" :
+				   jointype == JOIN_LEFT ? "LEFT" :
+				   jointype == JOIN_RIGHT ? "RIGHT" :
+				   jointype == JOIN_FULL ? "FULL" : "";
+
+	*retrieved_attrs = NIL;
+
+	/* print SELECT clause of the join scan */
+	initStringInfo(&selbuf);
+	i = 0;
+	foreach(lc, baserel->reltargetlist)
+	{
+		Var		   *var = (Var *) lfirst(lc);
+		TargetEntry *tle;
+
+		if (i > 0)
+			appendStringInfoString(&selbuf, ", ");
+		deparseJoinVar(var, &context);
+
+		tle = makeTargetEntry((Expr *) copyObject(var),
+							  i + 1, pstrdup(""), false);
+		if (fdw_ps_tlist)
+			*fdw_ps_tlist = lappend(*fdw_ps_tlist, copyObject(tle));
+
+		*retrieved_attrs = lappend_int(*retrieved_attrs, i + 1);
+
+		i++;
+	}
+	if (i == 0)
+		appendStringInfoString(&selbuf, "NULL");
+
+	/*
+	 * Do pseudo-projection for an underlying scan on a foreign table, if a) the
+	 * relation is a base relation, and b) its targetlist contains whole-row
+	 * reference.
+	 */
+	if (outerrel->reloptkind == RELOPT_BASEREL)
+		sql_o = deparseProjectionSql(root, outerrel, sql_o, 'l');
+	if (innerrel->reloptkind == RELOPT_BASEREL)
+		sql_i = deparseProjectionSql(root, innerrel, sql_i, 'r');
+
+	/* Deparse column alias portion of subquery in FROM clause. */
+	initStringInfo(&abuf_o);
+	deparseColumnAliases(&abuf_o, outerrel->reltargetlist);
+	initStringInfo(&abuf_i);
+	deparseColumnAliases(&abuf_i, innerrel->reltargetlist);
+
+	/* Construct SELECT statement */
+	appendStringInfo(buf, "SELECT %s FROM", selbuf.data);
+	appendStringInfo(buf, " (%s) l (%s) %s JOIN (%s) r (%s)",
+					 sql_o, abuf_o.data, jointype_str, sql_i, abuf_i.data);
+	/* Append ON clause */
+	if (joinclauses)
+		appendConditions(buf, root, baserel,
+						 outerrel->reltargetlist, innerrel->reltargetlist,
+						 joinclauses,
+						 " ON ", NULL);
+	/* Append WHERE clause */
+	if (otherclauses)
+		appendConditions(buf, root, baserel,
+						 outerrel->reltargetlist, innerrel->reltargetlist,
+						 otherclauses,
+						 " WHERE ", NULL);
+}
+
+/*
  * deparse remote INSERT statement
  *
  * The statement text is appended to buf, and we also create an integer List
@@ -976,8 +1381,7 @@ deparseReturningList(StringInfo buf, PlannerInfo *root,
 	if (trig_after_row)
 	{
 		/* whole-row reference acquires all non-system columns */
-		attrs_used =
-			bms_make_singleton(0 - FirstLowInvalidHeapAttributeNumber);
+		attrs_used = bms_make_singleton(TO_RELATIVE(0));
 	}
 
 	if (returningList != NIL)
@@ -1261,6 +1665,8 @@ deparseExpr(Expr *node, deparse_expr_cxt *context)
 /*
  * Deparse given Var node into context->buf.
  *
+ * If context has valid innerrel, this is invoked for a join conditions.
+ *
  * If the Var belongs to the foreign relation, just print its remote name.
  * Otherwise, it's effectively a Param (and will in fact be a Param at
  * run time).  Handle it the same way we handle plain Params --- see
@@ -1271,39 +1677,46 @@ deparseVar(Var *node, deparse_expr_cxt *context)
 {
 	StringInfo	buf = context->buf;
 
-	if (node->varno == context->foreignrel->relid &&
-		node->varlevelsup == 0)
+	if (context->foreignrel->reloptkind == RELOPT_JOINREL)
 	{
-		/* Var belongs to foreign table */
-		deparseColumnRef(buf, node->varno, node->varattno, context->root);
+		deparseJoinVar(node, context);
 	}
 	else
 	{
-		/* Treat like a Param */
-		if (context->params_list)
+		if (node->varno == context->foreignrel->relid &&
+			node->varlevelsup == 0)
 		{
-			int			pindex = 0;
-			ListCell   *lc;
-
-			/* find its index in params_list */
-			foreach(lc, *context->params_list)
+			/* Var belongs to foreign table */
+			deparseColumnRef(buf, node->varno, node->varattno, context->root);
+		}
+		else
+		{
+			/* Treat like a Param */
+			if (context->params_list)
 			{
-				pindex++;
-				if (equal(node, (Node *) lfirst(lc)))
-					break;
+				int			pindex = 0;
+				ListCell   *lc;
+
+				/* find its index in params_list */
+				foreach(lc, *context->params_list)
+				{
+					pindex++;
+					if (equal(node, (Node *) lfirst(lc)))
+						break;
+				}
+				if (lc == NULL)
+				{
+					/* not in list, so add it */
+					pindex++;
+					*context->params_list = lappend(*context->params_list, node);
+				}
+
+				printRemoteParam(pindex, node->vartype, node->vartypmod, context);
 			}
-			if (lc == NULL)
+			else
 			{
-				/* not in list, so add it */
-				pindex++;
-				*context->params_list = lappend(*context->params_list, node);
+				printRemotePlaceholder(node->vartype, node->vartypmod, context);
 			}
-
-			printRemoteParam(pindex, node->vartype, node->vartypmod, context);
-		}
-		else
-		{
-			printRemotePlaceholder(node->vartype, node->vartypmod, context);
 		}
 	}
 }
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 783cb41..19c1115 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9,11 +9,16 @@ DO $d$
             OPTIONS (dbname '$$||current_database()||$$',
                      port '$$||current_setting('port')||$$'
             )$$;
+        EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER postgres_fdw
+            OPTIONS (dbname '$$||current_database()||$$',
+                     port '$$||current_setting('port')||$$'
+            )$$;
     END;
 $d$;
 CREATE USER MAPPING FOR public SERVER testserver1
 	OPTIONS (user 'value', password 'value');
 CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
+CREATE USER MAPPING FOR CURRENT_USER SERVER loopback2;
 -- ===================================================================
 -- create objects used through FDW loopback server
 -- ===================================================================
@@ -35,6 +40,18 @@ CREATE TABLE "S 1"."T 2" (
 	c2 text,
 	CONSTRAINT t2_pkey PRIMARY KEY (c1)
 );
+CREATE TABLE "S 1"."T 3" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text,
+	CONSTRAINT t3_pkey PRIMARY KEY (c1)
+);
+CREATE TABLE "S 1"."T 4" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c4 text,
+	CONSTRAINT t4_pkey PRIMARY KEY (c1)
+);
 INSERT INTO "S 1"."T 1"
 	SELECT id,
 	       id % 10,
@@ -49,8 +66,22 @@ INSERT INTO "S 1"."T 2"
 	SELECT id,
 	       'AAA' || to_char(id, 'FM000')
 	FROM generate_series(1, 100) id;
+INSERT INTO "S 1"."T 3"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 3" WHERE c1 % 2 != 0;	-- delete for outer join tests
+INSERT INTO "S 1"."T 4"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 4" WHERE c1 % 3 != 0;	-- delete for outer join tests
 ANALYZE "S 1"."T 1";
 ANALYZE "S 1"."T 2";
+ANALYZE "S 1"."T 3";
+ANALYZE "S 1"."T 4";
 -- ===================================================================
 -- create foreign tables
 -- ===================================================================
@@ -78,6 +109,26 @@ CREATE FOREIGN TABLE ft2 (
 	c8 user_enum
 ) SERVER loopback;
 ALTER FOREIGN TABLE ft2 DROP COLUMN cx;
+CREATE FOREIGN TABLE ft4 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3');
+CREATE FOREIGN TABLE ft5 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE FOREIGN TABLE ft6 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback2 OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE USER view_owner;
+GRANT ALL ON ft5 TO view_owner;
+CREATE VIEW v_ft5 AS SELECT * FROM ft5;
+ALTER VIEW v_ft5 OWNER TO view_owner;
+CREATE USER MAPPING FOR view_owner SERVER loopback;
 -- ===================================================================
 -- tests for validator
 -- ===================================================================
@@ -119,12 +170,15 @@ ALTER FOREIGN TABLE ft2 OPTIONS (schema_name 'S 1', table_name 'T 1');
 ALTER FOREIGN TABLE ft1 ALTER COLUMN c1 OPTIONS (column_name 'C 1');
 ALTER FOREIGN TABLE ft2 ALTER COLUMN c1 OPTIONS (column_name 'C 1');
 \det+
-                             List of foreign tables
- Schema | Table |  Server  |              FDW Options              | Description 
---------+-------+----------+---------------------------------------+-------------
- public | ft1   | loopback | (schema_name 'S 1', table_name 'T 1') | 
- public | ft2   | loopback | (schema_name 'S 1', table_name 'T 1') | 
-(2 rows)
+                              List of foreign tables
+ Schema | Table |  Server   |              FDW Options              | Description 
+--------+-------+-----------+---------------------------------------+-------------
+ public | ft1   | loopback  | (schema_name 'S 1', table_name 'T 1') | 
+ public | ft2   | loopback  | (schema_name 'S 1', table_name 'T 1') | 
+ public | ft4   | loopback  | (schema_name 'S 1', table_name 'T 3') | 
+ public | ft5   | loopback  | (schema_name 'S 1', table_name 'T 4') | 
+ public | ft6   | loopback2 | (schema_name 'S 1', table_name 'T 4') | 
+(5 rows)
 
 -- Now we should be able to run ANALYZE.
 -- To exercise multiple code paths, we use local stats on ft1
@@ -160,8 +214,8 @@ SELECT * FROM ft1 ORDER BY c3, c1 OFFSET 100 LIMIT 10;
 (10 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                     QUERY PLAN                                                      
+---------------------------------------------------------------------------------------------------------------------
  Limit
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    ->  Sort
@@ -169,7 +223,7 @@ EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan on public.ft1 t1
                Output: c1, c2, c3, c4, c5, c6, c7, c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (8 rows)
 
 SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
@@ -189,8 +243,8 @@ SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
 
 -- whole-row reference
 EXPLAIN (VERBOSE, COSTS false) SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                     QUERY PLAN                                                      
+---------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.*, c3, c1
    ->  Sort
@@ -198,7 +252,7 @@ EXPLAIN (VERBOSE, COSTS false) SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSE
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan on public.ft1 t1
                Output: t1.*, c3, c1
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (8 rows)
 
 SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
@@ -224,11 +278,11 @@ SELECT * FROM ft1 WHERE false;
 
 -- with WHERE clause
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
-                                                                   QUERY PLAN                                                                   
-------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                   QUERY PLAN                                                                                   
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c7 >= '1'::bpchar)) AND (("C 1" = 101)) AND ((c6 = '1'::text))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c7 >= '1'::bpchar)) AND (("C 1" = 101)) AND ((c6 = '1'::text))
 (3 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
@@ -239,13 +293,13 @@ SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
 
 -- with FOR UPDATE/SHARE
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
-                                                   QUERY PLAN                                                   
-----------------------------------------------------------------------------------------------------------------
+                                                                   QUERY PLAN                                                                   
+------------------------------------------------------------------------------------------------------------------------------------------------
  LockRows
    Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 101)) FOR UPDATE
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 101)) FOR UPDATE
 (5 rows)
 
 SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
@@ -255,13 +309,13 @@ SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
 (1 row)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
-                                                  QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
+                                                                  QUERY PLAN                                                                   
+-----------------------------------------------------------------------------------------------------------------------------------------------
  LockRows
    Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 102)) FOR SHARE
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 102)) FOR SHARE
 (5 rows)
 
 SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
@@ -277,22 +331,6 @@ SELECT COUNT(*) FROM ft1 t1;
   1000
 (1 row)
 
--- join two tables
-SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
- c1  
------
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
-(10 rows)
-
 -- subquery
 SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <= 10) ORDER BY c1;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -353,153 +391,148 @@ CREATE OPERATOR === (
     NEGATOR = !==
 );
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = postgres_fdw_abs(t1.c2);
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c1 = postgres_fdw_abs(t1.c2))
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 === t1.c2;
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c1 === t1.c2)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = abs(t1.c2);
-                                            QUERY PLAN                                             
----------------------------------------------------------------------------------------------------
+                                                            QUERY PLAN                                                             
+-----------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = abs(c2)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = abs(c2)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = t1.c2;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
+                                                          QUERY PLAN                                                          
+------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = c2))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = c2))
 (3 rows)
 
 -- ===================================================================
 -- WHERE with remotely-executable conditions
 -- ===================================================================
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 1;         -- Var, OpExpr(b), Const
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
-                                                  QUERY PLAN                                                  
---------------------------------------------------------------------------------------------------------------
+                                                                  QUERY PLAN                                                                  
+----------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 100)) AND ((c2 = 0))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 100)) AND ((c2 = 0))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NULL;        -- NullTest
-                                           QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
+                                                           QUERY PLAN                                                            
+---------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;    -- NullTest
-                                             QUERY PLAN                                              
------------------------------------------------------------------------------------------------------
+                                                             QUERY PLAN                                                              
+-------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
-                                                     QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
+                                                                     QUERY PLAN                                                                      
+-----------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((round(abs("C 1"), 0) = 1::numeric))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((round(abs("C 1"), 0) = 1::numeric))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = -c1;          -- OpExpr(l)
-                                             QUERY PLAN                                              
------------------------------------------------------------------------------------------------------
+                                                             QUERY PLAN                                                              
+-------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = (- "C 1")))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = (- "C 1")))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE 1 = c1!;           -- OpExpr(r)
-                                                QUERY PLAN                                                
-----------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((1::numeric = ("C 1" !)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((1::numeric = ("C 1" !)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE (c1 IS NOT NULL) IS DISTINCT FROM (c1 IS NOT NULL); -- DistinctExpr
-                                                                 QUERY PLAN                                                                 
---------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                 QUERY PLAN                                                                                 
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" IS NOT NULL) IS DISTINCT FROM ("C 1" IS NOT NULL)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" IS NOT NULL) IS DISTINCT FROM ("C 1" IS NOT NULL)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = ANY(ARRAY[c2, 1, c1 + 0]); -- ScalarArrayOpExpr
-                                                        QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
+                                                                        QUERY PLAN                                                                         
+-----------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = ANY (ARRAY[c2, 1, ("C 1" + 0)])))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = ANY (ARRAY[c2, 1, ("C 1" + 0)])))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = (ARRAY[c1,c2,3])[1]; -- ArrayRef
-                                                      QUERY PLAN                                                      
-----------------------------------------------------------------------------------------------------------------------
+                                                                      QUERY PLAN                                                                      
+------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = ((ARRAY["C 1", c2, 3])[1])))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = ((ARRAY["C 1", c2, 3])[1])))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c6 = E'foo''s\\bar';  -- check special chars
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                  
+---------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c6 = E'foo''s\\bar'::text))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c6 = E'foo''s\\bar'::text))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c8 = 'foo';  -- can't be sent to remote
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 -- parameterized remote path
 EXPLAIN (VERBOSE, COSTS false)
   SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
- Nested Loop
+                                                                                                                                                                                                               QUERY PLAN                                                                                                                                                                                                                
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan
    Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8, b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
-   ->  Foreign Scan on public.ft2 a
-         Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 47))
-   ->  Foreign Scan on public.ft2 b
-         Output: b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
-(8 rows)
+   Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, r.a1, r.a2, r.a3, r.a4, r.a5, r.a6, r.a7, r.a8 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 47))) l (a1, a2, a3, a4, a5, a6, a7, a8) INNER JOIN (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r (a1, a2, a3, a4, a5, a6, a7, a8) ON ((l.a2 = r.a1))
+(3 rows)
 
 SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  | c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -511,18 +544,18 @@ SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
 EXPLAIN (VERBOSE, COSTS false)
   SELECT * FROM ft2 a, ft2 b
   WHERE a.c2 = 6 AND b.c1 = a.c1 AND a.c8 = 'foo' AND b.c7 = upper(a.c7);
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                 
+--------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop
    Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8, b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
    ->  Foreign Scan on public.ft2 a
          Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8
          Filter: (a.c8 = 'foo'::user_enum)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c2 = 6))
+         Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c2 = 6))
    ->  Foreign Scan on public.ft2 b
          Output: b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
          Filter: (upper((a.c7)::text) = (b.c7)::text)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
+         Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
 (10 rows)
 
 SELECT * FROM ft2 a, ft2 b
@@ -651,21 +684,587 @@ SELECT * FROM ft2 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft1 WHERE c1 < 5));
 (4 rows)
 
 -- ===================================================================
+-- JOIN queries
+-- ===================================================================
+-- join two tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                                                                      QUERY PLAN                                                                                       
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1, t1.c3
+               Remote SQL: SELECT l.a1, l.a2, r.a1 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l (a1, a2) INNER JOIN (SELECT "C 1" a9 FROM "S 1"."T 1") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1  | c1  
+-----+-----
+ 101 | 101
+ 102 | 102
+ 103 | 103
+ 104 | 104
+ 105 | 105
+ 106 | 106
+ 107 | 107
+ 108 | 108
+ 109 | 109
+ 110 | 110
+(10 rows)
+
+-- join three tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1, t3.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+                                                                                                                                                   QUERY PLAN                                                                                                                                                   
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t3.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t3.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1, t3.c1, t1.c3
+               Remote SQL: SELECT l.a1, l.a2, l.a3, r.a1 FROM (SELECT l.a1, l.a2, r.a1 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l (a1, a2) INNER JOIN (SELECT "C 1" a9 FROM "S 1"."T 1") r (a1) ON ((l.a1 = r.a1))) l (a1, a2, a3) INNER JOIN (SELECT c1 a9 FROM "S 1"."T 3") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1, t3.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 | c1 
+----+----+----
+ 22 | 22 | 22
+ 24 | 24 | 24
+ 26 | 26 | 26
+ 28 | 28 | 28
+ 30 | 30 | 30
+ 32 | 32 | 32
+ 34 | 34 | 34
+ 36 | 36 | 36
+ 38 | 38 | 38
+ 40 | 40 | 40
+(10 rows)
+
+-- left outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                         QUERY PLAN                                                                          
+-------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 3") l (a1) LEFT JOIN (SELECT c1 a9 FROM "S 1"."T 4") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 22 |   
+ 24 | 24
+ 26 |   
+ 28 |   
+ 30 | 30
+ 32 |   
+ 34 |   
+ 36 | 36
+ 38 |   
+ 40 |   
+(10 rows)
+
+-- right outer join
+SET enable_mergejoin = off; -- palnner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                         QUERY PLAN                                                                          
+-------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 4") l (a1) LEFT JOIN (SELECT c1 a9 FROM "S 1"."T 3") r (a1) ON ((r.a1 = l.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+    | 33
+ 36 | 36
+    | 39
+ 42 | 42
+    | 45
+ 48 | 48
+    | 51
+ 54 | 54
+    | 57
+ 60 | 60
+(10 rows)
+
+SET enable_mergejoin = on;
+-- full outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+                                                                         QUERY PLAN                                                                          
+-------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 3") l (a1) FULL JOIN (SELECT c1 a9 FROM "S 1"."T 4") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+ c1  | c1 
+-----+----
+  92 |   
+  94 |   
+  96 | 96
+  98 |   
+ 100 |   
+     |  3
+     |  9
+     | 15
+     | 21
+     | 27
+(10 rows)
+
+-- full outer join + WHERE clause, only matched rows
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                                              QUERY PLAN                                                                                               
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 3") l (a1) FULL JOIN (SELECT c1 a9 FROM "S 1"."T 4") r (a1) ON ((l.a1 = r.a1)) WHERE (((l.a1 = r.a1) OR (l.a1 IS NULL)))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 66 | 66
+ 72 | 72
+ 78 | 78
+ 84 | 84
+ 90 | 90
+ 96 | 96
+    |  3
+    |  9
+    | 15
+    | 21
+(10 rows)
+
+-- join at WHERE clause 
+SET enable_mergejoin = off; -- palnner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                          QUERY PLAN                                                                          
+--------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 3") l (a1) INNER JOIN (SELECT c1 a9 FROM "S 1"."T 4") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 66 | 66
+ 72 | 72
+ 78 | 78
+ 84 | 84
+ 90 | 90
+ 96 | 96
+(6 rows)
+
+SET enable_mergejoin = on;
+-- join in CTE
+EXPLAIN (COSTS false, VERBOSE)
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+                                                                                    QUERY PLAN                                                                                     
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t.c1_1, t.c2_1, t.c1_3
+   CTE t
+     ->  Foreign Scan
+           Output: t1.c1, t1.c3, t2.c1
+           Remote SQL: SELECT l.a1, l.a2, r.a1 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l (a1, a2) INNER JOIN (SELECT "C 1" a9 FROM "S 1"."T 1") r (a1) ON ((l.a1 = r.a1))
+   ->  Sort
+         Output: t.c1_1, t.c2_1, t.c1_3
+         Sort Key: t.c1_3, t.c1_1
+         ->  CTE Scan on t
+               Output: t.c1_1, t.c2_1, t.c1_3
+(11 rows)
+
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+ c1_1 | c2_1 
+------+------
+  101 |  101
+  102 |  102
+  103 |  103
+  104 |  104
+  105 |  105
+  106 |  106
+  107 |  107
+  108 |  108
+  109 |  109
+  110 |  110
+(10 rows)
+
+-- ctid with whole-row reference
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                                                                                                                                                                                                                                   QUERY PLAN                                                                                                                                                                                                                                                    
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+   ->  Sort
+         Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+               Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, r.a1 FROM (SELECT l.a7, ROW(l.a10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17), l.a10, l.a12 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17, ctid a7 FROM "S 1"."T 1") l) l (a1, a2, a3, a4) INNER JOIN (SELECT ROW(r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a9 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2) ON ((l.a3 = r.a2))
+(8 rows)
+
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+  ctid  |                                             t1                                             |                                             t2                                             | c1  
+--------+--------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------+-----
+ (1,4)  | (101,1,00101,"Fri Jan 02 00:00:00 1970 PST","Fri Jan 02 00:00:00 1970",1,"1         ",foo) | (101,1,00101,"Fri Jan 02 00:00:00 1970 PST","Fri Jan 02 00:00:00 1970",1,"1         ",foo) | 101
+ (1,5)  | (102,2,00102,"Sat Jan 03 00:00:00 1970 PST","Sat Jan 03 00:00:00 1970",2,"2         ",foo) | (102,2,00102,"Sat Jan 03 00:00:00 1970 PST","Sat Jan 03 00:00:00 1970",2,"2         ",foo) | 102
+ (1,6)  | (103,3,00103,"Sun Jan 04 00:00:00 1970 PST","Sun Jan 04 00:00:00 1970",3,"3         ",foo) | (103,3,00103,"Sun Jan 04 00:00:00 1970 PST","Sun Jan 04 00:00:00 1970",3,"3         ",foo) | 103
+ (1,7)  | (104,4,00104,"Mon Jan 05 00:00:00 1970 PST","Mon Jan 05 00:00:00 1970",4,"4         ",foo) | (104,4,00104,"Mon Jan 05 00:00:00 1970 PST","Mon Jan 05 00:00:00 1970",4,"4         ",foo) | 104
+ (1,8)  | (105,5,00105,"Tue Jan 06 00:00:00 1970 PST","Tue Jan 06 00:00:00 1970",5,"5         ",foo) | (105,5,00105,"Tue Jan 06 00:00:00 1970 PST","Tue Jan 06 00:00:00 1970",5,"5         ",foo) | 105
+ (1,9)  | (106,6,00106,"Wed Jan 07 00:00:00 1970 PST","Wed Jan 07 00:00:00 1970",6,"6         ",foo) | (106,6,00106,"Wed Jan 07 00:00:00 1970 PST","Wed Jan 07 00:00:00 1970",6,"6         ",foo) | 106
+ (1,10) | (107,7,00107,"Thu Jan 08 00:00:00 1970 PST","Thu Jan 08 00:00:00 1970",7,"7         ",foo) | (107,7,00107,"Thu Jan 08 00:00:00 1970 PST","Thu Jan 08 00:00:00 1970",7,"7         ",foo) | 107
+ (1,11) | (108,8,00108,"Fri Jan 09 00:00:00 1970 PST","Fri Jan 09 00:00:00 1970",8,"8         ",foo) | (108,8,00108,"Fri Jan 09 00:00:00 1970 PST","Fri Jan 09 00:00:00 1970",8,"8         ",foo) | 108
+ (1,12) | (109,9,00109,"Sat Jan 10 00:00:00 1970 PST","Sat Jan 10 00:00:00 1970",9,"9         ",foo) | (109,9,00109,"Sat Jan 10 00:00:00 1970 PST","Sat Jan 10 00:00:00 1970",9,"9         ",foo) | 109
+ (1,13) | (110,0,00110,"Sun Jan 11 00:00:00 1970 PST","Sun Jan 11 00:00:00 1970",0,"0         ",foo) | (110,0,00110,"Sun Jan 11 00:00:00 1970 PST","Sun Jan 11 00:00:00 1970",0,"0         ",foo) | 110
+(10 rows)
+
+-- partially unsafe to push down, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+                                                                                          QUERY PLAN                                                                                           
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Nested Loop
+               Output: t1.c1
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Materialize
+                     ->  Foreign Scan
+                           Remote SQL: SELECT NULL FROM (SELECT "C 1" a9 FROM "S 1"."T 1" WHERE (("C 1" = "C 1"))) l (a1) INNER JOIN (SELECT c1 a9 FROM "S 1"."T 3") r (a1) ON ((l.a1 = r.a1))
+(13 rows)
+
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+ c1 
+----
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+(10 rows)
+
+-- SEMI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Hash Join
+               Output: t1.c1
+               Hash Cond: (t1.c1 = t2.c1)
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t2.c1
+                     ->  HashAggregate
+                           Output: t2.c1
+                           Group Key: t2.c1
+                           ->  Foreign Scan on public.ft2 t2
+                                 Output: t2.c1
+                                 Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+(19 rows)
+
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+ c1  
+-----
+ 101
+ 102
+ 103
+ 104
+ 105
+ 106
+ 107
+ 108
+ 109
+ 110
+(10 rows)
+
+-- ANTI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+                              QUERY PLAN                              
+----------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Hash Anti Join
+               Output: t1.c1
+               Hash Cond: (t1.c1 = t2.c2)
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t2.c2
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c2
+                           Remote SQL: SELECT c2 a10 FROM "S 1"."T 1"
+(16 rows)
+
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+ c1  
+-----
+ 110
+ 111
+ 112
+ 113
+ 114
+ 115
+ 116
+ 117
+ 118
+ 119
+(10 rows)
+
+-- CROSS JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Nested Loop
+               Output: t1.c1, t2.c1
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Materialize
+                     Output: t2.c1
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c1
+                           Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+(15 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1  
+----+-----
+  1 | 101
+  1 | 102
+  1 | 103
+  1 | 104
+  1 | 105
+  1 | 106
+  1 | 107
+  1 | 108
+  1 | 109
+  1 | 110
+(10 rows)
+
+-- different server
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Merge Join
+         Output: t1.c1, t2.c1
+         Merge Cond: (t1.c1 = t2.c1)
+         ->  Sort
+               Output: t1.c1
+               Sort Key: t1.c1
+               ->  Foreign Scan on public.ft5 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+         ->  Sort
+               Output: t2.c1
+               Sort Key: t2.c1
+               ->  Foreign Scan on public.ft6 t2
+                     Output: t2.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1 
+----+----
+(0 rows)
+
+-- different effective user for permission check
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Limit
+   Output: t1.c1, ft5.c1
+   ->  Merge Join
+         Output: t1.c1, ft5.c1
+         Merge Cond: (t1.c1 = ft5.c1)
+         ->  Sort
+               Output: t1.c1
+               Sort Key: t1.c1
+               ->  Foreign Scan on public.ft5 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+         ->  Sort
+               Output: ft5.c1
+               Sort Key: ft5.c1
+               ->  Foreign Scan on public.ft5
+                     Output: ft5.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1 
+----+----
+(0 rows)
+
+-- unsafe join conditions
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Merge Join
+               Output: t1.c1, t2.c1, t1.c3
+               Merge Cond: (t1.c8 = t2.c8)
+               ->  Sort
+                     Output: t1.c1, t1.c3, t1.c8
+                     Sort Key: t1.c8
+                     ->  Foreign Scan on public.ft1 t1
+                           Output: t1.c1, t1.c3, t1.c8
+                           Remote SQL: SELECT "C 1" a10, c3 a12, c8 a17 FROM "S 1"."T 1"
+               ->  Sort
+                     Output: t2.c1, t2.c8
+                     Sort Key: t2.c8
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c1, t2.c8
+                           Remote SQL: SELECT "C 1" a9, c8 a17 FROM "S 1"."T 1"
+(20 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1 | c1  
+----+-----
+  1 | 102
+  1 | 103
+  1 | 104
+  1 | 105
+  1 | 106
+  1 | 107
+  1 | 108
+  1 | 109
+  1 | 110
+  1 |   1
+(10 rows)
+
+-- local filter (unsafe conditions on one side)
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Hash Join
+               Output: t1.c1, t2.c1, t1.c3
+               Hash Cond: (t2.c1 = t1.c1)
+               ->  Foreign Scan on public.ft2 t2
+                     Output: t2.c1
+                     Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t1.c1, t1.c3
+                     ->  Foreign Scan on public.ft1 t1
+                           Output: t1.c1, t1.c3
+                           Filter: (t1.c8 = 'foo'::user_enum)
+                           Remote SQL: SELECT "C 1" a10, c3 a12, c8 a17 FROM "S 1"."T 1"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1  | c1  
+-----+-----
+ 101 | 101
+ 102 | 102
+ 103 | 103
+ 104 | 104
+ 105 | 105
+ 106 | 106
+ 107 | 107
+ 108 | 108
+ 109 | 109
+ 110 | 110
+(10 rows)
+
+-- ===================================================================
 -- parameterized queries
 -- ===================================================================
 -- simple join
 PREPARE st1(int, int) AS SELECT t1.c3, t2.c3 FROM ft1 t1, ft2 t2 WHERE t1.c1 = $1 AND t2.c1 = $2;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st1(1, 2);
-                             QUERY PLAN                             
---------------------------------------------------------------------
+                               QUERY PLAN                               
+------------------------------------------------------------------------
  Nested Loop
    Output: t1.c3, t2.c3
    ->  Foreign Scan on public.ft1 t1
          Output: t1.c3
-         Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+         Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" = 1))
    ->  Foreign Scan on public.ft2 t2
          Output: t2.c3
-         Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" = 2))
+         Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" = 2))
 (8 rows)
 
 EXECUTE st1(1, 1);
@@ -683,8 +1282,8 @@ EXECUTE st1(101, 101);
 -- subquery using stable function (can't be sent to remote)
 PREPARE st2(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c4) = '1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st2(10, 20);
-                                                QUERY PLAN                                                
-----------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Sort
    Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
    Sort Key: t1.c1
@@ -693,13 +1292,13 @@ EXPLAIN (VERBOSE, COSTS false) EXECUTE st2(10, 20);
          Join Filter: (t1.c3 = t2.c3)
          ->  Foreign Scan on public.ft1 t1
                Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20))
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" < 20))
          ->  Materialize
                Output: t2.c3
                ->  Foreign Scan on public.ft2 t2
                      Output: t2.c3
                      Filter: (date(t2.c4) = '01-17-1970'::date)
-                     Remote SQL: SELECT c3, c4 FROM "S 1"."T 1" WHERE (("C 1" > 10))
+                     Remote SQL: SELECT c3 a12, c4 a13 FROM "S 1"."T 1" WHERE (("C 1" > 10))
 (15 rows)
 
 EXECUTE st2(10, 20);
@@ -717,8 +1316,8 @@ EXECUTE st2(101, 121);
 -- subquery using immutable function (can be sent to remote)
 PREPARE st3(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c5) = '1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st3(10, 20);
-                                                      QUERY PLAN                                                       
------------------------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Sort
    Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
    Sort Key: t1.c1
@@ -727,12 +1326,12 @@ EXPLAIN (VERBOSE, COSTS false) EXECUTE st3(10, 20);
          Join Filter: (t1.c3 = t2.c3)
          ->  Foreign Scan on public.ft1 t1
                Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20))
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" < 20))
          ->  Materialize
                Output: t2.c3
                ->  Foreign Scan on public.ft2 t2
                      Output: t2.c3
-                     Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date))
+                     Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date))
 (14 rows)
 
 EXECUTE st3(10, 20);
@@ -749,108 +1348,108 @@ EXECUTE st3(20, 30);
 -- custom plan should be chosen initially
 PREPARE st4(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 = $1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 -- once we try it enough times, should switch to generic plan
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
 (3 rows)
 
 -- value of $1 should not be sent to remote
 PREPARE st5(user_enum,int) AS SELECT * FROM ft1 t1 WHERE c8 = $1 and c1 = $2;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = $1)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
 (4 rows)
 
 EXECUTE st5('foo', 1);
@@ -868,14 +1467,14 @@ DEALLOCATE st5;
 -- System columns, except ctid, should not be sent to remote
 EXPLAIN (VERBOSE, COSTS false)
 SELECT * FROM ft1 t1 WHERE t1.tableoid = 'pg_class'::regclass LIMIT 1;
-                                  QUERY PLAN                                   
--------------------------------------------------------------------------------
+                                                  QUERY PLAN                                                   
+---------------------------------------------------------------------------------------------------------------
  Limit
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8
          Filter: (t1.tableoid = '1259'::oid)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (6 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.tableoid = 'ft1'::regclass LIMIT 1;
@@ -886,13 +1485,13 @@ SELECT * FROM ft1 t1 WHERE t1.tableoid = 'ft1'::regclass LIMIT 1;
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
-                                  QUERY PLAN                                   
--------------------------------------------------------------------------------
+                                                  QUERY PLAN                                                   
+---------------------------------------------------------------------------------------------------------------
  Limit
    Output: ((tableoid)::regclass), c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: (tableoid)::regclass, c1, c2, c3, c4, c5, c6, c7, c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (5 rows)
 
 SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
@@ -903,11 +1502,11 @@ SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((ctid = '(0,2)'::tid))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((ctid = '(0,2)'::tid))
 (3 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
@@ -918,13 +1517,13 @@ SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT ctid, * FROM ft1 t1 LIMIT 1;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                       QUERY PLAN                                                       
+------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: ctid, c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: ctid, c1, c2, c3, c4, c5, c6, c7, c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17, ctid a7 FROM "S 1"."T 1"
 (5 rows)
 
 SELECT ctid, * FROM ft1 t1 LIMIT 1;
@@ -987,7 +1586,7 @@ FETCH c;
 SAVEPOINT s;
 SELECT * FROM ft1 WHERE 1 / (c1 - 1) > 0;  -- ERROR
 ERROR:  division by zero
-CONTEXT:  Remote SQL command: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (((1 / ("C 1" - 1)) > 0))
+CONTEXT:  Remote SQL command: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (((1 / ("C 1" - 1)) > 0))
 ROLLBACK TO s;
 FETCH c;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -1010,64 +1609,64 @@ create foreign table ft3 (f1 text collate "C", f2 text)
   server loopback options (table_name 'loct3');
 -- can be sent to remote
 explain (verbose, costs off) select * from ft3 where f1 = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f1 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f1 = 'foo'::text))
 (3 rows)
 
 explain (verbose, costs off) select * from ft3 where f1 COLLATE "C" = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f1 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f1 = 'foo'::text))
 (3 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f2 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f2 = 'foo'::text))
 (3 rows)
 
 -- can't be sent to remote
 explain (verbose, costs off) select * from ft3 where f1 COLLATE "POSIX" = 'foo';
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: ((ft3.f1)::text = 'foo'::text)
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f1 = 'foo' COLLATE "C";
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: (ft3.f1 = 'foo'::text COLLATE "C")
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 COLLATE "C" = 'foo';
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: ((ft3.f2)::text = 'foo'::text)
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 = 'foo' COLLATE "C";
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: (ft3.f2 = 'foo'::text COLLATE "C")
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 -- ===================================================================
@@ -1085,7 +1684,7 @@ INSERT INTO ft2 (c1,c2,c3) SELECT c1+1000,c2+100, c3 || c3 FROM ft2 LIMIT 20;
                Output: ((ft2_1.c1 + 1000)), ((ft2_1.c2 + 100)), ((ft2_1.c3 || ft2_1.c3))
                ->  Foreign Scan on public.ft2 ft2_1
                      Output: (ft2_1.c1 + 1000), (ft2_1.c2 + 100), (ft2_1.c3 || ft2_1.c3)
-                     Remote SQL: SELECT "C 1", c2, c3 FROM "S 1"."T 1"
+                     Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12 FROM "S 1"."T 1"
 (9 rows)
 
 INSERT INTO ft2 (c1,c2,c3) SELECT c1+1000,c2+100, c3 || c3 FROM ft2 LIMIT 20;
@@ -1210,35 +1809,27 @@ UPDATE ft2 SET c2 = c2 + 400, c3 = c3 || '_update7' WHERE c1 % 10 = 7 RETURNING
 EXPLAIN (verbose, costs off)
 UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
-                                                                            QUERY PLAN                                                                             
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                                                                                                                       QUERY PLAN                                                                                                                                                                                                                                                                       
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Update on public.ft2
    Remote SQL: UPDATE "S 1"."T 1" SET c2 = $2, c3 = $3, c7 = $4 WHERE ctid = $1
-   ->  Hash Join
+   ->  Foreign Scan
          Output: ft2.c1, (ft2.c2 + 500), NULL::integer, (ft2.c3 || '_update9'::text), ft2.c4, ft2.c5, ft2.c6, 'ft2       '::character(10), ft2.c8, ft2.ctid, ft1.*
-         Hash Cond: (ft2.c2 = ft1.c1)
-         ->  Foreign Scan on public.ft2
-               Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c8, ft2.ctid
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c8, ctid FROM "S 1"."T 1" FOR UPDATE
-         ->  Hash
-               Output: ft1.*, ft1.c1
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.*, ft1.c1
-                     Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 9))
-(13 rows)
+         Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, r.a1 FROM (SELECT l.a9, l.a10, l.a12, l.a13, l.a14, l.a15, l.a17, l.a7 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c8 a17, ctid a7 FROM "S 1"."T 1" FOR UPDATE) l) l (a1, a2, a3, a4, a5, a6, a7, a8) INNER JOIN (SELECT ROW(r.a10, r.a11, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a10 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 9))) r) r (a1, a2) ON ((l.a2 = r.a2))
+(5 rows)
 
 UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
 EXPLAIN (verbose, costs off)
   DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
-                                       QUERY PLAN                                       
-----------------------------------------------------------------------------------------
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
  Delete on public.ft2
    Output: c1, c4
-   Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1 RETURNING "C 1", c4
+   Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1 RETURNING "C 1" a9, c4 a13
    ->  Foreign Scan on public.ft2
          Output: ctid
-         Remote SQL: SELECT ctid FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 5)) FOR UPDATE
+         Remote SQL: SELECT ctid a7 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 5)) FOR UPDATE
 (6 rows)
 
 DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
@@ -1351,22 +1942,14 @@ DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
 
 EXPLAIN (verbose, costs off)
 DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
-                                                      QUERY PLAN                                                      
-----------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                                        QUERY PLAN                                                                                                                                                                                         
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Delete on public.ft2
    Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1
-   ->  Hash Join
+   ->  Foreign Scan
          Output: ft2.ctid, ft1.*
-         Hash Cond: (ft2.c2 = ft1.c1)
-         ->  Foreign Scan on public.ft2
-               Output: ft2.ctid, ft2.c2
-               Remote SQL: SELECT c2, ctid FROM "S 1"."T 1" FOR UPDATE
-         ->  Hash
-               Output: ft1.*, ft1.c1
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.*, ft1.c1
-                     Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 2))
-(13 rows)
+         Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a7, l.a10 FROM (SELECT c2 a10, ctid a7 FROM "S 1"."T 1" FOR UPDATE) l) l (a1, a2) INNER JOIN (SELECT ROW(r.a10, r.a11, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a10 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 2))) r) r (a1, a2) ON ((l.a2 = r.a2))
+(5 rows)
 
 DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
 SELECT c1,c2,c3,c4 FROM ft2 ORDER BY c1;
@@ -3027,386 +3610,6 @@ NOTICE:  NEW: (13,"test triggered !")
 (1 row)
 
 -- ===================================================================
--- test inheritance features
--- ===================================================================
-CREATE TABLE a (aa TEXT);
-CREATE TABLE loct (aa TEXT, bb TEXT);
-CREATE FOREIGN TABLE b (bb TEXT) INHERITS (a)
-  SERVER loopback OPTIONS (table_name 'loct');
-INSERT INTO a(aa) VALUES('aaa');
-INSERT INTO a(aa) VALUES('aaaa');
-INSERT INTO a(aa) VALUES('aaaaa');
-INSERT INTO b(aa) VALUES('bbb');
-INSERT INTO b(aa) VALUES('bbbb');
-INSERT INTO b(aa) VALUES('bbbbb');
-SELECT tableoid::regclass, * FROM a;
- tableoid |  aa   
-----------+-------
- a        | aaa
- a        | aaaa
- a        | aaaaa
- b        | bbb
- b        | bbbb
- b        | bbbbb
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |  aa   | bb 
-----------+-------+----
- b        | bbb   | 
- b        | bbbb  | 
- b        | bbbbb | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |  aa   
-----------+-------
- a        | aaa
- a        | aaaa
- a        | aaaaa
-(3 rows)
-
-UPDATE a SET aa = 'zzzzzz' WHERE aa LIKE 'aaaa%';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
- b        | bbb
- b        | bbbb
- b        | bbbbb
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |  aa   | bb 
-----------+-------+----
- b        | bbb   | 
- b        | bbbb  | 
- b        | bbbbb | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
-(3 rows)
-
-UPDATE b SET aa = 'new';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
- b        | new
- b        | new
- b        | new
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid | aa  | bb 
-----------+-----+----
- b        | new | 
- b        | new | 
- b        | new | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
-(3 rows)
-
-UPDATE a SET aa = 'newtoo';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | newtoo
- a        | newtoo
- a        | newtoo
- b        | newtoo
- b        | newtoo
- b        | newtoo
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |   aa   | bb 
-----------+--------+----
- b        | newtoo | 
- b        | newtoo | 
- b        | newtoo | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | newtoo
- a        | newtoo
- a        | newtoo
-(3 rows)
-
-DELETE FROM a;
-SELECT tableoid::regclass, * FROM a;
- tableoid | aa 
-----------+----
-(0 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid | aa | bb 
-----------+----+----
-(0 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid | aa 
-----------+----
-(0 rows)
-
-DROP TABLE a CASCADE;
-NOTICE:  drop cascades to foreign table b
-DROP TABLE loct;
--- Check SELECT FOR UPDATE/SHARE with an inherited source table
-create table loct1 (f1 int, f2 int, f3 int);
-create table loct2 (f1 int, f2 int, f3 int);
-create table foo (f1 int, f2 int);
-create foreign table foo2 (f3 int) inherits (foo)
-  server loopback options (table_name 'loct1');
-create table bar (f1 int, f2 int);
-create foreign table bar2 (f3 int) inherits (bar)
-  server loopback options (table_name 'loct2');
-insert into foo values(1,1);
-insert into foo values(3,3);
-insert into foo2 values(2,2,2);
-insert into foo2 values(4,4,4);
-insert into bar values(1,11);
-insert into bar values(2,22);
-insert into bar values(6,66);
-insert into bar2 values(3,33,33);
-insert into bar2 values(4,44,44);
-insert into bar2 values(7,77,77);
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for update;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
- LockRows
-   Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-   ->  Hash Join
-         Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Append
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.ctid, bar2.tableoid, bar2.*
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(22 rows)
-
-select * from bar where f1 in (select f1 from foo) for update;
- f1 | f2 
-----+----
-  1 | 11
-  2 | 22
-  3 | 33
-  4 | 44
-(4 rows)
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for share;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
- LockRows
-   Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-   ->  Hash Join
-         Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Append
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.ctid, bar2.tableoid, bar2.*
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR SHARE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(22 rows)
-
-select * from bar where f1 in (select f1 from foo) for share;
- f1 | f2 
-----+----
-  1 | 11
-  2 | 22
-  3 | 33
-  4 | 44
-(4 rows)
-
--- Check UPDATE with inherited target and an inherited source table
-explain (verbose, costs off)
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
- Update on public.bar
-   Update on public.bar
-   Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
-   ->  Hash Join
-         Output: bar.f1, (bar.f2 + 100), bar.ctid, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Seq Scan on public.bar
-               Output: bar.f1, bar.f2, bar.ctid
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-   ->  Hash Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar2.f1 = foo.f1)
-         ->  Foreign Scan on public.bar2
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-               Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(37 rows)
-
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-select tableoid::regclass, * from bar order by 1,2;
- tableoid | f1 | f2  
-----------+----+-----
- bar      |  1 | 111
- bar      |  2 | 122
- bar      |  6 |  66
- bar2     |  3 | 133
- bar2     |  4 | 144
- bar2     |  7 |  77
-(6 rows)
-
--- Check UPDATE with inherited target and an appendrel subquery
-explain (verbose, costs off)
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-                                      QUERY PLAN                                      
---------------------------------------------------------------------------------------
- Update on public.bar
-   Update on public.bar
-   Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
-   ->  Hash Join
-         Output: bar.f1, (bar.f2 + 100), bar.ctid, (ROW(foo.f1))
-         Hash Cond: (foo.f1 = bar.f1)
-         ->  Append
-               ->  Seq Scan on public.foo
-                     Output: ROW(foo.f1), foo.f1
-               ->  Foreign Scan on public.foo2
-                     Output: ROW(foo2.f1), foo2.f1
-                     Remote SQL: SELECT f1 FROM public.loct1
-               ->  Seq Scan on public.foo foo_1
-                     Output: ROW((foo_1.f1 + 3)), (foo_1.f1 + 3)
-               ->  Foreign Scan on public.foo2 foo2_1
-                     Output: ROW((foo2_1.f1 + 3)), (foo2_1.f1 + 3)
-                     Remote SQL: SELECT f1 FROM public.loct1
-         ->  Hash
-               Output: bar.f1, bar.f2, bar.ctid
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid
-   ->  Merge Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, (ROW(foo.f1))
-         Merge Cond: (bar2.f1 = foo.f1)
-         ->  Sort
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-               Sort Key: bar2.f1
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Sort
-               Output: (ROW(foo.f1)), foo.f1
-               Sort Key: foo.f1
-               ->  Append
-                     ->  Seq Scan on public.foo
-                           Output: ROW(foo.f1), foo.f1
-                     ->  Foreign Scan on public.foo2
-                           Output: ROW(foo2.f1), foo2.f1
-                           Remote SQL: SELECT f1 FROM public.loct1
-                     ->  Seq Scan on public.foo foo_1
-                           Output: ROW((foo_1.f1 + 3)), (foo_1.f1 + 3)
-                     ->  Foreign Scan on public.foo2 foo2_1
-                           Output: ROW((foo2_1.f1 + 3)), (foo2_1.f1 + 3)
-                           Remote SQL: SELECT f1 FROM public.loct1
-(45 rows)
-
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-select tableoid::regclass, * from bar order by 1,2;
- tableoid | f1 | f2  
-----------+----+-----
- bar      |  1 | 211
- bar      |  2 | 222
- bar      |  6 | 166
- bar2     |  3 | 233
- bar2     |  4 | 244
- bar2     |  7 | 177
-(6 rows)
-
--- Test that WHERE CURRENT OF is not supported
-begin;
-declare c cursor for select * from bar where f1 = 7;
-fetch from c;
- f1 | f2  
-----+-----
-  7 | 177
-(1 row)
-
-update bar set f2 = null where current of c;
-ERROR:  WHERE CURRENT OF is not supported for this table type
-rollback;
-drop table foo cascade;
-NOTICE:  drop cascades to foreign table foo2
-drop table bar cascade;
-NOTICE:  drop cascades to foreign table bar2
-drop table loct1;
-drop table loct2;
--- ===================================================================
 -- test IMPORT FOREIGN SCHEMA
 -- ===================================================================
 CREATE SCHEMA import_source;
@@ -3636,3 +3839,6 @@ QUERY:  CREATE FOREIGN TABLE t5 (
 OPTIONS (schema_name 'import_source', table_name 't5');
 CONTEXT:  importing foreign table "t5"
 ROLLBACK;
+-- Cleanup
+DROP OWNED BY view_owner;
+DROP USER view_owner;
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 478e124..8e89523 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -28,7 +28,6 @@
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/planmain.h"
-#include "optimizer/prep.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/var.h"
 #include "parser/parsetree.h"
@@ -47,41 +46,8 @@ PG_MODULE_MAGIC;
 #define DEFAULT_FDW_TUPLE_COST		0.01
 
 /*
- * FDW-specific planner information kept in RelOptInfo.fdw_private for a
- * foreign table.  This information is collected by postgresGetForeignRelSize.
- */
-typedef struct PgFdwRelationInfo
-{
-	/* baserestrictinfo clauses, broken down into safe and unsafe subsets. */
-	List	   *remote_conds;
-	List	   *local_conds;
-
-	/* Bitmap of attr numbers we need to fetch from the remote server. */
-	Bitmapset  *attrs_used;
-
-	/* Cost and selectivity of local_conds. */
-	QualCost	local_conds_cost;
-	Selectivity local_conds_sel;
-
-	/* Estimated size and cost for a scan with baserestrictinfo quals. */
-	double		rows;
-	int			width;
-	Cost		startup_cost;
-	Cost		total_cost;
-
-	/* Options extracted from catalogs. */
-	bool		use_remote_estimate;
-	Cost		fdw_startup_cost;
-	Cost		fdw_tuple_cost;
-
-	/* Cached catalog information. */
-	ForeignTable *table;
-	ForeignServer *server;
-	UserMapping *user;			/* only set in use_remote_estimate mode */
-} PgFdwRelationInfo;
-
-/*
- * Indexes of FDW-private information stored in fdw_private lists.
+ * Indexes of FDW-private information stored in fdw_private of ForeignScan of
+ * a simple foreign table scan for a SELECT statement.
  *
  * We store various information in ForeignScan.fdw_private to pass it from
  * planner to executor.  Currently we store:
@@ -98,7 +64,11 @@ enum FdwScanPrivateIndex
 	/* SQL statement to execute remotely (as a String node) */
 	FdwScanPrivateSelectSql,
 	/* Integer list of attribute numbers retrieved by the SELECT */
-	FdwScanPrivateRetrievedAttrs
+	FdwScanPrivateRetrievedAttrs,
+	/* Integer value of server for the scan */
+	FdwScanPrivateServerOid,
+	/* Integer value of effective userid for the scan */
+	FdwScanPrivateUserOid,
 };
 
 /*
@@ -128,7 +98,8 @@ enum FdwModifyPrivateIndex
  */
 typedef struct PgFdwScanState
 {
-	Relation	rel;			/* relcache entry for the foreign table */
+	const char *relname;		/* name of relation being scanned */
+	TupleDesc	tupdesc;		/* tuple descriptor of the scan */
 	AttInMetadata *attinmeta;	/* attribute datatype conversion metadata */
 
 	/* extracted fdw_private data */
@@ -194,6 +165,8 @@ typedef struct PgFdwAnalyzeState
 	AttInMetadata *attinmeta;	/* attribute datatype conversion metadata */
 	List	   *retrieved_attrs;	/* attr numbers retrieved by query */
 
+	char	   *query;			/* text of SELECT command */
+
 	/* collected sample rows */
 	HeapTuple  *rows;			/* array of size targrows */
 	int			targrows;		/* target # of sample rows */
@@ -214,7 +187,10 @@ typedef struct PgFdwAnalyzeState
  */
 typedef struct ConversionLocation
 {
-	Relation	rel;			/* foreign table's relcache entry */
+	const char *relname;		/* name of relation being processed, or NULL for
+								   a foreign join */
+	const char *query;			/* query being processed */
+	TupleDesc	tupdesc;		/* tuple descriptor for attribute names */
 	AttrNumber	cur_attno;		/* attribute number being processed, or 0 */
 } ConversionLocation;
 
@@ -288,6 +264,12 @@ static bool postgresAnalyzeForeignTable(Relation relation,
 							BlockNumber *totalpages);
 static List *postgresImportForeignSchema(ImportForeignSchemaStmt *stmt,
 							Oid serverOid);
+static void postgresGetForeignJoinPaths(PlannerInfo *root,
+						   RelOptInfo *joinrel,
+						   RelOptInfo *outerrel,
+						   RelOptInfo *innerrel,
+						   SpecialJoinInfo *sjinfo,
+						   List *restrictlisti);
 
 /*
  * Helper functions
@@ -323,7 +305,9 @@ static void analyze_row_processor(PGresult *res, int row,
 					  PgFdwAnalyzeState *astate);
 static HeapTuple make_tuple_from_result_row(PGresult *res,
 						   int row,
-						   Relation rel,
+						   const char *relname,
+						   const char *query,
+						   TupleDesc tupdesc,
 						   AttInMetadata *attinmeta,
 						   List *retrieved_attrs,
 						   MemoryContext temp_context);
@@ -368,6 +352,9 @@ postgres_fdw_handler(PG_FUNCTION_ARGS)
 	/* Support functions for IMPORT FOREIGN SCHEMA */
 	routine->ImportForeignSchema = postgresImportForeignSchema;
 
+	/* Support functions for join push-down */
+	routine->GetForeignJoinPaths = postgresGetForeignJoinPaths;
+
 	PG_RETURN_POINTER(routine);
 }
 
@@ -383,7 +370,9 @@ postgresGetForeignRelSize(PlannerInfo *root,
 						  RelOptInfo *baserel,
 						  Oid foreigntableid)
 {
+	RangeTblEntry *rte;
 	PgFdwRelationInfo *fpinfo;
+	ForeignTable *table;
 	ListCell   *lc;
 
 	/*
@@ -394,8 +383,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	baserel->fdw_private = (void *) fpinfo;
 
 	/* Look up foreign-table catalog info. */
-	fpinfo->table = GetForeignTable(foreigntableid);
-	fpinfo->server = GetForeignServer(fpinfo->table->serverid);
+	table = GetForeignTable(foreigntableid);
+	fpinfo->server = GetForeignServer(table->serverid);
 
 	/*
 	 * Extract user-settable option values.  Note that per-table setting of
@@ -416,7 +405,7 @@ postgresGetForeignRelSize(PlannerInfo *root,
 		else if (strcmp(def->defname, "fdw_tuple_cost") == 0)
 			fpinfo->fdw_tuple_cost = strtod(defGetString(def), NULL);
 	}
-	foreach(lc, fpinfo->table->options)
+	foreach(lc, table->options)
 	{
 		DefElem    *def = (DefElem *) lfirst(lc);
 
@@ -428,20 +417,12 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	}
 
 	/*
-	 * If the table or the server is configured to use remote estimates,
-	 * identify which user to do remote access as during planning.  This
+	 * Identify which user to do remote access as during planning.  This
 	 * should match what ExecCheckRTEPerms() does.  If we fail due to lack of
 	 * permissions, the query would have failed at runtime anyway.
 	 */
-	if (fpinfo->use_remote_estimate)
-	{
-		RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
-		Oid			userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-		fpinfo->user = GetUserMapping(userid, fpinfo->server->serverid);
-	}
-	else
-		fpinfo->user = NULL;
+	rte = planner_rt_fetch(baserel->relid, root);
+	fpinfo->userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
 
 	/*
 	 * Identify which baserestrictinfo clauses can be sent to the remote
@@ -463,10 +444,9 @@ postgresGetForeignRelSize(PlannerInfo *root,
 				   &fpinfo->attrs_used);
 	foreach(lc, fpinfo->local_conds)
 	{
-		RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
+		Expr *expr = (Expr *) lfirst(lc);
 
-		pull_varattnos((Node *) rinfo->clause, baserel->relid,
-					   &fpinfo->attrs_used);
+		pull_varattnos((Node *) expr, baserel->relid, &fpinfo->attrs_used);
 	}
 
 	/*
@@ -752,6 +732,8 @@ postgresGetForeignPlan(PlannerInfo *root,
 	List	   *retrieved_attrs;
 	StringInfoData sql;
 	ListCell   *lc;
+	List	   *fdw_ps_tlist = NIL;
+	ForeignScan *scan;
 
 	/*
 	 * Separate the scan_clauses into those that can be executed remotely and
@@ -768,9 +750,6 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 *
 	 * This code must match "extract_actual_clauses(scan_clauses, false)"
 	 * except for the additional decision about remote versus local execution.
-	 * Note however that we only strip the RestrictInfo nodes from the
-	 * local_exprs list, since appendWhereClause expects a list of
-	 * RestrictInfos.
 	 */
 	foreach(lc, scan_clauses)
 	{
@@ -783,11 +762,11 @@ postgresGetForeignPlan(PlannerInfo *root,
 			continue;
 
 		if (list_member_ptr(fpinfo->remote_conds, rinfo))
-			remote_conds = lappend(remote_conds, rinfo);
+			remote_conds = lappend(remote_conds, rinfo->clause);
 		else if (list_member_ptr(fpinfo->local_conds, rinfo))
 			local_exprs = lappend(local_exprs, rinfo->clause);
 		else if (is_foreign_expr(root, baserel, rinfo->clause))
-			remote_conds = lappend(remote_conds, rinfo);
+			remote_conds = lappend(remote_conds, rinfo->clause);
 		else
 			local_exprs = lappend(local_exprs, rinfo->clause);
 	}
@@ -797,68 +776,17 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 * expressions to be sent as parameters.
 	 */
 	initStringInfo(&sql);
-	deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used,
-					 &retrieved_attrs);
-	if (remote_conds)
-		appendWhereClause(&sql, root, baserel, remote_conds,
-						  true, &params_list);
-
-	/*
-	 * Add FOR UPDATE/SHARE if appropriate.  We apply locking during the
-	 * initial row fetch, rather than later on as is done for local tables.
-	 * The extra roundtrips involved in trying to duplicate the local
-	 * semantics exactly don't seem worthwhile (see also comments for
-	 * RowMarkType).
-	 *
-	 * Note: because we actually run the query as a cursor, this assumes that
-	 * DECLARE CURSOR ... FOR UPDATE is supported, which it isn't before 8.3.
-	 */
-	if (baserel->relid == root->parse->resultRelation &&
-		(root->parse->commandType == CMD_UPDATE ||
-		 root->parse->commandType == CMD_DELETE))
-	{
-		/* Relation is UPDATE/DELETE target, so use FOR UPDATE */
-		appendStringInfoString(&sql, " FOR UPDATE");
-	}
-	else
-	{
-		PlanRowMark *rc = get_plan_rowmark(root->rowMarks, baserel->relid);
-
-		if (rc)
-		{
-			/*
-			 * Relation is specified as a FOR UPDATE/SHARE target, so handle
-			 * that.  (But we could also see LCS_NONE, meaning this isn't a
-			 * target relation after all.)
-			 *
-			 * For now, just ignore any [NO] KEY specification, since (a) it's
-			 * not clear what that means for a remote table that we don't have
-			 * complete information about, and (b) it wouldn't work anyway on
-			 * older remote servers.  Likewise, we don't worry about NOWAIT.
-			 */
-			switch (rc->strength)
-			{
-				case LCS_NONE:
-					/* No locking needed */
-					break;
-				case LCS_FORKEYSHARE:
-				case LCS_FORSHARE:
-					appendStringInfoString(&sql, " FOR SHARE");
-					break;
-				case LCS_FORNOKEYUPDATE:
-				case LCS_FORUPDATE:
-					appendStringInfoString(&sql, " FOR UPDATE");
-					break;
-			}
-		}
-	}
+	deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used, remote_conds,
+					 &params_list, &fdw_ps_tlist, &retrieved_attrs);
 
 	/*
-	 * Build the fdw_private list that will be available to the executor.
+	 * Build the fdw_private list that will be available in the executor.
 	 * Items in the list must match enum FdwScanPrivateIndex, above.
 	 */
-	fdw_private = list_make2(makeString(sql.data),
-							 retrieved_attrs);
+	fdw_private = list_make4(makeString(sql.data),
+							 retrieved_attrs,
+							 makeInteger(fpinfo->server->serverid),
+							 makeInteger(fpinfo->userid));
 
 	/*
 	 * Create the ForeignScan node from target list, local filtering
@@ -868,11 +796,18 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 * field of the finished plan node; we can't keep them in private state
 	 * because then they wouldn't be subject to later planner processing.
 	 */
-	return make_foreignscan(tlist,
+	scan = make_foreignscan(tlist,
 							local_exprs,
 							scan_relid,
 							params_list,
 							fdw_private);
+
+	/*
+	 * set fdw_ps_tlist to handle tuples generated by this scan.
+	 */
+	scan->fdw_ps_tlist = fdw_ps_tlist;
+
+	return scan;
 }
 
 /*
@@ -885,9 +820,8 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	ForeignScan *fsplan = (ForeignScan *) node->ss.ps.plan;
 	EState	   *estate = node->ss.ps.state;
 	PgFdwScanState *fsstate;
-	RangeTblEntry *rte;
+	Oid			serverid;
 	Oid			userid;
-	ForeignTable *table;
 	ForeignServer *server;
 	UserMapping *user;
 	int			numParams;
@@ -907,22 +841,13 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	node->fdw_state = (void *) fsstate;
 
 	/*
-	 * Identify which user to do the remote access as.  This should match what
-	 * ExecCheckRTEPerms() does.
-	 */
-	rte = rt_fetch(fsplan->scan.scanrelid, estate->es_range_table);
-	userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-	/* Get info about foreign table. */
-	fsstate->rel = node->ss.ss_currentRelation;
-	table = GetForeignTable(RelationGetRelid(fsstate->rel));
-	server = GetForeignServer(table->serverid);
-	user = GetUserMapping(userid, server->serverid);
-
-	/*
 	 * Get connection to the foreign server.  Connection manager will
 	 * establish new connection if necessary.
 	 */
+	serverid = intVal(list_nth(fsplan->fdw_private, FdwScanPrivateServerOid));
+	userid = intVal(list_nth(fsplan->fdw_private, FdwScanPrivateUserOid));
+	server = GetForeignServer(serverid);
+	user = GetUserMapping(userid, server->serverid);
 	fsstate->conn = GetConnection(server, user, false);
 
 	/* Assign a unique ID for my cursor */
@@ -932,8 +857,8 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	/* Get private info created by planner functions. */
 	fsstate->query = strVal(list_nth(fsplan->fdw_private,
 									 FdwScanPrivateSelectSql));
-	fsstate->retrieved_attrs = (List *) list_nth(fsplan->fdw_private,
-											   FdwScanPrivateRetrievedAttrs);
+	fsstate->retrieved_attrs = list_nth(fsplan->fdw_private,
+										FdwScanPrivateRetrievedAttrs);
 
 	/* Create contexts for batches of tuples and per-tuple temp workspace. */
 	fsstate->batch_cxt = AllocSetContextCreate(estate->es_query_cxt,
@@ -947,8 +872,18 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 											  ALLOCSET_SMALL_INITSIZE,
 											  ALLOCSET_SMALL_MAXSIZE);
 
-	/* Get info we'll need for input data conversion. */
-	fsstate->attinmeta = TupleDescGetAttInMetadata(RelationGetDescr(fsstate->rel));
+	/* Get info we'll need for input data conversion and error report. */
+	if (fsplan->scan.scanrelid > 0)
+	{
+		fsstate->relname = RelationGetRelationName(node->ss.ss_currentRelation);
+		fsstate->tupdesc = RelationGetDescr(node->ss.ss_currentRelation);
+	}
+	else
+	{
+		fsstate->relname = NULL;
+		fsstate->tupdesc = node->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
+	}
+	fsstate->attinmeta = TupleDescGetAttInMetadata(fsstate->tupdesc);
 
 	/* Prepare for output conversion of parameters used in remote query. */
 	numParams = list_length(fsplan->fdw_exprs);
@@ -1726,10 +1661,12 @@ estimate_path_cost_size(PlannerInfo *root,
 	 */
 	if (fpinfo->use_remote_estimate)
 	{
+		List	   *remote_conds;
 		List	   *remote_join_conds;
 		List	   *local_join_conds;
-		StringInfoData sql;
 		List	   *retrieved_attrs;
+		StringInfoData sql;
+		UserMapping *user;
 		PGconn	   *conn;
 		Selectivity local_sel;
 		QualCost	local_cost;
@@ -1741,24 +1678,24 @@ estimate_path_cost_size(PlannerInfo *root,
 		classifyConditions(root, baserel, join_conds,
 						   &remote_join_conds, &local_join_conds);
 
+		remote_conds = copyObject(fpinfo->remote_conds);
+		remote_conds = list_concat(remote_conds, remote_join_conds);
+
 		/*
 		 * Construct EXPLAIN query including the desired SELECT, FROM, and
 		 * WHERE clauses.  Params and other-relation Vars are replaced by
 		 * dummy values.
+		 * Here we waste params_list and fdw_ps_tlist because they are
+		 * unnecessary for EXPLAIN.
 		 */
 		initStringInfo(&sql);
 		appendStringInfoString(&sql, "EXPLAIN ");
-		deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used,
-						 &retrieved_attrs);
-		if (fpinfo->remote_conds)
-			appendWhereClause(&sql, root, baserel, fpinfo->remote_conds,
-							  true, NULL);
-		if (remote_join_conds)
-			appendWhereClause(&sql, root, baserel, remote_join_conds,
-							  (fpinfo->remote_conds == NIL), NULL);
+		deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used, remote_conds,
+						 NULL, NULL, &retrieved_attrs);
 
 		/* Get the remote estimate */
-		conn = GetConnection(fpinfo->server, fpinfo->user, false);
+		user = GetUserMapping(fpinfo->userid, fpinfo->server->serverid);
+		conn = GetConnection(fpinfo->server, user, false);
 		get_remote_estimate(sql.data, conn, &rows, &width,
 							&startup_cost, &total_cost);
 		ReleaseConnection(conn);
@@ -2055,7 +1992,9 @@ fetch_more_data(ForeignScanState *node)
 		{
 			fsstate->tuples[i] =
 				make_tuple_from_result_row(res, i,
-										   fsstate->rel,
+										   fsstate->relname,
+										   fsstate->query,
+										   fsstate->tupdesc,
 										   fsstate->attinmeta,
 										   fsstate->retrieved_attrs,
 										   fsstate->temp_cxt);
@@ -2273,7 +2212,9 @@ store_returning_result(PgFdwModifyState *fmstate,
 		HeapTuple	newtup;
 
 		newtup = make_tuple_from_result_row(res, 0,
-											fmstate->rel,
+										RelationGetRelationName(fmstate->rel),
+											fmstate->query,
+											RelationGetDescr(fmstate->rel),
 											fmstate->attinmeta,
 											fmstate->retrieved_attrs,
 											fmstate->temp_cxt);
@@ -2423,6 +2364,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
 	initStringInfo(&sql);
 	appendStringInfo(&sql, "DECLARE c%u CURSOR FOR ", cursor_number);
 	deparseAnalyzeSql(&sql, relation, &astate.retrieved_attrs);
+	astate.query = sql.data;
 
 	/* In what follows, do not risk leaking any PGresults. */
 	PG_TRY();
@@ -2565,7 +2507,9 @@ analyze_row_processor(PGresult *res, int row, PgFdwAnalyzeState *astate)
 		oldcontext = MemoryContextSwitchTo(astate->anl_cxt);
 
 		astate->rows[pos] = make_tuple_from_result_row(res, row,
-													   astate->rel,
+										   RelationGetRelationName(astate->rel),
+													   astate->query,
+											   RelationGetDescr(astate->rel),
 													   astate->attinmeta,
 													 astate->retrieved_attrs,
 													   astate->temp_cxt);
@@ -2839,6 +2783,258 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
 }
 
 /*
+ * Construct PgFdwRelationInfo from two join sources
+ */
+static PgFdwRelationInfo *
+merge_fpinfo(RelOptInfo *outerrel, RelOptInfo *innerrel, JoinType jointype)
+{
+	PgFdwRelationInfo *fpinfo_o;
+	PgFdwRelationInfo *fpinfo_i;
+	PgFdwRelationInfo *fpinfo;
+
+	fpinfo_o = (PgFdwRelationInfo *) outerrel->fdw_private;
+	fpinfo_i = (PgFdwRelationInfo *) innerrel->fdw_private;
+
+	fpinfo = (PgFdwRelationInfo *) palloc0(sizeof(PgFdwRelationInfo));
+
+	/* Join relation must have conditions come from sources */
+	fpinfo->remote_conds = list_concat(copyObject(fpinfo_o->remote_conds),
+									   copyObject(fpinfo_i->remote_conds));
+	fpinfo->local_conds = list_concat(copyObject(fpinfo_o->local_conds),
+									  copyObject(fpinfo_i->local_conds));
+
+	/* Only for simple foreign table scan */
+	fpinfo->attrs_used = NULL;
+
+	/* rows and width will be set later */
+	fpinfo->rows = 0;
+	fpinfo->width = 0;
+
+	/*
+	 * TODO estimate more accurately
+	 */
+	fpinfo->local_conds_cost.startup = fpinfo_o->local_conds_cost.startup +
+									   fpinfo_i->local_conds_cost.startup;
+	fpinfo->local_conds_cost.per_tuple = fpinfo_o->local_conds_cost.per_tuple +
+										 fpinfo_i->local_conds_cost.per_tuple;
+	fpinfo->local_conds_sel = fpinfo_o->local_conds_sel *
+							  fpinfo_i->local_conds_sel;
+	fpinfo->use_remote_estimate = false;
+	fpinfo->fdw_startup_cost = fpinfo_o->fdw_startup_cost +
+							   fpinfo_i->fdw_startup_cost;
+	fpinfo->fdw_tuple_cost = fpinfo_o->fdw_tuple_cost +
+							 fpinfo_i->fdw_tuple_cost;
+
+	/* total_cost will be set later. */
+	fpinfo->startup_cost = fpinfo_o->startup_cost + fpinfo_i->startup_cost;
+	fpinfo->total_cost = 0;
+
+	/* serverid and userid are respectively identical */
+	fpinfo->server = fpinfo_o->server;
+	fpinfo->userid = fpinfo_o->userid;
+
+	fpinfo->outerrel = outerrel;
+	fpinfo->innerrel = innerrel;
+	fpinfo->jointype = jointype;
+
+	/* joinclauses and otherclauses will be set later */
+
+	return fpinfo;
+}
+
+/*
+ * postgresGetForeignJoinPaths
+ *		Add possible ForeignPath to joinrel.
+ *
+ * Joins satisfy conditions below can be pushed down to the remote PostgreSQL
+ * server.
+ *
+ * 1) Join type is INNER or OUTER (one of LEFT/RIGHT/FULL)
+ * 2) Both outer and inner portions are safe to push-down
+ * 3) All foreign tables in the join belong to the same foreign server
+ * 4) All foreign tables are accessed with identical user
+ * 5) All join conditions are safe to push down
+ * 6) No relation has local filter (this can be relaxed for INNER JOIN with
+ * no volatile function/operator, but as of now we want safer way)
+ */
+static void
+postgresGetForeignJoinPaths(PlannerInfo *root,
+							RelOptInfo *joinrel,
+							RelOptInfo *outerrel,
+							RelOptInfo *innerrel,
+							SpecialJoinInfo *sjinfo,
+							List *restrictlist)
+{
+	PgFdwRelationInfo *fpinfo;
+	PgFdwRelationInfo *fpinfo_o;
+	PgFdwRelationInfo *fpinfo_i;
+	JoinType		jointype = !sjinfo ? JOIN_INNER : sjinfo->jointype;
+	ForeignPath	   *joinpath;
+	double			rows;
+	Cost			startup_cost;
+	Cost			total_cost;
+
+	ListCell	   *lc;
+	List		   *joinclauses;
+	List		   *otherclauses;
+
+	/*
+	 * We support all outer joins in addition to inner join.  CROSS JOIN is
+	 * an INNER JOIN with no conditions internally, so will be checked later.
+	 */
+	if (jointype != JOIN_INNER && jointype != JOIN_LEFT &&
+		jointype != JOIN_RIGHT && jointype != JOIN_FULL)
+	{
+		ereport(DEBUG3, (errmsg("unsupported join type (SEMI, ANTI)")));
+		return;
+	}
+
+	/*
+	 * Having valid PgFdwRelationInfo in RelOptInfo#fdw_private indicates that
+	 * scanning against the relation can be pushed down.  If either of them
+	 * doesn't have PgFdwRelationInfo, give up to push down this join relation.
+	 */
+	if (!outerrel->fdw_private)
+	{
+		ereport(DEBUG3, (errmsg("outer is not safe to push-down")));
+		return;
+	}
+	if (!innerrel->fdw_private)
+	{
+		ereport(DEBUG3, (errmsg("inner is not safe to push-down")));
+		return;
+	}
+	fpinfo_o = (PgFdwRelationInfo *) outerrel->fdw_private;
+	fpinfo_i = (PgFdwRelationInfo *) innerrel->fdw_private;
+
+	/*
+	 * All relations in the join must belong to same server.  Having a valid
+	 * fdw_private means that all relations in the relations belong to the
+	 * server the fdw_private has, so what we should do is just compare
+	 * serverid of outer/inner relations.
+	 */
+	if (fpinfo_o->server->serverid != fpinfo_i->server->serverid)
+	{
+		ereport(DEBUG3, (errmsg("server unmatch")));
+		return;
+	}
+
+	/*
+	 * effective userid of all source relations should be identical.
+	 * Having a valid fdw_private means that all relations in the relations is
+	 * accessed with identical user, so what we should do is just compare
+	 * userid of outer/inner relations.
+	 */
+	if (fpinfo_o->userid != fpinfo_i->userid)
+	{
+		ereport(DEBUG3, (errmsg("unmatch userid")));
+		return;
+	}
+
+	/*
+	 * No source relation can have local conditions.  This can be relaxed
+	 * if the join is an inner join and local conditions don't contain
+	 * volatile function/operator, but as of now we leave it as future
+	 * enhancement.
+	 */
+	if (fpinfo_o->local_conds != NULL || fpinfo_i->local_conds != NULL)
+	{
+		ereport(DEBUG3, (errmsg("join with local filter")));
+		return;
+	}
+
+	/*
+	 * Separate restrictlist into two lists, join conditions and remote filters.
+	 */
+	joinclauses = restrictlist;
+	if (IS_OUTER_JOIN(jointype))
+	{
+		extract_actual_join_clauses(joinclauses, &joinclauses, &otherclauses);
+	}
+	else
+	{
+		joinclauses = extract_actual_clauses(joinclauses, false);
+		otherclauses = NIL;
+	}
+
+	/*
+	 * Note that CROSS JOIN (cartesian product) is transformed to JOIN_INNER
+	 * with empty joinclauses.  Pushing down CROSS JOIN usually produces more
+	 * result than retrieving each tables separately, so we don't push down
+	 * such joins.
+	 */
+	if (jointype == JOIN_INNER && joinclauses == NIL)
+	{
+		ereport(DEBUG3, (errmsg("unsupported join type (CROSS)")));
+		return;
+	}
+
+	/*
+	 * Join condition must be safe to push down.
+	 */
+	foreach(lc, joinclauses)
+	{
+		Expr *expr = (Expr *) lfirst(lc);
+
+		if (!is_foreign_expr(root, joinrel, expr))
+		{
+			ereport(DEBUG3, (errmsg("join quals contains unsafe conditions")));
+			return;
+		}
+	}
+
+	/*
+	 * Other condition for the join must be safe to push down.
+	 */
+	foreach(lc, otherclauses)
+	{
+		Expr *expr = (Expr *) lfirst(lc);
+
+		if (!is_foreign_expr(root, joinrel, expr))
+		{
+			ereport(DEBUG3, (errmsg("filter contains unsafe conditions")));
+			return;
+		}
+	}
+
+	/* Here we know that this join can be pushed-down to remote side. */
+
+	/* Construct fpinfo for the join relation */
+	fpinfo = merge_fpinfo(outerrel, innerrel, jointype); 
+	fpinfo->rows = joinrel->rows;
+	fpinfo->width = joinrel->width;
+	fpinfo->total_cost = fpinfo->startup_cost +
+						 fpinfo->fdw_tuple_cost * fpinfo->rows;
+	fpinfo->joinclauses = joinclauses;
+	fpinfo->otherclauses = otherclauses;
+	joinrel->fdw_private = fpinfo;
+
+	/* TODO determine more accurate cost and rows of the join. */
+	rows = joinrel->rows;
+	startup_cost = fpinfo->startup_cost;
+	total_cost = fpinfo->total_cost;
+
+	/*
+	 * Create a new join path and add it to the joinrel which represents a join
+	 * between foreign tables.
+	 */
+	joinpath = create_foreignscan_path(root,
+									   joinrel,
+									   rows,
+									   startup_cost,
+									   total_cost,
+									   NIL,		/* no pathkeys */
+									   NULL,	/* no required_outer */
+									   NIL);	/* no fdw_private */
+
+	/* Add generated path into joinrel by add_path(). */
+	add_path(joinrel, (Path *) joinpath);
+	elog(DEBUG3, "join path added");
+
+	/* TODO consider parameterized paths */
+}
+
+/*
  * Create a tuple from the specified row of the PGresult.
  *
  * rel is the local representation of the foreign table, attinmeta is
@@ -2849,13 +3045,14 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
 static HeapTuple
 make_tuple_from_result_row(PGresult *res,
 						   int row,
-						   Relation rel,
+						   const char *relname,
+						   const char *query,
+						   TupleDesc tupdesc,
 						   AttInMetadata *attinmeta,
 						   List *retrieved_attrs,
 						   MemoryContext temp_context)
 {
 	HeapTuple	tuple;
-	TupleDesc	tupdesc = RelationGetDescr(rel);
 	Datum	   *values;
 	bool	   *nulls;
 	ItemPointer ctid = NULL;
@@ -2882,7 +3079,9 @@ make_tuple_from_result_row(PGresult *res,
 	/*
 	 * Set up and install callback to report where conversion error occurs.
 	 */
-	errpos.rel = rel;
+	errpos.relname = relname;
+	errpos.query = query;
+	errpos.tupdesc = tupdesc;
 	errpos.cur_attno = 0;
 	errcallback.callback = conversion_error_callback;
 	errcallback.arg = (void *) &errpos;
@@ -2966,11 +3165,39 @@ make_tuple_from_result_row(PGresult *res,
 static void
 conversion_error_callback(void *arg)
 {
+	const char *attname;
+	const char *relname;
 	ConversionLocation *errpos = (ConversionLocation *) arg;
-	TupleDesc	tupdesc = RelationGetDescr(errpos->rel);
+	TupleDesc	tupdesc = errpos->tupdesc;
+	StringInfoData buf;
+
+	if (errpos->relname)
+	{
+		/* error occurred in a scan against a foreign table */ 
+		initStringInfo(&buf);
+		if (errpos->cur_attno > 0)
+			appendStringInfo(&buf, "column \"%s\"",
+					 NameStr(tupdesc->attrs[errpos->cur_attno - 1]->attname));
+		else if (errpos->cur_attno == SelfItemPointerAttributeNumber)
+			appendStringInfoString(&buf, "column \"ctid\"");
+		attname = buf.data;
+
+		initStringInfo(&buf);
+		appendStringInfo(&buf, "foreign table \"%s\"", errpos->relname);
+		relname = buf.data;
+	}
+	else
+	{
+		/* error occurred in a scan against a foreign join */ 
+		initStringInfo(&buf);
+		appendStringInfo(&buf, "column %d", errpos->cur_attno - 1);
+		attname = buf.data;
+
+		initStringInfo(&buf);
+		appendStringInfo(&buf, "foreign join \"%s\"", errpos->query);
+		relname = buf.data;
+	}
 
 	if (errpos->cur_attno > 0 && errpos->cur_attno <= tupdesc->natts)
-		errcontext("column \"%s\" of foreign table \"%s\"",
-				   NameStr(tupdesc->attrs[errpos->cur_attno - 1]->attname),
-				   RelationGetRelationName(errpos->rel));
+		errcontext("%s of %s", attname, relname);
 }
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 950c6f7..0d05e5d 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -16,10 +16,52 @@
 #include "foreign/foreign.h"
 #include "lib/stringinfo.h"
 #include "nodes/relation.h"
+#include "nodes/plannodes.h"
 #include "utils/relcache.h"
 
 #include "libpq-fe.h"
 
+/*
+ * FDW-specific planner information kept in RelOptInfo.fdw_private for a
+ * foreign table or a foreign join.  This information is collected by
+ * postgresGetForeignRelSize, or calculated from join source relations.
+ */
+typedef struct PgFdwRelationInfo
+{
+	/* baserestrictinfo clauses, broken down into safe and unsafe subsets. */
+	List	   *remote_conds;
+	List	   *local_conds;
+
+	/* Bitmap of attr numbers we need to fetch from the remote server. */
+	Bitmapset  *attrs_used;
+
+	/* Cost and selectivity of local_conds. */
+	QualCost	local_conds_cost;
+	Selectivity local_conds_sel;
+
+	/* Estimated size and cost for a scan with baserestrictinfo quals. */
+	double		rows;
+	int			width;
+	Cost		startup_cost;
+	Cost		total_cost;
+
+	/* Options extracted from catalogs. */
+	bool		use_remote_estimate;
+	Cost		fdw_startup_cost;
+	Cost		fdw_tuple_cost;
+
+	/* Cached catalog information. */
+	ForeignServer *server;
+	Oid			userid;
+
+	/* Join information */
+	RelOptInfo *outerrel;
+	RelOptInfo *innerrel;
+	JoinType	jointype;
+	List	   *joinclauses;
+	List	   *otherclauses;
+} PgFdwRelationInfo;
+
 /* in postgres_fdw.c */
 extern int	set_transmission_modes(void);
 extern void reset_transmission_modes(int nestlevel);
@@ -51,13 +93,30 @@ extern void deparseSelectSql(StringInfo buf,
 				 PlannerInfo *root,
 				 RelOptInfo *baserel,
 				 Bitmapset *attrs_used,
+				 List *remote_conds,
+				 List **params_list,
+				 List **fdw_ps_tlist,
 				 List **retrieved_attrs);
-extern void appendWhereClause(StringInfo buf,
+extern void appendConditions(StringInfo buf,
 				  PlannerInfo *root,
 				  RelOptInfo *baserel,
+				  List *outertlist,
+				  List *innertlist,
 				  List *exprs,
-				  bool is_first,
+				  const char *prefix,
 				  List **params);
+extern void deparseJoinSql(StringInfo sql,
+			   PlannerInfo *root,
+			   RelOptInfo *baserel,
+			   RelOptInfo *outerrel,
+			   RelOptInfo *innerrel,
+			   const char *sql_o,
+			   const char *sql_i,
+			   JoinType jointype,
+			   List *joinclauses,
+			   List *otherclauses,
+			   List **fdw_ps_tlist,
+			   List **retrieved_attrs);
 extern void deparseInsertSql(StringInfo buf, PlannerInfo *root,
 				 Index rtindex, Relation rel,
 				 List *targetAttrs, List *returningList,
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 4a23457..05bd2f6 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -11,12 +11,17 @@ DO $d$
             OPTIONS (dbname '$$||current_database()||$$',
                      port '$$||current_setting('port')||$$'
             )$$;
+        EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER postgres_fdw
+            OPTIONS (dbname '$$||current_database()||$$',
+                     port '$$||current_setting('port')||$$'
+            )$$;
     END;
 $d$;
 
 CREATE USER MAPPING FOR public SERVER testserver1
 	OPTIONS (user 'value', password 'value');
 CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
+CREATE USER MAPPING FOR CURRENT_USER SERVER loopback2;
 
 -- ===================================================================
 -- create objects used through FDW loopback server
@@ -39,6 +44,18 @@ CREATE TABLE "S 1"."T 2" (
 	c2 text,
 	CONSTRAINT t2_pkey PRIMARY KEY (c1)
 );
+CREATE TABLE "S 1"."T 3" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text,
+	CONSTRAINT t3_pkey PRIMARY KEY (c1)
+);
+CREATE TABLE "S 1"."T 4" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c4 text,
+	CONSTRAINT t4_pkey PRIMARY KEY (c1)
+);
 
 INSERT INTO "S 1"."T 1"
 	SELECT id,
@@ -54,9 +71,23 @@ INSERT INTO "S 1"."T 2"
 	SELECT id,
 	       'AAA' || to_char(id, 'FM000')
 	FROM generate_series(1, 100) id;
+INSERT INTO "S 1"."T 3"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 3" WHERE c1 % 2 != 0;	-- delete for outer join tests
+INSERT INTO "S 1"."T 4"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 4" WHERE c1 % 3 != 0;	-- delete for outer join tests
 
 ANALYZE "S 1"."T 1";
 ANALYZE "S 1"."T 2";
+ANALYZE "S 1"."T 3";
+ANALYZE "S 1"."T 4";
 
 -- ===================================================================
 -- create foreign tables
@@ -87,6 +118,29 @@ CREATE FOREIGN TABLE ft2 (
 ) SERVER loopback;
 ALTER FOREIGN TABLE ft2 DROP COLUMN cx;
 
+CREATE FOREIGN TABLE ft4 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3');
+
+CREATE FOREIGN TABLE ft5 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4');
+
+CREATE FOREIGN TABLE ft6 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback2 OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE USER view_owner;
+GRANT ALL ON ft5 TO view_owner;
+CREATE VIEW v_ft5 AS SELECT * FROM ft5;
+ALTER VIEW v_ft5 OWNER TO view_owner;
+CREATE USER MAPPING FOR view_owner SERVER loopback;
+
 -- ===================================================================
 -- tests for validator
 -- ===================================================================
@@ -158,8 +212,6 @@ EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
 SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
 -- aggregate
 SELECT COUNT(*) FROM ft1 t1;
--- join two tables
-SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
 -- subquery
 SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <= 10) ORDER BY c1;
 -- subquery+MAX
@@ -216,6 +268,82 @@ SELECT * FROM ft1 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft2 WHERE c1 < 5));
 SELECT * FROM ft2 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft1 WHERE c1 < 5));
 
 -- ===================================================================
+-- JOIN queries
+-- ===================================================================
+-- join two tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- join three tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1, t3.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1, t3.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+-- left outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+-- right outer join
+SET enable_mergejoin = off; -- palnner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+SET enable_mergejoin = on;
+-- full outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+-- full outer join + WHERE clause, only matched rows
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+-- join at WHERE clause 
+SET enable_mergejoin = off; -- palnner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SET enable_mergejoin = on;
+-- join in CTE
+EXPLAIN (COSTS false, VERBOSE)
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+-- ctid with whole-row reference
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- partially unsafe to push down, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+-- SEMI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+-- ANTI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+-- CROSS JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- different server
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- different effective user for permission check
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- unsafe join conditions
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- local filter (unsafe conditions on one side)
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+
+-- ===================================================================
 -- parameterized queries
 -- ===================================================================
 -- simple join
@@ -666,116 +794,6 @@ UPDATE rem1 SET f2 = 'testo';
 INSERT INTO rem1(f2) VALUES ('test') RETURNING ctid;
 
 -- ===================================================================
--- test inheritance features
--- ===================================================================
-
-CREATE TABLE a (aa TEXT);
-CREATE TABLE loct (aa TEXT, bb TEXT);
-CREATE FOREIGN TABLE b (bb TEXT) INHERITS (a)
-  SERVER loopback OPTIONS (table_name 'loct');
-
-INSERT INTO a(aa) VALUES('aaa');
-INSERT INTO a(aa) VALUES('aaaa');
-INSERT INTO a(aa) VALUES('aaaaa');
-
-INSERT INTO b(aa) VALUES('bbb');
-INSERT INTO b(aa) VALUES('bbbb');
-INSERT INTO b(aa) VALUES('bbbbb');
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE a SET aa = 'zzzzzz' WHERE aa LIKE 'aaaa%';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE b SET aa = 'new';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE a SET aa = 'newtoo';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-DELETE FROM a;
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-DROP TABLE a CASCADE;
-DROP TABLE loct;
-
--- Check SELECT FOR UPDATE/SHARE with an inherited source table
-create table loct1 (f1 int, f2 int, f3 int);
-create table loct2 (f1 int, f2 int, f3 int);
-
-create table foo (f1 int, f2 int);
-create foreign table foo2 (f3 int) inherits (foo)
-  server loopback options (table_name 'loct1');
-create table bar (f1 int, f2 int);
-create foreign table bar2 (f3 int) inherits (bar)
-  server loopback options (table_name 'loct2');
-
-insert into foo values(1,1);
-insert into foo values(3,3);
-insert into foo2 values(2,2,2);
-insert into foo2 values(4,4,4);
-insert into bar values(1,11);
-insert into bar values(2,22);
-insert into bar values(6,66);
-insert into bar2 values(3,33,33);
-insert into bar2 values(4,44,44);
-insert into bar2 values(7,77,77);
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for update;
-select * from bar where f1 in (select f1 from foo) for update;
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for share;
-select * from bar where f1 in (select f1 from foo) for share;
-
--- Check UPDATE with inherited target and an inherited source table
-explain (verbose, costs off)
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-
-select tableoid::regclass, * from bar order by 1,2;
-
--- Check UPDATE with inherited target and an appendrel subquery
-explain (verbose, costs off)
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-
-select tableoid::regclass, * from bar order by 1,2;
-
--- Test that WHERE CURRENT OF is not supported
-begin;
-declare c cursor for select * from bar where f1 = 7;
-fetch from c;
-update bar set f2 = null where current of c;
-rollback;
-
-drop table foo cascade;
-drop table bar cascade;
-drop table loct1;
-drop table loct2;
-
--- ===================================================================
 -- test IMPORT FOREIGN SCHEMA
 -- ===================================================================
 
@@ -831,3 +849,7 @@ DROP TYPE "Colors" CASCADE;
 IMPORT FOREIGN SCHEMA import_source LIMIT TO (t5)
   FROM SERVER loopback INTO import_dest5;  -- ERROR
 ROLLBACK;
+
+-- Cleanup
+DROP OWNED BY view_owner;
+DROP USER view_owner;
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 43adb61..4a0159b 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -406,11 +406,27 @@
   <title>Remote Query Optimization</title>
 
   <para>
-   <filename>postgres_fdw</> attempts to optimize remote queries to reduce
-   the amount of data transferred from foreign servers.  This is done by
-   sending query <literal>WHERE</> clauses to the remote server for
-   execution, and by not retrieving table columns that are not needed for
-   the current query.  To reduce the risk of misexecution of queries,
+   <filename>postgres_fdw</filename> attempts to optimize remote queries to
+   reduce the amount of data transferred from foreign servers.
+   This is done by various ways.
+  </para>
+
+  <para>
+   For <literal>SELECT</> clause, <filename>postgres_fdw</filename> sends only
+   actually necessary columns in it.
+  </para>
+
+  <para>
+   If <literal>FROM</> clause contains multiple foreign tables managed
+   by the same server and accessed with identical user,
+   <filename>postgres_fdw</> tries to join foreign tables on the remote side as
+   much as it can.
+   To reduce risk of misexecution of queries, <filename>postgres_fdw</>
+   gives up sending joins to remote when join conditions might have differemt
+   semantics on the remote side.
+  </para>
+
+  <para>
    <literal>WHERE</> clauses are not sent to the remote server unless they use
    only built-in data types, operators, and functions.  Operators and
    functions in the clauses must be <literal>IMMUTABLE</> as well.

#20

Kouhei Kaigai

kaigai@ak.jp.nec.com

almost 11 years ago

In reply to: Shigeru Hanada (#19)

Hanada-san,

In addition to your comments, I removed useless code that retrieves ForeignPath
from outer/inner RelOptInfo and store them into ForeignPath#fdw_private. Now
postgres_fdw’s join pushd-down is free from existence of ForeignPath under the
join relation. This means that we can support the case Robert mentioned before,
that whole "(huge JOIN large) JOIN small” can be pushed down even if “(huge JOIN
large)” is dominated by another join path.

Yes, it's definitely reasonable design, and fits intention of the interface.
I should point out it from the beginning. :-)

"l" of the first SELECT represents a whole-row reference.
However, underlying SELECT contains system columns in its target-
list.

Is it available to construct such a query?
SELECT l.a1, l.a2 FROM (SELECT (id,a,atext), ctid) l (a1, a2) ...
^^^^^^^^^^

Simple relation reference such as "l" is not sufficient for the purpose, yes.
But putting columns in parentheses would not work when a user column is referenced
in original query.

I implemented deparseProjectionSql to use ROW(...) expression for a whole-row
reference in the target list, in addition to ordinary column references for
actually used columns and ctid.

Please see the test case for mixed use of ctid and whole-row reference to
postgres_fdw’s regression tests. Now a whole-row reference in the remote query
looks like this:

It seems to me that deparseProjectionSql() works properly.

-- ctid with whole-row reference
EXPLAIN (COSTS false, VERBOSE)
SELECT t1.ctid, t1, t2 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3,
t1.c1 OFFSET 100 LIMIT 10;

----------------------------------------------------------------------------
----------------------------------------------------------------------------
-------------------------------------------------------------------------
Limit
Output: t1.ctid, t1.*, t2.*, t1.c3, t1.c1
-> Sort
Output: t1.ctid, t1.*, t2.*, t1.c3, t1.c1
Sort Key: t1.c3, t1.c1
-> Foreign Scan
Output: t1.ctid, t1.*, t2.*, t1.c3, t1.c1
Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, r.a1 FROM (SELECT l.a7,
ROW(l.a10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17), l.a12, l.a10 FROM
(SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a1
(8 rows)

In fact l.a12 and l.a10, for t1.c3 and t1.c1, are redundant in transferred data,
but IMO this would simplify the code for deparsing.

I agree. Even if we can reduce networking cost a little, tuple
construction takes CPU cycles. Your decision is fair enough for
me.

* merge_fpinfo()
It seems to me fpinfo->rows should be joinrel->rows, and
fpinfo->width also should be joinrel->width.
No need to have special intelligence here, isn't it?

Oops. They are vestige of my struggle which disabled SELECT clause optimization
(omit unused columns). Now width and rows are inherited from joinrel. Besides
that, fdw_startup_cost and fdw_tuple_cost seem wrong, so I fixed them to use simple
summary, not average.

Does fpinfo->fdw_startup_cost represent a cost to open connection to remote
PostgreSQL, doesn't it?

postgres_fdw.c:1757 says as follows:

/*
* Add some additional cost factors to account for connection overhead
* (fdw_startup_cost), transferring data across the network
* (fdw_tuple_cost per retrieved row), and local manipulation of the data
* (cpu_tuple_cost per retrieved row).
*/

If so, does a ForeignScan that involves 100 underlying relation takes 100
times heavy network operations on startup? Probably, no.
I think, average is better than sum, and max of them will reflect the cost
more correctly.
Also, fdw_tuple_cost introduce the cost of data transfer over the network.
I thinks, weighted average is the best strategy, like:
fpinfo->fdw_tuple_cost =
(fpinfo_o->width / (fpinfo_o->width + fpinfo_i->width) * fpinfo_o->fdw_tuple_cost +
(fpinfo_i->width / (fpinfo_o->width + fpinfo_i->width) * fpinfo_i->fdw_tuple_cost;

That's just my suggestion. Please apply the best way you thought.

* explain output

EXPLAIN output may be a bit insufficient to know what does it
actually try to do.

postgres=# explain select * from ft1,ft2 where a = b;
QUERY PLAN
--------------------------------------------------------
Foreign Scan (cost=200.00..212.80 rows=1280 width=80)
(1 row)

Even though it is not an immediate request, it seems to me
better to show users joined relations and remote ON/WHERE
clause here.

Like this?

Foreign Scan on ft1 INNER JOIN ft2 ON ft1.a = ft2.b (cost=200.00..212.80
rows=1280 width=80)
…

No. This line is produced by ExplainScanTarget(), so it requires the
backend knowledge to individual FDW.
Rather than the backend, postgresExplainForeignScan() shall produce
the output.

It might produce a very long line in a case of joining many tables because it
contains most of remote query other than SELECT clause, but I prefer detailed.
Another idea is to print “Join Cond” and “Remote Filter” as separated EXPLAIN
items.

It is good, if postgres_fdw can generate relations name involved in
the join for each level, and join cond/remote filter individually.

Note that v8 patch doesn’t contain this change yet!

It is a "nice to have" feature. So, I don't think the first commit needs
to support this. Just a suggestion in the next step.

* implementation suggestion

At the deparseJoinSql(),

+   /* print SELECT clause of the join scan */
+   initStringInfo(&selbuf);
+   i = 0;
+   foreach(lc, baserel->reltargetlist)
+   {
+       Var        *var = (Var *) lfirst(lc);
+       TargetEntry *tle;
+
+       if (i > 0)
+           appendStringInfoString(&selbuf, ", ");
+       deparseJoinVar(var, &context);
+
+       tle = makeTargetEntry((Expr *) copyObject(var),
+                             i + 1, pstrdup(""), false);
+       if (fdw_ps_tlist)
+           *fdw_ps_tlist = lappend(*fdw_ps_tlist, copyObject(tle));
+
+       *retrieved_attrs = lappend_int(*retrieved_attrs, i + 1);
+
+       i++;
+   }

The tle is a copy of the original target-entry, and var-node is also
copied one. Why is the tle copied on lappend() again?
Also, NULL as acceptable as 3rd argument of makeTargetEntry.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Resolved by subject fallback

#21

Shigeru HANADA

shigeru.hanada@gmail.com

almost 11 years ago

In reply to: Kouhei Kaigai (#20)

1 attachment(s)

Kaigai-san,

Thanks for your review.

2015/04/09 10:48、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：
* merge_fpinfo()

It seems to me fpinfo->rows should be joinrel->rows, and
fpinfo->width also should be joinrel->width.
No need to have special intelligence here, isn't it?

Oops. They are vestige of my struggle which disabled SELECT clause optimization
(omit unused columns). Now width and rows are inherited from joinrel. Besides
that, fdw_startup_cost and fdw_tuple_cost seem wrong, so I fixed them to use simple
summary, not average.

Does fpinfo->fdw_startup_cost represent a cost to open connection to remote
PostgreSQL, doesn't it?

postgres_fdw.c:1757 says as follows:

/*
* Add some additional cost factors to account for connection overhead
* (fdw_startup_cost), transferring data across the network
* (fdw_tuple_cost per retrieved row), and local manipulation of the data
* (cpu_tuple_cost per retrieved row).
*/

If so, does a ForeignScan that involves 100 underlying relation takes 100
times heavy network operations on startup? Probably, no.
I think, average is better than sum, and max of them will reflect the cost
more correctly.

In my current opinion, no. Though I remember that I've written such comments before :P.

Connection establishment occurs only once for the very first access to the server, so in the use cases with long-lived session (via psql, connection pooling, etc.), taking connection overhead into account *every time* seems too pessimistic.

Instead, for practical cases, fdw_startup_cost should consider overheads of query construction and getting first response of it (hopefully it minus retrieving actual data). These overheads are visible in the order of milliseconds. I’m not sure how much is appropriate for the default, but 100 seems not so bad.

Anyway fdw_startup_cost is per-server setting as same as fdw_tuple_cost, and it should not be modified according to the width of the result, so using fpinfo_o->fdw_startup_cost would be ok.

Also, fdw_tuple_cost introduce the cost of data transfer over the network.
I thinks, weighted average is the best strategy, like:
fpinfo->fdw_tuple_cost =
(fpinfo_o->width / (fpinfo_o->width + fpinfo_i->width) * fpinfo_o->fdw_tuple_cost +
(fpinfo_i->width / (fpinfo_o->width + fpinfo_i->width) * fpinfo_i->fdw_tuple_cost;

That's just my suggestion. Please apply the best way you thought.

I can’t agree that strategy, because 1) width 0 causes per-tuple cost 0, and 2) fdw_tuple_cost never vary in a foreign server. Using fpinfo_o->fdw_tuple_cost (it must be identical to fpinfo_i->fdw_tuple_cost) seems reasonable. Thoughts?

* explain output

EXPLAIN output may be a bit insufficient to know what does it
actually try to do.

postgres=# explain select * from ft1,ft2 where a = b;
QUERY PLAN
--------------------------------------------------------
Foreign Scan (cost=200.00..212.80 rows=1280 width=80)
(1 row)

Even though it is not an immediate request, it seems to me
better to show users joined relations and remote ON/WHERE
clause here.

Like this?

Foreign Scan on ft1 INNER JOIN ft2 ON ft1.a = ft2.b (cost=200.00..212.80
rows=1280 width=80)
…

No. This line is produced by ExplainScanTarget(), so it requires the
backend knowledge to individual FDW.
Rather than the backend, postgresExplainForeignScan() shall produce
the output.

Agreed. Additional FDW output such as “Relations”, “Join type”, and “Join conditions” would be possible.

It might produce a very long line in a case of joining many tables because it
contains most of remote query other than SELECT clause, but I prefer detailed.
Another idea is to print “Join Cond” and “Remote Filter” as separated EXPLAIN
items.

It is good, if postgres_fdw can generate relations name involved in
the join for each level, and join cond/remote filter individually.

Note that v8 patch doesn’t contain this change yet!

It is a "nice to have" feature. So, I don't think the first commit needs
to support this. Just a suggestion in the next step.

Agreed.

* implementation suggestion

At the deparseJoinSql(),

+   /* print SELECT clause of the join scan */
+   initStringInfo(&selbuf);
+   i = 0;
+   foreach(lc, baserel->reltargetlist)
+   {
+       Var        *var = (Var *) lfirst(lc);
+       TargetEntry *tle;
+
+       if (i > 0)
+           appendStringInfoString(&selbuf, ", ");
+       deparseJoinVar(var, &context);
+
+       tle = makeTargetEntry((Expr *) copyObject(var),
+                             i + 1, pstrdup(""), false);
+       if (fdw_ps_tlist)
+           *fdw_ps_tlist = lappend(*fdw_ps_tlist, copyObject(tle));
+
+       *retrieved_attrs = lappend_int(*retrieved_attrs, i + 1);
+
+       i++;
+   }

The tle is a copy of the original target-entry, and var-node is also
copied one. Why is the tle copied on lappend() again?
Also, NULL as acceptable as 3rd argument of makeTargetEntry.

Good catch. Fixed.

--
Shigeru HANADA
shigeru.hanada@gmail.com

Attachments:

foreign_join_v10.patchapplication/octet-stream; name=foreign_join_v10.patch; x-unix-mode=0644Download

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 94fab18..dc09929 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -44,8 +44,11 @@
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
+#include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "nodes/plannodes.h"
 #include "optimizer/clauses.h"
+#include "optimizer/prep.h"
 #include "optimizer/var.h"
 #include "parser/parsetree.h"
 #include "utils/builtins.h"
@@ -89,6 +92,8 @@ typedef struct deparse_expr_cxt
 	RelOptInfo *foreignrel;		/* the foreign relation we are planning for */
 	StringInfo	buf;			/* output buffer to append to */
 	List	  **params_list;	/* exprs that will become remote Params */
+	List	   *outertlist;		/* outer child's target list */
+	List	   *innertlist;		/* inner child's target list */
 } deparse_expr_cxt;
 
 /*
@@ -137,12 +142,19 @@ static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod,
 static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
 					   deparse_expr_cxt *context);
 
+/*
+ * convert absolute attnum to relative one.  This would be handy for handling
+ * attnum for attrs_used and column aliases.
+ */
+#define TO_RELATIVE(x)	((x) - FirstLowInvalidHeapAttributeNumber)
+
 
 /*
  * Examine each qual clause in input_conds, and classify them into two groups,
  * which are returned as two lists:
  *	- remote_conds contains expressions that can be evaluated remotely
  *	- local_conds contains expressions that can't be evaluated remotely
+ * Note that each element is Expr, which was stripped from RestrictInfo, 
  */
 void
 classifyConditions(PlannerInfo *root,
@@ -161,9 +173,9 @@ classifyConditions(PlannerInfo *root,
 		RestrictInfo *ri = (RestrictInfo *) lfirst(lc);
 
 		if (is_foreign_expr(root, baserel, ri->clause))
-			*remote_conds = lappend(*remote_conds, ri);
+			*remote_conds = lappend(*remote_conds, ri->clause);
 		else
-			*local_conds = lappend(*local_conds, ri);
+			*local_conds = lappend(*local_conds, ri->clause);
 	}
 }
 
@@ -250,7 +262,7 @@ foreign_expr_walker(Node *node,
 				 * Param's collation, ie it's not safe for it to have a
 				 * non-default collation.
 				 */
-				if (var->varno == glob_cxt->foreignrel->relid &&
+				if (bms_is_member(var->varno, glob_cxt->foreignrel->relids) &&
 					var->varlevelsup == 0)
 				{
 					/* Var belongs to foreign table */
@@ -681,12 +693,57 @@ deparseSelectSql(StringInfo buf,
 				 PlannerInfo *root,
 				 RelOptInfo *baserel,
 				 Bitmapset *attrs_used,
+				 List *remote_conds,
+				 List **params_list,
+				 List **fdw_ps_tlist,
 				 List **retrieved_attrs)
 {
+	PgFdwRelationInfo  *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
 	RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
 	Relation	rel;
 
 	/*
+	 * If given relation was a join relation, recursively construct statement
+	 * by putting each outer and inner relations in FROM clause as a subquery
+	 * with aliasing.
+	 */
+	if (baserel->reloptkind == RELOPT_JOINREL)
+	{
+		RelOptInfo		   *rel_o = fpinfo->outerrel;
+		RelOptInfo		   *rel_i = fpinfo->innerrel;
+		PgFdwRelationInfo  *fpinfo_o = (PgFdwRelationInfo *) rel_o->fdw_private;
+		PgFdwRelationInfo  *fpinfo_i = (PgFdwRelationInfo *) rel_i->fdw_private;
+		StringInfoData		sql_o;
+		StringInfoData		sql_i;
+		List			   *ret_attrs_tmp;	/* not used */
+
+		/*
+		 * Deparse query for outer and inner relation, and combine them into
+		 * a query.
+		 */
+		initStringInfo(&sql_o);
+		deparseSelectSql(&sql_o, root, rel_o, fpinfo_o->attrs_used,
+						 fpinfo_o->remote_conds, params_list,
+						 fdw_ps_tlist, &ret_attrs_tmp);
+		initStringInfo(&sql_i);
+		deparseSelectSql(&sql_i, root, rel_i, fpinfo_i->attrs_used,
+						 fpinfo_i->remote_conds, params_list,
+						 fdw_ps_tlist, &ret_attrs_tmp);
+
+		deparseJoinSql(buf, root, baserel,
+					   fpinfo->outerrel,
+					   fpinfo->innerrel,
+					   sql_o.data,
+					   sql_i.data,
+					   fpinfo->jointype,
+					   fpinfo->joinclauses,
+					   fpinfo->otherclauses,
+					   fdw_ps_tlist,
+					   retrieved_attrs);
+		return;
+	}
+
+	/*
 	 * Core code already has some lock on each rel being planned, so we can
 	 * use NoLock here.
 	 */
@@ -705,6 +762,65 @@ deparseSelectSql(StringInfo buf,
 	appendStringInfoString(buf, " FROM ");
 	deparseRelation(buf, rel);
 
+	/*
+	 * Construct WHERE clause
+	 */
+	if (remote_conds)
+		appendConditions(buf, root, baserel, NULL, NULL, remote_conds,
+						 " WHERE ", params_list);
+
+	/*
+	 * Add FOR UPDATE/SHARE if appropriate.  We apply locking during the
+	 * initial row fetch, rather than later on as is done for local tables.
+	 * The extra roundtrips involved in trying to duplicate the local
+	 * semantics exactly don't seem worthwhile (see also comments for
+	 * RowMarkType).
+	 *
+	 * Note: because we actually run the query as a cursor, this assumes
+	 * that DECLARE CURSOR ... FOR UPDATE is supported, which it isn't
+	 * before 8.3.
+	 */
+	if (baserel->relid == root->parse->resultRelation &&
+		(root->parse->commandType == CMD_UPDATE ||
+		 root->parse->commandType == CMD_DELETE))
+	{
+		/* Relation is UPDATE/DELETE target, so use FOR UPDATE */
+		appendStringInfoString(buf, " FOR UPDATE");
+	}
+	else
+	{
+		PlanRowMark *rc = get_plan_rowmark(root->rowMarks, baserel->relid);
+
+		if (rc)
+		{
+			/*
+			 * Relation is specified as a FOR UPDATE/SHARE target, so handle
+			 * that.  (But we could also see LCS_NONE, meaning this isn't a
+			 * target relation after all.)
+			 *
+			 * For now, just ignore any [NO] KEY specification, since (a)
+			 * it's not clear what that means for a remote table that we
+			 * don't have complete information about, and (b) it wouldn't
+			 * work anyway on older remote servers.  Likewise, we don't
+			 * worry about NOWAIT.
+			 */
+			switch (rc->strength)
+			{
+				case LCS_NONE:
+					/* No locking needed */
+					break;
+				case LCS_FORKEYSHARE:
+				case LCS_FORSHARE:
+					appendStringInfoString(buf, " FOR SHARE");
+					break;
+				case LCS_FORNOKEYUPDATE:
+				case LCS_FORUPDATE:
+					appendStringInfoString(buf, " FOR UPDATE");
+					break;
+			}
+		}
+	}
+
 	heap_close(rel, NoLock);
 }
 
@@ -731,8 +847,7 @@ deparseTargetList(StringInfo buf,
 	*retrieved_attrs = NIL;
 
 	/* If there's a whole-row reference, we'll need all the columns. */
-	have_wholerow = bms_is_member(0 - FirstLowInvalidHeapAttributeNumber,
-								  attrs_used);
+	have_wholerow = bms_is_member(TO_RELATIVE(0), attrs_used);
 
 	first = true;
 	for (i = 1; i <= tupdesc->natts; i++)
@@ -743,15 +858,14 @@ deparseTargetList(StringInfo buf,
 		if (attr->attisdropped)
 			continue;
 
-		if (have_wholerow ||
-			bms_is_member(i - FirstLowInvalidHeapAttributeNumber,
-						  attrs_used))
+		if (have_wholerow || bms_is_member(TO_RELATIVE(i), attrs_used))
 		{
 			if (!first)
 				appendStringInfoString(buf, ", ");
 			first = false;
 
 			deparseColumnRef(buf, rtindex, i, root);
+			appendStringInfo(buf, " a%d", TO_RELATIVE(i));
 
 			*retrieved_attrs = lappend_int(*retrieved_attrs, i);
 		}
@@ -761,17 +875,17 @@ deparseTargetList(StringInfo buf,
 	 * Add ctid if needed.  We currently don't support retrieving any other
 	 * system columns.
 	 */
-	if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-					  attrs_used))
+	if (bms_is_member(TO_RELATIVE(SelfItemPointerAttributeNumber), attrs_used))
 	{
 		if (!first)
 			appendStringInfoString(buf, ", ");
 		first = false;
 
-		appendStringInfoString(buf, "ctid");
+		appendStringInfo(buf, "ctid a%d",
+						 TO_RELATIVE(SelfItemPointerAttributeNumber));
 
 		*retrieved_attrs = lappend_int(*retrieved_attrs,
-									   SelfItemPointerAttributeNumber);
+										   SelfItemPointerAttributeNumber);
 	}
 
 	/* Don't generate bad syntax if no undropped columns */
@@ -780,7 +894,8 @@ deparseTargetList(StringInfo buf,
 }
 
 /*
- * Deparse WHERE clauses in given list of RestrictInfos and append them to buf.
+ * Deparse conditions, such as WHERE clause and ON clause of JOIN, in given
+ * list of Expr and append them to buf.
  *
  * baserel is the foreign table we're planning for.
  *
@@ -794,12 +909,14 @@ deparseTargetList(StringInfo buf,
  * so Params and other-relation Vars should be replaced by dummy values.
  */
 void
-appendWhereClause(StringInfo buf,
-				  PlannerInfo *root,
-				  RelOptInfo *baserel,
-				  List *exprs,
-				  bool is_first,
-				  List **params)
+appendConditions(StringInfo buf,
+				 PlannerInfo *root,
+				 RelOptInfo *baserel,
+				 List *outertlist,
+				 List *innertlist,
+				 List *exprs,
+				 const char *prefix,
+				 List **params)
 {
 	deparse_expr_cxt context;
 	int			nestlevel;
@@ -813,31 +930,318 @@ appendWhereClause(StringInfo buf,
 	context.foreignrel = baserel;
 	context.buf = buf;
 	context.params_list = params;
+	context.outertlist = outertlist;
+	context.innertlist = innertlist;
 
 	/* Make sure any constants in the exprs are printed portably */
 	nestlevel = set_transmission_modes();
 
 	foreach(lc, exprs)
 	{
-		RestrictInfo *ri = (RestrictInfo *) lfirst(lc);
+		Expr	   *expr = (Expr *) lfirst(lc);
 
 		/* Connect expressions with "AND" and parenthesize each condition. */
-		if (is_first)
-			appendStringInfoString(buf, " WHERE ");
-		else
-			appendStringInfoString(buf, " AND ");
+		if (prefix)
+			appendStringInfo(buf, "%s", prefix);
 
 		appendStringInfoChar(buf, '(');
-		deparseExpr(ri->clause, &context);
+		deparseExpr(expr, &context);
 		appendStringInfoChar(buf, ')');
 
-		is_first = false;
+		prefix= " AND ";
 	}
 
 	reset_transmission_modes(nestlevel);
 }
 
 /*
+ * Returns position index (start with 1) of given var in given target list, or
+ * 0 when not found.
+ */
+static int
+find_var_pos(Var *node, List *tlist)
+{
+	int		pos = 1;
+	ListCell *lc;
+
+	foreach(lc, tlist)
+	{
+		Var *var = (Var *) lfirst(lc);
+
+		if (equal(var, node))
+		{
+			return pos;
+		}
+		pos++;
+	}
+
+	return 0;
+}
+
+/*
+ * Deparse given Var into buf.
+ */
+static void
+deparseJoinVar(Var *node, deparse_expr_cxt *context)
+{
+	char		side;
+	int			pos;
+
+	pos = find_var_pos(node, context->outertlist);
+	if (pos > 0)
+		side = 'l';
+	else
+	{
+		side = 'r';
+		pos = find_var_pos(node, context->innertlist);
+	}
+	Assert(pos > 0);
+	Assert(side == 'l' || side == 'r');
+
+	/*
+	 * We treat whole-row reference same as ordinary attribute references,
+	 * because such transformation should be done in lower level.
+	 */
+	appendStringInfo(context->buf, "%c.a%d", side, pos);
+}
+
+/*
+ * Deparse column alias list for a subquery in FROM clause.
+ */
+static void
+deparseColumnAliases(StringInfo buf, List *tlist)
+{
+	int			pos;
+	ListCell   *lc;
+
+	pos = 1;
+	foreach(lc, tlist)
+	{
+		/* Deparse column alias for the subquery */
+		if (pos > 1)
+			appendStringInfoString(buf, ", ");
+		appendStringInfo(buf, "a%d", pos);
+		pos++;
+	}
+}
+
+/*
+ * Deparse "wrapper" SQL for a query which contains whole-row reference or ctid.
+ * If the SQL is enough simple, just return it.
+ *
+ * Projecting whole-row value from each column value is done by ExecProjection
+ * for results of a scan against an ordinary tables, but join push-down omits
+ * ExecProjection calls so we need to do it in the remote SQL.
+ */
+static const char *
+deparseProjectionSql(PlannerInfo *root,
+					 RelOptInfo *baserel,
+					 const char *sql,
+					 char side)
+{
+	StringInfoData wholerow;
+	StringInfoData buf;
+	ListCell   *lc;
+	bool		first;
+	bool		have_wholerow = false;
+	bool		have_ctid = false;
+
+	/*
+	 * We have nothing to do if the targetlist contains no special reference,
+	 * such as whole-row and ctid.
+	 */
+	foreach(lc, baserel->reltargetlist)
+	{
+		Var		   *var = (Var *) lfirst(lc);
+		if (var->varattno == 0)
+		{
+			have_wholerow = true;
+			if (have_ctid)
+				break;
+		}
+		else if (var->varattno == SelfItemPointerAttributeNumber)
+		{
+			have_ctid = true;
+			if (have_wholerow)
+				break;
+		}
+	}
+	if (!have_wholerow && !have_ctid)
+		return sql;
+
+	/*
+	 * Construct whole-row reference with ROW() syntax
+	 */
+	if (have_wholerow)
+	{
+		RangeTblEntry *rte;
+		Relation		rel;
+		TupleDesc		tupdesc;
+		int				i;
+
+		/* Obtain TupleDesc for deparsing all valid columns */
+		rte = planner_rt_fetch(baserel->relid, root);
+		rel = heap_open(rte->relid, NoLock);
+		tupdesc = rel->rd_att;
+
+		/* Print all valid columns in ROW() to generate whole-row value */
+		initStringInfo(&wholerow);
+		appendStringInfoString(&wholerow, "ROW(");
+		first = true;
+		for (i = 1; i <= tupdesc->natts; i++)
+		{
+			Form_pg_attribute attr = tupdesc->attrs[i - 1];
+
+			/* Ignore dropped columns. */
+			if (attr->attisdropped)
+				continue;
+
+			if (!first)
+				appendStringInfoString(&wholerow, ", ");
+			first = false;
+
+			appendStringInfo(&wholerow, "%c.a%d", side, TO_RELATIVE(i));
+		}
+		appendStringInfoString(&wholerow, ")");
+
+		heap_close(rel, NoLock);
+	}
+
+	/*
+	 * Construct a SELECT statement which has the original query in its FROM
+	 * clause, and have target list entries in its SELECT clause.  The number
+	 * used in column aliases are attnum - FirstLowInvalidHeapAttributeNumber in
+	 * order to make all numbers positive even for system columns which have
+	 * minus value as attnum.
+	 */
+	initStringInfo(&buf);
+	appendStringInfoString(&buf, "SELECT ");
+	first = true;
+	foreach(lc, baserel->reltargetlist)
+	{
+		Var *var = (Var *) lfirst(lc);
+
+		if (!first)
+			appendStringInfoString(&buf, ", ");
+	
+		if (var->varattno == 0)
+			appendStringInfo(&buf, "%s", wholerow.data);
+		else
+			appendStringInfo(&buf, "%c.a%d", side, TO_RELATIVE(var->varattno));
+
+		first = false;
+	}
+	appendStringInfo(&buf, " FROM (%s) %c", sql, side);
+
+	return buf.data;
+}
+
+/*
+ * Construct a SELECT statement which contains join clause.
+ *
+ * We also create an TargetEntry List of the columns being retrieved, which is
+ * returned to *fdw_ps_tlist.
+ *
+ * path_o, tl_o, sql_o are respectively path, targetlist, and remote query
+ * statement of the outer child relation.  postfix _i means those for the inner
+ * child relation.  jointype and joinclauses are information of join method.
+ * fdw_ps_tlist is output parameter to pass target list of the pseudo scan to
+ * caller.
+ */
+void
+deparseJoinSql(StringInfo buf,
+			   PlannerInfo *root,
+			   RelOptInfo *baserel,
+			   RelOptInfo *outerrel,
+			   RelOptInfo *innerrel,
+			   const char *sql_o,
+			   const char *sql_i,
+			   JoinType jointype,
+			   List *joinclauses,
+			   List *otherclauses,
+			   List **fdw_ps_tlist,
+			   List **retrieved_attrs)
+{
+	StringInfoData selbuf;		/* buffer for SELECT clause */
+	StringInfoData abuf_o;		/* buffer for column alias list of outer */
+	StringInfoData abuf_i;		/* buffer for column alias list of inner */
+	int			i;
+	ListCell   *lc;
+	const char *jointype_str;
+	deparse_expr_cxt context;
+
+	context.root = root;
+	context.foreignrel = baserel;
+	context.buf = &selbuf;
+	context.params_list = NULL;
+	context.outertlist = outerrel->reltargetlist;
+	context.innertlist = innerrel->reltargetlist;
+
+	jointype_str = jointype == JOIN_INNER ? "INNER" :
+				   jointype == JOIN_LEFT ? "LEFT" :
+				   jointype == JOIN_RIGHT ? "RIGHT" :
+				   jointype == JOIN_FULL ? "FULL" : "";
+
+	*retrieved_attrs = NIL;
+
+	/* print SELECT clause of the join scan */
+	initStringInfo(&selbuf);
+	i = 0;
+	foreach(lc, baserel->reltargetlist)
+	{
+		Var		   *var = (Var *) lfirst(lc);
+		TargetEntry *tle;
+
+		if (i > 0)
+			appendStringInfoString(&selbuf, ", ");
+		deparseJoinVar(var, &context);
+
+		tle = makeTargetEntry((Expr *) var, i + 1, NULL, false);
+		if (fdw_ps_tlist)
+			*fdw_ps_tlist = lappend(*fdw_ps_tlist, tle);
+
+		*retrieved_attrs = lappend_int(*retrieved_attrs, i + 1);
+
+		i++;
+	}
+	if (i == 0)
+		appendStringInfoString(&selbuf, "NULL");
+
+	/*
+	 * Do pseudo-projection for an underlying scan on a foreign table, if a) the
+	 * relation is a base relation, and b) its targetlist contains whole-row
+	 * reference.
+	 */
+	if (outerrel->reloptkind == RELOPT_BASEREL)
+		sql_o = deparseProjectionSql(root, outerrel, sql_o, 'l');
+	if (innerrel->reloptkind == RELOPT_BASEREL)
+		sql_i = deparseProjectionSql(root, innerrel, sql_i, 'r');
+
+	/* Deparse column alias portion of subquery in FROM clause. */
+	initStringInfo(&abuf_o);
+	deparseColumnAliases(&abuf_o, outerrel->reltargetlist);
+	initStringInfo(&abuf_i);
+	deparseColumnAliases(&abuf_i, innerrel->reltargetlist);
+
+	/* Construct SELECT statement */
+	appendStringInfo(buf, "SELECT %s FROM", selbuf.data);
+	appendStringInfo(buf, " (%s) l (%s) %s JOIN (%s) r (%s)",
+					 sql_o, abuf_o.data, jointype_str, sql_i, abuf_i.data);
+	/* Append ON clause */
+	if (joinclauses)
+		appendConditions(buf, root, baserel,
+						 outerrel->reltargetlist, innerrel->reltargetlist,
+						 joinclauses,
+						 " ON ", NULL);
+	/* Append WHERE clause */
+	if (otherclauses)
+		appendConditions(buf, root, baserel,
+						 outerrel->reltargetlist, innerrel->reltargetlist,
+						 otherclauses,
+						 " WHERE ", NULL);
+}
+
+/*
  * deparse remote INSERT statement
  *
  * The statement text is appended to buf, and we also create an integer List
@@ -976,8 +1380,7 @@ deparseReturningList(StringInfo buf, PlannerInfo *root,
 	if (trig_after_row)
 	{
 		/* whole-row reference acquires all non-system columns */
-		attrs_used =
-			bms_make_singleton(0 - FirstLowInvalidHeapAttributeNumber);
+		attrs_used = bms_make_singleton(TO_RELATIVE(0));
 	}
 
 	if (returningList != NIL)
@@ -1261,6 +1664,8 @@ deparseExpr(Expr *node, deparse_expr_cxt *context)
 /*
  * Deparse given Var node into context->buf.
  *
+ * If context has valid innerrel, this is invoked for a join conditions.
+ *
  * If the Var belongs to the foreign relation, just print its remote name.
  * Otherwise, it's effectively a Param (and will in fact be a Param at
  * run time).  Handle it the same way we handle plain Params --- see
@@ -1271,39 +1676,46 @@ deparseVar(Var *node, deparse_expr_cxt *context)
 {
 	StringInfo	buf = context->buf;
 
-	if (node->varno == context->foreignrel->relid &&
-		node->varlevelsup == 0)
+	if (context->foreignrel->reloptkind == RELOPT_JOINREL)
 	{
-		/* Var belongs to foreign table */
-		deparseColumnRef(buf, node->varno, node->varattno, context->root);
+		deparseJoinVar(node, context);
 	}
 	else
 	{
-		/* Treat like a Param */
-		if (context->params_list)
+		if (node->varno == context->foreignrel->relid &&
+			node->varlevelsup == 0)
 		{
-			int			pindex = 0;
-			ListCell   *lc;
-
-			/* find its index in params_list */
-			foreach(lc, *context->params_list)
+			/* Var belongs to foreign table */
+			deparseColumnRef(buf, node->varno, node->varattno, context->root);
+		}
+		else
+		{
+			/* Treat like a Param */
+			if (context->params_list)
 			{
-				pindex++;
-				if (equal(node, (Node *) lfirst(lc)))
-					break;
+				int			pindex = 0;
+				ListCell   *lc;
+
+				/* find its index in params_list */
+				foreach(lc, *context->params_list)
+				{
+					pindex++;
+					if (equal(node, (Node *) lfirst(lc)))
+						break;
+				}
+				if (lc == NULL)
+				{
+					/* not in list, so add it */
+					pindex++;
+					*context->params_list = lappend(*context->params_list, node);
+				}
+
+				printRemoteParam(pindex, node->vartype, node->vartypmod, context);
 			}
-			if (lc == NULL)
+			else
 			{
-				/* not in list, so add it */
-				pindex++;
-				*context->params_list = lappend(*context->params_list, node);
+				printRemotePlaceholder(node->vartype, node->vartypmod, context);
 			}
-
-			printRemoteParam(pindex, node->vartype, node->vartypmod, context);
-		}
-		else
-		{
-			printRemotePlaceholder(node->vartype, node->vartypmod, context);
 		}
 	}
 }
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 783cb41..19c1115 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9,11 +9,16 @@ DO $d$
             OPTIONS (dbname '$$||current_database()||$$',
                      port '$$||current_setting('port')||$$'
             )$$;
+        EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER postgres_fdw
+            OPTIONS (dbname '$$||current_database()||$$',
+                     port '$$||current_setting('port')||$$'
+            )$$;
     END;
 $d$;
 CREATE USER MAPPING FOR public SERVER testserver1
 	OPTIONS (user 'value', password 'value');
 CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
+CREATE USER MAPPING FOR CURRENT_USER SERVER loopback2;
 -- ===================================================================
 -- create objects used through FDW loopback server
 -- ===================================================================
@@ -35,6 +40,18 @@ CREATE TABLE "S 1"."T 2" (
 	c2 text,
 	CONSTRAINT t2_pkey PRIMARY KEY (c1)
 );
+CREATE TABLE "S 1"."T 3" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text,
+	CONSTRAINT t3_pkey PRIMARY KEY (c1)
+);
+CREATE TABLE "S 1"."T 4" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c4 text,
+	CONSTRAINT t4_pkey PRIMARY KEY (c1)
+);
 INSERT INTO "S 1"."T 1"
 	SELECT id,
 	       id % 10,
@@ -49,8 +66,22 @@ INSERT INTO "S 1"."T 2"
 	SELECT id,
 	       'AAA' || to_char(id, 'FM000')
 	FROM generate_series(1, 100) id;
+INSERT INTO "S 1"."T 3"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 3" WHERE c1 % 2 != 0;	-- delete for outer join tests
+INSERT INTO "S 1"."T 4"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 4" WHERE c1 % 3 != 0;	-- delete for outer join tests
 ANALYZE "S 1"."T 1";
 ANALYZE "S 1"."T 2";
+ANALYZE "S 1"."T 3";
+ANALYZE "S 1"."T 4";
 -- ===================================================================
 -- create foreign tables
 -- ===================================================================
@@ -78,6 +109,26 @@ CREATE FOREIGN TABLE ft2 (
 	c8 user_enum
 ) SERVER loopback;
 ALTER FOREIGN TABLE ft2 DROP COLUMN cx;
+CREATE FOREIGN TABLE ft4 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3');
+CREATE FOREIGN TABLE ft5 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE FOREIGN TABLE ft6 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback2 OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE USER view_owner;
+GRANT ALL ON ft5 TO view_owner;
+CREATE VIEW v_ft5 AS SELECT * FROM ft5;
+ALTER VIEW v_ft5 OWNER TO view_owner;
+CREATE USER MAPPING FOR view_owner SERVER loopback;
 -- ===================================================================
 -- tests for validator
 -- ===================================================================
@@ -119,12 +170,15 @@ ALTER FOREIGN TABLE ft2 OPTIONS (schema_name 'S 1', table_name 'T 1');
 ALTER FOREIGN TABLE ft1 ALTER COLUMN c1 OPTIONS (column_name 'C 1');
 ALTER FOREIGN TABLE ft2 ALTER COLUMN c1 OPTIONS (column_name 'C 1');
 \det+
-                             List of foreign tables
- Schema | Table |  Server  |              FDW Options              | Description 
---------+-------+----------+---------------------------------------+-------------
- public | ft1   | loopback | (schema_name 'S 1', table_name 'T 1') | 
- public | ft2   | loopback | (schema_name 'S 1', table_name 'T 1') | 
-(2 rows)
+                              List of foreign tables
+ Schema | Table |  Server   |              FDW Options              | Description 
+--------+-------+-----------+---------------------------------------+-------------
+ public | ft1   | loopback  | (schema_name 'S 1', table_name 'T 1') | 
+ public | ft2   | loopback  | (schema_name 'S 1', table_name 'T 1') | 
+ public | ft4   | loopback  | (schema_name 'S 1', table_name 'T 3') | 
+ public | ft5   | loopback  | (schema_name 'S 1', table_name 'T 4') | 
+ public | ft6   | loopback2 | (schema_name 'S 1', table_name 'T 4') | 
+(5 rows)
 
 -- Now we should be able to run ANALYZE.
 -- To exercise multiple code paths, we use local stats on ft1
@@ -160,8 +214,8 @@ SELECT * FROM ft1 ORDER BY c3, c1 OFFSET 100 LIMIT 10;
 (10 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                     QUERY PLAN                                                      
+---------------------------------------------------------------------------------------------------------------------
  Limit
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    ->  Sort
@@ -169,7 +223,7 @@ EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan on public.ft1 t1
                Output: c1, c2, c3, c4, c5, c6, c7, c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (8 rows)
 
 SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
@@ -189,8 +243,8 @@ SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
 
 -- whole-row reference
 EXPLAIN (VERBOSE, COSTS false) SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                     QUERY PLAN                                                      
+---------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.*, c3, c1
    ->  Sort
@@ -198,7 +252,7 @@ EXPLAIN (VERBOSE, COSTS false) SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSE
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan on public.ft1 t1
                Output: t1.*, c3, c1
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (8 rows)
 
 SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
@@ -224,11 +278,11 @@ SELECT * FROM ft1 WHERE false;
 
 -- with WHERE clause
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
-                                                                   QUERY PLAN                                                                   
-------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                   QUERY PLAN                                                                                   
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c7 >= '1'::bpchar)) AND (("C 1" = 101)) AND ((c6 = '1'::text))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c7 >= '1'::bpchar)) AND (("C 1" = 101)) AND ((c6 = '1'::text))
 (3 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
@@ -239,13 +293,13 @@ SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
 
 -- with FOR UPDATE/SHARE
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
-                                                   QUERY PLAN                                                   
-----------------------------------------------------------------------------------------------------------------
+                                                                   QUERY PLAN                                                                   
+------------------------------------------------------------------------------------------------------------------------------------------------
  LockRows
    Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 101)) FOR UPDATE
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 101)) FOR UPDATE
 (5 rows)
 
 SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
@@ -255,13 +309,13 @@ SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
 (1 row)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
-                                                  QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
+                                                                  QUERY PLAN                                                                   
+-----------------------------------------------------------------------------------------------------------------------------------------------
  LockRows
    Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 102)) FOR SHARE
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 102)) FOR SHARE
 (5 rows)
 
 SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
@@ -277,22 +331,6 @@ SELECT COUNT(*) FROM ft1 t1;
   1000
 (1 row)
 
--- join two tables
-SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
- c1  
------
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
-(10 rows)
-
 -- subquery
 SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <= 10) ORDER BY c1;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -353,153 +391,148 @@ CREATE OPERATOR === (
     NEGATOR = !==
 );
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = postgres_fdw_abs(t1.c2);
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c1 = postgres_fdw_abs(t1.c2))
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 === t1.c2;
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c1 === t1.c2)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = abs(t1.c2);
-                                            QUERY PLAN                                             
----------------------------------------------------------------------------------------------------
+                                                            QUERY PLAN                                                             
+-----------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = abs(c2)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = abs(c2)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = t1.c2;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
+                                                          QUERY PLAN                                                          
+------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = c2))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = c2))
 (3 rows)
 
 -- ===================================================================
 -- WHERE with remotely-executable conditions
 -- ===================================================================
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 1;         -- Var, OpExpr(b), Const
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
-                                                  QUERY PLAN                                                  
---------------------------------------------------------------------------------------------------------------
+                                                                  QUERY PLAN                                                                  
+----------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 100)) AND ((c2 = 0))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 100)) AND ((c2 = 0))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NULL;        -- NullTest
-                                           QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
+                                                           QUERY PLAN                                                            
+---------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;    -- NullTest
-                                             QUERY PLAN                                              
------------------------------------------------------------------------------------------------------
+                                                             QUERY PLAN                                                              
+-------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
-                                                     QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
+                                                                     QUERY PLAN                                                                      
+-----------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((round(abs("C 1"), 0) = 1::numeric))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((round(abs("C 1"), 0) = 1::numeric))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = -c1;          -- OpExpr(l)
-                                             QUERY PLAN                                              
------------------------------------------------------------------------------------------------------
+                                                             QUERY PLAN                                                              
+-------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = (- "C 1")))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = (- "C 1")))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE 1 = c1!;           -- OpExpr(r)
-                                                QUERY PLAN                                                
-----------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((1::numeric = ("C 1" !)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((1::numeric = ("C 1" !)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE (c1 IS NOT NULL) IS DISTINCT FROM (c1 IS NOT NULL); -- DistinctExpr
-                                                                 QUERY PLAN                                                                 
---------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                 QUERY PLAN                                                                                 
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" IS NOT NULL) IS DISTINCT FROM ("C 1" IS NOT NULL)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" IS NOT NULL) IS DISTINCT FROM ("C 1" IS NOT NULL)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = ANY(ARRAY[c2, 1, c1 + 0]); -- ScalarArrayOpExpr
-                                                        QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
+                                                                        QUERY PLAN                                                                         
+-----------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = ANY (ARRAY[c2, 1, ("C 1" + 0)])))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = ANY (ARRAY[c2, 1, ("C 1" + 0)])))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = (ARRAY[c1,c2,3])[1]; -- ArrayRef
-                                                      QUERY PLAN                                                      
-----------------------------------------------------------------------------------------------------------------------
+                                                                      QUERY PLAN                                                                      
+------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = ((ARRAY["C 1", c2, 3])[1])))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = ((ARRAY["C 1", c2, 3])[1])))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c6 = E'foo''s\\bar';  -- check special chars
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                  
+---------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c6 = E'foo''s\\bar'::text))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c6 = E'foo''s\\bar'::text))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c8 = 'foo';  -- can't be sent to remote
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 -- parameterized remote path
 EXPLAIN (VERBOSE, COSTS false)
   SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
- Nested Loop
+                                                                                                                                                                                                               QUERY PLAN                                                                                                                                                                                                                
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan
    Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8, b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
-   ->  Foreign Scan on public.ft2 a
-         Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 47))
-   ->  Foreign Scan on public.ft2 b
-         Output: b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
-(8 rows)
+   Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, r.a1, r.a2, r.a3, r.a4, r.a5, r.a6, r.a7, r.a8 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 47))) l (a1, a2, a3, a4, a5, a6, a7, a8) INNER JOIN (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r (a1, a2, a3, a4, a5, a6, a7, a8) ON ((l.a2 = r.a1))
+(3 rows)
 
 SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  | c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -511,18 +544,18 @@ SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
 EXPLAIN (VERBOSE, COSTS false)
   SELECT * FROM ft2 a, ft2 b
   WHERE a.c2 = 6 AND b.c1 = a.c1 AND a.c8 = 'foo' AND b.c7 = upper(a.c7);
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                 
+--------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop
    Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8, b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
    ->  Foreign Scan on public.ft2 a
          Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8
          Filter: (a.c8 = 'foo'::user_enum)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c2 = 6))
+         Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c2 = 6))
    ->  Foreign Scan on public.ft2 b
          Output: b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
          Filter: (upper((a.c7)::text) = (b.c7)::text)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
+         Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
 (10 rows)
 
 SELECT * FROM ft2 a, ft2 b
@@ -651,21 +684,587 @@ SELECT * FROM ft2 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft1 WHERE c1 < 5));
 (4 rows)
 
 -- ===================================================================
+-- JOIN queries
+-- ===================================================================
+-- join two tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                                                                      QUERY PLAN                                                                                       
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1, t1.c3
+               Remote SQL: SELECT l.a1, l.a2, r.a1 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l (a1, a2) INNER JOIN (SELECT "C 1" a9 FROM "S 1"."T 1") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1  | c1  
+-----+-----
+ 101 | 101
+ 102 | 102
+ 103 | 103
+ 104 | 104
+ 105 | 105
+ 106 | 106
+ 107 | 107
+ 108 | 108
+ 109 | 109
+ 110 | 110
+(10 rows)
+
+-- join three tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1, t3.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+                                                                                                                                                   QUERY PLAN                                                                                                                                                   
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t3.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t3.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1, t3.c1, t1.c3
+               Remote SQL: SELECT l.a1, l.a2, l.a3, r.a1 FROM (SELECT l.a1, l.a2, r.a1 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l (a1, a2) INNER JOIN (SELECT "C 1" a9 FROM "S 1"."T 1") r (a1) ON ((l.a1 = r.a1))) l (a1, a2, a3) INNER JOIN (SELECT c1 a9 FROM "S 1"."T 3") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1, t3.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 | c1 
+----+----+----
+ 22 | 22 | 22
+ 24 | 24 | 24
+ 26 | 26 | 26
+ 28 | 28 | 28
+ 30 | 30 | 30
+ 32 | 32 | 32
+ 34 | 34 | 34
+ 36 | 36 | 36
+ 38 | 38 | 38
+ 40 | 40 | 40
+(10 rows)
+
+-- left outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                         QUERY PLAN                                                                          
+-------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 3") l (a1) LEFT JOIN (SELECT c1 a9 FROM "S 1"."T 4") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 22 |   
+ 24 | 24
+ 26 |   
+ 28 |   
+ 30 | 30
+ 32 |   
+ 34 |   
+ 36 | 36
+ 38 |   
+ 40 |   
+(10 rows)
+
+-- right outer join
+SET enable_mergejoin = off; -- palnner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                         QUERY PLAN                                                                          
+-------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 4") l (a1) LEFT JOIN (SELECT c1 a9 FROM "S 1"."T 3") r (a1) ON ((r.a1 = l.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+    | 33
+ 36 | 36
+    | 39
+ 42 | 42
+    | 45
+ 48 | 48
+    | 51
+ 54 | 54
+    | 57
+ 60 | 60
+(10 rows)
+
+SET enable_mergejoin = on;
+-- full outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+                                                                         QUERY PLAN                                                                          
+-------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 3") l (a1) FULL JOIN (SELECT c1 a9 FROM "S 1"."T 4") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+ c1  | c1 
+-----+----
+  92 |   
+  94 |   
+  96 | 96
+  98 |   
+ 100 |   
+     |  3
+     |  9
+     | 15
+     | 21
+     | 27
+(10 rows)
+
+-- full outer join + WHERE clause, only matched rows
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                                              QUERY PLAN                                                                                               
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 3") l (a1) FULL JOIN (SELECT c1 a9 FROM "S 1"."T 4") r (a1) ON ((l.a1 = r.a1)) WHERE (((l.a1 = r.a1) OR (l.a1 IS NULL)))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 66 | 66
+ 72 | 72
+ 78 | 78
+ 84 | 84
+ 90 | 90
+ 96 | 96
+    |  3
+    |  9
+    | 15
+    | 21
+(10 rows)
+
+-- join at WHERE clause 
+SET enable_mergejoin = off; -- palnner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                          QUERY PLAN                                                                          
+--------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT c1 a9 FROM "S 1"."T 3") l (a1) INNER JOIN (SELECT c1 a9 FROM "S 1"."T 4") r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 66 | 66
+ 72 | 72
+ 78 | 78
+ 84 | 84
+ 90 | 90
+ 96 | 96
+(6 rows)
+
+SET enable_mergejoin = on;
+-- join in CTE
+EXPLAIN (COSTS false, VERBOSE)
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+                                                                                    QUERY PLAN                                                                                     
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t.c1_1, t.c2_1, t.c1_3
+   CTE t
+     ->  Foreign Scan
+           Output: t1.c1, t1.c3, t2.c1
+           Remote SQL: SELECT l.a1, l.a2, r.a1 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l (a1, a2) INNER JOIN (SELECT "C 1" a9 FROM "S 1"."T 1") r (a1) ON ((l.a1 = r.a1))
+   ->  Sort
+         Output: t.c1_1, t.c2_1, t.c1_3
+         Sort Key: t.c1_3, t.c1_1
+         ->  CTE Scan on t
+               Output: t.c1_1, t.c2_1, t.c1_3
+(11 rows)
+
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+ c1_1 | c2_1 
+------+------
+  101 |  101
+  102 |  102
+  103 |  103
+  104 |  104
+  105 |  105
+  106 |  106
+  107 |  107
+  108 |  108
+  109 |  109
+  110 |  110
+(10 rows)
+
+-- ctid with whole-row reference
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                                                                                                                                                                                                                                   QUERY PLAN                                                                                                                                                                                                                                                    
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+   ->  Sort
+         Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+               Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, r.a1 FROM (SELECT l.a7, ROW(l.a10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17), l.a10, l.a12 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17, ctid a7 FROM "S 1"."T 1") l) l (a1, a2, a3, a4) INNER JOIN (SELECT ROW(r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a9 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2) ON ((l.a3 = r.a2))
+(8 rows)
+
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+  ctid  |                                             t1                                             |                                             t2                                             | c1  
+--------+--------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------+-----
+ (1,4)  | (101,1,00101,"Fri Jan 02 00:00:00 1970 PST","Fri Jan 02 00:00:00 1970",1,"1         ",foo) | (101,1,00101,"Fri Jan 02 00:00:00 1970 PST","Fri Jan 02 00:00:00 1970",1,"1         ",foo) | 101
+ (1,5)  | (102,2,00102,"Sat Jan 03 00:00:00 1970 PST","Sat Jan 03 00:00:00 1970",2,"2         ",foo) | (102,2,00102,"Sat Jan 03 00:00:00 1970 PST","Sat Jan 03 00:00:00 1970",2,"2         ",foo) | 102
+ (1,6)  | (103,3,00103,"Sun Jan 04 00:00:00 1970 PST","Sun Jan 04 00:00:00 1970",3,"3         ",foo) | (103,3,00103,"Sun Jan 04 00:00:00 1970 PST","Sun Jan 04 00:00:00 1970",3,"3         ",foo) | 103
+ (1,7)  | (104,4,00104,"Mon Jan 05 00:00:00 1970 PST","Mon Jan 05 00:00:00 1970",4,"4         ",foo) | (104,4,00104,"Mon Jan 05 00:00:00 1970 PST","Mon Jan 05 00:00:00 1970",4,"4         ",foo) | 104
+ (1,8)  | (105,5,00105,"Tue Jan 06 00:00:00 1970 PST","Tue Jan 06 00:00:00 1970",5,"5         ",foo) | (105,5,00105,"Tue Jan 06 00:00:00 1970 PST","Tue Jan 06 00:00:00 1970",5,"5         ",foo) | 105
+ (1,9)  | (106,6,00106,"Wed Jan 07 00:00:00 1970 PST","Wed Jan 07 00:00:00 1970",6,"6         ",foo) | (106,6,00106,"Wed Jan 07 00:00:00 1970 PST","Wed Jan 07 00:00:00 1970",6,"6         ",foo) | 106
+ (1,10) | (107,7,00107,"Thu Jan 08 00:00:00 1970 PST","Thu Jan 08 00:00:00 1970",7,"7         ",foo) | (107,7,00107,"Thu Jan 08 00:00:00 1970 PST","Thu Jan 08 00:00:00 1970",7,"7         ",foo) | 107
+ (1,11) | (108,8,00108,"Fri Jan 09 00:00:00 1970 PST","Fri Jan 09 00:00:00 1970",8,"8         ",foo) | (108,8,00108,"Fri Jan 09 00:00:00 1970 PST","Fri Jan 09 00:00:00 1970",8,"8         ",foo) | 108
+ (1,12) | (109,9,00109,"Sat Jan 10 00:00:00 1970 PST","Sat Jan 10 00:00:00 1970",9,"9         ",foo) | (109,9,00109,"Sat Jan 10 00:00:00 1970 PST","Sat Jan 10 00:00:00 1970",9,"9         ",foo) | 109
+ (1,13) | (110,0,00110,"Sun Jan 11 00:00:00 1970 PST","Sun Jan 11 00:00:00 1970",0,"0         ",foo) | (110,0,00110,"Sun Jan 11 00:00:00 1970 PST","Sun Jan 11 00:00:00 1970",0,"0         ",foo) | 110
+(10 rows)
+
+-- partially unsafe to push down, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+                                                                                          QUERY PLAN                                                                                           
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Nested Loop
+               Output: t1.c1
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Materialize
+                     ->  Foreign Scan
+                           Remote SQL: SELECT NULL FROM (SELECT "C 1" a9 FROM "S 1"."T 1" WHERE (("C 1" = "C 1"))) l (a1) INNER JOIN (SELECT c1 a9 FROM "S 1"."T 3") r (a1) ON ((l.a1 = r.a1))
+(13 rows)
+
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+ c1 
+----
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+(10 rows)
+
+-- SEMI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Hash Join
+               Output: t1.c1
+               Hash Cond: (t1.c1 = t2.c1)
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t2.c1
+                     ->  HashAggregate
+                           Output: t2.c1
+                           Group Key: t2.c1
+                           ->  Foreign Scan on public.ft2 t2
+                                 Output: t2.c1
+                                 Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+(19 rows)
+
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+ c1  
+-----
+ 101
+ 102
+ 103
+ 104
+ 105
+ 106
+ 107
+ 108
+ 109
+ 110
+(10 rows)
+
+-- ANTI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+                              QUERY PLAN                              
+----------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Hash Anti Join
+               Output: t1.c1
+               Hash Cond: (t1.c1 = t2.c2)
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t2.c2
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c2
+                           Remote SQL: SELECT c2 a10 FROM "S 1"."T 1"
+(16 rows)
+
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+ c1  
+-----
+ 110
+ 111
+ 112
+ 113
+ 114
+ 115
+ 116
+ 117
+ 118
+ 119
+(10 rows)
+
+-- CROSS JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Nested Loop
+               Output: t1.c1, t2.c1
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Materialize
+                     Output: t2.c1
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c1
+                           Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+(15 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1  
+----+-----
+  1 | 101
+  1 | 102
+  1 | 103
+  1 | 104
+  1 | 105
+  1 | 106
+  1 | 107
+  1 | 108
+  1 | 109
+  1 | 110
+(10 rows)
+
+-- different server
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Merge Join
+         Output: t1.c1, t2.c1
+         Merge Cond: (t1.c1 = t2.c1)
+         ->  Sort
+               Output: t1.c1
+               Sort Key: t1.c1
+               ->  Foreign Scan on public.ft5 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+         ->  Sort
+               Output: t2.c1
+               Sort Key: t2.c1
+               ->  Foreign Scan on public.ft6 t2
+                     Output: t2.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1 
+----+----
+(0 rows)
+
+-- different effective user for permission check
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Limit
+   Output: t1.c1, ft5.c1
+   ->  Merge Join
+         Output: t1.c1, ft5.c1
+         Merge Cond: (t1.c1 = ft5.c1)
+         ->  Sort
+               Output: t1.c1
+               Sort Key: t1.c1
+               ->  Foreign Scan on public.ft5 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+         ->  Sort
+               Output: ft5.c1
+               Sort Key: ft5.c1
+               ->  Foreign Scan on public.ft5
+                     Output: ft5.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1 
+----+----
+(0 rows)
+
+-- unsafe join conditions
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Merge Join
+               Output: t1.c1, t2.c1, t1.c3
+               Merge Cond: (t1.c8 = t2.c8)
+               ->  Sort
+                     Output: t1.c1, t1.c3, t1.c8
+                     Sort Key: t1.c8
+                     ->  Foreign Scan on public.ft1 t1
+                           Output: t1.c1, t1.c3, t1.c8
+                           Remote SQL: SELECT "C 1" a10, c3 a12, c8 a17 FROM "S 1"."T 1"
+               ->  Sort
+                     Output: t2.c1, t2.c8
+                     Sort Key: t2.c8
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c1, t2.c8
+                           Remote SQL: SELECT "C 1" a9, c8 a17 FROM "S 1"."T 1"
+(20 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1 | c1  
+----+-----
+  1 | 102
+  1 | 103
+  1 | 104
+  1 | 105
+  1 | 106
+  1 | 107
+  1 | 108
+  1 | 109
+  1 | 110
+  1 |   1
+(10 rows)
+
+-- local filter (unsafe conditions on one side)
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Hash Join
+               Output: t1.c1, t2.c1, t1.c3
+               Hash Cond: (t2.c1 = t1.c1)
+               ->  Foreign Scan on public.ft2 t2
+                     Output: t2.c1
+                     Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t1.c1, t1.c3
+                     ->  Foreign Scan on public.ft1 t1
+                           Output: t1.c1, t1.c3
+                           Filter: (t1.c8 = 'foo'::user_enum)
+                           Remote SQL: SELECT "C 1" a10, c3 a12, c8 a17 FROM "S 1"."T 1"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1  | c1  
+-----+-----
+ 101 | 101
+ 102 | 102
+ 103 | 103
+ 104 | 104
+ 105 | 105
+ 106 | 106
+ 107 | 107
+ 108 | 108
+ 109 | 109
+ 110 | 110
+(10 rows)
+
+-- ===================================================================
 -- parameterized queries
 -- ===================================================================
 -- simple join
 PREPARE st1(int, int) AS SELECT t1.c3, t2.c3 FROM ft1 t1, ft2 t2 WHERE t1.c1 = $1 AND t2.c1 = $2;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st1(1, 2);
-                             QUERY PLAN                             
---------------------------------------------------------------------
+                               QUERY PLAN                               
+------------------------------------------------------------------------
  Nested Loop
    Output: t1.c3, t2.c3
    ->  Foreign Scan on public.ft1 t1
          Output: t1.c3
-         Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+         Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" = 1))
    ->  Foreign Scan on public.ft2 t2
          Output: t2.c3
-         Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" = 2))
+         Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" = 2))
 (8 rows)
 
 EXECUTE st1(1, 1);
@@ -683,8 +1282,8 @@ EXECUTE st1(101, 101);
 -- subquery using stable function (can't be sent to remote)
 PREPARE st2(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c4) = '1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st2(10, 20);
-                                                QUERY PLAN                                                
-----------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Sort
    Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
    Sort Key: t1.c1
@@ -693,13 +1292,13 @@ EXPLAIN (VERBOSE, COSTS false) EXECUTE st2(10, 20);
          Join Filter: (t1.c3 = t2.c3)
          ->  Foreign Scan on public.ft1 t1
                Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20))
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" < 20))
          ->  Materialize
                Output: t2.c3
                ->  Foreign Scan on public.ft2 t2
                      Output: t2.c3
                      Filter: (date(t2.c4) = '01-17-1970'::date)
-                     Remote SQL: SELECT c3, c4 FROM "S 1"."T 1" WHERE (("C 1" > 10))
+                     Remote SQL: SELECT c3 a12, c4 a13 FROM "S 1"."T 1" WHERE (("C 1" > 10))
 (15 rows)
 
 EXECUTE st2(10, 20);
@@ -717,8 +1316,8 @@ EXECUTE st2(101, 121);
 -- subquery using immutable function (can be sent to remote)
 PREPARE st3(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c5) = '1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st3(10, 20);
-                                                      QUERY PLAN                                                       
------------------------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Sort
    Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
    Sort Key: t1.c1
@@ -727,12 +1326,12 @@ EXPLAIN (VERBOSE, COSTS false) EXECUTE st3(10, 20);
          Join Filter: (t1.c3 = t2.c3)
          ->  Foreign Scan on public.ft1 t1
                Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20))
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" < 20))
          ->  Materialize
                Output: t2.c3
                ->  Foreign Scan on public.ft2 t2
                      Output: t2.c3
-                     Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date))
+                     Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date))
 (14 rows)
 
 EXECUTE st3(10, 20);
@@ -749,108 +1348,108 @@ EXECUTE st3(20, 30);
 -- custom plan should be chosen initially
 PREPARE st4(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 = $1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 -- once we try it enough times, should switch to generic plan
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
 (3 rows)
 
 -- value of $1 should not be sent to remote
 PREPARE st5(user_enum,int) AS SELECT * FROM ft1 t1 WHERE c8 = $1 and c1 = $2;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = $1)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
 (4 rows)
 
 EXECUTE st5('foo', 1);
@@ -868,14 +1467,14 @@ DEALLOCATE st5;
 -- System columns, except ctid, should not be sent to remote
 EXPLAIN (VERBOSE, COSTS false)
 SELECT * FROM ft1 t1 WHERE t1.tableoid = 'pg_class'::regclass LIMIT 1;
-                                  QUERY PLAN                                   
--------------------------------------------------------------------------------
+                                                  QUERY PLAN                                                   
+---------------------------------------------------------------------------------------------------------------
  Limit
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8
          Filter: (t1.tableoid = '1259'::oid)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (6 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.tableoid = 'ft1'::regclass LIMIT 1;
@@ -886,13 +1485,13 @@ SELECT * FROM ft1 t1 WHERE t1.tableoid = 'ft1'::regclass LIMIT 1;
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
-                                  QUERY PLAN                                   
--------------------------------------------------------------------------------
+                                                  QUERY PLAN                                                   
+---------------------------------------------------------------------------------------------------------------
  Limit
    Output: ((tableoid)::regclass), c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: (tableoid)::regclass, c1, c2, c3, c4, c5, c6, c7, c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (5 rows)
 
 SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
@@ -903,11 +1502,11 @@ SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((ctid = '(0,2)'::tid))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((ctid = '(0,2)'::tid))
 (3 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
@@ -918,13 +1517,13 @@ SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT ctid, * FROM ft1 t1 LIMIT 1;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                       QUERY PLAN                                                       
+------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: ctid, c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: ctid, c1, c2, c3, c4, c5, c6, c7, c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17, ctid a7 FROM "S 1"."T 1"
 (5 rows)
 
 SELECT ctid, * FROM ft1 t1 LIMIT 1;
@@ -987,7 +1586,7 @@ FETCH c;
 SAVEPOINT s;
 SELECT * FROM ft1 WHERE 1 / (c1 - 1) > 0;  -- ERROR
 ERROR:  division by zero
-CONTEXT:  Remote SQL command: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (((1 / ("C 1" - 1)) > 0))
+CONTEXT:  Remote SQL command: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (((1 / ("C 1" - 1)) > 0))
 ROLLBACK TO s;
 FETCH c;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -1010,64 +1609,64 @@ create foreign table ft3 (f1 text collate "C", f2 text)
   server loopback options (table_name 'loct3');
 -- can be sent to remote
 explain (verbose, costs off) select * from ft3 where f1 = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f1 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f1 = 'foo'::text))
 (3 rows)
 
 explain (verbose, costs off) select * from ft3 where f1 COLLATE "C" = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f1 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f1 = 'foo'::text))
 (3 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f2 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f2 = 'foo'::text))
 (3 rows)
 
 -- can't be sent to remote
 explain (verbose, costs off) select * from ft3 where f1 COLLATE "POSIX" = 'foo';
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: ((ft3.f1)::text = 'foo'::text)
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f1 = 'foo' COLLATE "C";
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: (ft3.f1 = 'foo'::text COLLATE "C")
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 COLLATE "C" = 'foo';
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: ((ft3.f2)::text = 'foo'::text)
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 = 'foo' COLLATE "C";
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: (ft3.f2 = 'foo'::text COLLATE "C")
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 -- ===================================================================
@@ -1085,7 +1684,7 @@ INSERT INTO ft2 (c1,c2,c3) SELECT c1+1000,c2+100, c3 || c3 FROM ft2 LIMIT 20;
                Output: ((ft2_1.c1 + 1000)), ((ft2_1.c2 + 100)), ((ft2_1.c3 || ft2_1.c3))
                ->  Foreign Scan on public.ft2 ft2_1
                      Output: (ft2_1.c1 + 1000), (ft2_1.c2 + 100), (ft2_1.c3 || ft2_1.c3)
-                     Remote SQL: SELECT "C 1", c2, c3 FROM "S 1"."T 1"
+                     Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12 FROM "S 1"."T 1"
 (9 rows)
 
 INSERT INTO ft2 (c1,c2,c3) SELECT c1+1000,c2+100, c3 || c3 FROM ft2 LIMIT 20;
@@ -1210,35 +1809,27 @@ UPDATE ft2 SET c2 = c2 + 400, c3 = c3 || '_update7' WHERE c1 % 10 = 7 RETURNING
 EXPLAIN (verbose, costs off)
 UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
-                                                                            QUERY PLAN                                                                             
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                                                                                                                       QUERY PLAN                                                                                                                                                                                                                                                                       
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Update on public.ft2
    Remote SQL: UPDATE "S 1"."T 1" SET c2 = $2, c3 = $3, c7 = $4 WHERE ctid = $1
-   ->  Hash Join
+   ->  Foreign Scan
          Output: ft2.c1, (ft2.c2 + 500), NULL::integer, (ft2.c3 || '_update9'::text), ft2.c4, ft2.c5, ft2.c6, 'ft2       '::character(10), ft2.c8, ft2.ctid, ft1.*
-         Hash Cond: (ft2.c2 = ft1.c1)
-         ->  Foreign Scan on public.ft2
-               Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c8, ft2.ctid
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c8, ctid FROM "S 1"."T 1" FOR UPDATE
-         ->  Hash
-               Output: ft1.*, ft1.c1
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.*, ft1.c1
-                     Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 9))
-(13 rows)
+         Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, r.a1 FROM (SELECT l.a9, l.a10, l.a12, l.a13, l.a14, l.a15, l.a17, l.a7 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c8 a17, ctid a7 FROM "S 1"."T 1" FOR UPDATE) l) l (a1, a2, a3, a4, a5, a6, a7, a8) INNER JOIN (SELECT ROW(r.a10, r.a11, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a10 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 9))) r) r (a1, a2) ON ((l.a2 = r.a2))
+(5 rows)
 
 UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
 EXPLAIN (verbose, costs off)
   DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
-                                       QUERY PLAN                                       
-----------------------------------------------------------------------------------------
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
  Delete on public.ft2
    Output: c1, c4
-   Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1 RETURNING "C 1", c4
+   Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1 RETURNING "C 1" a9, c4 a13
    ->  Foreign Scan on public.ft2
          Output: ctid
-         Remote SQL: SELECT ctid FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 5)) FOR UPDATE
+         Remote SQL: SELECT ctid a7 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 5)) FOR UPDATE
 (6 rows)
 
 DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
@@ -1351,22 +1942,14 @@ DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
 
 EXPLAIN (verbose, costs off)
 DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
-                                                      QUERY PLAN                                                      
-----------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                                        QUERY PLAN                                                                                                                                                                                         
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Delete on public.ft2
    Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1
-   ->  Hash Join
+   ->  Foreign Scan
          Output: ft2.ctid, ft1.*
-         Hash Cond: (ft2.c2 = ft1.c1)
-         ->  Foreign Scan on public.ft2
-               Output: ft2.ctid, ft2.c2
-               Remote SQL: SELECT c2, ctid FROM "S 1"."T 1" FOR UPDATE
-         ->  Hash
-               Output: ft1.*, ft1.c1
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.*, ft1.c1
-                     Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 2))
-(13 rows)
+         Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a7, l.a10 FROM (SELECT c2 a10, ctid a7 FROM "S 1"."T 1" FOR UPDATE) l) l (a1, a2) INNER JOIN (SELECT ROW(r.a10, r.a11, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a10 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 2))) r) r (a1, a2) ON ((l.a2 = r.a2))
+(5 rows)
 
 DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
 SELECT c1,c2,c3,c4 FROM ft2 ORDER BY c1;
@@ -3027,386 +3610,6 @@ NOTICE:  NEW: (13,"test triggered !")
 (1 row)
 
 -- ===================================================================
--- test inheritance features
--- ===================================================================
-CREATE TABLE a (aa TEXT);
-CREATE TABLE loct (aa TEXT, bb TEXT);
-CREATE FOREIGN TABLE b (bb TEXT) INHERITS (a)
-  SERVER loopback OPTIONS (table_name 'loct');
-INSERT INTO a(aa) VALUES('aaa');
-INSERT INTO a(aa) VALUES('aaaa');
-INSERT INTO a(aa) VALUES('aaaaa');
-INSERT INTO b(aa) VALUES('bbb');
-INSERT INTO b(aa) VALUES('bbbb');
-INSERT INTO b(aa) VALUES('bbbbb');
-SELECT tableoid::regclass, * FROM a;
- tableoid |  aa   
-----------+-------
- a        | aaa
- a        | aaaa
- a        | aaaaa
- b        | bbb
- b        | bbbb
- b        | bbbbb
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |  aa   | bb 
-----------+-------+----
- b        | bbb   | 
- b        | bbbb  | 
- b        | bbbbb | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |  aa   
-----------+-------
- a        | aaa
- a        | aaaa
- a        | aaaaa
-(3 rows)
-
-UPDATE a SET aa = 'zzzzzz' WHERE aa LIKE 'aaaa%';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
- b        | bbb
- b        | bbbb
- b        | bbbbb
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |  aa   | bb 
-----------+-------+----
- b        | bbb   | 
- b        | bbbb  | 
- b        | bbbbb | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
-(3 rows)
-
-UPDATE b SET aa = 'new';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
- b        | new
- b        | new
- b        | new
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid | aa  | bb 
-----------+-----+----
- b        | new | 
- b        | new | 
- b        | new | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
-(3 rows)
-
-UPDATE a SET aa = 'newtoo';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | newtoo
- a        | newtoo
- a        | newtoo
- b        | newtoo
- b        | newtoo
- b        | newtoo
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |   aa   | bb 
-----------+--------+----
- b        | newtoo | 
- b        | newtoo | 
- b        | newtoo | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | newtoo
- a        | newtoo
- a        | newtoo
-(3 rows)
-
-DELETE FROM a;
-SELECT tableoid::regclass, * FROM a;
- tableoid | aa 
-----------+----
-(0 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid | aa | bb 
-----------+----+----
-(0 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid | aa 
-----------+----
-(0 rows)
-
-DROP TABLE a CASCADE;
-NOTICE:  drop cascades to foreign table b
-DROP TABLE loct;
--- Check SELECT FOR UPDATE/SHARE with an inherited source table
-create table loct1 (f1 int, f2 int, f3 int);
-create table loct2 (f1 int, f2 int, f3 int);
-create table foo (f1 int, f2 int);
-create foreign table foo2 (f3 int) inherits (foo)
-  server loopback options (table_name 'loct1');
-create table bar (f1 int, f2 int);
-create foreign table bar2 (f3 int) inherits (bar)
-  server loopback options (table_name 'loct2');
-insert into foo values(1,1);
-insert into foo values(3,3);
-insert into foo2 values(2,2,2);
-insert into foo2 values(4,4,4);
-insert into bar values(1,11);
-insert into bar values(2,22);
-insert into bar values(6,66);
-insert into bar2 values(3,33,33);
-insert into bar2 values(4,44,44);
-insert into bar2 values(7,77,77);
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for update;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
- LockRows
-   Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-   ->  Hash Join
-         Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Append
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.ctid, bar2.tableoid, bar2.*
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(22 rows)
-
-select * from bar where f1 in (select f1 from foo) for update;
- f1 | f2 
-----+----
-  1 | 11
-  2 | 22
-  3 | 33
-  4 | 44
-(4 rows)
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for share;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
- LockRows
-   Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-   ->  Hash Join
-         Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Append
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.ctid, bar2.tableoid, bar2.*
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR SHARE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(22 rows)
-
-select * from bar where f1 in (select f1 from foo) for share;
- f1 | f2 
-----+----
-  1 | 11
-  2 | 22
-  3 | 33
-  4 | 44
-(4 rows)
-
--- Check UPDATE with inherited target and an inherited source table
-explain (verbose, costs off)
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
- Update on public.bar
-   Update on public.bar
-   Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
-   ->  Hash Join
-         Output: bar.f1, (bar.f2 + 100), bar.ctid, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Seq Scan on public.bar
-               Output: bar.f1, bar.f2, bar.ctid
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-   ->  Hash Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar2.f1 = foo.f1)
-         ->  Foreign Scan on public.bar2
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-               Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(37 rows)
-
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-select tableoid::regclass, * from bar order by 1,2;
- tableoid | f1 | f2  
-----------+----+-----
- bar      |  1 | 111
- bar      |  2 | 122
- bar      |  6 |  66
- bar2     |  3 | 133
- bar2     |  4 | 144
- bar2     |  7 |  77
-(6 rows)
-
--- Check UPDATE with inherited target and an appendrel subquery
-explain (verbose, costs off)
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-                                      QUERY PLAN                                      
---------------------------------------------------------------------------------------
- Update on public.bar
-   Update on public.bar
-   Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
-   ->  Hash Join
-         Output: bar.f1, (bar.f2 + 100), bar.ctid, (ROW(foo.f1))
-         Hash Cond: (foo.f1 = bar.f1)
-         ->  Append
-               ->  Seq Scan on public.foo
-                     Output: ROW(foo.f1), foo.f1
-               ->  Foreign Scan on public.foo2
-                     Output: ROW(foo2.f1), foo2.f1
-                     Remote SQL: SELECT f1 FROM public.loct1
-               ->  Seq Scan on public.foo foo_1
-                     Output: ROW((foo_1.f1 + 3)), (foo_1.f1 + 3)
-               ->  Foreign Scan on public.foo2 foo2_1
-                     Output: ROW((foo2_1.f1 + 3)), (foo2_1.f1 + 3)
-                     Remote SQL: SELECT f1 FROM public.loct1
-         ->  Hash
-               Output: bar.f1, bar.f2, bar.ctid
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid
-   ->  Merge Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, (ROW(foo.f1))
-         Merge Cond: (bar2.f1 = foo.f1)
-         ->  Sort
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-               Sort Key: bar2.f1
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Sort
-               Output: (ROW(foo.f1)), foo.f1
-               Sort Key: foo.f1
-               ->  Append
-                     ->  Seq Scan on public.foo
-                           Output: ROW(foo.f1), foo.f1
-                     ->  Foreign Scan on public.foo2
-                           Output: ROW(foo2.f1), foo2.f1
-                           Remote SQL: SELECT f1 FROM public.loct1
-                     ->  Seq Scan on public.foo foo_1
-                           Output: ROW((foo_1.f1 + 3)), (foo_1.f1 + 3)
-                     ->  Foreign Scan on public.foo2 foo2_1
-                           Output: ROW((foo2_1.f1 + 3)), (foo2_1.f1 + 3)
-                           Remote SQL: SELECT f1 FROM public.loct1
-(45 rows)
-
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-select tableoid::regclass, * from bar order by 1,2;
- tableoid | f1 | f2  
-----------+----+-----
- bar      |  1 | 211
- bar      |  2 | 222
- bar      |  6 | 166
- bar2     |  3 | 233
- bar2     |  4 | 244
- bar2     |  7 | 177
-(6 rows)
-
--- Test that WHERE CURRENT OF is not supported
-begin;
-declare c cursor for select * from bar where f1 = 7;
-fetch from c;
- f1 | f2  
-----+-----
-  7 | 177
-(1 row)
-
-update bar set f2 = null where current of c;
-ERROR:  WHERE CURRENT OF is not supported for this table type
-rollback;
-drop table foo cascade;
-NOTICE:  drop cascades to foreign table foo2
-drop table bar cascade;
-NOTICE:  drop cascades to foreign table bar2
-drop table loct1;
-drop table loct2;
--- ===================================================================
 -- test IMPORT FOREIGN SCHEMA
 -- ===================================================================
 CREATE SCHEMA import_source;
@@ -3636,3 +3839,6 @@ QUERY:  CREATE FOREIGN TABLE t5 (
 OPTIONS (schema_name 'import_source', table_name 't5');
 CONTEXT:  importing foreign table "t5"
 ROLLBACK;
+-- Cleanup
+DROP OWNED BY view_owner;
+DROP USER view_owner;
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 478e124..e184056 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -28,7 +28,6 @@
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/planmain.h"
-#include "optimizer/prep.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/var.h"
 #include "parser/parsetree.h"
@@ -47,41 +46,8 @@ PG_MODULE_MAGIC;
 #define DEFAULT_FDW_TUPLE_COST		0.01
 
 /*
- * FDW-specific planner information kept in RelOptInfo.fdw_private for a
- * foreign table.  This information is collected by postgresGetForeignRelSize.
- */
-typedef struct PgFdwRelationInfo
-{
-	/* baserestrictinfo clauses, broken down into safe and unsafe subsets. */
-	List	   *remote_conds;
-	List	   *local_conds;
-
-	/* Bitmap of attr numbers we need to fetch from the remote server. */
-	Bitmapset  *attrs_used;
-
-	/* Cost and selectivity of local_conds. */
-	QualCost	local_conds_cost;
-	Selectivity local_conds_sel;
-
-	/* Estimated size and cost for a scan with baserestrictinfo quals. */
-	double		rows;
-	int			width;
-	Cost		startup_cost;
-	Cost		total_cost;
-
-	/* Options extracted from catalogs. */
-	bool		use_remote_estimate;
-	Cost		fdw_startup_cost;
-	Cost		fdw_tuple_cost;
-
-	/* Cached catalog information. */
-	ForeignTable *table;
-	ForeignServer *server;
-	UserMapping *user;			/* only set in use_remote_estimate mode */
-} PgFdwRelationInfo;
-
-/*
- * Indexes of FDW-private information stored in fdw_private lists.
+ * Indexes of FDW-private information stored in fdw_private of ForeignScan of
+ * a simple foreign table scan for a SELECT statement.
  *
  * We store various information in ForeignScan.fdw_private to pass it from
  * planner to executor.  Currently we store:
@@ -98,7 +64,11 @@ enum FdwScanPrivateIndex
 	/* SQL statement to execute remotely (as a String node) */
 	FdwScanPrivateSelectSql,
 	/* Integer list of attribute numbers retrieved by the SELECT */
-	FdwScanPrivateRetrievedAttrs
+	FdwScanPrivateRetrievedAttrs,
+	/* Integer value of server for the scan */
+	FdwScanPrivateServerOid,
+	/* Integer value of effective userid for the scan */
+	FdwScanPrivateUserOid,
 };
 
 /*
@@ -128,7 +98,8 @@ enum FdwModifyPrivateIndex
  */
 typedef struct PgFdwScanState
 {
-	Relation	rel;			/* relcache entry for the foreign table */
+	const char *relname;		/* name of relation being scanned */
+	TupleDesc	tupdesc;		/* tuple descriptor of the scan */
 	AttInMetadata *attinmeta;	/* attribute datatype conversion metadata */
 
 	/* extracted fdw_private data */
@@ -194,6 +165,8 @@ typedef struct PgFdwAnalyzeState
 	AttInMetadata *attinmeta;	/* attribute datatype conversion metadata */
 	List	   *retrieved_attrs;	/* attr numbers retrieved by query */
 
+	char	   *query;			/* text of SELECT command */
+
 	/* collected sample rows */
 	HeapTuple  *rows;			/* array of size targrows */
 	int			targrows;		/* target # of sample rows */
@@ -214,7 +187,10 @@ typedef struct PgFdwAnalyzeState
  */
 typedef struct ConversionLocation
 {
-	Relation	rel;			/* foreign table's relcache entry */
+	const char *relname;		/* name of relation being processed, or NULL for
+								   a foreign join */
+	const char *query;			/* query being processed */
+	TupleDesc	tupdesc;		/* tuple descriptor for attribute names */
 	AttrNumber	cur_attno;		/* attribute number being processed, or 0 */
 } ConversionLocation;
 
@@ -288,6 +264,12 @@ static bool postgresAnalyzeForeignTable(Relation relation,
 							BlockNumber *totalpages);
 static List *postgresImportForeignSchema(ImportForeignSchemaStmt *stmt,
 							Oid serverOid);
+static void postgresGetForeignJoinPaths(PlannerInfo *root,
+						   RelOptInfo *joinrel,
+						   RelOptInfo *outerrel,
+						   RelOptInfo *innerrel,
+						   SpecialJoinInfo *sjinfo,
+						   List *restrictlisti);
 
 /*
  * Helper functions
@@ -323,7 +305,9 @@ static void analyze_row_processor(PGresult *res, int row,
 					  PgFdwAnalyzeState *astate);
 static HeapTuple make_tuple_from_result_row(PGresult *res,
 						   int row,
-						   Relation rel,
+						   const char *relname,
+						   const char *query,
+						   TupleDesc tupdesc,
 						   AttInMetadata *attinmeta,
 						   List *retrieved_attrs,
 						   MemoryContext temp_context);
@@ -368,6 +352,9 @@ postgres_fdw_handler(PG_FUNCTION_ARGS)
 	/* Support functions for IMPORT FOREIGN SCHEMA */
 	routine->ImportForeignSchema = postgresImportForeignSchema;
 
+	/* Support functions for join push-down */
+	routine->GetForeignJoinPaths = postgresGetForeignJoinPaths;
+
 	PG_RETURN_POINTER(routine);
 }
 
@@ -383,7 +370,9 @@ postgresGetForeignRelSize(PlannerInfo *root,
 						  RelOptInfo *baserel,
 						  Oid foreigntableid)
 {
+	RangeTblEntry *rte;
 	PgFdwRelationInfo *fpinfo;
+	ForeignTable *table;
 	ListCell   *lc;
 
 	/*
@@ -394,8 +383,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	baserel->fdw_private = (void *) fpinfo;
 
 	/* Look up foreign-table catalog info. */
-	fpinfo->table = GetForeignTable(foreigntableid);
-	fpinfo->server = GetForeignServer(fpinfo->table->serverid);
+	table = GetForeignTable(foreigntableid);
+	fpinfo->server = GetForeignServer(table->serverid);
 
 	/*
 	 * Extract user-settable option values.  Note that per-table setting of
@@ -416,7 +405,7 @@ postgresGetForeignRelSize(PlannerInfo *root,
 		else if (strcmp(def->defname, "fdw_tuple_cost") == 0)
 			fpinfo->fdw_tuple_cost = strtod(defGetString(def), NULL);
 	}
-	foreach(lc, fpinfo->table->options)
+	foreach(lc, table->options)
 	{
 		DefElem    *def = (DefElem *) lfirst(lc);
 
@@ -428,20 +417,12 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	}
 
 	/*
-	 * If the table or the server is configured to use remote estimates,
-	 * identify which user to do remote access as during planning.  This
+	 * Identify which user to do remote access as during planning.  This
 	 * should match what ExecCheckRTEPerms() does.  If we fail due to lack of
 	 * permissions, the query would have failed at runtime anyway.
 	 */
-	if (fpinfo->use_remote_estimate)
-	{
-		RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
-		Oid			userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-		fpinfo->user = GetUserMapping(userid, fpinfo->server->serverid);
-	}
-	else
-		fpinfo->user = NULL;
+	rte = planner_rt_fetch(baserel->relid, root);
+	fpinfo->userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
 
 	/*
 	 * Identify which baserestrictinfo clauses can be sent to the remote
@@ -463,10 +444,9 @@ postgresGetForeignRelSize(PlannerInfo *root,
 				   &fpinfo->attrs_used);
 	foreach(lc, fpinfo->local_conds)
 	{
-		RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
+		Expr *expr = (Expr *) lfirst(lc);
 
-		pull_varattnos((Node *) rinfo->clause, baserel->relid,
-					   &fpinfo->attrs_used);
+		pull_varattnos((Node *) expr, baserel->relid, &fpinfo->attrs_used);
 	}
 
 	/*
@@ -752,6 +732,8 @@ postgresGetForeignPlan(PlannerInfo *root,
 	List	   *retrieved_attrs;
 	StringInfoData sql;
 	ListCell   *lc;
+	List	   *fdw_ps_tlist = NIL;
+	ForeignScan *scan;
 
 	/*
 	 * Separate the scan_clauses into those that can be executed remotely and
@@ -768,9 +750,6 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 *
 	 * This code must match "extract_actual_clauses(scan_clauses, false)"
 	 * except for the additional decision about remote versus local execution.
-	 * Note however that we only strip the RestrictInfo nodes from the
-	 * local_exprs list, since appendWhereClause expects a list of
-	 * RestrictInfos.
 	 */
 	foreach(lc, scan_clauses)
 	{
@@ -783,11 +762,11 @@ postgresGetForeignPlan(PlannerInfo *root,
 			continue;
 
 		if (list_member_ptr(fpinfo->remote_conds, rinfo))
-			remote_conds = lappend(remote_conds, rinfo);
+			remote_conds = lappend(remote_conds, rinfo->clause);
 		else if (list_member_ptr(fpinfo->local_conds, rinfo))
 			local_exprs = lappend(local_exprs, rinfo->clause);
 		else if (is_foreign_expr(root, baserel, rinfo->clause))
-			remote_conds = lappend(remote_conds, rinfo);
+			remote_conds = lappend(remote_conds, rinfo->clause);
 		else
 			local_exprs = lappend(local_exprs, rinfo->clause);
 	}
@@ -797,68 +776,17 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 * expressions to be sent as parameters.
 	 */
 	initStringInfo(&sql);
-	deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used,
-					 &retrieved_attrs);
-	if (remote_conds)
-		appendWhereClause(&sql, root, baserel, remote_conds,
-						  true, &params_list);
-
-	/*
-	 * Add FOR UPDATE/SHARE if appropriate.  We apply locking during the
-	 * initial row fetch, rather than later on as is done for local tables.
-	 * The extra roundtrips involved in trying to duplicate the local
-	 * semantics exactly don't seem worthwhile (see also comments for
-	 * RowMarkType).
-	 *
-	 * Note: because we actually run the query as a cursor, this assumes that
-	 * DECLARE CURSOR ... FOR UPDATE is supported, which it isn't before 8.3.
-	 */
-	if (baserel->relid == root->parse->resultRelation &&
-		(root->parse->commandType == CMD_UPDATE ||
-		 root->parse->commandType == CMD_DELETE))
-	{
-		/* Relation is UPDATE/DELETE target, so use FOR UPDATE */
-		appendStringInfoString(&sql, " FOR UPDATE");
-	}
-	else
-	{
-		PlanRowMark *rc = get_plan_rowmark(root->rowMarks, baserel->relid);
-
-		if (rc)
-		{
-			/*
-			 * Relation is specified as a FOR UPDATE/SHARE target, so handle
-			 * that.  (But we could also see LCS_NONE, meaning this isn't a
-			 * target relation after all.)
-			 *
-			 * For now, just ignore any [NO] KEY specification, since (a) it's
-			 * not clear what that means for a remote table that we don't have
-			 * complete information about, and (b) it wouldn't work anyway on
-			 * older remote servers.  Likewise, we don't worry about NOWAIT.
-			 */
-			switch (rc->strength)
-			{
-				case LCS_NONE:
-					/* No locking needed */
-					break;
-				case LCS_FORKEYSHARE:
-				case LCS_FORSHARE:
-					appendStringInfoString(&sql, " FOR SHARE");
-					break;
-				case LCS_FORNOKEYUPDATE:
-				case LCS_FORUPDATE:
-					appendStringInfoString(&sql, " FOR UPDATE");
-					break;
-			}
-		}
-	}
+	deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used, remote_conds,
+					 &params_list, &fdw_ps_tlist, &retrieved_attrs);
 
 	/*
-	 * Build the fdw_private list that will be available to the executor.
+	 * Build the fdw_private list that will be available in the executor.
 	 * Items in the list must match enum FdwScanPrivateIndex, above.
 	 */
-	fdw_private = list_make2(makeString(sql.data),
-							 retrieved_attrs);
+	fdw_private = list_make4(makeString(sql.data),
+							 retrieved_attrs,
+							 makeInteger(fpinfo->server->serverid),
+							 makeInteger(fpinfo->userid));
 
 	/*
 	 * Create the ForeignScan node from target list, local filtering
@@ -868,11 +796,18 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 * field of the finished plan node; we can't keep them in private state
 	 * because then they wouldn't be subject to later planner processing.
 	 */
-	return make_foreignscan(tlist,
+	scan = make_foreignscan(tlist,
 							local_exprs,
 							scan_relid,
 							params_list,
 							fdw_private);
+
+	/*
+	 * set fdw_ps_tlist to handle tuples generated by this scan.
+	 */
+	scan->fdw_ps_tlist = fdw_ps_tlist;
+
+	return scan;
 }
 
 /*
@@ -885,9 +820,8 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	ForeignScan *fsplan = (ForeignScan *) node->ss.ps.plan;
 	EState	   *estate = node->ss.ps.state;
 	PgFdwScanState *fsstate;
-	RangeTblEntry *rte;
+	Oid			serverid;
 	Oid			userid;
-	ForeignTable *table;
 	ForeignServer *server;
 	UserMapping *user;
 	int			numParams;
@@ -907,22 +841,13 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	node->fdw_state = (void *) fsstate;
 
 	/*
-	 * Identify which user to do the remote access as.  This should match what
-	 * ExecCheckRTEPerms() does.
-	 */
-	rte = rt_fetch(fsplan->scan.scanrelid, estate->es_range_table);
-	userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-	/* Get info about foreign table. */
-	fsstate->rel = node->ss.ss_currentRelation;
-	table = GetForeignTable(RelationGetRelid(fsstate->rel));
-	server = GetForeignServer(table->serverid);
-	user = GetUserMapping(userid, server->serverid);
-
-	/*
 	 * Get connection to the foreign server.  Connection manager will
 	 * establish new connection if necessary.
 	 */
+	serverid = intVal(list_nth(fsplan->fdw_private, FdwScanPrivateServerOid));
+	userid = intVal(list_nth(fsplan->fdw_private, FdwScanPrivateUserOid));
+	server = GetForeignServer(serverid);
+	user = GetUserMapping(userid, server->serverid);
 	fsstate->conn = GetConnection(server, user, false);
 
 	/* Assign a unique ID for my cursor */
@@ -932,8 +857,8 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	/* Get private info created by planner functions. */
 	fsstate->query = strVal(list_nth(fsplan->fdw_private,
 									 FdwScanPrivateSelectSql));
-	fsstate->retrieved_attrs = (List *) list_nth(fsplan->fdw_private,
-											   FdwScanPrivateRetrievedAttrs);
+	fsstate->retrieved_attrs = list_nth(fsplan->fdw_private,
+										FdwScanPrivateRetrievedAttrs);
 
 	/* Create contexts for batches of tuples and per-tuple temp workspace. */
 	fsstate->batch_cxt = AllocSetContextCreate(estate->es_query_cxt,
@@ -947,8 +872,18 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 											  ALLOCSET_SMALL_INITSIZE,
 											  ALLOCSET_SMALL_MAXSIZE);
 
-	/* Get info we'll need for input data conversion. */
-	fsstate->attinmeta = TupleDescGetAttInMetadata(RelationGetDescr(fsstate->rel));
+	/* Get info we'll need for input data conversion and error report. */
+	if (fsplan->scan.scanrelid > 0)
+	{
+		fsstate->relname = RelationGetRelationName(node->ss.ss_currentRelation);
+		fsstate->tupdesc = RelationGetDescr(node->ss.ss_currentRelation);
+	}
+	else
+	{
+		fsstate->relname = NULL;
+		fsstate->tupdesc = node->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
+	}
+	fsstate->attinmeta = TupleDescGetAttInMetadata(fsstate->tupdesc);
 
 	/* Prepare for output conversion of parameters used in remote query. */
 	numParams = list_length(fsplan->fdw_exprs);
@@ -1726,10 +1661,12 @@ estimate_path_cost_size(PlannerInfo *root,
 	 */
 	if (fpinfo->use_remote_estimate)
 	{
+		List	   *remote_conds;
 		List	   *remote_join_conds;
 		List	   *local_join_conds;
-		StringInfoData sql;
 		List	   *retrieved_attrs;
+		StringInfoData sql;
+		UserMapping *user;
 		PGconn	   *conn;
 		Selectivity local_sel;
 		QualCost	local_cost;
@@ -1741,24 +1678,24 @@ estimate_path_cost_size(PlannerInfo *root,
 		classifyConditions(root, baserel, join_conds,
 						   &remote_join_conds, &local_join_conds);
 
+		remote_conds = copyObject(fpinfo->remote_conds);
+		remote_conds = list_concat(remote_conds, remote_join_conds);
+
 		/*
 		 * Construct EXPLAIN query including the desired SELECT, FROM, and
 		 * WHERE clauses.  Params and other-relation Vars are replaced by
 		 * dummy values.
+		 * Here we waste params_list and fdw_ps_tlist because they are
+		 * unnecessary for EXPLAIN.
 		 */
 		initStringInfo(&sql);
 		appendStringInfoString(&sql, "EXPLAIN ");
-		deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used,
-						 &retrieved_attrs);
-		if (fpinfo->remote_conds)
-			appendWhereClause(&sql, root, baserel, fpinfo->remote_conds,
-							  true, NULL);
-		if (remote_join_conds)
-			appendWhereClause(&sql, root, baserel, remote_join_conds,
-							  (fpinfo->remote_conds == NIL), NULL);
+		deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used, remote_conds,
+						 NULL, NULL, &retrieved_attrs);
 
 		/* Get the remote estimate */
-		conn = GetConnection(fpinfo->server, fpinfo->user, false);
+		user = GetUserMapping(fpinfo->userid, fpinfo->server->serverid);
+		conn = GetConnection(fpinfo->server, user, false);
 		get_remote_estimate(sql.data, conn, &rows, &width,
 							&startup_cost, &total_cost);
 		ReleaseConnection(conn);
@@ -2055,7 +1992,9 @@ fetch_more_data(ForeignScanState *node)
 		{
 			fsstate->tuples[i] =
 				make_tuple_from_result_row(res, i,
-										   fsstate->rel,
+										   fsstate->relname,
+										   fsstate->query,
+										   fsstate->tupdesc,
 										   fsstate->attinmeta,
 										   fsstate->retrieved_attrs,
 										   fsstate->temp_cxt);
@@ -2273,7 +2212,9 @@ store_returning_result(PgFdwModifyState *fmstate,
 		HeapTuple	newtup;
 
 		newtup = make_tuple_from_result_row(res, 0,
-											fmstate->rel,
+										RelationGetRelationName(fmstate->rel),
+											fmstate->query,
+											RelationGetDescr(fmstate->rel),
 											fmstate->attinmeta,
 											fmstate->retrieved_attrs,
 											fmstate->temp_cxt);
@@ -2423,6 +2364,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
 	initStringInfo(&sql);
 	appendStringInfo(&sql, "DECLARE c%u CURSOR FOR ", cursor_number);
 	deparseAnalyzeSql(&sql, relation, &astate.retrieved_attrs);
+	astate.query = sql.data;
 
 	/* In what follows, do not risk leaking any PGresults. */
 	PG_TRY();
@@ -2565,7 +2507,9 @@ analyze_row_processor(PGresult *res, int row, PgFdwAnalyzeState *astate)
 		oldcontext = MemoryContextSwitchTo(astate->anl_cxt);
 
 		astate->rows[pos] = make_tuple_from_result_row(res, row,
-													   astate->rel,
+										   RelationGetRelationName(astate->rel),
+													   astate->query,
+											   RelationGetDescr(astate->rel),
 													   astate->attinmeta,
 													 astate->retrieved_attrs,
 													   astate->temp_cxt);
@@ -2839,6 +2783,268 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
 }
 
 /*
+ * Construct PgFdwRelationInfo from two join sources
+ */
+static PgFdwRelationInfo *
+merge_fpinfo(RelOptInfo *outerrel,
+			 RelOptInfo *innerrel,
+			 JoinType jointype,
+			 double rows,
+			 int width)
+{
+	PgFdwRelationInfo *fpinfo_o;
+	PgFdwRelationInfo *fpinfo_i;
+	PgFdwRelationInfo *fpinfo;
+
+	fpinfo_o = (PgFdwRelationInfo *) outerrel->fdw_private;
+	fpinfo_i = (PgFdwRelationInfo *) innerrel->fdw_private;
+
+	fpinfo = (PgFdwRelationInfo *) palloc0(sizeof(PgFdwRelationInfo));
+
+	/* Join relation must have conditions come from sources */
+	fpinfo->remote_conds = list_concat(copyObject(fpinfo_o->remote_conds),
+									   copyObject(fpinfo_i->remote_conds));
+	fpinfo->local_conds = list_concat(copyObject(fpinfo_o->local_conds),
+									  copyObject(fpinfo_i->local_conds));
+
+	/* Only for simple foreign table scan */
+	fpinfo->attrs_used = NULL;
+
+	/* rows and width will be set later */
+	fpinfo->rows = rows;
+	fpinfo->width = width;
+
+	/* A join have local conditions for outer and inner, so sum up them. */
+	fpinfo->local_conds_cost.startup = fpinfo_o->local_conds_cost.startup +
+									   fpinfo_i->local_conds_cost.startup;
+	fpinfo->local_conds_cost.per_tuple = fpinfo_o->local_conds_cost.per_tuple +
+										 fpinfo_i->local_conds_cost.per_tuple;
+
+	/* Don't consider correlation between local filters. */
+	fpinfo->local_conds_sel = fpinfo_o->local_conds_sel *
+							  fpinfo_i->local_conds_sel;
+
+	fpinfo->use_remote_estimate = false;
+
+	/*
+	 * These two comes default or per-server setting, so outer and inner must
+	 * have same value.
+	 */
+	fpinfo->fdw_startup_cost = fpinfo_o->fdw_startup_cost;
+	fpinfo->fdw_tuple_cost = fpinfo_o->fdw_tuple_cost;
+
+	/*
+	 * TODO estimate more accurately
+	 */
+	fpinfo->startup_cost = fpinfo->fdw_startup_cost +
+						   fpinfo->local_conds_cost.startup;
+	fpinfo->total_cost = fpinfo->startup_cost +
+						 (fpinfo->fdw_tuple_cost +
+						  fpinfo->local_conds_cost.per_tuple +
+						  cpu_tuple_cost) * fpinfo->rows;
+
+	/* serverid and userid are respectively identical */
+	fpinfo->server = fpinfo_o->server;
+	fpinfo->userid = fpinfo_o->userid;
+
+	fpinfo->outerrel = outerrel;
+	fpinfo->innerrel = innerrel;
+	fpinfo->jointype = jointype;
+
+	/* joinclauses and otherclauses will be set later */
+
+	return fpinfo;
+}
+
+/*
+ * postgresGetForeignJoinPaths
+ *		Add possible ForeignPath to joinrel.
+ *
+ * Joins satisfy conditions below can be pushed down to the remote PostgreSQL
+ * server.
+ *
+ * 1) Join type is INNER or OUTER (one of LEFT/RIGHT/FULL)
+ * 2) Both outer and inner portions are safe to push-down
+ * 3) All foreign tables in the join belong to the same foreign server
+ * 4) All foreign tables are accessed with identical user
+ * 5) All join conditions are safe to push down
+ * 6) No relation has local filter (this can be relaxed for INNER JOIN with
+ * no volatile function/operator, but as of now we want safer way)
+ */
+static void
+postgresGetForeignJoinPaths(PlannerInfo *root,
+							RelOptInfo *joinrel,
+							RelOptInfo *outerrel,
+							RelOptInfo *innerrel,
+							SpecialJoinInfo *sjinfo,
+							List *restrictlist)
+{
+	PgFdwRelationInfo *fpinfo;
+	PgFdwRelationInfo *fpinfo_o;
+	PgFdwRelationInfo *fpinfo_i;
+	JoinType		jointype = !sjinfo ? JOIN_INNER : sjinfo->jointype;
+	ForeignPath	   *joinpath;
+	double			rows;
+	Cost			startup_cost;
+	Cost			total_cost;
+
+	ListCell	   *lc;
+	List		   *joinclauses;
+	List		   *otherclauses;
+
+	/*
+	 * We support all outer joins in addition to inner join.  CROSS JOIN is
+	 * an INNER JOIN with no conditions internally, so will be checked later.
+	 */
+	if (jointype != JOIN_INNER && jointype != JOIN_LEFT &&
+		jointype != JOIN_RIGHT && jointype != JOIN_FULL)
+	{
+		ereport(DEBUG3, (errmsg("unsupported join type (SEMI, ANTI)")));
+		return;
+	}
+
+	/*
+	 * Having valid PgFdwRelationInfo in RelOptInfo#fdw_private indicates that
+	 * scanning against the relation can be pushed down.  If either of them
+	 * doesn't have PgFdwRelationInfo, give up to push down this join relation.
+	 */
+	if (!outerrel->fdw_private)
+	{
+		ereport(DEBUG3, (errmsg("outer is not safe to push-down")));
+		return;
+	}
+	if (!innerrel->fdw_private)
+	{
+		ereport(DEBUG3, (errmsg("inner is not safe to push-down")));
+		return;
+	}
+	fpinfo_o = (PgFdwRelationInfo *) outerrel->fdw_private;
+	fpinfo_i = (PgFdwRelationInfo *) innerrel->fdw_private;
+
+	/*
+	 * All relations in the join must belong to same server.  Having a valid
+	 * fdw_private means that all relations in the relations belong to the
+	 * server the fdw_private has, so what we should do is just compare
+	 * serverid of outer/inner relations.
+	 */
+	if (fpinfo_o->server->serverid != fpinfo_i->server->serverid)
+	{
+		ereport(DEBUG3, (errmsg("server unmatch")));
+		return;
+	}
+
+	/*
+	 * effective userid of all source relations should be identical.
+	 * Having a valid fdw_private means that all relations in the relations is
+	 * accessed with identical user, so what we should do is just compare
+	 * userid of outer/inner relations.
+	 */
+	if (fpinfo_o->userid != fpinfo_i->userid)
+	{
+		ereport(DEBUG3, (errmsg("unmatch userid")));
+		return;
+	}
+
+	/*
+	 * No source relation can have local conditions.  This can be relaxed
+	 * if the join is an inner join and local conditions don't contain
+	 * volatile function/operator, but as of now we leave it as future
+	 * enhancement.
+	 */
+	if (fpinfo_o->local_conds != NULL || fpinfo_i->local_conds != NULL)
+	{
+		ereport(DEBUG3, (errmsg("join with local filter")));
+		return;
+	}
+
+	/*
+	 * Separate restrictlist into two lists, join conditions and remote filters.
+	 */
+	joinclauses = restrictlist;
+	if (IS_OUTER_JOIN(jointype))
+	{
+		extract_actual_join_clauses(joinclauses, &joinclauses, &otherclauses);
+	}
+	else
+	{
+		joinclauses = extract_actual_clauses(joinclauses, false);
+		otherclauses = NIL;
+	}
+
+	/*
+	 * Note that CROSS JOIN (cartesian product) is transformed to JOIN_INNER
+	 * with empty joinclauses.  Pushing down CROSS JOIN usually produces more
+	 * result than retrieving each tables separately, so we don't push down
+	 * such joins.
+	 */
+	if (jointype == JOIN_INNER && joinclauses == NIL)
+	{
+		ereport(DEBUG3, (errmsg("unsupported join type (CROSS)")));
+		return;
+	}
+
+	/*
+	 * Join condition must be safe to push down.
+	 */
+	foreach(lc, joinclauses)
+	{
+		Expr *expr = (Expr *) lfirst(lc);
+
+		if (!is_foreign_expr(root, joinrel, expr))
+		{
+			ereport(DEBUG3, (errmsg("join quals contains unsafe conditions")));
+			return;
+		}
+	}
+
+	/*
+	 * Other condition for the join must be safe to push down.
+	 */
+	foreach(lc, otherclauses)
+	{
+		Expr *expr = (Expr *) lfirst(lc);
+
+		if (!is_foreign_expr(root, joinrel, expr))
+		{
+			ereport(DEBUG3, (errmsg("filter contains unsafe conditions")));
+			return;
+		}
+	}
+
+	/* Here we know that this join can be pushed-down to remote side. */
+
+	/* Construct fpinfo for the join relation */
+	fpinfo = merge_fpinfo(outerrel, innerrel, jointype, joinrel->rows, joinrel->width); 
+	fpinfo->joinclauses = joinclauses;
+	fpinfo->otherclauses = otherclauses;
+	joinrel->fdw_private = fpinfo;
+
+	/* TODO determine more accurate cost and rows of the join. */
+	rows = joinrel->rows;
+	startup_cost = fpinfo->startup_cost;
+	total_cost = fpinfo->total_cost;
+
+	/*
+	 * Create a new join path and add it to the joinrel which represents a join
+	 * between foreign tables.
+	 */
+	joinpath = create_foreignscan_path(root,
+									   joinrel,
+									   rows,
+									   startup_cost,
+									   total_cost,
+									   NIL,		/* no pathkeys */
+									   NULL,	/* no required_outer */
+									   NIL);	/* no fdw_private */
+
+	/* Add generated path into joinrel by add_path(). */
+	add_path(joinrel, (Path *) joinpath);
+	elog(DEBUG3, "join path added");
+
+	/* TODO consider parameterized paths */
+}
+
+/*
  * Create a tuple from the specified row of the PGresult.
  *
  * rel is the local representation of the foreign table, attinmeta is
@@ -2849,13 +3055,14 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
 static HeapTuple
 make_tuple_from_result_row(PGresult *res,
 						   int row,
-						   Relation rel,
+						   const char *relname,
+						   const char *query,
+						   TupleDesc tupdesc,
 						   AttInMetadata *attinmeta,
 						   List *retrieved_attrs,
 						   MemoryContext temp_context)
 {
 	HeapTuple	tuple;
-	TupleDesc	tupdesc = RelationGetDescr(rel);
 	Datum	   *values;
 	bool	   *nulls;
 	ItemPointer ctid = NULL;
@@ -2882,7 +3089,9 @@ make_tuple_from_result_row(PGresult *res,
 	/*
 	 * Set up and install callback to report where conversion error occurs.
 	 */
-	errpos.rel = rel;
+	errpos.relname = relname;
+	errpos.query = query;
+	errpos.tupdesc = tupdesc;
 	errpos.cur_attno = 0;
 	errcallback.callback = conversion_error_callback;
 	errcallback.arg = (void *) &errpos;
@@ -2966,11 +3175,39 @@ make_tuple_from_result_row(PGresult *res,
 static void
 conversion_error_callback(void *arg)
 {
+	const char *attname;
+	const char *relname;
 	ConversionLocation *errpos = (ConversionLocation *) arg;
-	TupleDesc	tupdesc = RelationGetDescr(errpos->rel);
+	TupleDesc	tupdesc = errpos->tupdesc;
+	StringInfoData buf;
+
+	if (errpos->relname)
+	{
+		/* error occurred in a scan against a foreign table */ 
+		initStringInfo(&buf);
+		if (errpos->cur_attno > 0)
+			appendStringInfo(&buf, "column \"%s\"",
+					 NameStr(tupdesc->attrs[errpos->cur_attno - 1]->attname));
+		else if (errpos->cur_attno == SelfItemPointerAttributeNumber)
+			appendStringInfoString(&buf, "column \"ctid\"");
+		attname = buf.data;
+
+		initStringInfo(&buf);
+		appendStringInfo(&buf, "foreign table \"%s\"", errpos->relname);
+		relname = buf.data;
+	}
+	else
+	{
+		/* error occurred in a scan against a foreign join */ 
+		initStringInfo(&buf);
+		appendStringInfo(&buf, "column %d", errpos->cur_attno - 1);
+		attname = buf.data;
+
+		initStringInfo(&buf);
+		appendStringInfo(&buf, "foreign join \"%s\"", errpos->query);
+		relname = buf.data;
+	}
 
 	if (errpos->cur_attno > 0 && errpos->cur_attno <= tupdesc->natts)
-		errcontext("column \"%s\" of foreign table \"%s\"",
-				   NameStr(tupdesc->attrs[errpos->cur_attno - 1]->attname),
-				   RelationGetRelationName(errpos->rel));
+		errcontext("%s of %s", attname, relname);
 }
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 950c6f7..0d05e5d 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -16,10 +16,52 @@
 #include "foreign/foreign.h"
 #include "lib/stringinfo.h"
 #include "nodes/relation.h"
+#include "nodes/plannodes.h"
 #include "utils/relcache.h"
 
 #include "libpq-fe.h"
 
+/*
+ * FDW-specific planner information kept in RelOptInfo.fdw_private for a
+ * foreign table or a foreign join.  This information is collected by
+ * postgresGetForeignRelSize, or calculated from join source relations.
+ */
+typedef struct PgFdwRelationInfo
+{
+	/* baserestrictinfo clauses, broken down into safe and unsafe subsets. */
+	List	   *remote_conds;
+	List	   *local_conds;
+
+	/* Bitmap of attr numbers we need to fetch from the remote server. */
+	Bitmapset  *attrs_used;
+
+	/* Cost and selectivity of local_conds. */
+	QualCost	local_conds_cost;
+	Selectivity local_conds_sel;
+
+	/* Estimated size and cost for a scan with baserestrictinfo quals. */
+	double		rows;
+	int			width;
+	Cost		startup_cost;
+	Cost		total_cost;
+
+	/* Options extracted from catalogs. */
+	bool		use_remote_estimate;
+	Cost		fdw_startup_cost;
+	Cost		fdw_tuple_cost;
+
+	/* Cached catalog information. */
+	ForeignServer *server;
+	Oid			userid;
+
+	/* Join information */
+	RelOptInfo *outerrel;
+	RelOptInfo *innerrel;
+	JoinType	jointype;
+	List	   *joinclauses;
+	List	   *otherclauses;
+} PgFdwRelationInfo;
+
 /* in postgres_fdw.c */
 extern int	set_transmission_modes(void);
 extern void reset_transmission_modes(int nestlevel);
@@ -51,13 +93,30 @@ extern void deparseSelectSql(StringInfo buf,
 				 PlannerInfo *root,
 				 RelOptInfo *baserel,
 				 Bitmapset *attrs_used,
+				 List *remote_conds,
+				 List **params_list,
+				 List **fdw_ps_tlist,
 				 List **retrieved_attrs);
-extern void appendWhereClause(StringInfo buf,
+extern void appendConditions(StringInfo buf,
 				  PlannerInfo *root,
 				  RelOptInfo *baserel,
+				  List *outertlist,
+				  List *innertlist,
 				  List *exprs,
-				  bool is_first,
+				  const char *prefix,
 				  List **params);
+extern void deparseJoinSql(StringInfo sql,
+			   PlannerInfo *root,
+			   RelOptInfo *baserel,
+			   RelOptInfo *outerrel,
+			   RelOptInfo *innerrel,
+			   const char *sql_o,
+			   const char *sql_i,
+			   JoinType jointype,
+			   List *joinclauses,
+			   List *otherclauses,
+			   List **fdw_ps_tlist,
+			   List **retrieved_attrs);
 extern void deparseInsertSql(StringInfo buf, PlannerInfo *root,
 				 Index rtindex, Relation rel,
 				 List *targetAttrs, List *returningList,
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 4a23457..05bd2f6 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -11,12 +11,17 @@ DO $d$
             OPTIONS (dbname '$$||current_database()||$$',
                      port '$$||current_setting('port')||$$'
             )$$;
+        EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER postgres_fdw
+            OPTIONS (dbname '$$||current_database()||$$',
+                     port '$$||current_setting('port')||$$'
+            )$$;
     END;
 $d$;
 
 CREATE USER MAPPING FOR public SERVER testserver1
 	OPTIONS (user 'value', password 'value');
 CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
+CREATE USER MAPPING FOR CURRENT_USER SERVER loopback2;
 
 -- ===================================================================
 -- create objects used through FDW loopback server
@@ -39,6 +44,18 @@ CREATE TABLE "S 1"."T 2" (
 	c2 text,
 	CONSTRAINT t2_pkey PRIMARY KEY (c1)
 );
+CREATE TABLE "S 1"."T 3" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text,
+	CONSTRAINT t3_pkey PRIMARY KEY (c1)
+);
+CREATE TABLE "S 1"."T 4" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c4 text,
+	CONSTRAINT t4_pkey PRIMARY KEY (c1)
+);
 
 INSERT INTO "S 1"."T 1"
 	SELECT id,
@@ -54,9 +71,23 @@ INSERT INTO "S 1"."T 2"
 	SELECT id,
 	       'AAA' || to_char(id, 'FM000')
 	FROM generate_series(1, 100) id;
+INSERT INTO "S 1"."T 3"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 3" WHERE c1 % 2 != 0;	-- delete for outer join tests
+INSERT INTO "S 1"."T 4"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 4" WHERE c1 % 3 != 0;	-- delete for outer join tests
 
 ANALYZE "S 1"."T 1";
 ANALYZE "S 1"."T 2";
+ANALYZE "S 1"."T 3";
+ANALYZE "S 1"."T 4";
 
 -- ===================================================================
 -- create foreign tables
@@ -87,6 +118,29 @@ CREATE FOREIGN TABLE ft2 (
 ) SERVER loopback;
 ALTER FOREIGN TABLE ft2 DROP COLUMN cx;
 
+CREATE FOREIGN TABLE ft4 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3');
+
+CREATE FOREIGN TABLE ft5 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4');
+
+CREATE FOREIGN TABLE ft6 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback2 OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE USER view_owner;
+GRANT ALL ON ft5 TO view_owner;
+CREATE VIEW v_ft5 AS SELECT * FROM ft5;
+ALTER VIEW v_ft5 OWNER TO view_owner;
+CREATE USER MAPPING FOR view_owner SERVER loopback;
+
 -- ===================================================================
 -- tests for validator
 -- ===================================================================
@@ -158,8 +212,6 @@ EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
 SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
 -- aggregate
 SELECT COUNT(*) FROM ft1 t1;
--- join two tables
-SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
 -- subquery
 SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <= 10) ORDER BY c1;
 -- subquery+MAX
@@ -216,6 +268,82 @@ SELECT * FROM ft1 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft2 WHERE c1 < 5));
 SELECT * FROM ft2 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft1 WHERE c1 < 5));
 
 -- ===================================================================
+-- JOIN queries
+-- ===================================================================
+-- join two tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- join three tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1, t3.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1, t3.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+-- left outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+-- right outer join
+SET enable_mergejoin = off; -- palnner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+SET enable_mergejoin = on;
+-- full outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+-- full outer join + WHERE clause, only matched rows
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+-- join at WHERE clause 
+SET enable_mergejoin = off; -- palnner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SET enable_mergejoin = on;
+-- join in CTE
+EXPLAIN (COSTS false, VERBOSE)
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+-- ctid with whole-row reference
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- partially unsafe to push down, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+-- SEMI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+-- ANTI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+-- CROSS JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- different server
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- different effective user for permission check
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- unsafe join conditions
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- local filter (unsafe conditions on one side)
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+
+-- ===================================================================
 -- parameterized queries
 -- ===================================================================
 -- simple join
@@ -666,116 +794,6 @@ UPDATE rem1 SET f2 = 'testo';
 INSERT INTO rem1(f2) VALUES ('test') RETURNING ctid;
 
 -- ===================================================================
--- test inheritance features
--- ===================================================================
-
-CREATE TABLE a (aa TEXT);
-CREATE TABLE loct (aa TEXT, bb TEXT);
-CREATE FOREIGN TABLE b (bb TEXT) INHERITS (a)
-  SERVER loopback OPTIONS (table_name 'loct');
-
-INSERT INTO a(aa) VALUES('aaa');
-INSERT INTO a(aa) VALUES('aaaa');
-INSERT INTO a(aa) VALUES('aaaaa');
-
-INSERT INTO b(aa) VALUES('bbb');
-INSERT INTO b(aa) VALUES('bbbb');
-INSERT INTO b(aa) VALUES('bbbbb');
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE a SET aa = 'zzzzzz' WHERE aa LIKE 'aaaa%';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE b SET aa = 'new';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE a SET aa = 'newtoo';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-DELETE FROM a;
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-DROP TABLE a CASCADE;
-DROP TABLE loct;
-
--- Check SELECT FOR UPDATE/SHARE with an inherited source table
-create table loct1 (f1 int, f2 int, f3 int);
-create table loct2 (f1 int, f2 int, f3 int);
-
-create table foo (f1 int, f2 int);
-create foreign table foo2 (f3 int) inherits (foo)
-  server loopback options (table_name 'loct1');
-create table bar (f1 int, f2 int);
-create foreign table bar2 (f3 int) inherits (bar)
-  server loopback options (table_name 'loct2');
-
-insert into foo values(1,1);
-insert into foo values(3,3);
-insert into foo2 values(2,2,2);
-insert into foo2 values(4,4,4);
-insert into bar values(1,11);
-insert into bar values(2,22);
-insert into bar values(6,66);
-insert into bar2 values(3,33,33);
-insert into bar2 values(4,44,44);
-insert into bar2 values(7,77,77);
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for update;
-select * from bar where f1 in (select f1 from foo) for update;
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for share;
-select * from bar where f1 in (select f1 from foo) for share;
-
--- Check UPDATE with inherited target and an inherited source table
-explain (verbose, costs off)
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-
-select tableoid::regclass, * from bar order by 1,2;
-
--- Check UPDATE with inherited target and an appendrel subquery
-explain (verbose, costs off)
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-
-select tableoid::regclass, * from bar order by 1,2;
-
--- Test that WHERE CURRENT OF is not supported
-begin;
-declare c cursor for select * from bar where f1 = 7;
-fetch from c;
-update bar set f2 = null where current of c;
-rollback;
-
-drop table foo cascade;
-drop table bar cascade;
-drop table loct1;
-drop table loct2;
-
--- ===================================================================
 -- test IMPORT FOREIGN SCHEMA
 -- ===================================================================
 
@@ -831,3 +849,7 @@ DROP TYPE "Colors" CASCADE;
 IMPORT FOREIGN SCHEMA import_source LIMIT TO (t5)
   FROM SERVER loopback INTO import_dest5;  -- ERROR
 ROLLBACK;
+
+-- Cleanup
+DROP OWNED BY view_owner;
+DROP USER view_owner;
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 43adb61..4a0159b 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -406,11 +406,27 @@
   <title>Remote Query Optimization</title>
 
   <para>
-   <filename>postgres_fdw</> attempts to optimize remote queries to reduce
-   the amount of data transferred from foreign servers.  This is done by
-   sending query <literal>WHERE</> clauses to the remote server for
-   execution, and by not retrieving table columns that are not needed for
-   the current query.  To reduce the risk of misexecution of queries,
+   <filename>postgres_fdw</filename> attempts to optimize remote queries to
+   reduce the amount of data transferred from foreign servers.
+   This is done by various ways.
+  </para>
+
+  <para>
+   For <literal>SELECT</> clause, <filename>postgres_fdw</filename> sends only
+   actually necessary columns in it.
+  </para>
+
+  <para>
+   If <literal>FROM</> clause contains multiple foreign tables managed
+   by the same server and accessed with identical user,
+   <filename>postgres_fdw</> tries to join foreign tables on the remote side as
+   much as it can.
+   To reduce risk of misexecution of queries, <filename>postgres_fdw</>
+   gives up sending joins to remote when join conditions might have differemt
+   semantics on the remote side.
+  </para>
+
+  <para>
    <literal>WHERE</> clauses are not sent to the remote server unless they use
    only built-in data types, operators, and functions.  Operators and
    functions in the clauses must be <literal>IMMUTABLE</> as well.
diff --git a/doc/src/sgml/release-9.4.sgml b/doc/src/sgml/release-9.4.sgml
index 066c8d4..3e15bb6 100644
--- a/doc/src/sgml/release-9.4.sgml
+++ b/doc/src/sgml/release-9.4.sgml
@@ -491,7 +491,7 @@ Branch: REL9_4_STABLE [b337d9657] 2015-01-15 20:52:18 +0200
     <listitem>
      <para>
       Fix incorrect replay of WAL parameter change records that report
-      changes in the <varname>wal_log_hints</> setting (Petr Jalinek)
+      changes in the <varname>wal_log_hints</> setting (Petr Jelinek)
      </para>
     </listitem>
 
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 351dcb2..d8ff554 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3316,7 +3316,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence)
  * REINDEX_REL_FORCE_INDEXES_UNLOGGED: if true, set the persistence of the
  * rebuilt indexes to unlogged.
  *
- * REINDEX_REL_FORCE_INDEXES_LOGGED: if true, set the persistence of the
+ * REINDEX_REL_FORCE_INDEXES_PERMANENT: if true, set the persistence of the
  * rebuilt indexes to permanent.
  *
  * Returns true if any indexes were rebuilt (including toast table's index
diff --git a/src/backend/commands/alter.c b/src/backend/commands/alter.c
index 3ddd7ec..d28758c 100644
--- a/src/backend/commands/alter.c
+++ b/src/backend/commands/alter.c
@@ -446,7 +446,6 @@ ExecAlterObjectSchemaStmt(AlterObjectSchemaStmt *stmt,
 				Relation	relation;
 				Oid			classId;
 				Oid			nspOid;
-				ObjectAddress address;
 
 				address = get_object_address(stmt->objectType,
 											 stmt->object,
diff --git a/src/backend/commands/event_trigger.c b/src/backend/commands/event_trigger.c
index 0026e53..f07fd06 100644
--- a/src/backend/commands/event_trigger.c
+++ b/src/backend/commands/event_trigger.c
@@ -1416,7 +1416,7 @@ pg_event_trigger_dropped_objects(PG_FUNCTION_ARGS)
 	if (!currentEventTriggerState ||
 		!currentEventTriggerState->in_sql_drop)
 		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				(errcode(ERRCODE_E_R_I_E_EVENT_TRIGGER_PROTOCOL_VIOLATED),
 		 errmsg("%s can only be called in a sql_drop event trigger function",
 				"pg_event_trigger_dropped_objects()")));
 
@@ -1536,7 +1536,7 @@ pg_event_trigger_table_rewrite_oid(PG_FUNCTION_ARGS)
 	if (!currentEventTriggerState ||
 		currentEventTriggerState->table_rewrite_oid == InvalidOid)
 		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				(errcode(ERRCODE_E_R_I_E_EVENT_TRIGGER_PROTOCOL_VIOLATED),
 		 errmsg("%s can only be called in a table_rewrite event trigger function",
 				"pg_event_trigger_table_rewrite_oid()")));
 
@@ -1557,7 +1557,7 @@ pg_event_trigger_table_rewrite_reason(PG_FUNCTION_ARGS)
 	if (!currentEventTriggerState ||
 		currentEventTriggerState->table_rewrite_reason == 0)
 		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				(errcode(ERRCODE_E_R_I_E_EVENT_TRIGGER_PROTOCOL_VIOLATED),
 		 errmsg("%s can only be called in a table_rewrite event trigger function",
 				"pg_event_trigger_table_rewrite_reason()")));
 
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index db46134..be4cd1d 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -507,6 +507,10 @@ AutoVacLauncherMain(int argc, char *argv[])
 		/* Now we can allow interrupts again */
 		RESUME_INTERRUPTS();
 
+		/* if in shutdown mode, no need for anything further; just go away */
+		if (got_SIGTERM)
+			goto shutdown;
+
 		/*
 		 * Sleep at least 1 second after any error.  We don't want to be
 		 * filling the error logs as fast as we can.
@@ -542,10 +546,14 @@ AutoVacLauncherMain(int argc, char *argv[])
 	SetConfigOption("default_transaction_isolation", "read committed",
 					PGC_SUSET, PGC_S_OVERRIDE);
 
-	/* in emergency mode, just start a worker and go away */
+	/*
+	 * In emergency mode, just start a worker (unless shutdown was requested)
+	 * and go away.
+	 */
 	if (!AutoVacuumingActive())
 	{
-		do_start_worker();
+		if (!got_SIGTERM)
+			do_start_worker();
 		proc_exit(0);			/* done */
 	}
 
@@ -560,7 +568,8 @@ AutoVacLauncherMain(int argc, char *argv[])
 	 */
 	rebuild_database_list(InvalidOid);
 
-	for (;;)
+	/* loop until shutdown request */
+	while (!got_SIGTERM)
 	{
 		struct timeval nap;
 		TimestampTz current_time = 0;
@@ -758,6 +767,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 	}
 
 	/* Normal exit from the autovac launcher is here */
+shutdown:
 	ereport(LOG,
 			(errmsg("autovacuum launcher shutting down")));
 	AutoVacuumShmem->av_launcherpid = 0;
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index d874a1a..18cbdd0 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -34,9 +34,6 @@
 #include "utils/pg_locale.h"
 #include "utils/sortsupport.h"
 
-#ifdef DEBUG_ABBREV_KEYS
-#define DEBUG_elog_output	DEBUG1
-#endif
 
 /* GUC variable */
 int			bytea_output = BYTEA_OUTPUT_HEX;
@@ -2149,11 +2146,13 @@ bttext_abbrev_abort(int memtupcount, SortSupport ssup)
 	 * time there are differences within full key strings not captured in
 	 * abbreviations.
 	 */
-#ifdef DEBUG_ABBREV_KEYS
+#ifdef TRACE_SORT
+	if (trace_sort)
 	{
 		double norm_abbrev_card = abbrev_distinct / (double) memtupcount;
 
-		elog(DEBUG_elog_output, "abbrev_distinct after %d: %f (key_distinct: %f, norm_abbrev_card: %f, prop_card: %f)",
+		elog(LOG, "bttext_abbrev: abbrev_distinct after %d: %f "
+			 "(key_distinct: %f, norm_abbrev_card: %f, prop_card: %f)",
 			 memtupcount, abbrev_distinct, key_distinct, norm_abbrev_card,
 			 tss->prop_card);
 	}
@@ -2219,11 +2218,11 @@ bttext_abbrev_abort(int memtupcount, SortSupport ssup)
 	 * of moderately high to high abbreviated cardinality.  There is little to
 	 * lose but much to gain, which our strategy reflects.
 	 */
-#ifdef DEBUG_ABBREV_KEYS
-	elog(DEBUG_elog_output, "would have aborted abbreviation due to worst-case at %d. abbrev_distinct: %f, key_distinct: %f, prop_card: %f",
-		 memtupcount, abbrev_distinct, key_distinct, tss->prop_card);
-	/* Actually abort only when debugging is disabled */
-	return false;
+#ifdef TRACE_SORT
+	if (trace_sort)
+		elog(LOG, "bttext_abbrev: aborted abbreviation at %d "
+			 "(abbrev_distinct: %f, key_distinct: %f, prop_card: %f)",
+			 memtupcount, abbrev_distinct, key_distinct, tss->prop_card);
 #endif
 
 	return true;
diff --git a/src/backend/utils/errcodes.txt b/src/backend/utils/errcodes.txt
index 6a113b8..6cc3ed9 100644
--- a/src/backend/utils/errcodes.txt
+++ b/src/backend/utils/errcodes.txt
@@ -278,6 +278,7 @@ Section: Class 39 - External Routine Invocation Exception
 39004    E    ERRCODE_E_R_I_E_NULL_VALUE_NOT_ALLOWED                         null_value_not_allowed
 39P01    E    ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED                      trigger_protocol_violated
 39P02    E    ERRCODE_E_R_I_E_SRF_PROTOCOL_VIOLATED                          srf_protocol_violated
+39P03    E    ERRCODE_E_R_I_E_EVENT_TRIGGER_PROTOCOL_VIOLATED                event_trigger_protocol_violated
 
 Section: Class 3B - Savepoint Exception
 
diff --git a/src/bin/pg_rewind/nls.mk b/src/bin/pg_rewind/nls.mk
index e43f3b9..6561226 100644
--- a/src/bin/pg_rewind/nls.mk
+++ b/src/bin/pg_rewind/nls.mk
@@ -1,9 +1,9 @@
 # src/bin/pg_rewind/nls.mk
 CATALOG_NAME     = pg_rewind
 AVAIL_LANGUAGES  =
-GETTEXT_FILES    = copy_fetch.c datapagemap.c fetch.c filemap.c libpq_fetch.c logging.c parsexlog.c pg_rewind.c timeline.c ../../common/fe_memutils.c ../../../src/backend/access/transam/xlogreader.c
+GETTEXT_FILES    = copy_fetch.c datapagemap.c fetch.c file_ops.c filemap.c libpq_fetch.c logging.c parsexlog.c pg_rewind.c timeline.c ../../common/fe_memutils.c ../../common/restricted_token.c ../../../src/backend/access/transam/xlogreader.c
 
-GETTEXT_TRIGGERS = pg_log pg_fatal report_invalid_record:2
+GETTEXT_TRIGGERS = pg_log:2 pg_fatal report_invalid_record:2
 GETTEXT_FLAGS    = pg_log:2:c-format \
     pg_fatal:1:c-format \
     report_invalid_record:2:c-format
diff --git a/src/bin/pg_rewind/parsexlog.c b/src/bin/pg_rewind/parsexlog.c
index 3cf96ab..715aaab 100644
--- a/src/bin/pg_rewind/parsexlog.c
+++ b/src/bin/pg_rewind/parsexlog.c
@@ -229,7 +229,7 @@ findLastCheckpoint(const char *datadir, XLogRecPtr forkptr, TimeLineID tli,
 }
 
 /* XLogreader callback function, to read a WAL page */
-int
+static int
 SimpleXLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr,
 				   int reqLen, XLogRecPtr targetRecPtr, char *readBuf,
 				   TimeLineID *pageTLI)
diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index dda3a79..93341a3 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -24,6 +24,7 @@
 #include "access/xlog_internal.h"
 #include "catalog/catversion.h"
 #include "catalog/pg_control.h"
+#include "common/restricted_token.h"
 #include "getopt_long.h"
 #include "storage/bufpage.h"
 
@@ -102,6 +103,7 @@ main(int argc, char **argv)
 	TimeLineID	endtli;
 	ControlFileData ControlFile_new;
 
+	set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_rewind"));
 	progname = get_progname(argv[0]);
 
 	/* Process command-line arguments */
@@ -155,25 +157,40 @@ main(int argc, char **argv)
 	/* No source given? Show usage */
 	if (datadir_source == NULL && connstr_source == NULL)
 	{
-		pg_fatal("no source specified (--source-pgdata or --source-server)\n");
-		pg_fatal("Try \"%s --help\" for more information.\n", progname);
+		fprintf(stderr, _("no source specified (--source-pgdata or --source-server)\n"));
+		fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 		exit(1);
 	}
 
 	if (datadir_target == NULL)
 	{
-		pg_fatal("no target data directory specified (--target-pgdata)\n");
+		fprintf(stderr, _("no target data directory specified (--target-pgdata)\n"));
 		fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 		exit(1);
 	}
 
 	if (argc != optind)
 	{
-		pg_fatal("%s: invalid arguments\n", progname);
+		fprintf(stderr, _("invalid arguments\n"));
 		fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 		exit(1);
 	}
 
+	/*
+	 * Don't allow pg_rewind to be run as root, to avoid overwriting the
+	 * ownership of files in the data directory. We need only check for root
+	 * -- any other user won't have sufficient permissions to modify files in
+	 * the data directory.
+	 */
+#ifndef WIN32
+	if (geteuid() == 0)
+		pg_fatal("cannot be executed by \"root\"\n"
+				 "You must run %s as the PostgreSQL superuser.\n",
+				 progname);
+#endif
+
+	get_restricted_token(progname);
+
 	/* Connect to remote server */
 	if (connstr_source)
 		libpqConnect(connstr_source);
diff --git a/src/interfaces/libpq/fe-auth.c b/src/interfaces/libpq/fe-auth.c
index 8927df4..08cc906 100644
--- a/src/interfaces/libpq/fe-auth.c
+++ b/src/interfaces/libpq/fe-auth.c
@@ -236,10 +236,10 @@ pg_SSPI_error(PGconn *conn, const char *mprefix, SECURITY_STATUS r)
 
 	if (FormatMessage(FORMAT_MESSAGE_FROM_SYSTEM, NULL, r, 0,
 					  sysmsg, sizeof(sysmsg), NULL) == 0)
-		printfPQExpBuffer(&conn->errorMessage, "%s: SSPI error %x",
+		printfPQExpBuffer(&conn->errorMessage, "%s: SSPI error %x\n",
 						  mprefix, (unsigned int) r);
 	else
-		printfPQExpBuffer(&conn->errorMessage, "%s: %s (%x)",
+		printfPQExpBuffer(&conn->errorMessage, "%s: %s (%x)\n",
 						  mprefix, sysmsg, (unsigned int) r);
 }
 
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index fa8a33f..e7c7a25 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -4061,6 +4061,16 @@ parseServiceFile(const char *serviceFile,
 				}
 				*val++ = '\0';
 
+				if (strcmp(key, "service") == 0)
+				{
+					printfPQExpBuffer(errorMessage,
+									  libpq_gettext("nested service specifications not supported in service file \"%s\", line %d\n"),
+									  serviceFile,
+									  linenr);
+					fclose(f);
+					return 3;
+				}
+
 				/*
 				 * Set the parameter --- but don't override any previous
 				 * explicit setting.

#22

Kouhei Kaigai

kaigai@ak.jp.nec.com

almost 11 years ago

In reply to: Shigeru HANADA (#21)

2015/04/09 10:48、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：
* merge_fpinfo()

It seems to me fpinfo->rows should be joinrel->rows, and
fpinfo->width also should be joinrel->width.
No need to have special intelligence here, isn't it?

Oops. They are vestige of my struggle which disabled SELECT clause optimization
(omit unused columns). Now width and rows are inherited from joinrel.

Besides

that, fdw_startup_cost and fdw_tuple_cost seem wrong, so I fixed them to use

simple

summary, not average.

Does fpinfo->fdw_startup_cost represent a cost to open connection to remote
PostgreSQL, doesn't it?

postgres_fdw.c:1757 says as follows:

/*
* Add some additional cost factors to account for connection overhead
* (fdw_startup_cost), transferring data across the network
* (fdw_tuple_cost per retrieved row), and local manipulation of the data
* (cpu_tuple_cost per retrieved row).
*/

If so, does a ForeignScan that involves 100 underlying relation takes 100
times heavy network operations on startup? Probably, no.
I think, average is better than sum, and max of them will reflect the cost
more correctly.

In my current opinion, no. Though I remember that I've written such comments
before :P.

Connection establishment occurs only once for the very first access to the server,
so in the use cases with long-lived session (via psql, connection pooling, etc.),
taking connection overhead into account *every time* seems too pessimistic.

Instead, for practical cases, fdw_startup_cost should consider overheads of query
construction and getting first response of it (hopefully it minus retrieving
actual data). These overheads are visible in the order of milliseconds. I’m
not sure how much is appropriate for the default, but 100 seems not so bad.

Anyway fdw_startup_cost is per-server setting as same as fdw_tuple_cost, and it
should not be modified according to the width of the result, so using
fpinfo_o->fdw_startup_cost would be ok.

Indeed, I forgot the connection cache mechanism. As long as we define
fdw_startup_cost as you mentioned, it seems to me your logic is heuristically
reasonable.

Also, fdw_tuple_cost introduce the cost of data transfer over the network.
I thinks, weighted average is the best strategy, like:
fpinfo->fdw_tuple_cost =
(fpinfo_o->width / (fpinfo_o->width + fpinfo_i->width) *

fpinfo_o->fdw_tuple_cost +

(fpinfo_i->width / (fpinfo_o->width + fpinfo_i->width) *

fpinfo_i->fdw_tuple_cost;

That's just my suggestion. Please apply the best way you thought.

I can’t agree that strategy, because 1) width 0 causes per-tuple cost 0, and 2)
fdw_tuple_cost never vary in a foreign server. Using fpinfo_o->fdw_tuple_cost
(it must be identical to fpinfo_i->fdw_tuple_cost) seems reasonable. Thoughts?

OK, you are right.

I think it is time to hand over the patch reviewing to committers.
So, let me mark it "ready for committers".

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Resolved by subject fallback

#23

Shigeru HANADA

shigeru.hanada@gmail.com

almost 11 years ago

In reply to: Kouhei Kaigai (#22)

2 attachment(s)

Hi Kaigai-san,

Thanks for further review, but I found two bugs in v10 patch.
I’ve fixed them and wrapped up v11 patch here.

* Fix bug about illegal column order

Scan against a base relation returns columns in order of column definition, but its target list might require different order. This can be resolved by tuple projection in usual cases, but pushing down joins skips the step, so we need to treat it in remote query.

Before this fix, deparseProjectionSql() was called only for queries which have ctid or whole-row reference in its target list, but it was a too-much optimization. We always need to call it, because usual column list might require ordering conversion. Checking ordering is not impossible, but it seems useless effort.

Another way to resolve this issue is to reorder SELECT clause of a query for base relation if it was a source of a join, but it requires stepping back in planning, so the fix above was chosen.

"three tables join" test case is also changed to check this behavior.

* Fix bug of duplicate fdw_ps_tlist contents.

I coded that deparseSelectSql passes fdw_ps_tlist to deparseSelectSql for underlying RelOptInfo, but it causes redundant entries in fdw_ps_tlist in cases of joining more than two foreign tables. I changed to pass NULL as fdw_ps_tlist for recursive call of deparseSelectSql.

* Fix typos

Please review the v11 patch, and mark it as “ready for committer” if it’s ok.

In addition to essential features, I tried to implement relation listing in EXPLAIN output.

Attached explain_forein_join.patch adds capability to show join combination of a ForeignScan in EXPLAIN output as an additional item “Relations”. I thought that using array to list relations is a good way too, but I chose one string value because users would like to know order and type of joins too.

2015/04/09 21:22、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：

2015/04/09 10:48、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：
* merge_fpinfo()

It seems to me fpinfo->rows should be joinrel->rows, and
fpinfo->width also should be joinrel->width.
No need to have special intelligence here, isn't it?

Oops. They are vestige of my struggle which disabled SELECT clause optimization
(omit unused columns). Now width and rows are inherited from joinrel.

Besides

that, fdw_startup_cost and fdw_tuple_cost seem wrong, so I fixed them to use

simple

summary, not average.

Does fpinfo->fdw_startup_cost represent a cost to open connection to remote
PostgreSQL, doesn't it?

postgres_fdw.c:1757 says as follows:

/*
* Add some additional cost factors to account for connection overhead
* (fdw_startup_cost), transferring data across the network
* (fdw_tuple_cost per retrieved row), and local manipulation of the data
* (cpu_tuple_cost per retrieved row).
*/

If so, does a ForeignScan that involves 100 underlying relation takes 100
times heavy network operations on startup? Probably, no.
I think, average is better than sum, and max of them will reflect the cost
more correctly.

In my current opinion, no. Though I remember that I've written such comments
before :P.

Connection establishment occurs only once for the very first access to the server,
so in the use cases with long-lived session (via psql, connection pooling, etc.),
taking connection overhead into account *every time* seems too pessimistic.

Instead, for practical cases, fdw_startup_cost should consider overheads of query
construction and getting first response of it (hopefully it minus retrieving
actual data). These overheads are visible in the order of milliseconds. I’m
not sure how much is appropriate for the default, but 100 seems not so bad.

Anyway fdw_startup_cost is per-server setting as same as fdw_tuple_cost, and it
should not be modified according to the width of the result, so using
fpinfo_o->fdw_startup_cost would be ok.

Indeed, I forgot the connection cache mechanism. As long as we define
fdw_startup_cost as you mentioned, it seems to me your logic is heuristically
reasonable.

Also, fdw_tuple_cost introduce the cost of data transfer over the network.
I thinks, weighted average is the best strategy, like:
fpinfo->fdw_tuple_cost =
(fpinfo_o->width / (fpinfo_o->width + fpinfo_i->width) *

fpinfo_o->fdw_tuple_cost +

(fpinfo_i->width / (fpinfo_o->width + fpinfo_i->width) *

fpinfo_i->fdw_tuple_cost;

That's just my suggestion. Please apply the best way you thought.

I can’t agree that strategy, because 1) width 0 causes per-tuple cost 0, and 2)
fdw_tuple_cost never vary in a foreign server. Using fpinfo_o->fdw_tuple_cost
(it must be identical to fpinfo_i->fdw_tuple_cost) seems reasonable. Thoughts?

OK, you are right.

I think it is time to hand over the patch reviewing to committers.
So, let me mark it "ready for committers".

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Shigeru HANADA
shigeru.hanada@gmail.com

Attachments:

foreign_join_v11.patchapplication/octet-stream; name=foreign_join_v11.patchDownload

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 94fab18..7e27313 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -44,8 +44,11 @@
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
+#include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "nodes/plannodes.h"
 #include "optimizer/clauses.h"
+#include "optimizer/prep.h"
 #include "optimizer/var.h"
 #include "parser/parsetree.h"
 #include "utils/builtins.h"
@@ -89,6 +92,8 @@ typedef struct deparse_expr_cxt
 	RelOptInfo *foreignrel;		/* the foreign relation we are planning for */
 	StringInfo	buf;			/* output buffer to append to */
 	List	  **params_list;	/* exprs that will become remote Params */
+	List	   *outertlist;		/* outer child's target list */
+	List	   *innertlist;		/* inner child's target list */
 } deparse_expr_cxt;
 
 /*
@@ -137,12 +142,19 @@ static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod,
 static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
 					   deparse_expr_cxt *context);
 
+/*
+ * convert absolute attnum to relative one.  This would be handy for handling
+ * attnum for attrs_used and column aliases.
+ */
+#define TO_RELATIVE(x)	((x) - FirstLowInvalidHeapAttributeNumber)
+
 
 /*
  * Examine each qual clause in input_conds, and classify them into two groups,
  * which are returned as two lists:
  *	- remote_conds contains expressions that can be evaluated remotely
  *	- local_conds contains expressions that can't be evaluated remotely
+ * Note that each element is Expr, which was stripped from RestrictInfo, 
  */
 void
 classifyConditions(PlannerInfo *root,
@@ -161,9 +173,9 @@ classifyConditions(PlannerInfo *root,
 		RestrictInfo *ri = (RestrictInfo *) lfirst(lc);
 
 		if (is_foreign_expr(root, baserel, ri->clause))
-			*remote_conds = lappend(*remote_conds, ri);
+			*remote_conds = lappend(*remote_conds, ri->clause);
 		else
-			*local_conds = lappend(*local_conds, ri);
+			*local_conds = lappend(*local_conds, ri->clause);
 	}
 }
 
@@ -250,7 +262,7 @@ foreign_expr_walker(Node *node,
 				 * Param's collation, ie it's not safe for it to have a
 				 * non-default collation.
 				 */
-				if (var->varno == glob_cxt->foreignrel->relid &&
+				if (bms_is_member(var->varno, glob_cxt->foreignrel->relids) &&
 					var->varlevelsup == 0)
 				{
 					/* Var belongs to foreign table */
@@ -681,12 +693,61 @@ deparseSelectSql(StringInfo buf,
 				 PlannerInfo *root,
 				 RelOptInfo *baserel,
 				 Bitmapset *attrs_used,
+				 List *remote_conds,
+				 List **params_list,
+				 List **fdw_ps_tlist,
 				 List **retrieved_attrs)
 {
+	PgFdwRelationInfo  *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
 	RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
 	Relation	rel;
 
 	/*
+	 * If given relation was a join relation, recursively construct statement
+	 * by putting each outer and inner relations in FROM clause as a subquery
+	 * with aliasing.
+	 */
+	if (baserel->reloptkind == RELOPT_JOINREL)
+	{
+		RelOptInfo		   *rel_o = fpinfo->outerrel;
+		RelOptInfo		   *rel_i = fpinfo->innerrel;
+		PgFdwRelationInfo  *fpinfo_o = (PgFdwRelationInfo *) rel_o->fdw_private;
+		PgFdwRelationInfo  *fpinfo_i = (PgFdwRelationInfo *) rel_i->fdw_private;
+		StringInfoData		sql_o;
+		StringInfoData		sql_i;
+		List			   *ret_attrs_tmp;	/* not used */
+
+		/*
+		 * Deparse query for outer and inner relation, and combine them into
+		 * a query.
+		 *
+		 * Here we don't pass fdw_ps_tlist because targets of underlying
+		 * relations are already put in joinrel->reltargetlist, and
+		 * deparseJoinRel() takes all care about it.
+		 */
+		initStringInfo(&sql_o);
+		deparseSelectSql(&sql_o, root, rel_o, fpinfo_o->attrs_used,
+						 fpinfo_o->remote_conds, params_list,
+						 NULL, &ret_attrs_tmp);
+		initStringInfo(&sql_i);
+		deparseSelectSql(&sql_i, root, rel_i, fpinfo_i->attrs_used,
+						 fpinfo_i->remote_conds, params_list,
+						 NULL, &ret_attrs_tmp);
+
+		deparseJoinSql(buf, root, baserel,
+					   fpinfo->outerrel,
+					   fpinfo->innerrel,
+					   sql_o.data,
+					   sql_i.data,
+					   fpinfo->jointype,
+					   fpinfo->joinclauses,
+					   fpinfo->otherclauses,
+					   fdw_ps_tlist,
+					   retrieved_attrs);
+		return;
+	}
+
+	/*
 	 * Core code already has some lock on each rel being planned, so we can
 	 * use NoLock here.
 	 */
@@ -705,6 +766,65 @@ deparseSelectSql(StringInfo buf,
 	appendStringInfoString(buf, " FROM ");
 	deparseRelation(buf, rel);
 
+	/*
+	 * Construct WHERE clause
+	 */
+	if (remote_conds)
+		appendConditions(buf, root, baserel, NULL, NULL, remote_conds,
+						 " WHERE ", params_list);
+
+	/*
+	 * Add FOR UPDATE/SHARE if appropriate.  We apply locking during the
+	 * initial row fetch, rather than later on as is done for local tables.
+	 * The extra roundtrips involved in trying to duplicate the local
+	 * semantics exactly don't seem worthwhile (see also comments for
+	 * RowMarkType).
+	 *
+	 * Note: because we actually run the query as a cursor, this assumes
+	 * that DECLARE CURSOR ... FOR UPDATE is supported, which it isn't
+	 * before 8.3.
+	 */
+	if (baserel->relid == root->parse->resultRelation &&
+		(root->parse->commandType == CMD_UPDATE ||
+		 root->parse->commandType == CMD_DELETE))
+	{
+		/* Relation is UPDATE/DELETE target, so use FOR UPDATE */
+		appendStringInfoString(buf, " FOR UPDATE");
+	}
+	else
+	{
+		PlanRowMark *rc = get_plan_rowmark(root->rowMarks, baserel->relid);
+
+		if (rc)
+		{
+			/*
+			 * Relation is specified as a FOR UPDATE/SHARE target, so handle
+			 * that.  (But we could also see LCS_NONE, meaning this isn't a
+			 * target relation after all.)
+			 *
+			 * For now, just ignore any [NO] KEY specification, since (a)
+			 * it's not clear what that means for a remote table that we
+			 * don't have complete information about, and (b) it wouldn't
+			 * work anyway on older remote servers.  Likewise, we don't
+			 * worry about NOWAIT.
+			 */
+			switch (rc->strength)
+			{
+				case LCS_NONE:
+					/* No locking needed */
+					break;
+				case LCS_FORKEYSHARE:
+				case LCS_FORSHARE:
+					appendStringInfoString(buf, " FOR SHARE");
+					break;
+				case LCS_FORNOKEYUPDATE:
+				case LCS_FORUPDATE:
+					appendStringInfoString(buf, " FOR UPDATE");
+					break;
+			}
+		}
+	}
+
 	heap_close(rel, NoLock);
 }
 
@@ -731,8 +851,7 @@ deparseTargetList(StringInfo buf,
 	*retrieved_attrs = NIL;
 
 	/* If there's a whole-row reference, we'll need all the columns. */
-	have_wholerow = bms_is_member(0 - FirstLowInvalidHeapAttributeNumber,
-								  attrs_used);
+	have_wholerow = bms_is_member(TO_RELATIVE(0), attrs_used);
 
 	first = true;
 	for (i = 1; i <= tupdesc->natts; i++)
@@ -743,15 +862,14 @@ deparseTargetList(StringInfo buf,
 		if (attr->attisdropped)
 			continue;
 
-		if (have_wholerow ||
-			bms_is_member(i - FirstLowInvalidHeapAttributeNumber,
-						  attrs_used))
+		if (have_wholerow || bms_is_member(TO_RELATIVE(i), attrs_used))
 		{
 			if (!first)
 				appendStringInfoString(buf, ", ");
 			first = false;
 
 			deparseColumnRef(buf, rtindex, i, root);
+			appendStringInfo(buf, " a%d", TO_RELATIVE(i));
 
 			*retrieved_attrs = lappend_int(*retrieved_attrs, i);
 		}
@@ -761,17 +879,17 @@ deparseTargetList(StringInfo buf,
 	 * Add ctid if needed.  We currently don't support retrieving any other
 	 * system columns.
 	 */
-	if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-					  attrs_used))
+	if (bms_is_member(TO_RELATIVE(SelfItemPointerAttributeNumber), attrs_used))
 	{
 		if (!first)
 			appendStringInfoString(buf, ", ");
 		first = false;
 
-		appendStringInfoString(buf, "ctid");
+		appendStringInfo(buf, "ctid a%d",
+						 TO_RELATIVE(SelfItemPointerAttributeNumber));
 
 		*retrieved_attrs = lappend_int(*retrieved_attrs,
-									   SelfItemPointerAttributeNumber);
+										   SelfItemPointerAttributeNumber);
 	}
 
 	/* Don't generate bad syntax if no undropped columns */
@@ -780,7 +898,8 @@ deparseTargetList(StringInfo buf,
 }
 
 /*
- * Deparse WHERE clauses in given list of RestrictInfos and append them to buf.
+ * Deparse conditions, such as WHERE clause and ON clause of JOIN, in given
+ * list of Expr and append them to buf.
  *
  * baserel is the foreign table we're planning for.
  *
@@ -794,12 +913,14 @@ deparseTargetList(StringInfo buf,
  * so Params and other-relation Vars should be replaced by dummy values.
  */
 void
-appendWhereClause(StringInfo buf,
-				  PlannerInfo *root,
-				  RelOptInfo *baserel,
-				  List *exprs,
-				  bool is_first,
-				  List **params)
+appendConditions(StringInfo buf,
+				 PlannerInfo *root,
+				 RelOptInfo *baserel,
+				 List *outertlist,
+				 List *innertlist,
+				 List *exprs,
+				 const char *prefix,
+				 List **params)
 {
 	deparse_expr_cxt context;
 	int			nestlevel;
@@ -813,31 +934,310 @@ appendWhereClause(StringInfo buf,
 	context.foreignrel = baserel;
 	context.buf = buf;
 	context.params_list = params;
+	context.outertlist = outertlist;
+	context.innertlist = innertlist;
 
 	/* Make sure any constants in the exprs are printed portably */
 	nestlevel = set_transmission_modes();
 
 	foreach(lc, exprs)
 	{
-		RestrictInfo *ri = (RestrictInfo *) lfirst(lc);
+		Expr	   *expr = (Expr *) lfirst(lc);
 
 		/* Connect expressions with "AND" and parenthesize each condition. */
-		if (is_first)
-			appendStringInfoString(buf, " WHERE ");
-		else
-			appendStringInfoString(buf, " AND ");
+		if (prefix)
+			appendStringInfo(buf, "%s", prefix);
 
 		appendStringInfoChar(buf, '(');
-		deparseExpr(ri->clause, &context);
+		deparseExpr(expr, &context);
 		appendStringInfoChar(buf, ')');
 
-		is_first = false;
+		prefix= " AND ";
 	}
 
 	reset_transmission_modes(nestlevel);
 }
 
 /*
+ * Returns position index (start with 1) of given var in given target list, or
+ * 0 when not found.
+ */
+static int
+find_var_pos(Var *node, List *tlist)
+{
+	int		pos = 1;
+	ListCell *lc;
+
+	foreach(lc, tlist)
+	{
+		Var *var = (Var *) lfirst(lc);
+
+		if (equal(var, node))
+		{
+			return pos;
+		}
+		pos++;
+	}
+
+	return 0;
+}
+
+/*
+ * Deparse given Var into buf.
+ */
+static void
+deparseJoinVar(Var *node, deparse_expr_cxt *context)
+{
+	char		side;
+	int			pos;
+
+	pos = find_var_pos(node, context->outertlist);
+	if (pos > 0)
+		side = 'l';
+	else
+	{
+		side = 'r';
+		pos = find_var_pos(node, context->innertlist);
+	}
+	Assert(pos > 0);
+	Assert(side == 'l' || side == 'r');
+
+	/*
+	 * We treat whole-row reference same as ordinary attribute references,
+	 * because such transformation should be done in lower level.
+	 */
+	appendStringInfo(context->buf, "%c.a%d", side, pos);
+}
+
+/*
+ * Deparse column alias list for a subquery in FROM clause.
+ */
+static void
+deparseColumnAliases(StringInfo buf, List *tlist)
+{
+	int			pos;
+	ListCell   *lc;
+
+	pos = 1;
+	foreach(lc, tlist)
+	{
+		/* Deparse column alias for the subquery */
+		if (pos > 1)
+			appendStringInfoString(buf, ", ");
+		appendStringInfo(buf, "a%d", pos);
+		pos++;
+	}
+}
+
+/*
+ * Deparse "wrapper" SQL for a query which projects target lists in proper
+ * order and contents.  Note that this treatment is necessary only for queries
+ * used in FROM clause of a join query.
+ *
+ * Even if the SQL is enough simple (no ctid, no whole-row reference), the order
+ * of output column might different from underlying scan, so we always need to
+ * wrap the queries for join sources.
+ *
+ */
+static const char *
+deparseProjectionSql(PlannerInfo *root,
+					 RelOptInfo *baserel,
+					 const char *sql,
+					 char side)
+{
+	StringInfoData wholerow;
+	StringInfoData buf;
+	ListCell   *lc;
+	bool		first;
+	bool		have_wholerow = false;
+
+	/*
+	 * We have nothing to do if the targetlist contains no special reference,
+	 * such as whole-row and ctid.
+	 */
+	foreach(lc, baserel->reltargetlist)
+	{
+		Var		   *var = (Var *) lfirst(lc);
+		if (var->varattno == 0)
+		{
+			have_wholerow = true;
+			break;
+		}
+	}
+
+	/*
+	 * Construct whole-row reference with ROW() syntax
+	 */
+	if (have_wholerow)
+	{
+		RangeTblEntry *rte;
+		Relation		rel;
+		TupleDesc		tupdesc;
+		int				i;
+
+		/* Obtain TupleDesc for deparsing all valid columns */
+		rte = planner_rt_fetch(baserel->relid, root);
+		rel = heap_open(rte->relid, NoLock);
+		tupdesc = rel->rd_att;
+
+		/* Print all valid columns in ROW() to generate whole-row value */
+		initStringInfo(&wholerow);
+		appendStringInfoString(&wholerow, "ROW(");
+		first = true;
+		for (i = 1; i <= tupdesc->natts; i++)
+		{
+			Form_pg_attribute attr = tupdesc->attrs[i - 1];
+
+			/* Ignore dropped columns. */
+			if (attr->attisdropped)
+				continue;
+
+			if (!first)
+				appendStringInfoString(&wholerow, ", ");
+			first = false;
+
+			appendStringInfo(&wholerow, "%c.a%d", side, TO_RELATIVE(i));
+		}
+		appendStringInfoString(&wholerow, ")");
+
+		heap_close(rel, NoLock);
+	}
+
+	/*
+	 * Construct a SELECT statement which has the original query in its FROM
+	 * clause, and have target list entries in its SELECT clause.  The number
+	 * used in column aliases are attnum - FirstLowInvalidHeapAttributeNumber in
+	 * order to make all numbers positive even for system columns which have
+	 * minus value as attnum.
+	 */
+	initStringInfo(&buf);
+	appendStringInfoString(&buf, "SELECT ");
+	first = true;
+	foreach(lc, baserel->reltargetlist)
+	{
+		Var *var = (Var *) lfirst(lc);
+
+		if (!first)
+			appendStringInfoString(&buf, ", ");
+	
+		if (var->varattno == 0)
+			appendStringInfo(&buf, "%s", wholerow.data);
+		else
+			appendStringInfo(&buf, "%c.a%d", side, TO_RELATIVE(var->varattno));
+
+		first = false;
+	}
+	appendStringInfo(&buf, " FROM (%s) %c", sql, side);
+
+	return buf.data;
+}
+
+/*
+ * Construct a SELECT statement which contains join clause.
+ *
+ * We also create an TargetEntry List of the columns being retrieved, which is
+ * returned to *fdw_ps_tlist.
+ *
+ * path_o, tl_o, sql_o are respectively path, targetlist, and remote query
+ * statement of the outer child relation.  postfix _i means those for the inner
+ * child relation.  jointype and joinclauses are information of join method.
+ * fdw_ps_tlist is output parameter to pass target list of the pseudo scan to
+ * caller.
+ */
+void
+deparseJoinSql(StringInfo buf,
+			   PlannerInfo *root,
+			   RelOptInfo *baserel,
+			   RelOptInfo *outerrel,
+			   RelOptInfo *innerrel,
+			   const char *sql_o,
+			   const char *sql_i,
+			   JoinType jointype,
+			   List *joinclauses,
+			   List *otherclauses,
+			   List **fdw_ps_tlist,
+			   List **retrieved_attrs)
+{
+	StringInfoData selbuf;		/* buffer for SELECT clause */
+	StringInfoData abuf_o;		/* buffer for column alias list of outer */
+	StringInfoData abuf_i;		/* buffer for column alias list of inner */
+	int			i;
+	ListCell   *lc;
+	const char *jointype_str;
+	deparse_expr_cxt context;
+
+	context.root = root;
+	context.foreignrel = baserel;
+	context.buf = &selbuf;
+	context.params_list = NULL;
+	context.outertlist = outerrel->reltargetlist;
+	context.innertlist = innerrel->reltargetlist;
+
+	jointype_str = jointype == JOIN_INNER ? "INNER" :
+				   jointype == JOIN_LEFT ? "LEFT" :
+				   jointype == JOIN_RIGHT ? "RIGHT" :
+				   jointype == JOIN_FULL ? "FULL" : "";
+
+	*retrieved_attrs = NIL;
+
+	/* print SELECT clause of the join scan */
+	initStringInfo(&selbuf);
+	i = 0;
+	foreach(lc, baserel->reltargetlist)
+	{
+		Var		   *var = (Var *) lfirst(lc);
+		TargetEntry *tle;
+
+		if (i > 0)
+			appendStringInfoString(&selbuf, ", ");
+		deparseJoinVar(var, &context);
+
+		tle = makeTargetEntry((Expr *) var, i + 1, NULL, false);
+		if (fdw_ps_tlist)
+			*fdw_ps_tlist = lappend(*fdw_ps_tlist, tle);
+
+		*retrieved_attrs = lappend_int(*retrieved_attrs, i + 1);
+
+		i++;
+	}
+	if (i == 0)
+		appendStringInfoString(&selbuf, "NULL");
+
+	/*
+	 * Do pseudo-projection for an underlying scan on a foreign table, if a) the
+	 * relation is a base relation, and b) its targetlist contains whole-row
+	 * reference.
+	 */
+	if (outerrel->reloptkind == RELOPT_BASEREL)
+		sql_o = deparseProjectionSql(root, outerrel, sql_o, 'l');
+	if (innerrel->reloptkind == RELOPT_BASEREL)
+		sql_i = deparseProjectionSql(root, innerrel, sql_i, 'r');
+
+	/* Deparse column alias portion of subquery in FROM clause. */
+	initStringInfo(&abuf_o);
+	deparseColumnAliases(&abuf_o, outerrel->reltargetlist);
+	initStringInfo(&abuf_i);
+	deparseColumnAliases(&abuf_i, innerrel->reltargetlist);
+
+	/* Construct SELECT statement */
+	appendStringInfo(buf, "SELECT %s FROM", selbuf.data);
+	appendStringInfo(buf, " (%s) l (%s) %s JOIN (%s) r (%s)",
+					 sql_o, abuf_o.data, jointype_str, sql_i, abuf_i.data);
+	/* Append ON clause */
+	if (joinclauses)
+		appendConditions(buf, root, baserel,
+						 outerrel->reltargetlist, innerrel->reltargetlist,
+						 joinclauses,
+						 " ON ", NULL);
+	/* Append WHERE clause */
+	if (otherclauses)
+		appendConditions(buf, root, baserel,
+						 outerrel->reltargetlist, innerrel->reltargetlist,
+						 otherclauses,
+						 " WHERE ", NULL);
+}
+
+/*
  * deparse remote INSERT statement
  *
  * The statement text is appended to buf, and we also create an integer List
@@ -976,8 +1376,7 @@ deparseReturningList(StringInfo buf, PlannerInfo *root,
 	if (trig_after_row)
 	{
 		/* whole-row reference acquires all non-system columns */
-		attrs_used =
-			bms_make_singleton(0 - FirstLowInvalidHeapAttributeNumber);
+		attrs_used = bms_make_singleton(TO_RELATIVE(0));
 	}
 
 	if (returningList != NIL)
@@ -1261,6 +1660,8 @@ deparseExpr(Expr *node, deparse_expr_cxt *context)
 /*
  * Deparse given Var node into context->buf.
  *
+ * If context has valid innerrel, this is invoked for a join conditions.
+ *
  * If the Var belongs to the foreign relation, just print its remote name.
  * Otherwise, it's effectively a Param (and will in fact be a Param at
  * run time).  Handle it the same way we handle plain Params --- see
@@ -1271,39 +1672,46 @@ deparseVar(Var *node, deparse_expr_cxt *context)
 {
 	StringInfo	buf = context->buf;
 
-	if (node->varno == context->foreignrel->relid &&
-		node->varlevelsup == 0)
+	if (context->foreignrel->reloptkind == RELOPT_JOINREL)
 	{
-		/* Var belongs to foreign table */
-		deparseColumnRef(buf, node->varno, node->varattno, context->root);
+		deparseJoinVar(node, context);
 	}
 	else
 	{
-		/* Treat like a Param */
-		if (context->params_list)
+		if (node->varno == context->foreignrel->relid &&
+			node->varlevelsup == 0)
 		{
-			int			pindex = 0;
-			ListCell   *lc;
-
-			/* find its index in params_list */
-			foreach(lc, *context->params_list)
+			/* Var belongs to foreign table */
+			deparseColumnRef(buf, node->varno, node->varattno, context->root);
+		}
+		else
+		{
+			/* Treat like a Param */
+			if (context->params_list)
 			{
-				pindex++;
-				if (equal(node, (Node *) lfirst(lc)))
-					break;
+				int			pindex = 0;
+				ListCell   *lc;
+
+				/* find its index in params_list */
+				foreach(lc, *context->params_list)
+				{
+					pindex++;
+					if (equal(node, (Node *) lfirst(lc)))
+						break;
+				}
+				if (lc == NULL)
+				{
+					/* not in list, so add it */
+					pindex++;
+					*context->params_list = lappend(*context->params_list, node);
+				}
+
+				printRemoteParam(pindex, node->vartype, node->vartypmod, context);
 			}
-			if (lc == NULL)
+			else
 			{
-				/* not in list, so add it */
-				pindex++;
-				*context->params_list = lappend(*context->params_list, node);
+				printRemotePlaceholder(node->vartype, node->vartypmod, context);
 			}
-
-			printRemoteParam(pindex, node->vartype, node->vartypmod, context);
-		}
-		else
-		{
-			printRemotePlaceholder(node->vartype, node->vartypmod, context);
 		}
 	}
 }
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 783cb41..530525e 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9,11 +9,16 @@ DO $d$
             OPTIONS (dbname '$$||current_database()||$$',
                      port '$$||current_setting('port')||$$'
             )$$;
+        EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER postgres_fdw
+            OPTIONS (dbname '$$||current_database()||$$',
+                     port '$$||current_setting('port')||$$'
+            )$$;
     END;
 $d$;
 CREATE USER MAPPING FOR public SERVER testserver1
 	OPTIONS (user 'value', password 'value');
 CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
+CREATE USER MAPPING FOR CURRENT_USER SERVER loopback2;
 -- ===================================================================
 -- create objects used through FDW loopback server
 -- ===================================================================
@@ -35,6 +40,18 @@ CREATE TABLE "S 1"."T 2" (
 	c2 text,
 	CONSTRAINT t2_pkey PRIMARY KEY (c1)
 );
+CREATE TABLE "S 1"."T 3" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text,
+	CONSTRAINT t3_pkey PRIMARY KEY (c1)
+);
+CREATE TABLE "S 1"."T 4" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c4 text,
+	CONSTRAINT t4_pkey PRIMARY KEY (c1)
+);
 INSERT INTO "S 1"."T 1"
 	SELECT id,
 	       id % 10,
@@ -49,8 +66,22 @@ INSERT INTO "S 1"."T 2"
 	SELECT id,
 	       'AAA' || to_char(id, 'FM000')
 	FROM generate_series(1, 100) id;
+INSERT INTO "S 1"."T 3"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 3" WHERE c1 % 2 != 0;	-- delete for outer join tests
+INSERT INTO "S 1"."T 4"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 4" WHERE c1 % 3 != 0;	-- delete for outer join tests
 ANALYZE "S 1"."T 1";
 ANALYZE "S 1"."T 2";
+ANALYZE "S 1"."T 3";
+ANALYZE "S 1"."T 4";
 -- ===================================================================
 -- create foreign tables
 -- ===================================================================
@@ -78,6 +109,26 @@ CREATE FOREIGN TABLE ft2 (
 	c8 user_enum
 ) SERVER loopback;
 ALTER FOREIGN TABLE ft2 DROP COLUMN cx;
+CREATE FOREIGN TABLE ft4 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3');
+CREATE FOREIGN TABLE ft5 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE FOREIGN TABLE ft6 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback2 OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE USER view_owner;
+GRANT ALL ON ft5 TO view_owner;
+CREATE VIEW v_ft5 AS SELECT * FROM ft5;
+ALTER VIEW v_ft5 OWNER TO view_owner;
+CREATE USER MAPPING FOR view_owner SERVER loopback;
 -- ===================================================================
 -- tests for validator
 -- ===================================================================
@@ -119,12 +170,15 @@ ALTER FOREIGN TABLE ft2 OPTIONS (schema_name 'S 1', table_name 'T 1');
 ALTER FOREIGN TABLE ft1 ALTER COLUMN c1 OPTIONS (column_name 'C 1');
 ALTER FOREIGN TABLE ft2 ALTER COLUMN c1 OPTIONS (column_name 'C 1');
 \det+
-                             List of foreign tables
- Schema | Table |  Server  |              FDW Options              | Description 
---------+-------+----------+---------------------------------------+-------------
- public | ft1   | loopback | (schema_name 'S 1', table_name 'T 1') | 
- public | ft2   | loopback | (schema_name 'S 1', table_name 'T 1') | 
-(2 rows)
+                              List of foreign tables
+ Schema | Table |  Server   |              FDW Options              | Description 
+--------+-------+-----------+---------------------------------------+-------------
+ public | ft1   | loopback  | (schema_name 'S 1', table_name 'T 1') | 
+ public | ft2   | loopback  | (schema_name 'S 1', table_name 'T 1') | 
+ public | ft4   | loopback  | (schema_name 'S 1', table_name 'T 3') | 
+ public | ft5   | loopback  | (schema_name 'S 1', table_name 'T 4') | 
+ public | ft6   | loopback2 | (schema_name 'S 1', table_name 'T 4') | 
+(5 rows)
 
 -- Now we should be able to run ANALYZE.
 -- To exercise multiple code paths, we use local stats on ft1
@@ -160,8 +214,8 @@ SELECT * FROM ft1 ORDER BY c3, c1 OFFSET 100 LIMIT 10;
 (10 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                     QUERY PLAN                                                      
+---------------------------------------------------------------------------------------------------------------------
  Limit
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    ->  Sort
@@ -169,7 +223,7 @@ EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan on public.ft1 t1
                Output: c1, c2, c3, c4, c5, c6, c7, c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (8 rows)
 
 SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
@@ -189,8 +243,8 @@ SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
 
 -- whole-row reference
 EXPLAIN (VERBOSE, COSTS false) SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                     QUERY PLAN                                                      
+---------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.*, c3, c1
    ->  Sort
@@ -198,7 +252,7 @@ EXPLAIN (VERBOSE, COSTS false) SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSE
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan on public.ft1 t1
                Output: t1.*, c3, c1
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (8 rows)
 
 SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
@@ -224,11 +278,11 @@ SELECT * FROM ft1 WHERE false;
 
 -- with WHERE clause
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
-                                                                   QUERY PLAN                                                                   
-------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                   QUERY PLAN                                                                                   
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c7 >= '1'::bpchar)) AND (("C 1" = 101)) AND ((c6 = '1'::text))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c7 >= '1'::bpchar)) AND (("C 1" = 101)) AND ((c6 = '1'::text))
 (3 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
@@ -239,13 +293,13 @@ SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
 
 -- with FOR UPDATE/SHARE
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
-                                                   QUERY PLAN                                                   
-----------------------------------------------------------------------------------------------------------------
+                                                                   QUERY PLAN                                                                   
+------------------------------------------------------------------------------------------------------------------------------------------------
  LockRows
    Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 101)) FOR UPDATE
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 101)) FOR UPDATE
 (5 rows)
 
 SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
@@ -255,13 +309,13 @@ SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
 (1 row)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
-                                                  QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
+                                                                  QUERY PLAN                                                                   
+-----------------------------------------------------------------------------------------------------------------------------------------------
  LockRows
    Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 102)) FOR SHARE
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 102)) FOR SHARE
 (5 rows)
 
 SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
@@ -277,22 +331,6 @@ SELECT COUNT(*) FROM ft1 t1;
   1000
 (1 row)
 
--- join two tables
-SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
- c1  
------
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
-(10 rows)
-
 -- subquery
 SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <= 10) ORDER BY c1;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -353,153 +391,148 @@ CREATE OPERATOR === (
     NEGATOR = !==
 );
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = postgres_fdw_abs(t1.c2);
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c1 = postgres_fdw_abs(t1.c2))
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 === t1.c2;
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c1 === t1.c2)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = abs(t1.c2);
-                                            QUERY PLAN                                             
----------------------------------------------------------------------------------------------------
+                                                            QUERY PLAN                                                             
+-----------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = abs(c2)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = abs(c2)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = t1.c2;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
+                                                          QUERY PLAN                                                          
+------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = c2))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = c2))
 (3 rows)
 
 -- ===================================================================
 -- WHERE with remotely-executable conditions
 -- ===================================================================
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 1;         -- Var, OpExpr(b), Const
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
-                                                  QUERY PLAN                                                  
---------------------------------------------------------------------------------------------------------------
+                                                                  QUERY PLAN                                                                  
+----------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 100)) AND ((c2 = 0))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 100)) AND ((c2 = 0))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NULL;        -- NullTest
-                                           QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
+                                                           QUERY PLAN                                                            
+---------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;    -- NullTest
-                                             QUERY PLAN                                              
------------------------------------------------------------------------------------------------------
+                                                             QUERY PLAN                                                              
+-------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
-                                                     QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
+                                                                     QUERY PLAN                                                                      
+-----------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((round(abs("C 1"), 0) = 1::numeric))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((round(abs("C 1"), 0) = 1::numeric))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = -c1;          -- OpExpr(l)
-                                             QUERY PLAN                                              
------------------------------------------------------------------------------------------------------
+                                                             QUERY PLAN                                                              
+-------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = (- "C 1")))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = (- "C 1")))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE 1 = c1!;           -- OpExpr(r)
-                                                QUERY PLAN                                                
-----------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((1::numeric = ("C 1" !)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((1::numeric = ("C 1" !)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE (c1 IS NOT NULL) IS DISTINCT FROM (c1 IS NOT NULL); -- DistinctExpr
-                                                                 QUERY PLAN                                                                 
---------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                 QUERY PLAN                                                                                 
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" IS NOT NULL) IS DISTINCT FROM ("C 1" IS NOT NULL)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" IS NOT NULL) IS DISTINCT FROM ("C 1" IS NOT NULL)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = ANY(ARRAY[c2, 1, c1 + 0]); -- ScalarArrayOpExpr
-                                                        QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
+                                                                        QUERY PLAN                                                                         
+-----------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = ANY (ARRAY[c2, 1, ("C 1" + 0)])))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = ANY (ARRAY[c2, 1, ("C 1" + 0)])))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = (ARRAY[c1,c2,3])[1]; -- ArrayRef
-                                                      QUERY PLAN                                                      
-----------------------------------------------------------------------------------------------------------------------
+                                                                      QUERY PLAN                                                                      
+------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = ((ARRAY["C 1", c2, 3])[1])))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = ((ARRAY["C 1", c2, 3])[1])))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c6 = E'foo''s\\bar';  -- check special chars
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                  
+---------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c6 = E'foo''s\\bar'::text))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c6 = E'foo''s\\bar'::text))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c8 = 'foo';  -- can't be sent to remote
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 -- parameterized remote path
 EXPLAIN (VERBOSE, COSTS false)
   SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
- Nested Loop
+                                                                                                                                                                                                                                                                                     QUERY PLAN                                                                                                                                                                                                                                                                                      
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan
    Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8, b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
-   ->  Foreign Scan on public.ft2 a
-         Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 47))
-   ->  Foreign Scan on public.ft2 b
-         Output: b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
-(8 rows)
+   Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, r.a1, r.a2, r.a3, r.a4, r.a5, r.a6, r.a7, r.a8 FROM (SELECT l.a9, l.a10, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 47))) l) l (a1, a2, a3, a4, a5, a6, a7, a8) INNER JOIN (SELECT r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2, a3, a4, a5, a6, a7, a8) ON ((l.a2 = r.a1))
+(3 rows)
 
 SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  | c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -511,18 +544,18 @@ SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
 EXPLAIN (VERBOSE, COSTS false)
   SELECT * FROM ft2 a, ft2 b
   WHERE a.c2 = 6 AND b.c1 = a.c1 AND a.c8 = 'foo' AND b.c7 = upper(a.c7);
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                 
+--------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop
    Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8, b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
    ->  Foreign Scan on public.ft2 a
          Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8
          Filter: (a.c8 = 'foo'::user_enum)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c2 = 6))
+         Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c2 = 6))
    ->  Foreign Scan on public.ft2 b
          Output: b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
          Filter: (upper((a.c7)::text) = (b.c7)::text)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
+         Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
 (10 rows)
 
 SELECT * FROM ft2 a, ft2 b
@@ -651,21 +684,587 @@ SELECT * FROM ft2 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft1 WHERE c1 < 5));
 (4 rows)
 
 -- ===================================================================
+-- JOIN queries
+-- ===================================================================
+-- join two tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                                                                                               QUERY PLAN                                                                                                                
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1, t1.c3
+               Remote SQL: SELECT l.a1, l.a2, r.a1 FROM (SELECT l.a10, l.a12 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l) l (a1, a2) INNER JOIN (SELECT r.a9 FROM (SELECT "C 1" a9 FROM "S 1"."T 1") r) r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1  | c1  
+-----+-----
+ 101 | 101
+ 102 | 102
+ 103 | 103
+ 104 | 104
+ 105 | 105
+ 106 | 106
+ 107 | 107
+ 108 | 108
+ 109 | 109
+ 110 | 110
+(10 rows)
+
+-- join three tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+                                                                                                                                                                                                              QUERY PLAN                                                                                                                                                                                                               
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c2, t3.c3, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c2, t3.c3, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c2, t3.c3, t1.c3
+               Remote SQL: SELECT l.a1, l.a2, l.a3, r.a1 FROM (SELECT l.a1, l.a2, r.a1, r.a2 FROM (SELECT l.a10, l.a12 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l) l (a1, a2) INNER JOIN (SELECT r.a10, r.a9 FROM (SELECT "C 1" a9, c2 a10 FROM "S 1"."T 1") r) r (a1, a2) ON ((l.a1 = r.a2))) l (a1, a2, a3, a4) INNER JOIN (SELECT r.a11, r.a9 FROM (SELECT c1 a9, c3 a11 FROM "S 1"."T 3") r) r (a1, a2) ON ((l.a1 = r.a2))
+(8 rows)
+
+SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+ c1 | c2 |   c3   
+----+----+--------
+ 22 |  2 | AAA022
+ 24 |  4 | AAA024
+ 26 |  6 | AAA026
+ 28 |  8 | AAA028
+ 30 |  0 | AAA030
+ 32 |  2 | AAA032
+ 34 |  4 | AAA034
+ 36 |  6 | AAA036
+ 38 |  8 | AAA038
+ 40 |  0 | AAA040
+(10 rows)
+
+-- left outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                                              QUERY PLAN                                                                                               
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) LEFT JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 22 |   
+ 24 | 24
+ 26 |   
+ 28 |   
+ 30 | 30
+ 32 |   
+ 34 |   
+ 36 | 36
+ 38 |   
+ 40 |   
+(10 rows)
+
+-- right outer join
+SET enable_mergejoin = off; -- planner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                                              QUERY PLAN                                                                                               
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") l) l (a1) LEFT JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") r) r (a1) ON ((r.a1 = l.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+    | 33
+ 36 | 36
+    | 39
+ 42 | 42
+    | 45
+ 48 | 48
+    | 51
+ 54 | 54
+    | 57
+ 60 | 60
+(10 rows)
+
+SET enable_mergejoin = on;
+-- full outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+                                                                                              QUERY PLAN                                                                                               
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) FULL JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+ c1  | c1 
+-----+----
+  92 |   
+  94 |   
+  96 | 96
+  98 |   
+ 100 |   
+     |  3
+     |  9
+     | 15
+     | 21
+     | 27
+(10 rows)
+
+-- full outer join + WHERE clause, only matched rows
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                                                                   QUERY PLAN                                                                                                                    
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) FULL JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1)) WHERE (((l.a1 = r.a1) OR (l.a1 IS NULL)))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 66 | 66
+ 72 | 72
+ 78 | 78
+ 84 | 84
+ 90 | 90
+ 96 | 96
+    |  3
+    |  9
+    | 15
+    | 21
+(10 rows)
+
+-- join at WHERE clause 
+SET enable_mergejoin = off; -- planner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                                               QUERY PLAN                                                                                               
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) INNER JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1))
+(8 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 66 | 66
+ 72 | 72
+ 78 | 78
+ 84 | 84
+ 90 | 90
+ 96 | 96
+(6 rows)
+
+SET enable_mergejoin = on;
+-- join in CTE
+EXPLAIN (COSTS false, VERBOSE)
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+                                                                                                             QUERY PLAN                                                                                                              
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t.c1_1, t.c2_1, t.c1_3
+   CTE t
+     ->  Foreign Scan
+           Output: t1.c1, t1.c3, t2.c1
+           Remote SQL: SELECT l.a1, l.a2, r.a1 FROM (SELECT l.a10, l.a12 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l) l (a1, a2) INNER JOIN (SELECT r.a9 FROM (SELECT "C 1" a9 FROM "S 1"."T 1") r) r (a1) ON ((l.a1 = r.a1))
+   ->  Sort
+         Output: t.c1_1, t.c2_1, t.c1_3
+         Sort Key: t.c1_3, t.c1_1
+         ->  CTE Scan on t
+               Output: t.c1_1, t.c2_1, t.c1_3
+(11 rows)
+
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+ c1_1 | c2_1 
+------+------
+  101 |  101
+  102 |  102
+  103 |  103
+  104 |  104
+  105 |  105
+  106 |  106
+  107 |  107
+  108 |  108
+  109 |  109
+  110 |  110
+(10 rows)
+
+-- ctid with whole-row reference
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                                                                                                                                                                                                                                   QUERY PLAN                                                                                                                                                                                                                                                    
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+   ->  Sort
+         Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+               Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, r.a1 FROM (SELECT l.a7, ROW(l.a10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17), l.a10, l.a12 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17, ctid a7 FROM "S 1"."T 1") l) l (a1, a2, a3, a4) INNER JOIN (SELECT ROW(r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a9 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2) ON ((l.a3 = r.a2))
+(8 rows)
+
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+  ctid  |                                             t1                                             |                                             t2                                             | c1  
+--------+--------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------+-----
+ (1,4)  | (101,1,00101,"Fri Jan 02 00:00:00 1970 PST","Fri Jan 02 00:00:00 1970",1,"1         ",foo) | (101,1,00101,"Fri Jan 02 00:00:00 1970 PST","Fri Jan 02 00:00:00 1970",1,"1         ",foo) | 101
+ (1,5)  | (102,2,00102,"Sat Jan 03 00:00:00 1970 PST","Sat Jan 03 00:00:00 1970",2,"2         ",foo) | (102,2,00102,"Sat Jan 03 00:00:00 1970 PST","Sat Jan 03 00:00:00 1970",2,"2         ",foo) | 102
+ (1,6)  | (103,3,00103,"Sun Jan 04 00:00:00 1970 PST","Sun Jan 04 00:00:00 1970",3,"3         ",foo) | (103,3,00103,"Sun Jan 04 00:00:00 1970 PST","Sun Jan 04 00:00:00 1970",3,"3         ",foo) | 103
+ (1,7)  | (104,4,00104,"Mon Jan 05 00:00:00 1970 PST","Mon Jan 05 00:00:00 1970",4,"4         ",foo) | (104,4,00104,"Mon Jan 05 00:00:00 1970 PST","Mon Jan 05 00:00:00 1970",4,"4         ",foo) | 104
+ (1,8)  | (105,5,00105,"Tue Jan 06 00:00:00 1970 PST","Tue Jan 06 00:00:00 1970",5,"5         ",foo) | (105,5,00105,"Tue Jan 06 00:00:00 1970 PST","Tue Jan 06 00:00:00 1970",5,"5         ",foo) | 105
+ (1,9)  | (106,6,00106,"Wed Jan 07 00:00:00 1970 PST","Wed Jan 07 00:00:00 1970",6,"6         ",foo) | (106,6,00106,"Wed Jan 07 00:00:00 1970 PST","Wed Jan 07 00:00:00 1970",6,"6         ",foo) | 106
+ (1,10) | (107,7,00107,"Thu Jan 08 00:00:00 1970 PST","Thu Jan 08 00:00:00 1970",7,"7         ",foo) | (107,7,00107,"Thu Jan 08 00:00:00 1970 PST","Thu Jan 08 00:00:00 1970",7,"7         ",foo) | 107
+ (1,11) | (108,8,00108,"Fri Jan 09 00:00:00 1970 PST","Fri Jan 09 00:00:00 1970",8,"8         ",foo) | (108,8,00108,"Fri Jan 09 00:00:00 1970 PST","Fri Jan 09 00:00:00 1970",8,"8         ",foo) | 108
+ (1,12) | (109,9,00109,"Sat Jan 10 00:00:00 1970 PST","Sat Jan 10 00:00:00 1970",9,"9         ",foo) | (109,9,00109,"Sat Jan 10 00:00:00 1970 PST","Sat Jan 10 00:00:00 1970",9,"9         ",foo) | 109
+ (1,13) | (110,0,00110,"Sun Jan 11 00:00:00 1970 PST","Sun Jan 11 00:00:00 1970",0,"0         ",foo) | (110,0,00110,"Sun Jan 11 00:00:00 1970 PST","Sun Jan 11 00:00:00 1970",0,"0         ",foo) | 110
+(10 rows)
+
+-- partially unsafe to push down, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+                                                                                                               QUERY PLAN                                                                                                                
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Nested Loop
+               Output: t1.c1
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Materialize
+                     ->  Foreign Scan
+                           Remote SQL: SELECT NULL FROM (SELECT l.a9 FROM (SELECT "C 1" a9 FROM "S 1"."T 1" WHERE (("C 1" = "C 1"))) l) l (a1) INNER JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") r) r (a1) ON ((l.a1 = r.a1))
+(13 rows)
+
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+ c1 
+----
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+(10 rows)
+
+-- SEMI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Hash Join
+               Output: t1.c1
+               Hash Cond: (t1.c1 = t2.c1)
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t2.c1
+                     ->  HashAggregate
+                           Output: t2.c1
+                           Group Key: t2.c1
+                           ->  Foreign Scan on public.ft2 t2
+                                 Output: t2.c1
+                                 Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+(19 rows)
+
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+ c1  
+-----
+ 101
+ 102
+ 103
+ 104
+ 105
+ 106
+ 107
+ 108
+ 109
+ 110
+(10 rows)
+
+-- ANTI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+                              QUERY PLAN                              
+----------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Hash Anti Join
+               Output: t1.c1
+               Hash Cond: (t1.c1 = t2.c2)
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t2.c2
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c2
+                           Remote SQL: SELECT c2 a10 FROM "S 1"."T 1"
+(16 rows)
+
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+ c1  
+-----
+ 110
+ 111
+ 112
+ 113
+ 114
+ 115
+ 116
+ 117
+ 118
+ 119
+(10 rows)
+
+-- CROSS JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Nested Loop
+               Output: t1.c1, t2.c1
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Materialize
+                     Output: t2.c1
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c1
+                           Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+(15 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1  
+----+-----
+  1 | 101
+  1 | 102
+  1 | 103
+  1 | 104
+  1 | 105
+  1 | 106
+  1 | 107
+  1 | 108
+  1 | 109
+  1 | 110
+(10 rows)
+
+-- different server
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Merge Join
+         Output: t1.c1, t2.c1
+         Merge Cond: (t1.c1 = t2.c1)
+         ->  Sort
+               Output: t1.c1
+               Sort Key: t1.c1
+               ->  Foreign Scan on public.ft5 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+         ->  Sort
+               Output: t2.c1
+               Sort Key: t2.c1
+               ->  Foreign Scan on public.ft6 t2
+                     Output: t2.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1 
+----+----
+(0 rows)
+
+-- different effective user for permission check
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Limit
+   Output: t1.c1, ft5.c1
+   ->  Merge Join
+         Output: t1.c1, ft5.c1
+         Merge Cond: (t1.c1 = ft5.c1)
+         ->  Sort
+               Output: t1.c1
+               Sort Key: t1.c1
+               ->  Foreign Scan on public.ft5 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+         ->  Sort
+               Output: ft5.c1
+               Sort Key: ft5.c1
+               ->  Foreign Scan on public.ft5
+                     Output: ft5.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1 
+----+----
+(0 rows)
+
+-- unsafe join conditions
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Merge Join
+               Output: t1.c1, t2.c1, t1.c3
+               Merge Cond: (t1.c8 = t2.c8)
+               ->  Sort
+                     Output: t1.c1, t1.c3, t1.c8
+                     Sort Key: t1.c8
+                     ->  Foreign Scan on public.ft1 t1
+                           Output: t1.c1, t1.c3, t1.c8
+                           Remote SQL: SELECT "C 1" a10, c3 a12, c8 a17 FROM "S 1"."T 1"
+               ->  Sort
+                     Output: t2.c1, t2.c8
+                     Sort Key: t2.c8
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c1, t2.c8
+                           Remote SQL: SELECT "C 1" a9, c8 a17 FROM "S 1"."T 1"
+(20 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1 | c1  
+----+-----
+  1 | 102
+  1 | 103
+  1 | 104
+  1 | 105
+  1 | 106
+  1 | 107
+  1 | 108
+  1 | 109
+  1 | 110
+  1 |   1
+(10 rows)
+
+-- local filter (unsafe conditions on one side)
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Hash Join
+               Output: t1.c1, t2.c1, t1.c3
+               Hash Cond: (t2.c1 = t1.c1)
+               ->  Foreign Scan on public.ft2 t2
+                     Output: t2.c1
+                     Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t1.c1, t1.c3
+                     ->  Foreign Scan on public.ft1 t1
+                           Output: t1.c1, t1.c3
+                           Filter: (t1.c8 = 'foo'::user_enum)
+                           Remote SQL: SELECT "C 1" a10, c3 a12, c8 a17 FROM "S 1"."T 1"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1  | c1  
+-----+-----
+ 101 | 101
+ 102 | 102
+ 103 | 103
+ 104 | 104
+ 105 | 105
+ 106 | 106
+ 107 | 107
+ 108 | 108
+ 109 | 109
+ 110 | 110
+(10 rows)
+
+-- ===================================================================
 -- parameterized queries
 -- ===================================================================
 -- simple join
 PREPARE st1(int, int) AS SELECT t1.c3, t2.c3 FROM ft1 t1, ft2 t2 WHERE t1.c1 = $1 AND t2.c1 = $2;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st1(1, 2);
-                             QUERY PLAN                             
---------------------------------------------------------------------
+                               QUERY PLAN                               
+------------------------------------------------------------------------
  Nested Loop
    Output: t1.c3, t2.c3
    ->  Foreign Scan on public.ft1 t1
          Output: t1.c3
-         Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+         Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" = 1))
    ->  Foreign Scan on public.ft2 t2
          Output: t2.c3
-         Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" = 2))
+         Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" = 2))
 (8 rows)
 
 EXECUTE st1(1, 1);
@@ -683,8 +1282,8 @@ EXECUTE st1(101, 101);
 -- subquery using stable function (can't be sent to remote)
 PREPARE st2(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c4) = '1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st2(10, 20);
-                                                QUERY PLAN                                                
-----------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Sort
    Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
    Sort Key: t1.c1
@@ -693,13 +1292,13 @@ EXPLAIN (VERBOSE, COSTS false) EXECUTE st2(10, 20);
          Join Filter: (t1.c3 = t2.c3)
          ->  Foreign Scan on public.ft1 t1
                Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20))
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" < 20))
          ->  Materialize
                Output: t2.c3
                ->  Foreign Scan on public.ft2 t2
                      Output: t2.c3
                      Filter: (date(t2.c4) = '01-17-1970'::date)
-                     Remote SQL: SELECT c3, c4 FROM "S 1"."T 1" WHERE (("C 1" > 10))
+                     Remote SQL: SELECT c3 a12, c4 a13 FROM "S 1"."T 1" WHERE (("C 1" > 10))
 (15 rows)
 
 EXECUTE st2(10, 20);
@@ -717,8 +1316,8 @@ EXECUTE st2(101, 121);
 -- subquery using immutable function (can be sent to remote)
 PREPARE st3(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c5) = '1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st3(10, 20);
-                                                      QUERY PLAN                                                       
------------------------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Sort
    Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
    Sort Key: t1.c1
@@ -727,12 +1326,12 @@ EXPLAIN (VERBOSE, COSTS false) EXECUTE st3(10, 20);
          Join Filter: (t1.c3 = t2.c3)
          ->  Foreign Scan on public.ft1 t1
                Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20))
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" < 20))
          ->  Materialize
                Output: t2.c3
                ->  Foreign Scan on public.ft2 t2
                      Output: t2.c3
-                     Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date))
+                     Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date))
 (14 rows)
 
 EXECUTE st3(10, 20);
@@ -749,108 +1348,108 @@ EXECUTE st3(20, 30);
 -- custom plan should be chosen initially
 PREPARE st4(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 = $1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 -- once we try it enough times, should switch to generic plan
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
 (3 rows)
 
 -- value of $1 should not be sent to remote
 PREPARE st5(user_enum,int) AS SELECT * FROM ft1 t1 WHERE c8 = $1 and c1 = $2;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = $1)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
 (4 rows)
 
 EXECUTE st5('foo', 1);
@@ -868,14 +1467,14 @@ DEALLOCATE st5;
 -- System columns, except ctid, should not be sent to remote
 EXPLAIN (VERBOSE, COSTS false)
 SELECT * FROM ft1 t1 WHERE t1.tableoid = 'pg_class'::regclass LIMIT 1;
-                                  QUERY PLAN                                   
--------------------------------------------------------------------------------
+                                                  QUERY PLAN                                                   
+---------------------------------------------------------------------------------------------------------------
  Limit
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8
          Filter: (t1.tableoid = '1259'::oid)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (6 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.tableoid = 'ft1'::regclass LIMIT 1;
@@ -886,13 +1485,13 @@ SELECT * FROM ft1 t1 WHERE t1.tableoid = 'ft1'::regclass LIMIT 1;
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
-                                  QUERY PLAN                                   
--------------------------------------------------------------------------------
+                                                  QUERY PLAN                                                   
+---------------------------------------------------------------------------------------------------------------
  Limit
    Output: ((tableoid)::regclass), c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: (tableoid)::regclass, c1, c2, c3, c4, c5, c6, c7, c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (5 rows)
 
 SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
@@ -903,11 +1502,11 @@ SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((ctid = '(0,2)'::tid))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((ctid = '(0,2)'::tid))
 (3 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
@@ -918,13 +1517,13 @@ SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT ctid, * FROM ft1 t1 LIMIT 1;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                       QUERY PLAN                                                       
+------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: ctid, c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: ctid, c1, c2, c3, c4, c5, c6, c7, c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17, ctid a7 FROM "S 1"."T 1"
 (5 rows)
 
 SELECT ctid, * FROM ft1 t1 LIMIT 1;
@@ -987,7 +1586,7 @@ FETCH c;
 SAVEPOINT s;
 SELECT * FROM ft1 WHERE 1 / (c1 - 1) > 0;  -- ERROR
 ERROR:  division by zero
-CONTEXT:  Remote SQL command: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (((1 / ("C 1" - 1)) > 0))
+CONTEXT:  Remote SQL command: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (((1 / ("C 1" - 1)) > 0))
 ROLLBACK TO s;
 FETCH c;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -1010,64 +1609,64 @@ create foreign table ft3 (f1 text collate "C", f2 text)
   server loopback options (table_name 'loct3');
 -- can be sent to remote
 explain (verbose, costs off) select * from ft3 where f1 = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f1 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f1 = 'foo'::text))
 (3 rows)
 
 explain (verbose, costs off) select * from ft3 where f1 COLLATE "C" = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f1 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f1 = 'foo'::text))
 (3 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f2 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f2 = 'foo'::text))
 (3 rows)
 
 -- can't be sent to remote
 explain (verbose, costs off) select * from ft3 where f1 COLLATE "POSIX" = 'foo';
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: ((ft3.f1)::text = 'foo'::text)
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f1 = 'foo' COLLATE "C";
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: (ft3.f1 = 'foo'::text COLLATE "C")
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 COLLATE "C" = 'foo';
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: ((ft3.f2)::text = 'foo'::text)
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 = 'foo' COLLATE "C";
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: (ft3.f2 = 'foo'::text COLLATE "C")
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 -- ===================================================================
@@ -1085,7 +1684,7 @@ INSERT INTO ft2 (c1,c2,c3) SELECT c1+1000,c2+100, c3 || c3 FROM ft2 LIMIT 20;
                Output: ((ft2_1.c1 + 1000)), ((ft2_1.c2 + 100)), ((ft2_1.c3 || ft2_1.c3))
                ->  Foreign Scan on public.ft2 ft2_1
                      Output: (ft2_1.c1 + 1000), (ft2_1.c2 + 100), (ft2_1.c3 || ft2_1.c3)
-                     Remote SQL: SELECT "C 1", c2, c3 FROM "S 1"."T 1"
+                     Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12 FROM "S 1"."T 1"
 (9 rows)
 
 INSERT INTO ft2 (c1,c2,c3) SELECT c1+1000,c2+100, c3 || c3 FROM ft2 LIMIT 20;
@@ -1210,35 +1809,27 @@ UPDATE ft2 SET c2 = c2 + 400, c3 = c3 || '_update7' WHERE c1 % 10 = 7 RETURNING
 EXPLAIN (verbose, costs off)
 UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
-                                                                            QUERY PLAN                                                                             
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                                                                                                                       QUERY PLAN                                                                                                                                                                                                                                                                       
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Update on public.ft2
    Remote SQL: UPDATE "S 1"."T 1" SET c2 = $2, c3 = $3, c7 = $4 WHERE ctid = $1
-   ->  Hash Join
+   ->  Foreign Scan
          Output: ft2.c1, (ft2.c2 + 500), NULL::integer, (ft2.c3 || '_update9'::text), ft2.c4, ft2.c5, ft2.c6, 'ft2       '::character(10), ft2.c8, ft2.ctid, ft1.*
-         Hash Cond: (ft2.c2 = ft1.c1)
-         ->  Foreign Scan on public.ft2
-               Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c8, ft2.ctid
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c8, ctid FROM "S 1"."T 1" FOR UPDATE
-         ->  Hash
-               Output: ft1.*, ft1.c1
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.*, ft1.c1
-                     Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 9))
-(13 rows)
+         Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, r.a1 FROM (SELECT l.a9, l.a10, l.a12, l.a13, l.a14, l.a15, l.a17, l.a7 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c8 a17, ctid a7 FROM "S 1"."T 1" FOR UPDATE) l) l (a1, a2, a3, a4, a5, a6, a7, a8) INNER JOIN (SELECT ROW(r.a10, r.a11, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a10 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 9))) r) r (a1, a2) ON ((l.a2 = r.a2))
+(5 rows)
 
 UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
 EXPLAIN (verbose, costs off)
   DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
-                                       QUERY PLAN                                       
-----------------------------------------------------------------------------------------
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
  Delete on public.ft2
    Output: c1, c4
-   Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1 RETURNING "C 1", c4
+   Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1 RETURNING "C 1" a9, c4 a13
    ->  Foreign Scan on public.ft2
          Output: ctid
-         Remote SQL: SELECT ctid FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 5)) FOR UPDATE
+         Remote SQL: SELECT ctid a7 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 5)) FOR UPDATE
 (6 rows)
 
 DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
@@ -1351,22 +1942,14 @@ DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
 
 EXPLAIN (verbose, costs off)
 DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
-                                                      QUERY PLAN                                                      
-----------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                                        QUERY PLAN                                                                                                                                                                                         
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Delete on public.ft2
    Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1
-   ->  Hash Join
+   ->  Foreign Scan
          Output: ft2.ctid, ft1.*
-         Hash Cond: (ft2.c2 = ft1.c1)
-         ->  Foreign Scan on public.ft2
-               Output: ft2.ctid, ft2.c2
-               Remote SQL: SELECT c2, ctid FROM "S 1"."T 1" FOR UPDATE
-         ->  Hash
-               Output: ft1.*, ft1.c1
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.*, ft1.c1
-                     Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 2))
-(13 rows)
+         Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a7, l.a10 FROM (SELECT c2 a10, ctid a7 FROM "S 1"."T 1" FOR UPDATE) l) l (a1, a2) INNER JOIN (SELECT ROW(r.a10, r.a11, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a10 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 2))) r) r (a1, a2) ON ((l.a2 = r.a2))
+(5 rows)
 
 DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
 SELECT c1,c2,c3,c4 FROM ft2 ORDER BY c1;
@@ -3027,386 +3610,6 @@ NOTICE:  NEW: (13,"test triggered !")
 (1 row)
 
 -- ===================================================================
--- test inheritance features
--- ===================================================================
-CREATE TABLE a (aa TEXT);
-CREATE TABLE loct (aa TEXT, bb TEXT);
-CREATE FOREIGN TABLE b (bb TEXT) INHERITS (a)
-  SERVER loopback OPTIONS (table_name 'loct');
-INSERT INTO a(aa) VALUES('aaa');
-INSERT INTO a(aa) VALUES('aaaa');
-INSERT INTO a(aa) VALUES('aaaaa');
-INSERT INTO b(aa) VALUES('bbb');
-INSERT INTO b(aa) VALUES('bbbb');
-INSERT INTO b(aa) VALUES('bbbbb');
-SELECT tableoid::regclass, * FROM a;
- tableoid |  aa   
-----------+-------
- a        | aaa
- a        | aaaa
- a        | aaaaa
- b        | bbb
- b        | bbbb
- b        | bbbbb
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |  aa   | bb 
-----------+-------+----
- b        | bbb   | 
- b        | bbbb  | 
- b        | bbbbb | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |  aa   
-----------+-------
- a        | aaa
- a        | aaaa
- a        | aaaaa
-(3 rows)
-
-UPDATE a SET aa = 'zzzzzz' WHERE aa LIKE 'aaaa%';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
- b        | bbb
- b        | bbbb
- b        | bbbbb
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |  aa   | bb 
-----------+-------+----
- b        | bbb   | 
- b        | bbbb  | 
- b        | bbbbb | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
-(3 rows)
-
-UPDATE b SET aa = 'new';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
- b        | new
- b        | new
- b        | new
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid | aa  | bb 
-----------+-----+----
- b        | new | 
- b        | new | 
- b        | new | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
-(3 rows)
-
-UPDATE a SET aa = 'newtoo';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | newtoo
- a        | newtoo
- a        | newtoo
- b        | newtoo
- b        | newtoo
- b        | newtoo
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |   aa   | bb 
-----------+--------+----
- b        | newtoo | 
- b        | newtoo | 
- b        | newtoo | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | newtoo
- a        | newtoo
- a        | newtoo
-(3 rows)
-
-DELETE FROM a;
-SELECT tableoid::regclass, * FROM a;
- tableoid | aa 
-----------+----
-(0 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid | aa | bb 
-----------+----+----
-(0 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid | aa 
-----------+----
-(0 rows)
-
-DROP TABLE a CASCADE;
-NOTICE:  drop cascades to foreign table b
-DROP TABLE loct;
--- Check SELECT FOR UPDATE/SHARE with an inherited source table
-create table loct1 (f1 int, f2 int, f3 int);
-create table loct2 (f1 int, f2 int, f3 int);
-create table foo (f1 int, f2 int);
-create foreign table foo2 (f3 int) inherits (foo)
-  server loopback options (table_name 'loct1');
-create table bar (f1 int, f2 int);
-create foreign table bar2 (f3 int) inherits (bar)
-  server loopback options (table_name 'loct2');
-insert into foo values(1,1);
-insert into foo values(3,3);
-insert into foo2 values(2,2,2);
-insert into foo2 values(4,4,4);
-insert into bar values(1,11);
-insert into bar values(2,22);
-insert into bar values(6,66);
-insert into bar2 values(3,33,33);
-insert into bar2 values(4,44,44);
-insert into bar2 values(7,77,77);
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for update;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
- LockRows
-   Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-   ->  Hash Join
-         Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Append
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.ctid, bar2.tableoid, bar2.*
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(22 rows)
-
-select * from bar where f1 in (select f1 from foo) for update;
- f1 | f2 
-----+----
-  1 | 11
-  2 | 22
-  3 | 33
-  4 | 44
-(4 rows)
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for share;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
- LockRows
-   Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-   ->  Hash Join
-         Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Append
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.ctid, bar2.tableoid, bar2.*
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR SHARE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(22 rows)
-
-select * from bar where f1 in (select f1 from foo) for share;
- f1 | f2 
-----+----
-  1 | 11
-  2 | 22
-  3 | 33
-  4 | 44
-(4 rows)
-
--- Check UPDATE with inherited target and an inherited source table
-explain (verbose, costs off)
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
- Update on public.bar
-   Update on public.bar
-   Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
-   ->  Hash Join
-         Output: bar.f1, (bar.f2 + 100), bar.ctid, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Seq Scan on public.bar
-               Output: bar.f1, bar.f2, bar.ctid
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-   ->  Hash Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar2.f1 = foo.f1)
-         ->  Foreign Scan on public.bar2
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-               Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(37 rows)
-
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-select tableoid::regclass, * from bar order by 1,2;
- tableoid | f1 | f2  
-----------+----+-----
- bar      |  1 | 111
- bar      |  2 | 122
- bar      |  6 |  66
- bar2     |  3 | 133
- bar2     |  4 | 144
- bar2     |  7 |  77
-(6 rows)
-
--- Check UPDATE with inherited target and an appendrel subquery
-explain (verbose, costs off)
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-                                      QUERY PLAN                                      
---------------------------------------------------------------------------------------
- Update on public.bar
-   Update on public.bar
-   Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
-   ->  Hash Join
-         Output: bar.f1, (bar.f2 + 100), bar.ctid, (ROW(foo.f1))
-         Hash Cond: (foo.f1 = bar.f1)
-         ->  Append
-               ->  Seq Scan on public.foo
-                     Output: ROW(foo.f1), foo.f1
-               ->  Foreign Scan on public.foo2
-                     Output: ROW(foo2.f1), foo2.f1
-                     Remote SQL: SELECT f1 FROM public.loct1
-               ->  Seq Scan on public.foo foo_1
-                     Output: ROW((foo_1.f1 + 3)), (foo_1.f1 + 3)
-               ->  Foreign Scan on public.foo2 foo2_1
-                     Output: ROW((foo2_1.f1 + 3)), (foo2_1.f1 + 3)
-                     Remote SQL: SELECT f1 FROM public.loct1
-         ->  Hash
-               Output: bar.f1, bar.f2, bar.ctid
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid
-   ->  Merge Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, (ROW(foo.f1))
-         Merge Cond: (bar2.f1 = foo.f1)
-         ->  Sort
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-               Sort Key: bar2.f1
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Sort
-               Output: (ROW(foo.f1)), foo.f1
-               Sort Key: foo.f1
-               ->  Append
-                     ->  Seq Scan on public.foo
-                           Output: ROW(foo.f1), foo.f1
-                     ->  Foreign Scan on public.foo2
-                           Output: ROW(foo2.f1), foo2.f1
-                           Remote SQL: SELECT f1 FROM public.loct1
-                     ->  Seq Scan on public.foo foo_1
-                           Output: ROW((foo_1.f1 + 3)), (foo_1.f1 + 3)
-                     ->  Foreign Scan on public.foo2 foo2_1
-                           Output: ROW((foo2_1.f1 + 3)), (foo2_1.f1 + 3)
-                           Remote SQL: SELECT f1 FROM public.loct1
-(45 rows)
-
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-select tableoid::regclass, * from bar order by 1,2;
- tableoid | f1 | f2  
-----------+----+-----
- bar      |  1 | 211
- bar      |  2 | 222
- bar      |  6 | 166
- bar2     |  3 | 233
- bar2     |  4 | 244
- bar2     |  7 | 177
-(6 rows)
-
--- Test that WHERE CURRENT OF is not supported
-begin;
-declare c cursor for select * from bar where f1 = 7;
-fetch from c;
- f1 | f2  
-----+-----
-  7 | 177
-(1 row)
-
-update bar set f2 = null where current of c;
-ERROR:  WHERE CURRENT OF is not supported for this table type
-rollback;
-drop table foo cascade;
-NOTICE:  drop cascades to foreign table foo2
-drop table bar cascade;
-NOTICE:  drop cascades to foreign table bar2
-drop table loct1;
-drop table loct2;
--- ===================================================================
 -- test IMPORT FOREIGN SCHEMA
 -- ===================================================================
 CREATE SCHEMA import_source;
@@ -3636,3 +3839,6 @@ QUERY:  CREATE FOREIGN TABLE t5 (
 OPTIONS (schema_name 'import_source', table_name 't5');
 CONTEXT:  importing foreign table "t5"
 ROLLBACK;
+-- Cleanup
+DROP OWNED BY view_owner;
+DROP USER view_owner;
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 478e124..5e5ccb7 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -28,7 +28,6 @@
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/planmain.h"
-#include "optimizer/prep.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/var.h"
 #include "parser/parsetree.h"
@@ -47,41 +46,8 @@ PG_MODULE_MAGIC;
 #define DEFAULT_FDW_TUPLE_COST		0.01
 
 /*
- * FDW-specific planner information kept in RelOptInfo.fdw_private for a
- * foreign table.  This information is collected by postgresGetForeignRelSize.
- */
-typedef struct PgFdwRelationInfo
-{
-	/* baserestrictinfo clauses, broken down into safe and unsafe subsets. */
-	List	   *remote_conds;
-	List	   *local_conds;
-
-	/* Bitmap of attr numbers we need to fetch from the remote server. */
-	Bitmapset  *attrs_used;
-
-	/* Cost and selectivity of local_conds. */
-	QualCost	local_conds_cost;
-	Selectivity local_conds_sel;
-
-	/* Estimated size and cost for a scan with baserestrictinfo quals. */
-	double		rows;
-	int			width;
-	Cost		startup_cost;
-	Cost		total_cost;
-
-	/* Options extracted from catalogs. */
-	bool		use_remote_estimate;
-	Cost		fdw_startup_cost;
-	Cost		fdw_tuple_cost;
-
-	/* Cached catalog information. */
-	ForeignTable *table;
-	ForeignServer *server;
-	UserMapping *user;			/* only set in use_remote_estimate mode */
-} PgFdwRelationInfo;
-
-/*
- * Indexes of FDW-private information stored in fdw_private lists.
+ * Indexes of FDW-private information stored in fdw_private of ForeignScan of
+ * a simple foreign table scan for a SELECT statement.
  *
  * We store various information in ForeignScan.fdw_private to pass it from
  * planner to executor.  Currently we store:
@@ -98,7 +64,11 @@ enum FdwScanPrivateIndex
 	/* SQL statement to execute remotely (as a String node) */
 	FdwScanPrivateSelectSql,
 	/* Integer list of attribute numbers retrieved by the SELECT */
-	FdwScanPrivateRetrievedAttrs
+	FdwScanPrivateRetrievedAttrs,
+	/* Integer value of server for the scan */
+	FdwScanPrivateServerOid,
+	/* Integer value of effective userid for the scan */
+	FdwScanPrivateUserOid,
 };
 
 /*
@@ -128,7 +98,8 @@ enum FdwModifyPrivateIndex
  */
 typedef struct PgFdwScanState
 {
-	Relation	rel;			/* relcache entry for the foreign table */
+	const char *relname;		/* name of relation being scanned */
+	TupleDesc	tupdesc;		/* tuple descriptor of the scan */
 	AttInMetadata *attinmeta;	/* attribute datatype conversion metadata */
 
 	/* extracted fdw_private data */
@@ -194,6 +165,8 @@ typedef struct PgFdwAnalyzeState
 	AttInMetadata *attinmeta;	/* attribute datatype conversion metadata */
 	List	   *retrieved_attrs;	/* attr numbers retrieved by query */
 
+	char	   *query;			/* text of SELECT command */
+
 	/* collected sample rows */
 	HeapTuple  *rows;			/* array of size targrows */
 	int			targrows;		/* target # of sample rows */
@@ -214,7 +187,10 @@ typedef struct PgFdwAnalyzeState
  */
 typedef struct ConversionLocation
 {
-	Relation	rel;			/* foreign table's relcache entry */
+	const char *relname;		/* name of relation being processed, or NULL for
+								   a foreign join */
+	const char *query;			/* query being processed */
+	TupleDesc	tupdesc;		/* tuple descriptor for attribute names */
 	AttrNumber	cur_attno;		/* attribute number being processed, or 0 */
 } ConversionLocation;
 
@@ -288,6 +264,12 @@ static bool postgresAnalyzeForeignTable(Relation relation,
 							BlockNumber *totalpages);
 static List *postgresImportForeignSchema(ImportForeignSchemaStmt *stmt,
 							Oid serverOid);
+static void postgresGetForeignJoinPaths(PlannerInfo *root,
+						   RelOptInfo *joinrel,
+						   RelOptInfo *outerrel,
+						   RelOptInfo *innerrel,
+						   SpecialJoinInfo *sjinfo,
+						   List *restrictlisti);
 
 /*
  * Helper functions
@@ -323,12 +305,40 @@ static void analyze_row_processor(PGresult *res, int row,
 					  PgFdwAnalyzeState *astate);
 static HeapTuple make_tuple_from_result_row(PGresult *res,
 						   int row,
-						   Relation rel,
+						   const char *relname,
+						   const char *query,
+						   TupleDesc tupdesc,
 						   AttInMetadata *attinmeta,
 						   List *retrieved_attrs,
 						   MemoryContext temp_context);
 static void conversion_error_callback(void *arg);
 
+/*
+ * Describe Bitmapset as comma-separated integer list.
+ * For debug purpose.
+ * XXX Can this become a member of bitmapset.c?
+ */
+static char *
+bms_to_str(Bitmapset *bmp)
+{
+	StringInfoData buf;
+	bool		first = true;
+	int			x;
+
+	initStringInfo(&buf);
+
+	x = -1;
+	while ((x = bms_next_member(bmp, x)) >= 0)
+	{
+		if (!first)
+			appendStringInfoString(&buf, ", ");
+		appendStringInfo(&buf, "%d", x);
+
+		first = false;
+	}
+
+	return buf.data;
+}
 
 /*
  * Foreign-data wrapper handler function: return a struct with pointers
@@ -368,6 +378,9 @@ postgres_fdw_handler(PG_FUNCTION_ARGS)
 	/* Support functions for IMPORT FOREIGN SCHEMA */
 	routine->ImportForeignSchema = postgresImportForeignSchema;
 
+	/* Support functions for join push-down */
+	routine->GetForeignJoinPaths = postgresGetForeignJoinPaths;
+
 	PG_RETURN_POINTER(routine);
 }
 
@@ -383,7 +396,9 @@ postgresGetForeignRelSize(PlannerInfo *root,
 						  RelOptInfo *baserel,
 						  Oid foreigntableid)
 {
+	RangeTblEntry *rte;
 	PgFdwRelationInfo *fpinfo;
+	ForeignTable *table;
 	ListCell   *lc;
 
 	/*
@@ -394,8 +409,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	baserel->fdw_private = (void *) fpinfo;
 
 	/* Look up foreign-table catalog info. */
-	fpinfo->table = GetForeignTable(foreigntableid);
-	fpinfo->server = GetForeignServer(fpinfo->table->serverid);
+	table = GetForeignTable(foreigntableid);
+	fpinfo->server = GetForeignServer(table->serverid);
 
 	/*
 	 * Extract user-settable option values.  Note that per-table setting of
@@ -416,7 +431,7 @@ postgresGetForeignRelSize(PlannerInfo *root,
 		else if (strcmp(def->defname, "fdw_tuple_cost") == 0)
 			fpinfo->fdw_tuple_cost = strtod(defGetString(def), NULL);
 	}
-	foreach(lc, fpinfo->table->options)
+	foreach(lc, table->options)
 	{
 		DefElem    *def = (DefElem *) lfirst(lc);
 
@@ -428,20 +443,12 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	}
 
 	/*
-	 * If the table or the server is configured to use remote estimates,
-	 * identify which user to do remote access as during planning.  This
+	 * Identify which user to do remote access as during planning.  This
 	 * should match what ExecCheckRTEPerms() does.  If we fail due to lack of
 	 * permissions, the query would have failed at runtime anyway.
 	 */
-	if (fpinfo->use_remote_estimate)
-	{
-		RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
-		Oid			userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-		fpinfo->user = GetUserMapping(userid, fpinfo->server->serverid);
-	}
-	else
-		fpinfo->user = NULL;
+	rte = planner_rt_fetch(baserel->relid, root);
+	fpinfo->userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
 
 	/*
 	 * Identify which baserestrictinfo clauses can be sent to the remote
@@ -463,10 +470,9 @@ postgresGetForeignRelSize(PlannerInfo *root,
 				   &fpinfo->attrs_used);
 	foreach(lc, fpinfo->local_conds)
 	{
-		RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
+		Expr *expr = (Expr *) lfirst(lc);
 
-		pull_varattnos((Node *) rinfo->clause, baserel->relid,
-					   &fpinfo->attrs_used);
+		pull_varattnos((Node *) expr, baserel->relid, &fpinfo->attrs_used);
 	}
 
 	/*
@@ -752,6 +758,8 @@ postgresGetForeignPlan(PlannerInfo *root,
 	List	   *retrieved_attrs;
 	StringInfoData sql;
 	ListCell   *lc;
+	List	   *fdw_ps_tlist = NIL;
+	ForeignScan *scan;
 
 	/*
 	 * Separate the scan_clauses into those that can be executed remotely and
@@ -768,9 +776,6 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 *
 	 * This code must match "extract_actual_clauses(scan_clauses, false)"
 	 * except for the additional decision about remote versus local execution.
-	 * Note however that we only strip the RestrictInfo nodes from the
-	 * local_exprs list, since appendWhereClause expects a list of
-	 * RestrictInfos.
 	 */
 	foreach(lc, scan_clauses)
 	{
@@ -783,11 +788,11 @@ postgresGetForeignPlan(PlannerInfo *root,
 			continue;
 
 		if (list_member_ptr(fpinfo->remote_conds, rinfo))
-			remote_conds = lappend(remote_conds, rinfo);
+			remote_conds = lappend(remote_conds, rinfo->clause);
 		else if (list_member_ptr(fpinfo->local_conds, rinfo))
 			local_exprs = lappend(local_exprs, rinfo->clause);
 		else if (is_foreign_expr(root, baserel, rinfo->clause))
-			remote_conds = lappend(remote_conds, rinfo);
+			remote_conds = lappend(remote_conds, rinfo->clause);
 		else
 			local_exprs = lappend(local_exprs, rinfo->clause);
 	}
@@ -797,68 +802,17 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 * expressions to be sent as parameters.
 	 */
 	initStringInfo(&sql);
-	deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used,
-					 &retrieved_attrs);
-	if (remote_conds)
-		appendWhereClause(&sql, root, baserel, remote_conds,
-						  true, &params_list);
-
-	/*
-	 * Add FOR UPDATE/SHARE if appropriate.  We apply locking during the
-	 * initial row fetch, rather than later on as is done for local tables.
-	 * The extra roundtrips involved in trying to duplicate the local
-	 * semantics exactly don't seem worthwhile (see also comments for
-	 * RowMarkType).
-	 *
-	 * Note: because we actually run the query as a cursor, this assumes that
-	 * DECLARE CURSOR ... FOR UPDATE is supported, which it isn't before 8.3.
-	 */
-	if (baserel->relid == root->parse->resultRelation &&
-		(root->parse->commandType == CMD_UPDATE ||
-		 root->parse->commandType == CMD_DELETE))
-	{
-		/* Relation is UPDATE/DELETE target, so use FOR UPDATE */
-		appendStringInfoString(&sql, " FOR UPDATE");
-	}
-	else
-	{
-		PlanRowMark *rc = get_plan_rowmark(root->rowMarks, baserel->relid);
-
-		if (rc)
-		{
-			/*
-			 * Relation is specified as a FOR UPDATE/SHARE target, so handle
-			 * that.  (But we could also see LCS_NONE, meaning this isn't a
-			 * target relation after all.)
-			 *
-			 * For now, just ignore any [NO] KEY specification, since (a) it's
-			 * not clear what that means for a remote table that we don't have
-			 * complete information about, and (b) it wouldn't work anyway on
-			 * older remote servers.  Likewise, we don't worry about NOWAIT.
-			 */
-			switch (rc->strength)
-			{
-				case LCS_NONE:
-					/* No locking needed */
-					break;
-				case LCS_FORKEYSHARE:
-				case LCS_FORSHARE:
-					appendStringInfoString(&sql, " FOR SHARE");
-					break;
-				case LCS_FORNOKEYUPDATE:
-				case LCS_FORUPDATE:
-					appendStringInfoString(&sql, " FOR UPDATE");
-					break;
-			}
-		}
-	}
+	deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used, remote_conds,
+					 &params_list, &fdw_ps_tlist, &retrieved_attrs);
 
 	/*
-	 * Build the fdw_private list that will be available to the executor.
+	 * Build the fdw_private list that will be available in the executor.
 	 * Items in the list must match enum FdwScanPrivateIndex, above.
 	 */
-	fdw_private = list_make2(makeString(sql.data),
-							 retrieved_attrs);
+	fdw_private = list_make4(makeString(sql.data),
+							 retrieved_attrs,
+							 makeInteger(fpinfo->server->serverid),
+							 makeInteger(fpinfo->userid));
 
 	/*
 	 * Create the ForeignScan node from target list, local filtering
@@ -868,11 +822,18 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 * field of the finished plan node; we can't keep them in private state
 	 * because then they wouldn't be subject to later planner processing.
 	 */
-	return make_foreignscan(tlist,
+	scan = make_foreignscan(tlist,
 							local_exprs,
 							scan_relid,
 							params_list,
 							fdw_private);
+
+	/*
+	 * set fdw_ps_tlist to handle tuples generated by this scan.
+	 */
+	scan->fdw_ps_tlist = fdw_ps_tlist;
+
+	return scan;
 }
 
 /*
@@ -885,9 +846,8 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	ForeignScan *fsplan = (ForeignScan *) node->ss.ps.plan;
 	EState	   *estate = node->ss.ps.state;
 	PgFdwScanState *fsstate;
-	RangeTblEntry *rte;
+	Oid			serverid;
 	Oid			userid;
-	ForeignTable *table;
 	ForeignServer *server;
 	UserMapping *user;
 	int			numParams;
@@ -907,22 +867,13 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	node->fdw_state = (void *) fsstate;
 
 	/*
-	 * Identify which user to do the remote access as.  This should match what
-	 * ExecCheckRTEPerms() does.
-	 */
-	rte = rt_fetch(fsplan->scan.scanrelid, estate->es_range_table);
-	userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-	/* Get info about foreign table. */
-	fsstate->rel = node->ss.ss_currentRelation;
-	table = GetForeignTable(RelationGetRelid(fsstate->rel));
-	server = GetForeignServer(table->serverid);
-	user = GetUserMapping(userid, server->serverid);
-
-	/*
 	 * Get connection to the foreign server.  Connection manager will
 	 * establish new connection if necessary.
 	 */
+	serverid = intVal(list_nth(fsplan->fdw_private, FdwScanPrivateServerOid));
+	userid = intVal(list_nth(fsplan->fdw_private, FdwScanPrivateUserOid));
+	server = GetForeignServer(serverid);
+	user = GetUserMapping(userid, server->serverid);
 	fsstate->conn = GetConnection(server, user, false);
 
 	/* Assign a unique ID for my cursor */
@@ -932,8 +883,8 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	/* Get private info created by planner functions. */
 	fsstate->query = strVal(list_nth(fsplan->fdw_private,
 									 FdwScanPrivateSelectSql));
-	fsstate->retrieved_attrs = (List *) list_nth(fsplan->fdw_private,
-											   FdwScanPrivateRetrievedAttrs);
+	fsstate->retrieved_attrs = list_nth(fsplan->fdw_private,
+										FdwScanPrivateRetrievedAttrs);
 
 	/* Create contexts for batches of tuples and per-tuple temp workspace. */
 	fsstate->batch_cxt = AllocSetContextCreate(estate->es_query_cxt,
@@ -947,8 +898,18 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 											  ALLOCSET_SMALL_INITSIZE,
 											  ALLOCSET_SMALL_MAXSIZE);
 
-	/* Get info we'll need for input data conversion. */
-	fsstate->attinmeta = TupleDescGetAttInMetadata(RelationGetDescr(fsstate->rel));
+	/* Get info we'll need for input data conversion and error report. */
+	if (fsplan->scan.scanrelid > 0)
+	{
+		fsstate->relname = RelationGetRelationName(node->ss.ss_currentRelation);
+		fsstate->tupdesc = RelationGetDescr(node->ss.ss_currentRelation);
+	}
+	else
+	{
+		fsstate->relname = NULL;
+		fsstate->tupdesc = node->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
+	}
+	fsstate->attinmeta = TupleDescGetAttInMetadata(fsstate->tupdesc);
 
 	/* Prepare for output conversion of parameters used in remote query. */
 	numParams = list_length(fsplan->fdw_exprs);
@@ -1726,10 +1687,12 @@ estimate_path_cost_size(PlannerInfo *root,
 	 */
 	if (fpinfo->use_remote_estimate)
 	{
+		List	   *remote_conds;
 		List	   *remote_join_conds;
 		List	   *local_join_conds;
-		StringInfoData sql;
 		List	   *retrieved_attrs;
+		StringInfoData sql;
+		UserMapping *user;
 		PGconn	   *conn;
 		Selectivity local_sel;
 		QualCost	local_cost;
@@ -1741,24 +1704,24 @@ estimate_path_cost_size(PlannerInfo *root,
 		classifyConditions(root, baserel, join_conds,
 						   &remote_join_conds, &local_join_conds);
 
+		remote_conds = copyObject(fpinfo->remote_conds);
+		remote_conds = list_concat(remote_conds, remote_join_conds);
+
 		/*
 		 * Construct EXPLAIN query including the desired SELECT, FROM, and
 		 * WHERE clauses.  Params and other-relation Vars are replaced by
 		 * dummy values.
+		 * Here we waste params_list and fdw_ps_tlist because they are
+		 * unnecessary for EXPLAIN.
 		 */
 		initStringInfo(&sql);
 		appendStringInfoString(&sql, "EXPLAIN ");
-		deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used,
-						 &retrieved_attrs);
-		if (fpinfo->remote_conds)
-			appendWhereClause(&sql, root, baserel, fpinfo->remote_conds,
-							  true, NULL);
-		if (remote_join_conds)
-			appendWhereClause(&sql, root, baserel, remote_join_conds,
-							  (fpinfo->remote_conds == NIL), NULL);
+		deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used, remote_conds,
+						 NULL, NULL, &retrieved_attrs);
 
 		/* Get the remote estimate */
-		conn = GetConnection(fpinfo->server, fpinfo->user, false);
+		user = GetUserMapping(fpinfo->userid, fpinfo->server->serverid);
+		conn = GetConnection(fpinfo->server, user, false);
 		get_remote_estimate(sql.data, conn, &rows, &width,
 							&startup_cost, &total_cost);
 		ReleaseConnection(conn);
@@ -2055,7 +2018,9 @@ fetch_more_data(ForeignScanState *node)
 		{
 			fsstate->tuples[i] =
 				make_tuple_from_result_row(res, i,
-										   fsstate->rel,
+										   fsstate->relname,
+										   fsstate->query,
+										   fsstate->tupdesc,
 										   fsstate->attinmeta,
 										   fsstate->retrieved_attrs,
 										   fsstate->temp_cxt);
@@ -2273,7 +2238,9 @@ store_returning_result(PgFdwModifyState *fmstate,
 		HeapTuple	newtup;
 
 		newtup = make_tuple_from_result_row(res, 0,
-											fmstate->rel,
+										RelationGetRelationName(fmstate->rel),
+											fmstate->query,
+											RelationGetDescr(fmstate->rel),
 											fmstate->attinmeta,
 											fmstate->retrieved_attrs,
 											fmstate->temp_cxt);
@@ -2423,6 +2390,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
 	initStringInfo(&sql);
 	appendStringInfo(&sql, "DECLARE c%u CURSOR FOR ", cursor_number);
 	deparseAnalyzeSql(&sql, relation, &astate.retrieved_attrs);
+	astate.query = sql.data;
 
 	/* In what follows, do not risk leaking any PGresults. */
 	PG_TRY();
@@ -2565,7 +2533,9 @@ analyze_row_processor(PGresult *res, int row, PgFdwAnalyzeState *astate)
 		oldcontext = MemoryContextSwitchTo(astate->anl_cxt);
 
 		astate->rows[pos] = make_tuple_from_result_row(res, row,
-													   astate->rel,
+										   RelationGetRelationName(astate->rel),
+													   astate->query,
+											   RelationGetDescr(astate->rel),
 													   astate->attinmeta,
 													 astate->retrieved_attrs,
 													   astate->temp_cxt);
@@ -2839,6 +2809,269 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
 }
 
 /*
+ * Construct PgFdwRelationInfo from two join sources
+ */
+static PgFdwRelationInfo *
+merge_fpinfo(RelOptInfo *outerrel,
+			 RelOptInfo *innerrel,
+			 JoinType jointype,
+			 double rows,
+			 int width)
+{
+	PgFdwRelationInfo *fpinfo_o;
+	PgFdwRelationInfo *fpinfo_i;
+	PgFdwRelationInfo *fpinfo;
+
+	fpinfo_o = (PgFdwRelationInfo *) outerrel->fdw_private;
+	fpinfo_i = (PgFdwRelationInfo *) innerrel->fdw_private;
+
+	fpinfo = (PgFdwRelationInfo *) palloc0(sizeof(PgFdwRelationInfo));
+
+	/* Join relation must have conditions come from sources */
+	fpinfo->remote_conds = list_concat(copyObject(fpinfo_o->remote_conds),
+									   copyObject(fpinfo_i->remote_conds));
+	fpinfo->local_conds = list_concat(copyObject(fpinfo_o->local_conds),
+									  copyObject(fpinfo_i->local_conds));
+
+	/* Only for simple foreign table scan */
+	fpinfo->attrs_used = NULL;
+
+	/* rows and width will be set later */
+	fpinfo->rows = rows;
+	fpinfo->width = width;
+
+	/* A join have local conditions for outer and inner, so sum up them. */
+	fpinfo->local_conds_cost.startup = fpinfo_o->local_conds_cost.startup +
+									   fpinfo_i->local_conds_cost.startup;
+	fpinfo->local_conds_cost.per_tuple = fpinfo_o->local_conds_cost.per_tuple +
+										 fpinfo_i->local_conds_cost.per_tuple;
+
+	/* Don't consider correlation between local filters. */
+	fpinfo->local_conds_sel = fpinfo_o->local_conds_sel *
+							  fpinfo_i->local_conds_sel;
+
+	fpinfo->use_remote_estimate = false;
+
+	/*
+	 * These two comes default or per-server setting, so outer and inner must
+	 * have same value.
+	 */
+	fpinfo->fdw_startup_cost = fpinfo_o->fdw_startup_cost;
+	fpinfo->fdw_tuple_cost = fpinfo_o->fdw_tuple_cost;
+
+	/*
+	 * TODO estimate more accurately
+	 */
+	fpinfo->startup_cost = fpinfo->fdw_startup_cost +
+						   fpinfo->local_conds_cost.startup;
+	fpinfo->total_cost = fpinfo->startup_cost +
+						 (fpinfo->fdw_tuple_cost +
+						  fpinfo->local_conds_cost.per_tuple +
+						  cpu_tuple_cost) * fpinfo->rows;
+
+	/* serverid and userid are respectively identical */
+	fpinfo->server = fpinfo_o->server;
+	fpinfo->userid = fpinfo_o->userid;
+
+	fpinfo->outerrel = outerrel;
+	fpinfo->innerrel = innerrel;
+	fpinfo->jointype = jointype;
+
+	/* joinclauses and otherclauses will be set later */
+
+	return fpinfo;
+}
+
+/*
+ * postgresGetForeignJoinPaths
+ *		Add possible ForeignPath to joinrel.
+ *
+ * Joins satisfy conditions below can be pushed down to the remote PostgreSQL
+ * server.
+ *
+ * 1) Join type is INNER or OUTER (one of LEFT/RIGHT/FULL)
+ * 2) Both outer and inner portions are safe to push-down
+ * 3) All foreign tables in the join belong to the same foreign server
+ * 4) All foreign tables are accessed with identical user
+ * 5) All join conditions are safe to push down
+ * 6) No relation has local filter (this can be relaxed for INNER JOIN with
+ * no volatile function/operator, but as of now we want safer way)
+ */
+static void
+postgresGetForeignJoinPaths(PlannerInfo *root,
+							RelOptInfo *joinrel,
+							RelOptInfo *outerrel,
+							RelOptInfo *innerrel,
+							SpecialJoinInfo *sjinfo,
+							List *restrictlist)
+{
+	PgFdwRelationInfo *fpinfo;
+	PgFdwRelationInfo *fpinfo_o;
+	PgFdwRelationInfo *fpinfo_i;
+	JoinType		jointype = !sjinfo ? JOIN_INNER : sjinfo->jointype;
+	ForeignPath	   *joinpath;
+	double			rows;
+	Cost			startup_cost;
+	Cost			total_cost;
+
+	ListCell	   *lc;
+	List		   *joinclauses;
+	List		   *otherclauses;
+
+	/*
+	 * We support all outer joins in addition to inner join.  CROSS JOIN is
+	 * an INNER JOIN with no conditions internally, so will be checked later.
+	 */
+	if (jointype != JOIN_INNER && jointype != JOIN_LEFT &&
+		jointype != JOIN_RIGHT && jointype != JOIN_FULL)
+	{
+		ereport(DEBUG3, (errmsg("unsupported join type (SEMI, ANTI)")));
+		return;
+	}
+
+	/*
+	 * Having valid PgFdwRelationInfo in RelOptInfo#fdw_private indicates that
+	 * scanning against the relation can be pushed down.  If either of them
+	 * doesn't have PgFdwRelationInfo, give up to push down this join relation.
+	 */
+	if (!outerrel->fdw_private)
+	{
+		ereport(DEBUG3, (errmsg("outer is not safe to push-down")));
+		return;
+	}
+	if (!innerrel->fdw_private)
+	{
+		ereport(DEBUG3, (errmsg("inner is not safe to push-down")));
+		return;
+	}
+	fpinfo_o = (PgFdwRelationInfo *) outerrel->fdw_private;
+	fpinfo_i = (PgFdwRelationInfo *) innerrel->fdw_private;
+
+	/*
+	 * All relations in the join must belong to same server.  Having a valid
+	 * fdw_private means that all relations in the relations belong to the
+	 * server the fdw_private has, so what we should do is just compare
+	 * serverid of outer/inner relations.
+	 */
+	if (fpinfo_o->server->serverid != fpinfo_i->server->serverid)
+	{
+		ereport(DEBUG3, (errmsg("server unmatch")));
+		return;
+	}
+
+	/*
+	 * effective userid of all source relations should be identical.
+	 * Having a valid fdw_private means that all relations in the relations is
+	 * accessed with identical user, so what we should do is just compare
+	 * userid of outer/inner relations.
+	 */
+	if (fpinfo_o->userid != fpinfo_i->userid)
+	{
+		ereport(DEBUG3, (errmsg("unmatch userid")));
+		return;
+	}
+
+	/*
+	 * No source relation can have local conditions.  This can be relaxed
+	 * if the join is an inner join and local conditions don't contain
+	 * volatile function/operator, but as of now we leave it as future
+	 * enhancement.
+	 */
+	if (fpinfo_o->local_conds != NULL || fpinfo_i->local_conds != NULL)
+	{
+		ereport(DEBUG3, (errmsg("join with local filter")));
+		return;
+	}
+
+	/*
+	 * Separate restrictlist into two lists, join conditions and remote filters.
+	 */
+	joinclauses = restrictlist;
+	if (IS_OUTER_JOIN(jointype))
+	{
+		extract_actual_join_clauses(joinclauses, &joinclauses, &otherclauses);
+	}
+	else
+	{
+		joinclauses = extract_actual_clauses(joinclauses, false);
+		otherclauses = NIL;
+	}
+
+	/*
+	 * Note that CROSS JOIN (cartesian product) is transformed to JOIN_INNER
+	 * with empty joinclauses.  Pushing down CROSS JOIN usually produces more
+	 * result than retrieving each tables separately, so we don't push down
+	 * such joins.
+	 */
+	if (jointype == JOIN_INNER && joinclauses == NIL)
+	{
+		ereport(DEBUG3, (errmsg("unsupported join type (CROSS)")));
+		return;
+	}
+
+	/*
+	 * Join condition must be safe to push down.
+	 */
+	foreach(lc, joinclauses)
+	{
+		Expr *expr = (Expr *) lfirst(lc);
+
+		if (!is_foreign_expr(root, joinrel, expr))
+		{
+			ereport(DEBUG3, (errmsg("join quals contains unsafe conditions")));
+			return;
+		}
+	}
+
+	/*
+	 * Other condition for the join must be safe to push down.
+	 */
+	foreach(lc, otherclauses)
+	{
+		Expr *expr = (Expr *) lfirst(lc);
+
+		if (!is_foreign_expr(root, joinrel, expr))
+		{
+			ereport(DEBUG3, (errmsg("filter contains unsafe conditions")));
+			return;
+		}
+	}
+
+	/* Here we know that this join can be pushed-down to remote side. */
+
+	/* Construct fpinfo for the join relation */
+	fpinfo = merge_fpinfo(outerrel, innerrel, jointype, joinrel->rows, joinrel->width); 
+	fpinfo->joinclauses = joinclauses;
+	fpinfo->otherclauses = otherclauses;
+	joinrel->fdw_private = fpinfo;
+
+	/* TODO determine more accurate cost and rows of the join. */
+	rows = joinrel->rows;
+	startup_cost = fpinfo->startup_cost;
+	total_cost = fpinfo->total_cost;
+
+	/*
+	 * Create a new join path and add it to the joinrel which represents a join
+	 * between foreign tables.
+	 */
+	joinpath = create_foreignscan_path(root,
+									   joinrel,
+									   rows,
+									   startup_cost,
+									   total_cost,
+									   NIL,		/* no pathkeys */
+									   NULL,	/* no required_outer */
+									   NIL);	/* no fdw_private */
+
+	/* Add generated path into joinrel by add_path(). */
+	add_path(joinrel, (Path *) joinpath);
+	elog(DEBUG3, "join path added for (%s) join (%s)",
+		 bms_to_str(outerrel->relids), bms_to_str(innerrel->relids));
+
+	/* TODO consider parameterized paths */
+}
+
+/*
  * Create a tuple from the specified row of the PGresult.
  *
  * rel is the local representation of the foreign table, attinmeta is
@@ -2849,13 +3082,14 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
 static HeapTuple
 make_tuple_from_result_row(PGresult *res,
 						   int row,
-						   Relation rel,
+						   const char *relname,
+						   const char *query,
+						   TupleDesc tupdesc,
 						   AttInMetadata *attinmeta,
 						   List *retrieved_attrs,
 						   MemoryContext temp_context)
 {
 	HeapTuple	tuple;
-	TupleDesc	tupdesc = RelationGetDescr(rel);
 	Datum	   *values;
 	bool	   *nulls;
 	ItemPointer ctid = NULL;
@@ -2882,7 +3116,9 @@ make_tuple_from_result_row(PGresult *res,
 	/*
 	 * Set up and install callback to report where conversion error occurs.
 	 */
-	errpos.rel = rel;
+	errpos.relname = relname;
+	errpos.query = query;
+	errpos.tupdesc = tupdesc;
 	errpos.cur_attno = 0;
 	errcallback.callback = conversion_error_callback;
 	errcallback.arg = (void *) &errpos;
@@ -2966,11 +3202,39 @@ make_tuple_from_result_row(PGresult *res,
 static void
 conversion_error_callback(void *arg)
 {
+	const char *attname;
+	const char *relname;
 	ConversionLocation *errpos = (ConversionLocation *) arg;
-	TupleDesc	tupdesc = RelationGetDescr(errpos->rel);
+	TupleDesc	tupdesc = errpos->tupdesc;
+	StringInfoData buf;
+
+	if (errpos->relname)
+	{
+		/* error occurred in a scan against a foreign table */ 
+		initStringInfo(&buf);
+		if (errpos->cur_attno > 0)
+			appendStringInfo(&buf, "column \"%s\"",
+					 NameStr(tupdesc->attrs[errpos->cur_attno - 1]->attname));
+		else if (errpos->cur_attno == SelfItemPointerAttributeNumber)
+			appendStringInfoString(&buf, "column \"ctid\"");
+		attname = buf.data;
+
+		initStringInfo(&buf);
+		appendStringInfo(&buf, "foreign table \"%s\"", errpos->relname);
+		relname = buf.data;
+	}
+	else
+	{
+		/* error occurred in a scan against a foreign join */ 
+		initStringInfo(&buf);
+		appendStringInfo(&buf, "column %d", errpos->cur_attno - 1);
+		attname = buf.data;
+
+		initStringInfo(&buf);
+		appendStringInfo(&buf, "foreign join \"%s\"", errpos->query);
+		relname = buf.data;
+	}
 
 	if (errpos->cur_attno > 0 && errpos->cur_attno <= tupdesc->natts)
-		errcontext("column \"%s\" of foreign table \"%s\"",
-				   NameStr(tupdesc->attrs[errpos->cur_attno - 1]->attname),
-				   RelationGetRelationName(errpos->rel));
+		errcontext("%s of %s", attname, relname);
 }
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 950c6f7..0d05e5d 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -16,10 +16,52 @@
 #include "foreign/foreign.h"
 #include "lib/stringinfo.h"
 #include "nodes/relation.h"
+#include "nodes/plannodes.h"
 #include "utils/relcache.h"
 
 #include "libpq-fe.h"
 
+/*
+ * FDW-specific planner information kept in RelOptInfo.fdw_private for a
+ * foreign table or a foreign join.  This information is collected by
+ * postgresGetForeignRelSize, or calculated from join source relations.
+ */
+typedef struct PgFdwRelationInfo
+{
+	/* baserestrictinfo clauses, broken down into safe and unsafe subsets. */
+	List	   *remote_conds;
+	List	   *local_conds;
+
+	/* Bitmap of attr numbers we need to fetch from the remote server. */
+	Bitmapset  *attrs_used;
+
+	/* Cost and selectivity of local_conds. */
+	QualCost	local_conds_cost;
+	Selectivity local_conds_sel;
+
+	/* Estimated size and cost for a scan with baserestrictinfo quals. */
+	double		rows;
+	int			width;
+	Cost		startup_cost;
+	Cost		total_cost;
+
+	/* Options extracted from catalogs. */
+	bool		use_remote_estimate;
+	Cost		fdw_startup_cost;
+	Cost		fdw_tuple_cost;
+
+	/* Cached catalog information. */
+	ForeignServer *server;
+	Oid			userid;
+
+	/* Join information */
+	RelOptInfo *outerrel;
+	RelOptInfo *innerrel;
+	JoinType	jointype;
+	List	   *joinclauses;
+	List	   *otherclauses;
+} PgFdwRelationInfo;
+
 /* in postgres_fdw.c */
 extern int	set_transmission_modes(void);
 extern void reset_transmission_modes(int nestlevel);
@@ -51,13 +93,30 @@ extern void deparseSelectSql(StringInfo buf,
 				 PlannerInfo *root,
 				 RelOptInfo *baserel,
 				 Bitmapset *attrs_used,
+				 List *remote_conds,
+				 List **params_list,
+				 List **fdw_ps_tlist,
 				 List **retrieved_attrs);
-extern void appendWhereClause(StringInfo buf,
+extern void appendConditions(StringInfo buf,
 				  PlannerInfo *root,
 				  RelOptInfo *baserel,
+				  List *outertlist,
+				  List *innertlist,
 				  List *exprs,
-				  bool is_first,
+				  const char *prefix,
 				  List **params);
+extern void deparseJoinSql(StringInfo sql,
+			   PlannerInfo *root,
+			   RelOptInfo *baserel,
+			   RelOptInfo *outerrel,
+			   RelOptInfo *innerrel,
+			   const char *sql_o,
+			   const char *sql_i,
+			   JoinType jointype,
+			   List *joinclauses,
+			   List *otherclauses,
+			   List **fdw_ps_tlist,
+			   List **retrieved_attrs);
 extern void deparseInsertSql(StringInfo buf, PlannerInfo *root,
 				 Index rtindex, Relation rel,
 				 List *targetAttrs, List *returningList,
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 4a23457..b0c9a8d 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -11,12 +11,17 @@ DO $d$
             OPTIONS (dbname '$$||current_database()||$$',
                      port '$$||current_setting('port')||$$'
             )$$;
+        EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER postgres_fdw
+            OPTIONS (dbname '$$||current_database()||$$',
+                     port '$$||current_setting('port')||$$'
+            )$$;
     END;
 $d$;
 
 CREATE USER MAPPING FOR public SERVER testserver1
 	OPTIONS (user 'value', password 'value');
 CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
+CREATE USER MAPPING FOR CURRENT_USER SERVER loopback2;
 
 -- ===================================================================
 -- create objects used through FDW loopback server
@@ -39,6 +44,18 @@ CREATE TABLE "S 1"."T 2" (
 	c2 text,
 	CONSTRAINT t2_pkey PRIMARY KEY (c1)
 );
+CREATE TABLE "S 1"."T 3" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text,
+	CONSTRAINT t3_pkey PRIMARY KEY (c1)
+);
+CREATE TABLE "S 1"."T 4" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c4 text,
+	CONSTRAINT t4_pkey PRIMARY KEY (c1)
+);
 
 INSERT INTO "S 1"."T 1"
 	SELECT id,
@@ -54,9 +71,23 @@ INSERT INTO "S 1"."T 2"
 	SELECT id,
 	       'AAA' || to_char(id, 'FM000')
 	FROM generate_series(1, 100) id;
+INSERT INTO "S 1"."T 3"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 3" WHERE c1 % 2 != 0;	-- delete for outer join tests
+INSERT INTO "S 1"."T 4"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 4" WHERE c1 % 3 != 0;	-- delete for outer join tests
 
 ANALYZE "S 1"."T 1";
 ANALYZE "S 1"."T 2";
+ANALYZE "S 1"."T 3";
+ANALYZE "S 1"."T 4";
 
 -- ===================================================================
 -- create foreign tables
@@ -87,6 +118,29 @@ CREATE FOREIGN TABLE ft2 (
 ) SERVER loopback;
 ALTER FOREIGN TABLE ft2 DROP COLUMN cx;
 
+CREATE FOREIGN TABLE ft4 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3');
+
+CREATE FOREIGN TABLE ft5 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4');
+
+CREATE FOREIGN TABLE ft6 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback2 OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE USER view_owner;
+GRANT ALL ON ft5 TO view_owner;
+CREATE VIEW v_ft5 AS SELECT * FROM ft5;
+ALTER VIEW v_ft5 OWNER TO view_owner;
+CREATE USER MAPPING FOR view_owner SERVER loopback;
+
 -- ===================================================================
 -- tests for validator
 -- ===================================================================
@@ -158,8 +212,6 @@ EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
 SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
 -- aggregate
 SELECT COUNT(*) FROM ft1 t1;
--- join two tables
-SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
 -- subquery
 SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <= 10) ORDER BY c1;
 -- subquery+MAX
@@ -216,6 +268,82 @@ SELECT * FROM ft1 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft2 WHERE c1 < 5));
 SELECT * FROM ft2 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft1 WHERE c1 < 5));
 
 -- ===================================================================
+-- JOIN queries
+-- ===================================================================
+-- join two tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- join three tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+-- left outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+-- right outer join
+SET enable_mergejoin = off; -- planner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+SET enable_mergejoin = on;
+-- full outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+-- full outer join + WHERE clause, only matched rows
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+-- join at WHERE clause 
+SET enable_mergejoin = off; -- planner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SET enable_mergejoin = on;
+-- join in CTE
+EXPLAIN (COSTS false, VERBOSE)
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+-- ctid with whole-row reference
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- partially unsafe to push down, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+-- SEMI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+-- ANTI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+-- CROSS JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- different server
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- different effective user for permission check
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- unsafe join conditions
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- local filter (unsafe conditions on one side)
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+
+-- ===================================================================
 -- parameterized queries
 -- ===================================================================
 -- simple join
@@ -666,116 +794,6 @@ UPDATE rem1 SET f2 = 'testo';
 INSERT INTO rem1(f2) VALUES ('test') RETURNING ctid;
 
 -- ===================================================================
--- test inheritance features
--- ===================================================================
-
-CREATE TABLE a (aa TEXT);
-CREATE TABLE loct (aa TEXT, bb TEXT);
-CREATE FOREIGN TABLE b (bb TEXT) INHERITS (a)
-  SERVER loopback OPTIONS (table_name 'loct');
-
-INSERT INTO a(aa) VALUES('aaa');
-INSERT INTO a(aa) VALUES('aaaa');
-INSERT INTO a(aa) VALUES('aaaaa');
-
-INSERT INTO b(aa) VALUES('bbb');
-INSERT INTO b(aa) VALUES('bbbb');
-INSERT INTO b(aa) VALUES('bbbbb');
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE a SET aa = 'zzzzzz' WHERE aa LIKE 'aaaa%';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE b SET aa = 'new';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE a SET aa = 'newtoo';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-DELETE FROM a;
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-DROP TABLE a CASCADE;
-DROP TABLE loct;
-
--- Check SELECT FOR UPDATE/SHARE with an inherited source table
-create table loct1 (f1 int, f2 int, f3 int);
-create table loct2 (f1 int, f2 int, f3 int);
-
-create table foo (f1 int, f2 int);
-create foreign table foo2 (f3 int) inherits (foo)
-  server loopback options (table_name 'loct1');
-create table bar (f1 int, f2 int);
-create foreign table bar2 (f3 int) inherits (bar)
-  server loopback options (table_name 'loct2');
-
-insert into foo values(1,1);
-insert into foo values(3,3);
-insert into foo2 values(2,2,2);
-insert into foo2 values(4,4,4);
-insert into bar values(1,11);
-insert into bar values(2,22);
-insert into bar values(6,66);
-insert into bar2 values(3,33,33);
-insert into bar2 values(4,44,44);
-insert into bar2 values(7,77,77);
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for update;
-select * from bar where f1 in (select f1 from foo) for update;
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for share;
-select * from bar where f1 in (select f1 from foo) for share;
-
--- Check UPDATE with inherited target and an inherited source table
-explain (verbose, costs off)
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-
-select tableoid::regclass, * from bar order by 1,2;
-
--- Check UPDATE with inherited target and an appendrel subquery
-explain (verbose, costs off)
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-
-select tableoid::regclass, * from bar order by 1,2;
-
--- Test that WHERE CURRENT OF is not supported
-begin;
-declare c cursor for select * from bar where f1 = 7;
-fetch from c;
-update bar set f2 = null where current of c;
-rollback;
-
-drop table foo cascade;
-drop table bar cascade;
-drop table loct1;
-drop table loct2;
-
--- ===================================================================
 -- test IMPORT FOREIGN SCHEMA
 -- ===================================================================
 
@@ -831,3 +849,7 @@ DROP TYPE "Colors" CASCADE;
 IMPORT FOREIGN SCHEMA import_source LIMIT TO (t5)
   FROM SERVER loopback INTO import_dest5;  -- ERROR
 ROLLBACK;
+
+-- Cleanup
+DROP OWNED BY view_owner;
+DROP USER view_owner;
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 43adb61..fb39c38 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -406,11 +406,27 @@
   <title>Remote Query Optimization</title>
 
   <para>
-   <filename>postgres_fdw</> attempts to optimize remote queries to reduce
-   the amount of data transferred from foreign servers.  This is done by
-   sending query <literal>WHERE</> clauses to the remote server for
-   execution, and by not retrieving table columns that are not needed for
-   the current query.  To reduce the risk of misexecution of queries,
+   <filename>postgres_fdw</filename> attempts to optimize remote queries to
+   reduce the amount of data transferred from foreign servers.
+   This is done by various ways.
+  </para>
+
+  <para>
+   For <literal>SELECT</> clause, <filename>postgres_fdw</filename> sends only
+   actually necessary columns in it.
+  </para>
+
+  <para>
+   If <literal>FROM</> clause contains multiple foreign tables managed
+   by the same server and accessed with identical user,
+   <filename>postgres_fdw</> tries to join foreign tables on the remote side as
+   much as it can.
+   To reduce risk of misexecution of queries, <filename>postgres_fdw</>
+   gives up sending joins to remote when join conditions might have different
+   semantics on the remote side.
+  </para>
+
+  <para>
    <literal>WHERE</> clauses are not sent to the remote server unless they use
    only built-in data types, operators, and functions.  Operators and
    functions in the clauses must be <literal>IMMUTABLE</> as well.

explain_foreign_join.patchapplication/octet-stream; name=explain_foreign_join.patchDownload

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 7e27313..43726a0 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -141,6 +141,7 @@ static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod,
 				 deparse_expr_cxt *context);
 static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
 					   deparse_expr_cxt *context);
+static const char *get_jointype_name(JoinType jointype);
 
 /*
  * convert absolute attnum to relative one.  This would be handy for handling
@@ -696,12 +697,16 @@ deparseSelectSql(StringInfo buf,
 				 List *remote_conds,
 				 List **params_list,
 				 List **fdw_ps_tlist,
-				 List **retrieved_attrs)
+				 List **retrieved_attrs,
+				 StringInfo relations)
 {
 	PgFdwRelationInfo  *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
 	RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
 	Relation	rel;
 
+	if (relations)
+		initStringInfo(relations);
+
 	/*
 	 * If given relation was a join relation, recursively construct statement
 	 * by putting each outer and inner relations in FROM clause as a subquery
@@ -716,6 +721,9 @@ deparseSelectSql(StringInfo buf,
 		StringInfoData		sql_o;
 		StringInfoData		sql_i;
 		List			   *ret_attrs_tmp;	/* not used */
+		StringInfoData		relations_o;
+		StringInfoData		relations_i;
+		const char		   *jointype_str;
 
 		/*
 		 * Deparse query for outer and inner relation, and combine them into
@@ -728,11 +736,17 @@ deparseSelectSql(StringInfo buf,
 		initStringInfo(&sql_o);
 		deparseSelectSql(&sql_o, root, rel_o, fpinfo_o->attrs_used,
 						 fpinfo_o->remote_conds, params_list,
-						 NULL, &ret_attrs_tmp);
+						 NULL, &ret_attrs_tmp, &relations_o);
 		initStringInfo(&sql_i);
 		deparseSelectSql(&sql_i, root, rel_i, fpinfo_i->attrs_used,
 						 fpinfo_i->remote_conds, params_list,
-						 NULL, &ret_attrs_tmp);
+						 NULL, &ret_attrs_tmp, &relations_i);
+
+		/* For EXPLAIN output */
+		jointype_str = get_jointype_name(fpinfo->jointype);
+		if (relations)
+			appendStringInfo(relations, "(%s) %s JOIN (%s)",
+							 relations_o.data, jointype_str, relations_i.data);
 
 		deparseJoinSql(buf, root, baserel,
 					   fpinfo->outerrel,
@@ -765,6 +779,8 @@ deparseSelectSql(StringInfo buf,
 	 */
 	appendStringInfoString(buf, " FROM ");
 	deparseRelation(buf, rel);
+	if (relations)
+		deparseRelation(relations, rel);
 
 	/*
 	 * Construct WHERE clause
@@ -1132,6 +1148,15 @@ deparseProjectionSql(PlannerInfo *root,
 	return buf.data;
 }
 
+static const char *
+get_jointype_name(JoinType jointype)
+{
+	return jointype == JOIN_INNER ? "INNER" :
+		   jointype == JOIN_LEFT ? "LEFT" :
+		   jointype == JOIN_RIGHT ? "RIGHT" :
+		   jointype == JOIN_FULL ? "FULL" : "";
+}
+
 /*
  * Construct a SELECT statement which contains join clause.
  *
@@ -1173,11 +1198,7 @@ deparseJoinSql(StringInfo buf,
 	context.outertlist = outerrel->reltargetlist;
 	context.innertlist = innerrel->reltargetlist;
 
-	jointype_str = jointype == JOIN_INNER ? "INNER" :
-				   jointype == JOIN_LEFT ? "LEFT" :
-				   jointype == JOIN_RIGHT ? "RIGHT" :
-				   jointype == JOIN_FULL ? "FULL" : "";
-
+	jointype_str = get_jointype_name(jointype);
 	*retrieved_attrs = NIL;
 
 	/* print SELECT clause of the join scan */
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 530525e..cec25fd 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -531,8 +531,9 @@ EXPLAIN (VERBOSE, COSTS false)
 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan
    Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8, b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
+   Relations: ("S 1"."T 1") INNER JOIN ("S 1"."T 1")
    Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, r.a1, r.a2, r.a3, r.a4, r.a5, r.a6, r.a7, r.a8 FROM (SELECT l.a9, l.a10, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 47))) l) l (a1, a2, a3, a4, a5, a6, a7, a8) INNER JOIN (SELECT r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2, a3, a4, a5, a6, a7, a8) ON ((l.a2 = r.a1))
-(3 rows)
+(4 rows)
 
 SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  | c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -698,8 +699,9 @@ SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan
                Output: t1.c1, t2.c1, t1.c3
+               Relations: ("S 1"."T 1") INNER JOIN ("S 1"."T 1")
                Remote SQL: SELECT l.a1, l.a2, r.a1 FROM (SELECT l.a10, l.a12 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l) l (a1, a2) INNER JOIN (SELECT r.a9 FROM (SELECT "C 1" a9 FROM "S 1"."T 1") r) r (a1) ON ((l.a1 = r.a1))
-(8 rows)
+(9 rows)
 
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
  c1  | c1  
@@ -728,8 +730,9 @@ SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan
                Output: t1.c1, t2.c2, t3.c3, t1.c3
+               Relations: (("S 1"."T 1") INNER JOIN ("S 1"."T 1")) INNER JOIN ("S 1"."T 3")
                Remote SQL: SELECT l.a1, l.a2, l.a3, r.a1 FROM (SELECT l.a1, l.a2, r.a1, r.a2 FROM (SELECT l.a10, l.a12 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l) l (a1, a2) INNER JOIN (SELECT r.a10, r.a9 FROM (SELECT "C 1" a9, c2 a10 FROM "S 1"."T 1") r) r (a1, a2) ON ((l.a1 = r.a2))) l (a1, a2, a3, a4) INNER JOIN (SELECT r.a11, r.a9 FROM (SELECT c1 a9, c3 a11 FROM "S 1"."T 3") r) r (a1, a2) ON ((l.a1 = r.a2))
-(8 rows)
+(9 rows)
 
 SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
  c1 | c2 |   c3   
@@ -758,8 +761,9 @@ SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.
          Sort Key: t1.c1, t2.c1
          ->  Foreign Scan
                Output: t1.c1, t2.c1
+               Relations: ("S 1"."T 3") LEFT JOIN ("S 1"."T 4")
                Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) LEFT JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1))
-(8 rows)
+(9 rows)
 
 SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
  c1 | c1 
@@ -789,8 +793,9 @@ SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2
          Sort Key: t2.c1
          ->  Foreign Scan
                Output: t1.c1, t2.c1
+               Relations: ("S 1"."T 4") LEFT JOIN ("S 1"."T 3")
                Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") l) l (a1) LEFT JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") r) r (a1) ON ((r.a1 = l.a1))
-(8 rows)
+(9 rows)
 
 SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
  c1 | c1 
@@ -820,8 +825,9 @@ SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.
          Sort Key: t1.c1, t2.c1
          ->  Foreign Scan
                Output: t1.c1, t2.c1
+               Relations: ("S 1"."T 3") FULL JOIN ("S 1"."T 4")
                Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) FULL JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1))
-(8 rows)
+(9 rows)
 
 SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
  c1  | c1 
@@ -850,8 +856,9 @@ SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1
          Sort Key: t1.c1, t2.c1
          ->  Foreign Scan
                Output: t1.c1, t2.c1
+               Relations: ("S 1"."T 3") FULL JOIN ("S 1"."T 4")
                Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) FULL JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1)) WHERE (((l.a1 = r.a1) OR (l.a1 IS NULL)))
-(8 rows)
+(9 rows)
 
 SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
  c1 | c1 
@@ -881,8 +888,9 @@ SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) O
          Sort Key: t1.c1
          ->  Foreign Scan
                Output: t1.c1, t2.c1
+               Relations: ("S 1"."T 3") INNER JOIN ("S 1"."T 4")
                Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) INNER JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1))
-(8 rows)
+(9 rows)
 
 SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
  c1 | c1 
@@ -906,13 +914,14 @@ WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2
    CTE t
      ->  Foreign Scan
            Output: t1.c1, t1.c3, t2.c1
+           Relations: ("S 1"."T 1") INNER JOIN ("S 1"."T 1")
            Remote SQL: SELECT l.a1, l.a2, r.a1 FROM (SELECT l.a10, l.a12 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l) l (a1, a2) INNER JOIN (SELECT r.a9 FROM (SELECT "C 1" a9 FROM "S 1"."T 1") r) r (a1) ON ((l.a1 = r.a1))
    ->  Sort
          Output: t.c1_1, t.c2_1, t.c1_3
          Sort Key: t.c1_3, t.c1_1
          ->  CTE Scan on t
                Output: t.c1_1, t.c2_1, t.c1_3
-(11 rows)
+(12 rows)
 
 WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
  c1_1 | c2_1 
@@ -941,8 +950,9 @@ SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER B
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan
                Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+               Relations: ("S 1"."T 1") INNER JOIN ("S 1"."T 1")
                Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, r.a1 FROM (SELECT l.a7, ROW(l.a10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17), l.a10, l.a12 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17, ctid a7 FROM "S 1"."T 1") l) l (a1, a2, a3, a4) INNER JOIN (SELECT ROW(r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a9 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2) ON ((l.a3 = r.a2))
-(8 rows)
+(9 rows)
 
 SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
   ctid  |                                             t1                                             |                                             t2                                             | c1  
@@ -976,8 +986,9 @@ SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.
                      Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
                ->  Materialize
                      ->  Foreign Scan
+                           Relations: ("S 1"."T 1") INNER JOIN ("S 1"."T 3")
                            Remote SQL: SELECT NULL FROM (SELECT l.a9 FROM (SELECT "C 1" a9 FROM "S 1"."T 1" WHERE (("C 1" = "C 1"))) l) l (a1) INNER JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") r) r (a1) ON ((l.a1 = r.a1))
-(13 rows)
+(14 rows)
 
 SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
  c1 
@@ -1815,8 +1826,9 @@ UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
    Remote SQL: UPDATE "S 1"."T 1" SET c2 = $2, c3 = $3, c7 = $4 WHERE ctid = $1
    ->  Foreign Scan
          Output: ft2.c1, (ft2.c2 + 500), NULL::integer, (ft2.c3 || '_update9'::text), ft2.c4, ft2.c5, ft2.c6, 'ft2       '::character(10), ft2.c8, ft2.ctid, ft1.*
+         Relations: ("S 1"."T 1") INNER JOIN ("S 1"."T 1")
          Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, r.a1 FROM (SELECT l.a9, l.a10, l.a12, l.a13, l.a14, l.a15, l.a17, l.a7 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c8 a17, ctid a7 FROM "S 1"."T 1" FOR UPDATE) l) l (a1, a2, a3, a4, a5, a6, a7, a8) INNER JOIN (SELECT ROW(r.a10, r.a11, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a10 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 9))) r) r (a1, a2) ON ((l.a2 = r.a2))
-(5 rows)
+(6 rows)
 
 UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
@@ -1948,8 +1960,9 @@ DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
    Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1
    ->  Foreign Scan
          Output: ft2.ctid, ft1.*
+         Relations: ("S 1"."T 1") INNER JOIN ("S 1"."T 1")
          Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a7, l.a10 FROM (SELECT c2 a10, ctid a7 FROM "S 1"."T 1" FOR UPDATE) l) l (a1, a2) INNER JOIN (SELECT ROW(r.a10, r.a11, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a10 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 2))) r) r (a1, a2) ON ((l.a2 = r.a2))
-(5 rows)
+(6 rows)
 
 DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
 SELECT c1,c2,c3,c4 FROM ft2 ORDER BY c1;
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 5e5ccb7..bedf83c 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -69,6 +69,8 @@ enum FdwScanPrivateIndex
 	FdwScanPrivateServerOid,
 	/* Integer value of effective userid for the scan */
 	FdwScanPrivateUserOid,
+	/* Names of relation scanned, added when the scan is join */
+	FdwScanPrivateRelations,
 };
 
 /*
@@ -760,6 +762,7 @@ postgresGetForeignPlan(PlannerInfo *root,
 	ListCell   *lc;
 	List	   *fdw_ps_tlist = NIL;
 	ForeignScan *scan;
+	StringInfoData relations;
 
 	/*
 	 * Separate the scan_clauses into those that can be executed remotely and
@@ -803,7 +806,7 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 */
 	initStringInfo(&sql);
 	deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used, remote_conds,
-					 &params_list, &fdw_ps_tlist, &retrieved_attrs);
+					 &params_list, &fdw_ps_tlist, &retrieved_attrs, &relations);
 
 	/*
 	 * Build the fdw_private list that will be available in the executor.
@@ -813,6 +816,8 @@ postgresGetForeignPlan(PlannerInfo *root,
 							 retrieved_attrs,
 							 makeInteger(fpinfo->server->serverid),
 							 makeInteger(fpinfo->userid));
+	if (baserel->reloptkind == RELOPT_JOINREL)
+		fdw_private = lappend(fdw_private, makeString(relations.data));
 
 	/*
 	 * Create the ForeignScan node from target list, local filtering
@@ -1625,10 +1630,25 @@ postgresExplainForeignScan(ForeignScanState *node, ExplainState *es)
 {
 	List	   *fdw_private;
 	char	   *sql;
+	char	   *relations;
 
+	fdw_private = ((ForeignScan *) node->ss.ps.plan)->fdw_private;
+
+	/*
+	 * Add names of relation handled by the foreign scan when the scan is a
+	 * join
+	 */
+	if (list_length(fdw_private) > FdwScanPrivateRelations)
+	{
+		relations = strVal(list_nth(fdw_private, FdwScanPrivateRelations));
+		ExplainPropertyText("Relations", relations, es);
+	}
+
+	/*
+	 * Add remote query, when VERBOSE option is specified.
+	 */
 	if (es->verbose)
 	{
-		fdw_private = ((ForeignScan *) node->ss.ps.plan)->fdw_private;
 		sql = strVal(list_nth(fdw_private, FdwScanPrivateSelectSql));
 		ExplainPropertyText("Remote SQL", sql, es);
 	}
@@ -1717,7 +1737,7 @@ estimate_path_cost_size(PlannerInfo *root,
 		initStringInfo(&sql);
 		appendStringInfoString(&sql, "EXPLAIN ");
 		deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used, remote_conds,
-						 NULL, NULL, &retrieved_attrs);
+						 NULL, NULL, &retrieved_attrs, NULL);
 
 		/* Get the remote estimate */
 		user = GetUserMapping(fpinfo->userid, fpinfo->server->serverid);
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 0d05e5d..d6b16d8 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -96,7 +96,8 @@ extern void deparseSelectSql(StringInfo buf,
 				 List *remote_conds,
 				 List **params_list,
 				 List **fdw_ps_tlist,
-				 List **retrieved_attrs);
+				 List **retrieved_attrs,
+				 StringInfo relations);
 extern void appendConditions(StringInfo buf,
 				  PlannerInfo *root,
 				  RelOptInfo *baserel,

#24

Kouhei Kaigai

kaigai@ak.jp.nec.com

over 10 years ago

In reply to: Shigeru HANADA (#23)

Hanada-san,

Thanks for further review, but I found two bugs in v10 patch.
I’ve fixed them and wrapped up v11 patch here.

* Fix bug about illegal column order

Scan against a base relation returns columns in order of column definition, but
its target list might require different order. This can be resolved by tuple
projection in usual cases, but pushing down joins skips the step, so we need to
treat it in remote query.

Before this fix, deparseProjectionSql() was called only for queries which have
ctid or whole-row reference in its target list, but it was a too-much optimization.
We always need to call it, because usual column list might require ordering
conversion. Checking ordering is not impossible, but it seems useless effort.

Another way to resolve this issue is to reorder SELECT clause of a query for base
relation if it was a source of a join, but it requires stepping back in planning,
so the fix above was chosen.

"three tables join" test case is also changed to check this behavior.

Sorry for my oversight. Yep, var-node reference on join relation cannot
expect any column orders predefined like as base relations.
All reasonable way I know is, relying on the targetlist of the RelOptInfo
that contains all the referenced columns in the later stage.

* Fix bug of duplicate fdw_ps_tlist contents.

I coded that deparseSelectSql passes fdw_ps_tlist to deparseSelectSql for
underlying RelOptInfo, but it causes redundant entries in fdw_ps_tlist in cases
of joining more than two foreign tables. I changed to pass NULL as fdw_ps_tlist
for recursive call of deparseSelectSql.

It's reasonable, and also makes performance benefit because descriptor
constructed based on the ps_tlist will match expected result tuple, so
it allows to avoid unnecessary projection.

* Fix typos

Please review the v11 patch, and mark it as “ready for committer” if it’s ok.

It's OK for me, and wants to be reviewed by other people to get it committed.

In addition to essential features, I tried to implement relation listing in EXPLAIN
output.

Attached explain_forein_join.patch adds capability to show join combination of
a ForeignScan in EXPLAIN output as an additional item “Relations”. I thought
that using array to list relations is a good way too, but I chose one string value
because users would like to know order and type of joins too.

A bit different from my expectation... I expected to display name of the local
foreign tables (and its alias), not remote one, because all the local join logic
displays local foreign tables name.
Is it easy to adjust isn't it? Probably, all you need to do is, putting a local
relation name on the text buffer (at deparseSelectSql) instead of the deparsed
remote relation.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Resolved by subject fallback

#25

Shigeru HANADA

shigeru.hanada@gmail.com

over 10 years ago

In reply to: Kouhei Kaigai (#24)

1 attachment(s)

KaiGai-san,

2015/04/14 14:04、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：

* Fix typos

Please review the v11 patch, and mark it as “ready for committer” if it’s ok.

It's OK for me, and wants to be reviewed by other people to get it committed.

Thanks!

In addition to essential features, I tried to implement relation listing in EXPLAIN
output.

Attached explain_forein_join.patch adds capability to show join combination of
a ForeignScan in EXPLAIN output as an additional item “Relations”. I thought
that using array to list relations is a good way too, but I chose one string value
because users would like to know order and type of joins too.

A bit different from my expectation... I expected to display name of the local
foreign tables (and its alias), not remote one, because all the local join logic
displays local foreign tables name.
Is it easy to adjust isn't it? Probably, all you need to do is, putting a local
relation name on the text buffer (at deparseSelectSql) instead of the deparsed
remote relation.

Oops, that’s right. Attached is the revised version. I chose fully qualified name, schema.relname [alias] for the output. It would waste some cycles during planning if that is not for EXPLAIN, but it seems difficult to get a list of name of relations in ExplainForeignScan() phase, because planning information has gone away at that time.

--
Shigeru HANADA
shigeru.hanada@gmail.com

Attachments:

explain_foreign_join_v2.patchapplication/octet-stream; name=explain_foreign_join_v2.patch; x-unix-mode=0644Download

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 7e27313..abf55c5 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -141,6 +141,7 @@ static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod,
 				 deparse_expr_cxt *context);
 static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
 					   deparse_expr_cxt *context);
+static const char *get_jointype_name(JoinType jointype);
 
 /*
  * convert absolute attnum to relative one.  This would be handy for handling
@@ -696,12 +697,16 @@ deparseSelectSql(StringInfo buf,
 				 List *remote_conds,
 				 List **params_list,
 				 List **fdw_ps_tlist,
-				 List **retrieved_attrs)
+				 List **retrieved_attrs,
+				 StringInfo relations)
 {
 	PgFdwRelationInfo  *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
 	RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
 	Relation	rel;
 
+	if (relations)
+		initStringInfo(relations);
+
 	/*
 	 * If given relation was a join relation, recursively construct statement
 	 * by putting each outer and inner relations in FROM clause as a subquery
@@ -716,6 +721,9 @@ deparseSelectSql(StringInfo buf,
 		StringInfoData		sql_o;
 		StringInfoData		sql_i;
 		List			   *ret_attrs_tmp;	/* not used */
+		StringInfoData		relations_o;
+		StringInfoData		relations_i;
+		const char		   *jointype_str;
 
 		/*
 		 * Deparse query for outer and inner relation, and combine them into
@@ -728,11 +736,17 @@ deparseSelectSql(StringInfo buf,
 		initStringInfo(&sql_o);
 		deparseSelectSql(&sql_o, root, rel_o, fpinfo_o->attrs_used,
 						 fpinfo_o->remote_conds, params_list,
-						 NULL, &ret_attrs_tmp);
+						 NULL, &ret_attrs_tmp, &relations_o);
 		initStringInfo(&sql_i);
 		deparseSelectSql(&sql_i, root, rel_i, fpinfo_i->attrs_used,
 						 fpinfo_i->remote_conds, params_list,
-						 NULL, &ret_attrs_tmp);
+						 NULL, &ret_attrs_tmp, &relations_i);
+
+		/* For EXPLAIN output */
+		jointype_str = get_jointype_name(fpinfo->jointype);
+		if (relations)
+			appendStringInfo(relations, "(%s) %s JOIN (%s)",
+							 relations_o.data, jointype_str, relations_i.data);
 
 		deparseJoinSql(buf, root, baserel,
 					   fpinfo->outerrel,
@@ -767,6 +781,28 @@ deparseSelectSql(StringInfo buf,
 	deparseRelation(buf, rel);
 
 	/*
+	 * Return local relation name for EXPLAIN output.
+	 * We can't know VERBOSE option is specified or not, so always add shcema
+	 * name.
+	 */
+	if (relations)
+	{
+		const char	   *namespace;
+		const char	   *relname;
+		const char	   *refname;
+
+		namespace = get_namespace_name(get_rel_namespace(rte->relid));
+		relname = get_rel_name(rte->relid);
+		refname = rte->eref->aliasname;
+		appendStringInfo(relations, "%s.%s",
+						 quote_identifier(namespace),
+						 quote_identifier(relname));
+		if (*refname && strcmp(refname, relname) != 0)
+			appendStringInfo(relations, " %s",
+							 quote_identifier(rte->eref->aliasname));
+	}
+
+	/*
 	 * Construct WHERE clause
 	 */
 	if (remote_conds)
@@ -1132,6 +1168,15 @@ deparseProjectionSql(PlannerInfo *root,
 	return buf.data;
 }
 
+static const char *
+get_jointype_name(JoinType jointype)
+{
+	return jointype == JOIN_INNER ? "INNER" :
+		   jointype == JOIN_LEFT ? "LEFT" :
+		   jointype == JOIN_RIGHT ? "RIGHT" :
+		   jointype == JOIN_FULL ? "FULL" : "";
+}
+
 /*
  * Construct a SELECT statement which contains join clause.
  *
@@ -1173,11 +1218,7 @@ deparseJoinSql(StringInfo buf,
 	context.outertlist = outerrel->reltargetlist;
 	context.innertlist = innerrel->reltargetlist;
 
-	jointype_str = jointype == JOIN_INNER ? "INNER" :
-				   jointype == JOIN_LEFT ? "LEFT" :
-				   jointype == JOIN_RIGHT ? "RIGHT" :
-				   jointype == JOIN_FULL ? "FULL" : "";
-
+	jointype_str = get_jointype_name(jointype);
 	*retrieved_attrs = NIL;
 
 	/* print SELECT clause of the join scan */
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 530525e..58f24c0 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -531,8 +531,9 @@ EXPLAIN (VERBOSE, COSTS false)
 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan
    Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8, b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
+   Relations: (public.ft2 a) INNER JOIN (public.ft2 b)
    Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, r.a1, r.a2, r.a3, r.a4, r.a5, r.a6, r.a7, r.a8 FROM (SELECT l.a9, l.a10, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 47))) l) l (a1, a2, a3, a4, a5, a6, a7, a8) INNER JOIN (SELECT r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2, a3, a4, a5, a6, a7, a8) ON ((l.a2 = r.a1))
-(3 rows)
+(4 rows)
 
 SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  | c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -698,8 +699,9 @@ SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan
                Output: t1.c1, t2.c1, t1.c3
+               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
                Remote SQL: SELECT l.a1, l.a2, r.a1 FROM (SELECT l.a10, l.a12 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l) l (a1, a2) INNER JOIN (SELECT r.a9 FROM (SELECT "C 1" a9 FROM "S 1"."T 1") r) r (a1) ON ((l.a1 = r.a1))
-(8 rows)
+(9 rows)
 
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
  c1  | c1  
@@ -728,8 +730,9 @@ SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan
                Output: t1.c1, t2.c2, t3.c3, t1.c3
+               Relations: ((public.ft1 t1) INNER JOIN (public.ft2 t2)) INNER JOIN (public.ft4 t3)
                Remote SQL: SELECT l.a1, l.a2, l.a3, r.a1 FROM (SELECT l.a1, l.a2, r.a1, r.a2 FROM (SELECT l.a10, l.a12 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l) l (a1, a2) INNER JOIN (SELECT r.a10, r.a9 FROM (SELECT "C 1" a9, c2 a10 FROM "S 1"."T 1") r) r (a1, a2) ON ((l.a1 = r.a2))) l (a1, a2, a3, a4) INNER JOIN (SELECT r.a11, r.a9 FROM (SELECT c1 a9, c3 a11 FROM "S 1"."T 3") r) r (a1, a2) ON ((l.a1 = r.a2))
-(8 rows)
+(9 rows)
 
 SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
  c1 | c2 |   c3   
@@ -758,8 +761,9 @@ SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.
          Sort Key: t1.c1, t2.c1
          ->  Foreign Scan
                Output: t1.c1, t2.c1
+               Relations: (public.ft4 t1) LEFT JOIN (public.ft5 t2)
                Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) LEFT JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1))
-(8 rows)
+(9 rows)
 
 SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
  c1 | c1 
@@ -789,8 +793,9 @@ SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2
          Sort Key: t2.c1
          ->  Foreign Scan
                Output: t1.c1, t2.c1
+               Relations: (public.ft5 t2) LEFT JOIN (public.ft4 t1)
                Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") l) l (a1) LEFT JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") r) r (a1) ON ((r.a1 = l.a1))
-(8 rows)
+(9 rows)
 
 SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
  c1 | c1 
@@ -820,8 +825,9 @@ SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.
          Sort Key: t1.c1, t2.c1
          ->  Foreign Scan
                Output: t1.c1, t2.c1
+               Relations: (public.ft4 t1) FULL JOIN (public.ft5 t2)
                Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) FULL JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1))
-(8 rows)
+(9 rows)
 
 SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
  c1  | c1 
@@ -850,8 +856,9 @@ SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1
          Sort Key: t1.c1, t2.c1
          ->  Foreign Scan
                Output: t1.c1, t2.c1
+               Relations: (public.ft4 t1) FULL JOIN (public.ft5 t2)
                Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) FULL JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1)) WHERE (((l.a1 = r.a1) OR (l.a1 IS NULL)))
-(8 rows)
+(9 rows)
 
 SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
  c1 | c1 
@@ -881,8 +888,9 @@ SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) O
          Sort Key: t1.c1
          ->  Foreign Scan
                Output: t1.c1, t2.c1
+               Relations: (public.ft4 t1) INNER JOIN (public.ft5 t2)
                Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) INNER JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1))
-(8 rows)
+(9 rows)
 
 SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
  c1 | c1 
@@ -906,13 +914,14 @@ WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2
    CTE t
      ->  Foreign Scan
            Output: t1.c1, t1.c3, t2.c1
+           Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
            Remote SQL: SELECT l.a1, l.a2, r.a1 FROM (SELECT l.a10, l.a12 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l) l (a1, a2) INNER JOIN (SELECT r.a9 FROM (SELECT "C 1" a9 FROM "S 1"."T 1") r) r (a1) ON ((l.a1 = r.a1))
    ->  Sort
          Output: t.c1_1, t.c2_1, t.c1_3
          Sort Key: t.c1_3, t.c1_1
          ->  CTE Scan on t
                Output: t.c1_1, t.c2_1, t.c1_3
-(11 rows)
+(12 rows)
 
 WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
  c1_1 | c2_1 
@@ -941,8 +950,9 @@ SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER B
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan
                Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
                Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, r.a1 FROM (SELECT l.a7, ROW(l.a10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17), l.a10, l.a12 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17, ctid a7 FROM "S 1"."T 1") l) l (a1, a2, a3, a4) INNER JOIN (SELECT ROW(r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a9 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2) ON ((l.a3 = r.a2))
-(8 rows)
+(9 rows)
 
 SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
   ctid  |                                             t1                                             |                                             t2                                             | c1  
@@ -976,8 +986,9 @@ SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.
                      Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
                ->  Materialize
                      ->  Foreign Scan
+                           Relations: (public.ft2 t2) INNER JOIN (public.ft4 t3)
                            Remote SQL: SELECT NULL FROM (SELECT l.a9 FROM (SELECT "C 1" a9 FROM "S 1"."T 1" WHERE (("C 1" = "C 1"))) l) l (a1) INNER JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") r) r (a1) ON ((l.a1 = r.a1))
-(13 rows)
+(14 rows)
 
 SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
  c1 
@@ -1815,8 +1826,9 @@ UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
    Remote SQL: UPDATE "S 1"."T 1" SET c2 = $2, c3 = $3, c7 = $4 WHERE ctid = $1
    ->  Foreign Scan
          Output: ft2.c1, (ft2.c2 + 500), NULL::integer, (ft2.c3 || '_update9'::text), ft2.c4, ft2.c5, ft2.c6, 'ft2       '::character(10), ft2.c8, ft2.ctid, ft1.*
+         Relations: (public.ft2) INNER JOIN (public.ft1)
          Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, r.a1 FROM (SELECT l.a9, l.a10, l.a12, l.a13, l.a14, l.a15, l.a17, l.a7 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c8 a17, ctid a7 FROM "S 1"."T 1" FOR UPDATE) l) l (a1, a2, a3, a4, a5, a6, a7, a8) INNER JOIN (SELECT ROW(r.a10, r.a11, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a10 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 9))) r) r (a1, a2) ON ((l.a2 = r.a2))
-(5 rows)
+(6 rows)
 
 UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
@@ -1948,8 +1960,9 @@ DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
    Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1
    ->  Foreign Scan
          Output: ft2.ctid, ft1.*
+         Relations: (public.ft2) INNER JOIN (public.ft1)
          Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a7, l.a10 FROM (SELECT c2 a10, ctid a7 FROM "S 1"."T 1" FOR UPDATE) l) l (a1, a2) INNER JOIN (SELECT ROW(r.a10, r.a11, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a10 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 2))) r) r (a1, a2) ON ((l.a2 = r.a2))
-(5 rows)
+(6 rows)
 
 DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
 SELECT c1,c2,c3,c4 FROM ft2 ORDER BY c1;
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 5e5ccb7..bedf83c 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -69,6 +69,8 @@ enum FdwScanPrivateIndex
 	FdwScanPrivateServerOid,
 	/* Integer value of effective userid for the scan */
 	FdwScanPrivateUserOid,
+	/* Names of relation scanned, added when the scan is join */
+	FdwScanPrivateRelations,
 };
 
 /*
@@ -760,6 +762,7 @@ postgresGetForeignPlan(PlannerInfo *root,
 	ListCell   *lc;
 	List	   *fdw_ps_tlist = NIL;
 	ForeignScan *scan;
+	StringInfoData relations;
 
 	/*
 	 * Separate the scan_clauses into those that can be executed remotely and
@@ -803,7 +806,7 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 */
 	initStringInfo(&sql);
 	deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used, remote_conds,
-					 &params_list, &fdw_ps_tlist, &retrieved_attrs);
+					 &params_list, &fdw_ps_tlist, &retrieved_attrs, &relations);
 
 	/*
 	 * Build the fdw_private list that will be available in the executor.
@@ -813,6 +816,8 @@ postgresGetForeignPlan(PlannerInfo *root,
 							 retrieved_attrs,
 							 makeInteger(fpinfo->server->serverid),
 							 makeInteger(fpinfo->userid));
+	if (baserel->reloptkind == RELOPT_JOINREL)
+		fdw_private = lappend(fdw_private, makeString(relations.data));
 
 	/*
 	 * Create the ForeignScan node from target list, local filtering
@@ -1625,10 +1630,25 @@ postgresExplainForeignScan(ForeignScanState *node, ExplainState *es)
 {
 	List	   *fdw_private;
 	char	   *sql;
+	char	   *relations;
 
+	fdw_private = ((ForeignScan *) node->ss.ps.plan)->fdw_private;
+
+	/*
+	 * Add names of relation handled by the foreign scan when the scan is a
+	 * join
+	 */
+	if (list_length(fdw_private) > FdwScanPrivateRelations)
+	{
+		relations = strVal(list_nth(fdw_private, FdwScanPrivateRelations));
+		ExplainPropertyText("Relations", relations, es);
+	}
+
+	/*
+	 * Add remote query, when VERBOSE option is specified.
+	 */
 	if (es->verbose)
 	{
-		fdw_private = ((ForeignScan *) node->ss.ps.plan)->fdw_private;
 		sql = strVal(list_nth(fdw_private, FdwScanPrivateSelectSql));
 		ExplainPropertyText("Remote SQL", sql, es);
 	}
@@ -1717,7 +1737,7 @@ estimate_path_cost_size(PlannerInfo *root,
 		initStringInfo(&sql);
 		appendStringInfoString(&sql, "EXPLAIN ");
 		deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used, remote_conds,
-						 NULL, NULL, &retrieved_attrs);
+						 NULL, NULL, &retrieved_attrs, NULL);
 
 		/* Get the remote estimate */
 		user = GetUserMapping(fpinfo->userid, fpinfo->server->serverid);
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 0d05e5d..d6b16d8 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -96,7 +96,8 @@ extern void deparseSelectSql(StringInfo buf,
 				 List *remote_conds,
 				 List **params_list,
 				 List **fdw_ps_tlist,
-				 List **retrieved_attrs);
+				 List **retrieved_attrs,
+				 StringInfo relations);
 extern void appendConditions(StringInfo buf,
 				  PlannerInfo *root,
 				  RelOptInfo *baserel,

#26

Kouhei Kaigai

kaigai@ak.jp.nec.com

over 10 years ago

In reply to: Shigeru HANADA (#25)

Attached explain_forein_join.patch adds capability to show join combination

of

a ForeignScan in EXPLAIN output as an additional item “Relations”. I thought
that using array to list relations is a good way too, but I chose one string

value

because users would like to know order and type of joins too.

A bit different from my expectation... I expected to display name of the local
foreign tables (and its alias), not remote one, because all the local join logic
displays local foreign tables name.
Is it easy to adjust isn't it? Probably, all you need to do is, putting a local
relation name on the text buffer (at deparseSelectSql) instead of the deparsed
remote relation.

Oops, that’s right. Attached is the revised version. I chose fully qualified
name, schema.relname [alias] for the output. It would waste some cycles during
planning if that is not for EXPLAIN, but it seems difficult to get a list of name
of relations in ExplainForeignScan() phase, because planning information has gone
away at that time.

I understand. Private data structure of the postgres_fdw is not designed
to keep tree structure data (like relations join tree), so it seems to me
a straightforward way to implement the feature.

I have a small suggestion. This patch makes deparseSelectSql initialize
the StringInfoData if supplied, however, it usually shall be a task of
function caller, not callee.
In this case, I like to initStringInfo(&relations) next to the line of
initStingInfo(&sql) on the postgresGetForeignPlan. In my sense, it is
a bit strange to pass uninitialized StringInfoData, to get a text form.

@@ -803,7 +806,7 @@ postgresGetForeignPlan(PlannerInfo *root,
     */
    initStringInfo(&sql);
    deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used, remote_conds,
-                    &params_list, &fdw_ps_tlist, &retrieved_attrs);
+                    &params_list, &fdw_ps_tlist, &retrieved_attrs, &relations);

/*
* Build the fdw_private list that will be available in the executor.

Also, could you merge the EXPLAIN output feature on the main patch?
I think here is no reason why to split this feature.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Shigeru HANADA
Sent: Tuesday, April 14, 2015 7:49 PM
To: Kaigai Kouhei(海外浩平)
Cc: Ashutosh Bapat; Robert Haas; Tom Lane; Thom Brown;
pgsql-hackers@postgreSQL.org
Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

KaiGai-san,

2015/04/14 14:04、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：

* Fix typos

Please review the v11 patch, and mark it as “ready for committer” if it’s

ok.

It's OK for me, and wants to be reviewed by other people to get it committed.

Thanks!

In addition to essential features, I tried to implement relation listing in

EXPLAIN

output.

Attached explain_forein_join.patch adds capability to show join combination

of

a ForeignScan in EXPLAIN output as an additional item “Relations”. I thought
that using array to list relations is a good way too, but I chose one string

value

because users would like to know order and type of joins too.

A bit different from my expectation... I expected to display name of the local
foreign tables (and its alias), not remote one, because all the local join logic
displays local foreign tables name.
Is it easy to adjust isn't it? Probably, all you need to do is, putting a local
relation name on the text buffer (at deparseSelectSql) instead of the deparsed
remote relation.

Oops, that’s right. Attached is the revised version. I chose fully qualified
name, schema.relname [alias] for the output. It would waste some cycles during
planning if that is not for EXPLAIN, but it seems difficult to get a list of name
of relations in ExplainForeignScan() phase, because planning information has gone
away at that time.

--
Shigeru HANADA
shigeru.hanada@gmail.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27

Shigeru HANADA

shigeru.hanada@gmail.com

over 10 years ago

In reply to: Kouhei Kaigai (#26)

1 attachment(s)

Kaigai-san,

2015/04/15 22:33、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：

Oops, that’s right. Attached is the revised version. I chose fully qualified
name, schema.relname [alias] for the output. It would waste some cycles during
planning if that is not for EXPLAIN, but it seems difficult to get a list of name
of relations in ExplainForeignScan() phase, because planning information has gone
away at that time.

I understand. Private data structure of the postgres_fdw is not designed
to keep tree structure data (like relations join tree), so it seems to me
a straightforward way to implement the feature.

I have a small suggestion. This patch makes deparseSelectSql initialize
the StringInfoData if supplied, however, it usually shall be a task of
function caller, not callee.
In this case, I like to initStringInfo(&relations) next to the line of
initStingInfo(&sql) on the postgresGetForeignPlan. In my sense, it is
a bit strange to pass uninitialized StringInfoData, to get a text form.
@@ -803,7 +806,7 @@ postgresGetForeignPlan(PlannerInfo *root,
*/
initStringInfo(&sql);
deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used, remote_conds,
-                    &params_list, &fdw_ps_tlist, &retrieved_attrs);
+                    &params_list, &fdw_ps_tlist, &retrieved_attrs, &relations);
/*
* Build the fdw_private list that will be available in the executor.

Agreed. If caller passes a buffer, it should be initialized by caller. In addition to your idea, I added a check that the RelOptInfo is a JOINREL, coz BASEREL doesn’t need relations for its EXPLAIN output.

Also, could you merge the EXPLAIN output feature on the main patch?
I think here is no reason why to split this feature.

I merged explain patch into foreign_join patch.

Now v12 is the latest patch.

--
Shigeru HANADA
shigeru.hanada@gmail.com

Attachments:

foreign_join_v12.patchapplication/octet-stream; name=foreign_join_v12.patchDownload

diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 38aab11..4ef0de6 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -473,3 +473,34 @@ AC_DEFUN([PGAC_HAVE_GCC__ATOMIC_INT64_CAS],
 if test x"$pgac_cv_gcc_atomic_int64_cas" = x"yes"; then
   AC_DEFINE(HAVE_GCC__ATOMIC_INT64_CAS, 1, [Define to 1 if you have __atomic_compare_exchange_n(int64 *, int *, int64).])
 fi])# PGAC_HAVE_GCC__ATOMIC_INT64_CAS
+
+# PGAC_SSE42_CRC32_INTRINSICS
+# -----------------------
+# Check if the compiler supports the x86 CRC instructions added in SSE 4.2,
+# using the _mm_crc32_u8 and _mm_crc32_u32 intrinsic functions. (We don't
+# test the 8-byte variant, _mm_crc32_u64, but it is assumed to be present if
+# the other ones are, on x86-64 platforms)
+#
+# An optional compiler flag can be passed as argument (e.g. -msse4.2). If the
+# intrinsics are supported, sets pgac_sse42_crc32_intrinsics, and CFLAGS_SSE42.
+AC_DEFUN([PGAC_SSE42_CRC32_INTRINSICS],
+[define([Ac_cachevar], [AS_TR_SH([pgac_cv_sse42_crc32_intrinsics_$1])])dnl
+AC_CACHE_CHECK([for _mm_crc32_u8 and _mm_crc32_u32 with CFLAGS=$1], [Ac_cachevar],
+[pgac_save_CFLAGS=$CFLAGS
+CFLAGS="$pgac_save_CFLAGS $1"
+ac_save_c_werror_flag=$ac_c_werror_flag
+ac_c_werror_flag=yes
+AC_TRY_LINK([#include <nmmintrin.h>],
+  [unsigned int crc = 0;
+   crc = _mm_crc32_u8(crc, 0);
+   crc = _mm_crc32_u32(crc, 0);],
+  [Ac_cachevar=yes],
+  [Ac_cachevar=no])
+ac_c_werror_flag=$ac_save_c_werror_flag
+CFLAGS="$pgac_save_CFLAGS"])
+if test x"$Ac_cachevar" = x"yes"; then
+  CFLAGS_SSE42="$1"
+  pgac_sse42_crc32_intrinsics=yes
+fi
+undefine([Ac_cachevar])dnl
+])# PGAC_SSE42_CRC32_INTRINSICS
diff --git a/configure b/configure
index 640ffc7..7c0bd0c 100755
--- a/configure
+++ b/configure
@@ -650,6 +650,8 @@ MSGMERGE
 MSGFMT_FLAGS
 MSGFMT
 HAVE_POSIX_SIGNALS
+PG_CRC32C_OBJS
+CFLAGS_SSE42
 LDAP_LIBS_BE
 LDAP_LIBS_FE
 PTHREAD_CFLAGS
@@ -14095,6 +14097,242 @@ $as_echo "#define HAVE_GCC__ATOMIC_INT64_CAS 1" >>confdefs.h
 
 fi
 
+
+# Check for x86 cpuid instruction
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __get_cpuid" >&5
+$as_echo_n "checking for __get_cpuid... " >&6; }
+if ${pgac_cv__get_cpuid+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include <cpuid.h>
+int
+main ()
+{
+unsigned int exx[4] = {0, 0, 0, 0};
+  __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  pgac_cv__get_cpuid="yes"
+else
+  pgac_cv__get_cpuid="no"
+fi
+rm -f core conftest.err conftest.$ac_objext \
+    conftest$ac_exeext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv__get_cpuid" >&5
+$as_echo "$pgac_cv__get_cpuid" >&6; }
+if test x"$pgac_cv__get_cpuid" = x"yes"; then
+
+$as_echo "#define HAVE__GET_CPUID 1" >>confdefs.h
+
+fi
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __cpuid" >&5
+$as_echo_n "checking for __cpuid... " >&6; }
+if ${pgac_cv__cpuid+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include <intrin.h>
+int
+main ()
+{
+unsigned int exx[4] = {0, 0, 0, 0};
+  __get_cpuid(exx[0], 1);
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  pgac_cv__cpuid="yes"
+else
+  pgac_cv__cpuid="no"
+fi
+rm -f core conftest.err conftest.$ac_objext \
+    conftest$ac_exeext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv__cpuid" >&5
+$as_echo "$pgac_cv__cpuid" >&6; }
+if test x"$pgac_cv__cpuid" = x"yes"; then
+
+$as_echo "#define HAVE__CPUID 1" >>confdefs.h
+
+fi
+
+# Check for Intel SSE 4.2 intrinsics to do CRC calculations.
+#
+# First check if the _mm_crc32_u8 and _mm_crc32_u64 intrinsics can be used
+# with the default compiler flags. If not, check if adding the -msse4.2
+# flag helps. CFLAGS_SSE42 is set to -msse4.2 if that's required.
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for _mm_crc32_u8 and _mm_crc32_u32 with CFLAGS=" >&5
+$as_echo_n "checking for _mm_crc32_u8 and _mm_crc32_u32 with CFLAGS=... " >&6; }
+if ${pgac_cv_sse42_crc32_intrinsics_+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  pgac_save_CFLAGS=$CFLAGS
+CFLAGS="$pgac_save_CFLAGS "
+ac_save_c_werror_flag=$ac_c_werror_flag
+ac_c_werror_flag=yes
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include <nmmintrin.h>
+int
+main ()
+{
+unsigned int crc = 0;
+   crc = _mm_crc32_u8(crc, 0);
+   crc = _mm_crc32_u32(crc, 0);
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  pgac_cv_sse42_crc32_intrinsics_=yes
+else
+  pgac_cv_sse42_crc32_intrinsics_=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+    conftest$ac_exeext conftest.$ac_ext
+ac_c_werror_flag=$ac_save_c_werror_flag
+CFLAGS="$pgac_save_CFLAGS"
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_sse42_crc32_intrinsics_" >&5
+$as_echo "$pgac_cv_sse42_crc32_intrinsics_" >&6; }
+if test x"$pgac_cv_sse42_crc32_intrinsics_" = x"yes"; then
+  CFLAGS_SSE42=""
+  pgac_sse42_crc32_intrinsics=yes
+fi
+
+if test x"$pgac_sse42_crc32_intrinsics" != x"yes"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for _mm_crc32_u8 and _mm_crc32_u32 with CFLAGS=-msse4.2" >&5
+$as_echo_n "checking for _mm_crc32_u8 and _mm_crc32_u32 with CFLAGS=-msse4.2... " >&6; }
+if ${pgac_cv_sse42_crc32_intrinsics__msse4_2+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  pgac_save_CFLAGS=$CFLAGS
+CFLAGS="$pgac_save_CFLAGS -msse4.2"
+ac_save_c_werror_flag=$ac_c_werror_flag
+ac_c_werror_flag=yes
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include <nmmintrin.h>
+int
+main ()
+{
+unsigned int crc = 0;
+   crc = _mm_crc32_u8(crc, 0);
+   crc = _mm_crc32_u32(crc, 0);
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  pgac_cv_sse42_crc32_intrinsics__msse4_2=yes
+else
+  pgac_cv_sse42_crc32_intrinsics__msse4_2=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+    conftest$ac_exeext conftest.$ac_ext
+ac_c_werror_flag=$ac_save_c_werror_flag
+CFLAGS="$pgac_save_CFLAGS"
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_sse42_crc32_intrinsics__msse4_2" >&5
+$as_echo "$pgac_cv_sse42_crc32_intrinsics__msse4_2" >&6; }
+if test x"$pgac_cv_sse42_crc32_intrinsics__msse4_2" = x"yes"; then
+  CFLAGS_SSE42="-msse4.2"
+  pgac_sse42_crc32_intrinsics=yes
+fi
+
+fi
+
+
+# Are we targeting a processor that supports SSE 4.2? gcc, clang and icc all
+# define __SSE4_2__ in that case.
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+
+#ifndef __SSE4_2__
+#error __SSE4_2__ not defined
+#endif
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+  SSE4_2_TARGETED=1
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+
+# Select CRC-32C implementation.
+#
+# If we are targeting a processor that has SSE 4.2 instructions, we can use the
+# special CRC instructions for calculating CRC-32C. If we're not targeting such
+# a processor, but we can nevertheless produce code that uses the SSE
+# intrinsics, perhaps with some extra CFLAGS, compile both implementations and
+# select which one to use at runtime, depending on whether SSE 4.2 is supported
+# by the processor we're running on.
+#
+# You can override this logic by setting the appropriate USE_*_CRC32 flag to 1
+# in the template or configure command line.
+if test x"$USE_SSE42_CRC32C" = x"" && test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"" && test x"$USE_SLICING_BY_8_CRC32C" = x""; then
+  if test x"$pgac_sse42_crc32_intrinsics" = x"yes" && test x"$SSE4_2_TARGETED" = x"1" ; then
+    USE_SSE42_CRC32C=1
+  else
+    # the CPUID instruction is needed for the runtime check.
+    if test x"$pgac_sse42_crc32_intrinsics" = x"yes" && (test x"$pgac_cv__get_cpuid" = x"yes" || test x"$pgac_cv__cpuid" = x"yes"); then
+      USE_SSE42_CRC32C_WITH_RUNTIME_CHECK=1
+    else
+      # fall back to slicing-by-8 algorithm which doesn't require any special
+      # CPU support.
+      USE_SLICING_BY_8_CRC32C=1
+    fi
+  fi
+fi
+
+# Set PG_CRC32C_OBJS appropriately depending on the selected implementation.
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking which CRC-32C implementation to use" >&5
+$as_echo_n "checking which CRC-32C implementation to use... " >&6; }
+if test x"$USE_SSE42_CRC32C" = x"1"; then
+
+$as_echo "#define USE_SSE42_CRC32C 1" >>confdefs.h
+
+  PG_CRC32C_OBJS="pg_crc32c_sse42.o"
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: SSE 4.2" >&5
+$as_echo "SSE 4.2" >&6; }
+else
+  if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
+
+$as_echo "#define USE_SSE42_CRC32C_WITH_RUNTIME_CHECK 1" >>confdefs.h
+
+    PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o pg_crc32c_choose.o"
+    { $as_echo "$as_me:${as_lineno-$LINENO}: result: SSE 4.2 with runtime check" >&5
+$as_echo "SSE 4.2 with runtime check" >&6; }
+  else
+
+$as_echo "#define USE_SLICING_BY_8_CRC32C 1" >>confdefs.h
+
+    PG_CRC32C_OBJS="pg_crc32c_sb8.o"
+    { $as_echo "$as_me:${as_lineno-$LINENO}: result: slicing-by-8" >&5
+$as_echo "slicing-by-8" >&6; }
+  fi
+fi
+
+
+
+# Check that POSIX signals are available if thread safety is enabled.
 if test "$PORTNAME" != "win32"
 then
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for POSIX signal interface" >&5
diff --git a/configure.in b/configure.in
index 1f958cf..1cd9e1e 100644
--- a/configure.in
+++ b/configure.in
@@ -1790,6 +1790,96 @@ PGAC_HAVE_GCC__SYNC_INT64_CAS
 PGAC_HAVE_GCC__ATOMIC_INT32_CAS
 PGAC_HAVE_GCC__ATOMIC_INT64_CAS
 
+
+# Check for x86 cpuid instruction
+AC_CACHE_CHECK([for __get_cpuid], [pgac_cv__get_cpuid],
+[AC_TRY_LINK([#include <cpuid.h>],
+  [unsigned int exx[4] = {0, 0, 0, 0};
+  __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
+  ],
+  [pgac_cv__get_cpuid="yes"],
+  [pgac_cv__get_cpuid="no"])])
+if test x"$pgac_cv__get_cpuid" = x"yes"; then
+  AC_DEFINE(HAVE__GET_CPUID, 1, [Define to 1 if you have __get_cpuid.])
+fi
+
+AC_CACHE_CHECK([for __cpuid], [pgac_cv__cpuid],
+[AC_TRY_LINK([#include <intrin.h>],
+  [unsigned int exx[4] = {0, 0, 0, 0};
+  __get_cpuid(exx[0], 1);
+  ],
+  [pgac_cv__cpuid="yes"],
+  [pgac_cv__cpuid="no"])])
+if test x"$pgac_cv__cpuid" = x"yes"; then
+  AC_DEFINE(HAVE__CPUID, 1, [Define to 1 if you have __cpuid.])
+fi
+
+# Check for Intel SSE 4.2 intrinsics to do CRC calculations.
+#
+# First check if the _mm_crc32_u8 and _mm_crc32_u64 intrinsics can be used
+# with the default compiler flags. If not, check if adding the -msse4.2
+# flag helps. CFLAGS_SSE42 is set to -msse4.2 if that's required.
+PGAC_SSE42_CRC32_INTRINSICS([])
+if test x"$pgac_sse42_crc32_intrinsics" != x"yes"; then
+  PGAC_SSE42_CRC32_INTRINSICS([-msse4.2])
+fi
+AC_SUBST(CFLAGS_SSE42)
+
+# Are we targeting a processor that supports SSE 4.2? gcc, clang and icc all
+# define __SSE4_2__ in that case.
+AC_TRY_COMPILE([], [
+#ifndef __SSE4_2__
+#error __SSE4_2__ not defined
+#endif
+], [SSE4_2_TARGETED=1])
+
+# Select CRC-32C implementation.
+#
+# If we are targeting a processor that has SSE 4.2 instructions, we can use the
+# special CRC instructions for calculating CRC-32C. If we're not targeting such
+# a processor, but we can nevertheless produce code that uses the SSE
+# intrinsics, perhaps with some extra CFLAGS, compile both implementations and
+# select which one to use at runtime, depending on whether SSE 4.2 is supported
+# by the processor we're running on.
+#
+# You can override this logic by setting the appropriate USE_*_CRC32 flag to 1
+# in the template or configure command line.
+if test x"$USE_SSE42_CRC32C" = x"" && test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"" && test x"$USE_SLICING_BY_8_CRC32C" = x""; then
+  if test x"$pgac_sse42_crc32_intrinsics" = x"yes" && test x"$SSE4_2_TARGETED" = x"1" ; then
+    USE_SSE42_CRC32C=1
+  else
+    # the CPUID instruction is needed for the runtime check.
+    if test x"$pgac_sse42_crc32_intrinsics" = x"yes" && (test x"$pgac_cv__get_cpuid" = x"yes" || test x"$pgac_cv__cpuid" = x"yes"); then
+      USE_SSE42_CRC32C_WITH_RUNTIME_CHECK=1
+    else
+      # fall back to slicing-by-8 algorithm which doesn't require any special
+      # CPU support.
+      USE_SLICING_BY_8_CRC32C=1
+    fi
+  fi
+fi
+
+# Set PG_CRC32C_OBJS appropriately depending on the selected implementation.
+AC_MSG_CHECKING([which CRC-32C implementation to use])
+if test x"$USE_SSE42_CRC32C" = x"1"; then
+  AC_DEFINE(USE_SSE42_CRC32C, 1, [Define to 1 use Intel SSE 4.2 CRC instructions.])
+  PG_CRC32C_OBJS="pg_crc32c_sse42.o"
+  AC_MSG_RESULT(SSE 4.2)
+else
+  if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
+    AC_DEFINE(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use Intel SSSE 4.2 CRC instructions with a runtime check.])
+    PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o pg_crc32c_choose.o"
+    AC_MSG_RESULT(SSE 4.2 with runtime check)
+  else
+    AC_DEFINE(USE_SLICING_BY_8_CRC32C, 1, [Define to 1 to use Intel SSE 4.2 CRC instructions with a runtime check.])
+    PG_CRC32C_OBJS="pg_crc32c_sb8.o"
+    AC_MSG_RESULT(slicing-by-8)
+  fi
+fi
+AC_SUBST(PG_CRC32C_OBJS)
+
+
+# Check that POSIX signals are available if thread safety is enabled.
 if test "$PORTNAME" != "win32"
 then
 PGAC_FUNC_POSIX_SIGNALS
diff --git a/contrib/Makefile b/contrib/Makefile
index d63e441..cc60d68 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -36,8 +36,6 @@ SUBDIRS = \
 		pg_test_fsync	\
 		pg_test_timing	\
 		pg_trgm		\
-		pg_upgrade	\
-		pg_upgrade_support \
 		pgcrypto	\
 		pgrowlocks	\
 		pgstattuple	\
diff --git a/contrib/hstore/hstore_gist.c b/contrib/hstore/hstore_gist.c
index f375f5d..06f3c93 100644
--- a/contrib/hstore/hstore_gist.c
+++ b/contrib/hstore/hstore_gist.c
@@ -6,7 +6,7 @@
 #include "access/gist.h"
 #include "access/skey.h"
 #include "catalog/pg_type.h"
-#include "common/pg_crc.h"
+#include "utils/pg_crc.h"
 
 #include "hstore.h"
 
diff --git a/contrib/ltree/crc32.c b/contrib/ltree/crc32.c
index 9e04037..1c08d26 100644
--- a/contrib/ltree/crc32.c
+++ b/contrib/ltree/crc32.c
@@ -20,7 +20,7 @@
 #define TOLOWER(x)	(x)
 #endif
 
-#include "common/pg_crc.h"
+#include "utils/pg_crc.h"
 #include "crc32.h"
 
 unsigned int
diff --git a/contrib/pg_trgm/trgm_op.c b/contrib/pg_trgm/trgm_op.c
index 5ec7f26..1a71a2b 100644
--- a/contrib/pg_trgm/trgm_op.c
+++ b/contrib/pg_trgm/trgm_op.c
@@ -10,6 +10,7 @@
 #include "catalog/pg_type.h"
 #include "tsearch/ts_locale.h"
 #include "utils/memutils.h"
+#include "utils/pg_crc.h"
 
 PG_MODULE_MAGIC;
 
diff --git a/contrib/pg_upgrade/.gitignore b/contrib/pg_upgrade/.gitignore
deleted file mode 100644
index d24ec60..0000000
--- a/contrib/pg_upgrade/.gitignore
+++ /dev/null
@@ -1,8 +0,0 @@
-/pg_upgrade
-# Generated by test suite
-/analyze_new_cluster.sh
-/delete_old_cluster.sh
-/analyze_new_cluster.bat
-/delete_old_cluster.bat
-/log/
-/tmp_check/
diff --git a/contrib/pg_upgrade/IMPLEMENTATION b/contrib/pg_upgrade/IMPLEMENTATION
deleted file mode 100644
index a0cfcf1..0000000
--- a/contrib/pg_upgrade/IMPLEMENTATION
+++ /dev/null
@@ -1,100 +0,0 @@
-contrib/pg_upgrade/IMPLEMENTATION
-
-------------------------------------------------------------------------------
-PG_UPGRADE: IN-PLACE UPGRADES FOR POSTGRESQL
-------------------------------------------------------------------------------
-
-Upgrading a PostgreSQL database from one major release to another can be
-an expensive process. For minor upgrades, you can simply install new
-executables and forget about upgrading existing data. But for major
-upgrades, you have to export all of your data using pg_dump, install the
-new release, run initdb to create a new cluster, and then import your
-old data. If you have a lot of data, that can take a considerable amount
-of time. If you have too much data, you may have to buy more storage
-since you need enough room to hold the original data plus the exported
-data.  pg_upgrade can reduce the amount of time and disk space required
-for many upgrades.
-
-The URL http://momjian.us/main/writings/pgsql/pg_upgrade.pdf contains a
-presentation about pg_upgrade internals that mirrors the text
-description below.
-
-------------------------------------------------------------------------------
-WHAT IT DOES
-------------------------------------------------------------------------------
-
-pg_upgrade is a tool that performs an in-place upgrade of existing
-data. Some upgrades change the on-disk representation of data;
-pg_upgrade cannot help in those upgrades.  However, many upgrades do
-not change the on-disk representation of a user-defined table.  In those
-cases, pg_upgrade can move existing user-defined tables from the old
-database cluster into the new cluster.
-
-There are two factors that determine whether an in-place upgrade is
-practical.
-
-Every table in a cluster shares the same on-disk representation of the
-table headers and trailers and the on-disk representation of tuple
-headers. If this changes between the old version of PostgreSQL and the
-new version, pg_upgrade cannot move existing tables to the new cluster;
-you will have to pg_dump the old data and then import that data into the
-new cluster.
-
-Second, all data types should have the same binary representation
-between the two major PostgreSQL versions.
-
-------------------------------------------------------------------------------
-HOW IT WORKS
-------------------------------------------------------------------------------
-
-To use pg_upgrade during an upgrade, start by installing a fresh
-cluster using the newest version in a new directory. When you've
-finished installation, the new cluster will contain the new executables
-and the usual template0, template1, and postgres, but no user-defined
-tables. At this point, you can shut down the old and new postmasters and
-invoke pg_upgrade.
-
-When pg_upgrade starts, it ensures that all required executables are
-present and contain the expected version numbers. The verification
-process also checks the old and new $PGDATA directories to ensure that
-the expected files and subdirectories are in place.  If the verification
-process succeeds, pg_upgrade starts the old postmaster and runs
-pg_dumpall --schema-only to capture the metadata contained in the old
-cluster. The script produced by pg_dumpall will be used in a later step
-to recreate all user-defined objects in the new cluster.
-
-Note that the script produced by pg_dumpall will only recreate
-user-defined objects, not system-defined objects.  The new cluster will
-contain the system-defined objects created by the latest version of
-PostgreSQL.
-
-Once pg_upgrade has extracted the metadata from the old cluster, it
-performs a number of bookkeeping tasks required to 'sync up' the new
-cluster with the existing data.
-
-First, pg_upgrade copies the commit status information and 'next
-transaction ID' from the old cluster to the new cluster. This is the
-steps ensures that the proper tuples are visible from the new cluster.
-Remember, pg_upgrade does not export/import the content of user-defined
-tables so the transaction IDs in the new cluster must match the
-transaction IDs in the old data. pg_upgrade also copies the starting
-address for write-ahead logs from the old cluster to the new cluster.
-
-Now pg_upgrade begins reconstructing the metadata obtained from the old
-cluster using the first part of the pg_dumpall output.
-
-Next, pg_upgrade executes the remainder of the script produced earlier
-by pg_dumpall --- this script effectively creates the complete
-user-defined metadata from the old cluster to the new cluster.  It
-preserves the relfilenode numbers so TOAST and other references
-to relfilenodes in user data is preserved.  (See binary-upgrade usage
-in pg_dump).
-
-Finally, pg_upgrade links or copies each user-defined table and its
-supporting indexes and toast tables from the old cluster to the new
-cluster.
-
-An important feature of the pg_upgrade design is that it leaves the
-original cluster intact --- if a problem occurs during the upgrade, you
-can still run the previous version, after renaming the tablespaces back
-to the original names.
diff --git a/contrib/pg_upgrade/Makefile b/contrib/pg_upgrade/Makefile
deleted file mode 100644
index 87da4b8..0000000
--- a/contrib/pg_upgrade/Makefile
+++ /dev/null
@@ -1,34 +0,0 @@
-# contrib/pg_upgrade/Makefile
-
-PGFILEDESC = "pg_upgrade - an in-place binary upgrade utility"
-PGAPPICON = win32
-
-PROGRAM  = pg_upgrade
-OBJS = check.o controldata.o dump.o exec.o file.o function.o info.o \
-       option.o page.o parallel.o pg_upgrade.o relfilenode.o server.o \
-       tablespace.o util.o version.o $(WIN32RES)
-
-PG_CPPFLAGS  = -DFRONTEND -DDLSUFFIX=\"$(DLSUFFIX)\" -I$(srcdir) -I$(libpq_srcdir)
-PG_LIBS = $(libpq_pgport)
-
-EXTRA_CLEAN = analyze_new_cluster.sh delete_old_cluster.sh log/ tmp_check/ \
-              pg_upgrade_dump_globals.sql \
-              pg_upgrade_dump_*.custom pg_upgrade_*.log
-
-ifdef USE_PGXS
-PG_CONFIG = pg_config
-PGXS := $(shell $(PG_CONFIG) --pgxs)
-include $(PGXS)
-else
-subdir = contrib/pg_upgrade
-top_builddir = ../..
-include $(top_builddir)/src/Makefile.global
-include $(top_srcdir)/contrib/contrib-global.mk
-endif
-
-check: test.sh all
-	MAKE=$(MAKE) bindir=$(bindir) libdir=$(libdir) EXTRA_REGRESS_OPTS="$(EXTRA_REGRESS_OPTS)" $(SHELL) $< --install
-
-# disabled because it upsets the build farm
-#installcheck: test.sh
-#	MAKE=$(MAKE) bindir=$(bindir) libdir=$(libdir) $(SHELL) $<
diff --git a/contrib/pg_upgrade/TESTING b/contrib/pg_upgrade/TESTING
deleted file mode 100644
index 359688c..0000000
--- a/contrib/pg_upgrade/TESTING
+++ /dev/null
@@ -1,83 +0,0 @@
-contrib/pg_upgrade/TESTING
-
-The most effective way to test pg_upgrade, aside from testing on user
-data, is by upgrading the PostgreSQL regression database.
-
-This testing process first requires the creation of a valid regression
-database dump.  Such files contain most database features and are
-specific to each major version of Postgres.
-
-Here are the steps needed to create a regression database dump file:
-
-1)  Create and populate the regression database in the old cluster
-    This database can be created by running 'make installcheck' from
-    src/test/regression.
-
-2)  Use pg_dump to dump out the regression database.  Use the new
-    cluster's pg_dump on the old database to minimize whitespace
-    differences in the diff.
-
-3)  Adjust the regression database dump file
-
-    a)  Perform the load/dump twice
-        This fixes problems with the ordering of COPY columns for
-        inherited tables.
-
-    b)  Change CREATE FUNCTION shared object paths to use '$libdir'
-        The old and new cluster will have different shared object paths.
-
-    c)  Fix any wrapping format differences
-        Commands like CREATE TRIGGER and ALTER TABLE sometimes have
-        differences.
-
-    d)  For pre-9.0, change CREATE OR REPLACE LANGUAGE to CREATE LANGUAGE
-
-    e)  For pre-9.0, remove 'regex_flavor'
-
-    f)  For pre-9.0, adjust extra_float_digits
-        Postgres 9.0 pg_dump uses extra_float_digits=-2 for pre-9.0
-        databases, and extra_float_digits=-3 for >= 9.0 databases.
-        It is necessary to modify 9.0 pg_dump to always use -3, and
-        modify the pre-9.0 old server to accept extra_float_digits=-3.
-
-Once the dump is created, it can be repeatedly loaded into the old
-database, upgraded, and dumped out of the new database, and then
-compared to the original version. To test the dump file, perform these
-steps:
-
-1)  Create the old and new clusters in different directories.
-
-2)  Copy the regression shared object files into the appropriate /lib
-    directory for old and new clusters.
-
-3)  Create the regression database in the old server.
-
-4)  Load the dump file created above into the regression database;
-    check for errors while loading.
-
-5)  Upgrade the old database to the new major version, as outlined in
-    the pg_upgrade manual section.
-
-6)  Use pg_dump to dump out the regression database in the new cluster.
-
-7)  Diff the regression database dump file with the regression dump
-    file loaded into the old server.
-
-The shell script test.sh in this directory performs more or less this
-procedure.  You can invoke it by running
-
-    make check
-
-or by running
-
-    make installcheck
-
-if "make install" (or "make install-world") were done beforehand.
-When invoked without arguments, it will run an upgrade from the
-version in this source tree to a new instance of the same version.  To
-test an upgrade from a different version, invoke it like this:
-
-    make installcheck oldbindir=...otherversion/bin oldsrc=...somewhere/postgresql
-
-In this case, you will have to manually eyeball the resulting dump
-diff for version-specific differences, as explained above.
diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c
deleted file mode 100644
index 6a498c3..0000000
--- a/contrib/pg_upgrade/check.c
+++ /dev/null
@@ -1,1016 +0,0 @@
-/*
- *	check.c
- *
- *	server checks and output routines
- *
- *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
- *	contrib/pg_upgrade/check.c
- */
-
-#include "postgres_fe.h"
-
-#include "catalog/pg_authid.h"
-#include "mb/pg_wchar.h"
-#include "pg_upgrade.h"
-
-
-static void check_new_cluster_is_empty(void);
-static void check_databases_are_compatible(void);
-static void check_locale_and_encoding(DbInfo *olddb, DbInfo *newdb);
-static bool equivalent_locale(int category, const char *loca, const char *locb);
-static void check_is_install_user(ClusterInfo *cluster);
-static void check_for_prepared_transactions(ClusterInfo *cluster);
-static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
-static void check_for_reg_data_type_usage(ClusterInfo *cluster);
-static void check_for_jsonb_9_4_usage(ClusterInfo *cluster);
-static void get_bin_version(ClusterInfo *cluster);
-static char *get_canonical_locale_name(int category, const char *locale);
-
-
-/*
- * fix_path_separator
- * For non-Windows, just return the argument.
- * For Windows convert any forward slash to a backslash
- * such as is suitable for arguments to builtin commands
- * like RMDIR and DEL.
- */
-static char *
-fix_path_separator(char *path)
-{
-#ifdef WIN32
-
-	char	   *result;
-	char	   *c;
-
-	result = pg_strdup(path);
-
-	for (c = result; *c != '\0'; c++)
-		if (*c == '/')
-			*c = '\\';
-
-	return result;
-#else
-
-	return path;
-#endif
-}
-
-void
-output_check_banner(bool live_check)
-{
-	if (user_opts.check && live_check)
-	{
-		pg_log(PG_REPORT, "Performing Consistency Checks on Old Live Server\n");
-		pg_log(PG_REPORT, "------------------------------------------------\n");
-	}
-	else
-	{
-		pg_log(PG_REPORT, "Performing Consistency Checks\n");
-		pg_log(PG_REPORT, "-----------------------------\n");
-	}
-}
-
-
-void
-check_and_dump_old_cluster(bool live_check)
-{
-	/* -- OLD -- */
-
-	if (!live_check)
-		start_postmaster(&old_cluster, true);
-
-	get_pg_database_relfilenode(&old_cluster);
-
-	/* Extract a list of databases and tables from the old cluster */
-	get_db_and_rel_infos(&old_cluster);
-
-	init_tablespaces();
-
-	get_loadable_libraries();
-
-
-	/*
-	 * Check for various failure cases
-	 */
-	check_is_install_user(&old_cluster);
-	check_for_prepared_transactions(&old_cluster);
-	check_for_reg_data_type_usage(&old_cluster);
-	check_for_isn_and_int8_passing_mismatch(&old_cluster);
-	if (GET_MAJOR_VERSION(old_cluster.major_version) == 904 &&
-		old_cluster.controldata.cat_ver < JSONB_FORMAT_CHANGE_CAT_VER)
-		check_for_jsonb_9_4_usage(&old_cluster);
-
-	/* Pre-PG 9.4 had a different 'line' data type internal format */
-	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 903)
-		old_9_3_check_for_line_data_type_usage(&old_cluster);
-
-	/* Pre-PG 9.0 had no large object permissions */
-	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 804)
-		new_9_0_populate_pg_largeobject_metadata(&old_cluster, true);
-
-	/*
-	 * While not a check option, we do this now because this is the only time
-	 * the old server is running.
-	 */
-	if (!user_opts.check)
-		generate_old_dump();
-
-	if (!live_check)
-		stop_postmaster(false);
-}
-
-
-void
-check_new_cluster(void)
-{
-	get_db_and_rel_infos(&new_cluster);
-
-	check_new_cluster_is_empty();
-	check_databases_are_compatible();
-
-	check_loadable_libraries();
-
-	if (user_opts.transfer_mode == TRANSFER_MODE_LINK)
-		check_hard_link();
-
-	check_is_install_user(&new_cluster);
-
-	check_for_prepared_transactions(&new_cluster);
-}
-
-
-void
-report_clusters_compatible(void)
-{
-	if (user_opts.check)
-	{
-		pg_log(PG_REPORT, "\n*Clusters are compatible*\n");
-		/* stops new cluster */
-		stop_postmaster(false);
-		exit(0);
-	}
-
-	pg_log(PG_REPORT, "\n"
-		   "If pg_upgrade fails after this point, you must re-initdb the\n"
-		   "new cluster before continuing.\n");
-}
-
-
-void
-issue_warnings(void)
-{
-	/* Create dummy large object permissions for old < PG 9.0? */
-	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 804)
-	{
-		start_postmaster(&new_cluster, true);
-		new_9_0_populate_pg_largeobject_metadata(&new_cluster, false);
-		stop_postmaster(false);
-	}
-}
-
-
-void
-output_completion_banner(char *analyze_script_file_name,
-						 char *deletion_script_file_name)
-{
-	/* Did we copy the free space files? */
-	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 804)
-		pg_log(PG_REPORT,
-			   "Optimizer statistics are not transferred by pg_upgrade so,\n"
-			   "once you start the new server, consider running:\n"
-			   "    %s\n\n", analyze_script_file_name);
-	else
-		pg_log(PG_REPORT,
-			   "Optimizer statistics and free space information are not transferred\n"
-		"by pg_upgrade so, once you start the new server, consider running:\n"
-			   "    %s\n\n", analyze_script_file_name);
-
-
-	if (deletion_script_file_name)
-		pg_log(PG_REPORT,
-			"Running this script will delete the old cluster's data files:\n"
-			   "    %s\n",
-			   deletion_script_file_name);
-	else
-		pg_log(PG_REPORT,
-			   "Could not create a script to delete the old cluster's data\n"
-		  "files because user-defined tablespaces exist in the old cluster\n"
-		"directory.  The old cluster's contents must be deleted manually.\n");
-}
-
-
-void
-check_cluster_versions(void)
-{
-	prep_status("Checking cluster versions");
-
-	/* get old and new cluster versions */
-	old_cluster.major_version = get_major_server_version(&old_cluster);
-	new_cluster.major_version = get_major_server_version(&new_cluster);
-
-	/*
-	 * We allow upgrades from/to the same major version for alpha/beta
-	 * upgrades
-	 */
-
-	if (GET_MAJOR_VERSION(old_cluster.major_version) < 804)
-		pg_fatal("This utility can only upgrade from PostgreSQL version 8.4 and later.\n");
-
-	/* Only current PG version is supported as a target */
-	if (GET_MAJOR_VERSION(new_cluster.major_version) != GET_MAJOR_VERSION(PG_VERSION_NUM))
-		pg_fatal("This utility can only upgrade to PostgreSQL version %s.\n",
-				 PG_MAJORVERSION);
-
-	/*
-	 * We can't allow downgrading because we use the target pg_dump, and
-	 * pg_dump cannot operate on newer database versions, only current and
-	 * older versions.
-	 */
-	if (old_cluster.major_version > new_cluster.major_version)
-		pg_fatal("This utility cannot be used to downgrade to older major PostgreSQL versions.\n");
-
-	/* get old and new binary versions */
-	get_bin_version(&old_cluster);
-	get_bin_version(&new_cluster);
-
-	/* Ensure binaries match the designated data directories */
-	if (GET_MAJOR_VERSION(old_cluster.major_version) !=
-		GET_MAJOR_VERSION(old_cluster.bin_version))
-		pg_fatal("Old cluster data and binary directories are from different major versions.\n");
-	if (GET_MAJOR_VERSION(new_cluster.major_version) !=
-		GET_MAJOR_VERSION(new_cluster.bin_version))
-		pg_fatal("New cluster data and binary directories are from different major versions.\n");
-
-	check_ok();
-}
-
-
-void
-check_cluster_compatibility(bool live_check)
-{
-	/* get/check pg_control data of servers */
-	get_control_data(&old_cluster, live_check);
-	get_control_data(&new_cluster, false);
-	check_control_data(&old_cluster.controldata, &new_cluster.controldata);
-
-	/* Is it 9.0 but without tablespace directories? */
-	if (GET_MAJOR_VERSION(new_cluster.major_version) == 900 &&
-		new_cluster.controldata.cat_ver < TABLE_SPACE_SUBDIRS_CAT_VER)
-		pg_fatal("This utility can only upgrade to PostgreSQL version 9.0 after 2010-01-11\n"
-				 "because of backend API changes made during development.\n");
-
-	/* We read the real port number for PG >= 9.1 */
-	if (live_check && GET_MAJOR_VERSION(old_cluster.major_version) < 901 &&
-		old_cluster.port == DEF_PGUPORT)
-		pg_fatal("When checking a pre-PG 9.1 live old server, "
-				 "you must specify the old server's port number.\n");
-
-	if (live_check && old_cluster.port == new_cluster.port)
-		pg_fatal("When checking a live server, "
-				 "the old and new port numbers must be different.\n");
-}
-
-
-/*
- * check_locale_and_encoding()
- *
- * Check that locale and encoding of a database in the old and new clusters
- * are compatible.
- */
-static void
-check_locale_and_encoding(DbInfo *olddb, DbInfo *newdb)
-{
-	if (olddb->db_encoding != newdb->db_encoding)
-		pg_fatal("encodings for database \"%s\" do not match:  old \"%s\", new \"%s\"\n",
-				 olddb->db_name,
-				 pg_encoding_to_char(olddb->db_encoding),
-				 pg_encoding_to_char(newdb->db_encoding));
-	if (!equivalent_locale(LC_COLLATE, olddb->db_collate, newdb->db_collate))
-		pg_fatal("lc_collate values for database \"%s\" do not match:  old \"%s\", new \"%s\"\n",
-				 olddb->db_name, olddb->db_collate, newdb->db_collate);
-	if (!equivalent_locale(LC_CTYPE, olddb->db_ctype, newdb->db_ctype))
-		pg_fatal("lc_ctype values for database \"%s\" do not match:  old \"%s\", new \"%s\"\n",
-				 olddb->db_name, olddb->db_ctype, newdb->db_ctype);
-}
-
-/*
- * equivalent_locale()
- *
- * Best effort locale-name comparison.  Return false if we are not 100% sure
- * the locales are equivalent.
- *
- * Note: The encoding parts of the names are ignored. This function is
- * currently used to compare locale names stored in pg_database, and
- * pg_database contains a separate encoding field. That's compared directly
- * in check_locale_and_encoding().
- */
-static bool
-equivalent_locale(int category, const char *loca, const char *locb)
-{
-	const char *chara;
-	const char *charb;
-	char	   *canona;
-	char	   *canonb;
-	int			lena;
-	int			lenb;
-
-	/*
-	 * If the names are equal, the locales are equivalent. Checking this
-	 * first avoids calling setlocale() in the common case that the names
-	 * are equal. That's a good thing, if setlocale() is buggy, for example.
-	 */
-	if (pg_strcasecmp(loca, locb) == 0)
-		return true;
-
-	/*
-	 * Not identical. Canonicalize both names, remove the encoding parts,
-	 * and try again.
-	 */
-	canona = get_canonical_locale_name(category, loca);
-	chara = strrchr(canona, '.');
-	lena = chara ? (chara - canona) : strlen(canona);
-
-	canonb = get_canonical_locale_name(category, locb);
-	charb = strrchr(canonb, '.');
-	lenb = charb ? (charb - canonb) : strlen(canonb);
-
-	if (lena == lenb && pg_strncasecmp(canona, canonb, lena) == 0)
-		return true;
-
-	return false;
-}
-
-
-static void
-check_new_cluster_is_empty(void)
-{
-	int			dbnum;
-
-	for (dbnum = 0; dbnum < new_cluster.dbarr.ndbs; dbnum++)
-	{
-		int			relnum;
-		RelInfoArr *rel_arr = &new_cluster.dbarr.dbs[dbnum].rel_arr;
-
-		for (relnum = 0; relnum < rel_arr->nrels;
-			 relnum++)
-		{
-			/* pg_largeobject and its index should be skipped */
-			if (strcmp(rel_arr->rels[relnum].nspname, "pg_catalog") != 0)
-				pg_fatal("New cluster database \"%s\" is not empty\n",
-						 new_cluster.dbarr.dbs[dbnum].db_name);
-		}
-	}
-}
-
-/*
- * Check that every database that already exists in the new cluster is
- * compatible with the corresponding database in the old one.
- */
-static void
-check_databases_are_compatible(void)
-{
-	int			newdbnum;
-	int			olddbnum;
-	DbInfo	   *newdbinfo;
-	DbInfo	   *olddbinfo;
-
-	for (newdbnum = 0; newdbnum < new_cluster.dbarr.ndbs; newdbnum++)
-	{
-		newdbinfo = &new_cluster.dbarr.dbs[newdbnum];
-
-		/* Find the corresponding database in the old cluster */
-		for (olddbnum = 0; olddbnum < old_cluster.dbarr.ndbs; olddbnum++)
-		{
-			olddbinfo = &old_cluster.dbarr.dbs[olddbnum];
-			if (strcmp(newdbinfo->db_name, olddbinfo->db_name) == 0)
-			{
-				check_locale_and_encoding(olddbinfo, newdbinfo);
-				break;
-			}
-		}
-	}
-}
-
-
-/*
- * create_script_for_cluster_analyze()
- *
- *	This incrementally generates better optimizer statistics
- */
-void
-create_script_for_cluster_analyze(char **analyze_script_file_name)
-{
-	FILE	   *script = NULL;
-	char	   *user_specification = "";
-
-	prep_status("Creating script to analyze new cluster");
-
-	if (os_info.user_specified)
-		user_specification = psprintf("-U \"%s\" ", os_info.user);
-
-	*analyze_script_file_name = psprintf("%sanalyze_new_cluster.%s",
-										 SCRIPT_PREFIX, SCRIPT_EXT);
-
-	if ((script = fopen_priv(*analyze_script_file_name, "w")) == NULL)
-		pg_fatal("Could not open file \"%s\": %s\n",
-				 *analyze_script_file_name, getErrorText(errno));
-
-#ifndef WIN32
-	/* add shebang header */
-	fprintf(script, "#!/bin/sh\n\n");
-#else
-	/* suppress command echoing */
-	fprintf(script, "@echo off\n");
-#endif
-
-	fprintf(script, "echo %sThis script will generate minimal optimizer statistics rapidly%s\n",
-			ECHO_QUOTE, ECHO_QUOTE);
-	fprintf(script, "echo %sso your system is usable, and then gather statistics twice more%s\n",
-			ECHO_QUOTE, ECHO_QUOTE);
-	fprintf(script, "echo %swith increasing accuracy.  When it is done, your system will%s\n",
-			ECHO_QUOTE, ECHO_QUOTE);
-	fprintf(script, "echo %shave the default level of optimizer statistics.%s\n",
-			ECHO_QUOTE, ECHO_QUOTE);
-	fprintf(script, "echo%s\n\n", ECHO_BLANK);
-
-	fprintf(script, "echo %sIf you have used ALTER TABLE to modify the statistics target for%s\n",
-			ECHO_QUOTE, ECHO_QUOTE);
-	fprintf(script, "echo %sany tables, you might want to remove them and restore them after%s\n",
-			ECHO_QUOTE, ECHO_QUOTE);
-	fprintf(script, "echo %srunning this script because they will delay fast statistics generation.%s\n",
-			ECHO_QUOTE, ECHO_QUOTE);
-	fprintf(script, "echo%s\n\n", ECHO_BLANK);
-
-	fprintf(script, "echo %sIf you would like default statistics as quickly as possible, cancel%s\n",
-			ECHO_QUOTE, ECHO_QUOTE);
-	fprintf(script, "echo %sthis script and run:%s\n",
-			ECHO_QUOTE, ECHO_QUOTE);
-	fprintf(script, "echo %s    \"%s/vacuumdb\" %s--all %s%s\n", ECHO_QUOTE,
-			new_cluster.bindir, user_specification,
-	/* Did we copy the free space files? */
-			(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
-			"--analyze-only" : "--analyze", ECHO_QUOTE);
-	fprintf(script, "echo%s\n\n", ECHO_BLANK);
-
-	fprintf(script, "\"%s/vacuumdb\" %s--all --analyze-in-stages\n",
-			new_cluster.bindir, user_specification);
-	/* Did we copy the free space files? */
-	if (GET_MAJOR_VERSION(old_cluster.major_version) < 804)
-		fprintf(script, "\"%s/vacuumdb\" %s--all\n", new_cluster.bindir,
-				user_specification);
-
-	fprintf(script, "echo%s\n\n", ECHO_BLANK);
-	fprintf(script, "echo %sDone%s\n",
-			ECHO_QUOTE, ECHO_QUOTE);
-
-	fclose(script);
-
-#ifndef WIN32
-	if (chmod(*analyze_script_file_name, S_IRWXU) != 0)
-		pg_fatal("Could not add execute permission to file \"%s\": %s\n",
-				 *analyze_script_file_name, getErrorText(errno));
-#endif
-
-	if (os_info.user_specified)
-		pg_free(user_specification);
-
-	check_ok();
-}
-
-
-/*
- * create_script_for_old_cluster_deletion()
- *
- *	This is particularly useful for tablespace deletion.
- */
-void
-create_script_for_old_cluster_deletion(char **deletion_script_file_name)
-{
-	FILE	   *script = NULL;
-	int			tblnum;
-	char		old_cluster_pgdata[MAXPGPATH];
-
-	*deletion_script_file_name = psprintf("%sdelete_old_cluster.%s",
-										  SCRIPT_PREFIX, SCRIPT_EXT);
-
-	/*
-	 * Some users (oddly) create tablespaces inside the cluster data
-	 * directory.  We can't create a proper old cluster delete script in that
-	 * case.
-	 */
-	strlcpy(old_cluster_pgdata, old_cluster.pgdata, MAXPGPATH);
-	canonicalize_path(old_cluster_pgdata);
-	for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
-	{
-		char		old_tablespace_dir[MAXPGPATH];
-
-		strlcpy(old_tablespace_dir, os_info.old_tablespaces[tblnum], MAXPGPATH);
-		canonicalize_path(old_tablespace_dir);
-		if (path_is_prefix_of_path(old_cluster_pgdata, old_tablespace_dir))
-		{
-			/* Unlink file in case it is left over from a previous run. */
-			unlink(*deletion_script_file_name);
-			pg_free(*deletion_script_file_name);
-			*deletion_script_file_name = NULL;
-			return;
-		}
-	}
-
-	prep_status("Creating script to delete old cluster");
-
-	if ((script = fopen_priv(*deletion_script_file_name, "w")) == NULL)
-		pg_fatal("Could not open file \"%s\": %s\n",
-				 *deletion_script_file_name, getErrorText(errno));
-
-#ifndef WIN32
-	/* add shebang header */
-	fprintf(script, "#!/bin/sh\n\n");
-#endif
-
-	/* delete old cluster's default tablespace */
-	fprintf(script, RMDIR_CMD " \"%s\"\n", fix_path_separator(old_cluster.pgdata));
-
-	/* delete old cluster's alternate tablespaces */
-	for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
-	{
-		/*
-		 * Do the old cluster's per-database directories share a directory
-		 * with a new version-specific tablespace?
-		 */
-		if (strlen(old_cluster.tablespace_suffix) == 0)
-		{
-			/* delete per-database directories */
-			int			dbnum;
-
-			fprintf(script, "\n");
-			/* remove PG_VERSION? */
-			if (GET_MAJOR_VERSION(old_cluster.major_version) <= 804)
-				fprintf(script, RM_CMD " %s%cPG_VERSION\n",
-						fix_path_separator(os_info.old_tablespaces[tblnum]),
-						PATH_SEPARATOR);
-
-			for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
-				fprintf(script, RMDIR_CMD " \"%s%c%d\"\n",
-						fix_path_separator(os_info.old_tablespaces[tblnum]),
-						PATH_SEPARATOR, old_cluster.dbarr.dbs[dbnum].db_oid);
-		}
-		else
-		{
-			char	   *suffix_path = pg_strdup(old_cluster.tablespace_suffix);
-
-			/*
-			 * Simply delete the tablespace directory, which might be ".old"
-			 * or a version-specific subdirectory.
-			 */
-			fprintf(script, RMDIR_CMD " \"%s%s\"\n",
-					fix_path_separator(os_info.old_tablespaces[tblnum]),
-					fix_path_separator(suffix_path));
-			pfree(suffix_path);
-		}
-	}
-
-	fclose(script);
-
-#ifndef WIN32
-	if (chmod(*deletion_script_file_name, S_IRWXU) != 0)
-		pg_fatal("Could not add execute permission to file \"%s\": %s\n",
-				 *deletion_script_file_name, getErrorText(errno));
-#endif
-
-	check_ok();
-}
-
-
-/*
- *	check_is_install_user()
- *
- *	Check we are the install user, and that the new cluster
- *	has no other users.
- */
-static void
-check_is_install_user(ClusterInfo *cluster)
-{
-	PGresult   *res;
-	PGconn	   *conn = connectToServer(cluster, "template1");
-
-	prep_status("Checking database user is the install user");
-
-	/* Can't use pg_authid because only superusers can view it. */
-	res = executeQueryOrDie(conn,
-							"SELECT rolsuper, oid "
-							"FROM pg_catalog.pg_roles "
-							"WHERE rolname = current_user");
-
-	/*
-	 * We only allow the install user in the new cluster (see comment below)
-	 * and we preserve pg_authid.oid, so this must be the install user in
-	 * the old cluster too.
-	 */
-	if (PQntuples(res) != 1 ||
-		atooid(PQgetvalue(res, 0, 1)) != BOOTSTRAP_SUPERUSERID)
-		pg_fatal("database user \"%s\" is not the install user\n",
-				 os_info.user);
-
-	PQclear(res);
-
-	res = executeQueryOrDie(conn,
-							"SELECT COUNT(*) "
-							"FROM pg_catalog.pg_roles ");
-
-	if (PQntuples(res) != 1)
-		pg_fatal("could not determine the number of users\n");
-
-	/*
-	 * We only allow the install user in the new cluster because other defined
-	 * users might match users defined in the old cluster and generate an
-	 * error during pg_dump restore.
-	 */
-	if (cluster == &new_cluster && atooid(PQgetvalue(res, 0, 0)) != 1)
-		pg_fatal("Only the install user can be defined in the new cluster.\n");
-
-	PQclear(res);
-
-	PQfinish(conn);
-
-	check_ok();
-}
-
-
-/*
- *	check_for_prepared_transactions()
- *
- *	Make sure there are no prepared transactions because the storage format
- *	might have changed.
- */
-static void
-check_for_prepared_transactions(ClusterInfo *cluster)
-{
-	PGresult   *res;
-	PGconn	   *conn = connectToServer(cluster, "template1");
-
-	prep_status("Checking for prepared transactions");
-
-	res = executeQueryOrDie(conn,
-							"SELECT * "
-							"FROM pg_catalog.pg_prepared_xacts");
-
-	if (PQntuples(res) != 0)
-		pg_fatal("The %s cluster contains prepared transactions\n",
-				 CLUSTER_NAME(cluster));
-
-	PQclear(res);
-
-	PQfinish(conn);
-
-	check_ok();
-}
-
-
-/*
- *	check_for_isn_and_int8_passing_mismatch()
- *
- *	contrib/isn relies on data type int8, and in 8.4 int8 can now be passed
- *	by value.  The schema dumps the CREATE TYPE PASSEDBYVALUE setting so
- *	it must match for the old and new servers.
- */
-static void
-check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster)
-{
-	int			dbnum;
-	FILE	   *script = NULL;
-	bool		found = false;
-	char		output_path[MAXPGPATH];
-
-	prep_status("Checking for contrib/isn with bigint-passing mismatch");
-
-	if (old_cluster.controldata.float8_pass_by_value ==
-		new_cluster.controldata.float8_pass_by_value)
-	{
-		/* no mismatch */
-		check_ok();
-		return;
-	}
-
-	snprintf(output_path, sizeof(output_path),
-			 "contrib_isn_and_int8_pass_by_value.txt");
-
-	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
-	{
-		PGresult   *res;
-		bool		db_used = false;
-		int			ntups;
-		int			rowno;
-		int			i_nspname,
-					i_proname;
-		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
-		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
-
-		/* Find any functions coming from contrib/isn */
-		res = executeQueryOrDie(conn,
-								"SELECT n.nspname, p.proname "
-								"FROM	pg_catalog.pg_proc p, "
-								"		pg_catalog.pg_namespace n "
-								"WHERE	p.pronamespace = n.oid AND "
-								"		p.probin = '$libdir/isn'");
-
-		ntups = PQntuples(res);
-		i_nspname = PQfnumber(res, "nspname");
-		i_proname = PQfnumber(res, "proname");
-		for (rowno = 0; rowno < ntups; rowno++)
-		{
-			found = true;
-			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
-				pg_fatal("Could not open file \"%s\": %s\n",
-						 output_path, getErrorText(errno));
-			if (!db_used)
-			{
-				fprintf(script, "Database: %s\n", active_db->db_name);
-				db_used = true;
-			}
-			fprintf(script, "  %s.%s\n",
-					PQgetvalue(res, rowno, i_nspname),
-					PQgetvalue(res, rowno, i_proname));
-		}
-
-		PQclear(res);
-
-		PQfinish(conn);
-	}
-
-	if (script)
-		fclose(script);
-
-	if (found)
-	{
-		pg_log(PG_REPORT, "fatal\n");
-		pg_fatal("Your installation contains \"contrib/isn\" functions which rely on the\n"
-		  "bigint data type.  Your old and new clusters pass bigint values\n"
-		"differently so this cluster cannot currently be upgraded.  You can\n"
-				 "manually upgrade databases that use \"contrib/isn\" facilities and remove\n"
-				 "\"contrib/isn\" from the old cluster and restart the upgrade.  A list of\n"
-				 "the problem functions is in the file:\n"
-				 "    %s\n\n", output_path);
-	}
-	else
-		check_ok();
-}
-
-
-/*
- * check_for_reg_data_type_usage()
- *	pg_upgrade only preserves these system values:
- *		pg_class.oid
- *		pg_type.oid
- *		pg_enum.oid
- *
- *	Many of the reg* data types reference system catalog info that is
- *	not preserved, and hence these data types cannot be used in user
- *	tables upgraded by pg_upgrade.
- */
-static void
-check_for_reg_data_type_usage(ClusterInfo *cluster)
-{
-	int			dbnum;
-	FILE	   *script = NULL;
-	bool		found = false;
-	char		output_path[MAXPGPATH];
-
-	prep_status("Checking for reg* system OID user data types");
-
-	snprintf(output_path, sizeof(output_path), "tables_using_reg.txt");
-
-	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
-	{
-		PGresult   *res;
-		bool		db_used = false;
-		int			ntups;
-		int			rowno;
-		int			i_nspname,
-					i_relname,
-					i_attname;
-		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
-		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
-
-		/*
-		 * While several relkinds don't store any data, e.g. views, they can
-		 * be used to define data types of other columns, so we check all
-		 * relkinds.
-		 */
-		res = executeQueryOrDie(conn,
-								"SELECT n.nspname, c.relname, a.attname "
-								"FROM	pg_catalog.pg_class c, "
-								"		pg_catalog.pg_namespace n, "
-								"		pg_catalog.pg_attribute a "
-								"WHERE	c.oid = a.attrelid AND "
-								"		NOT a.attisdropped AND "
-								"		a.atttypid IN ( "
-		  "			'pg_catalog.regproc'::pg_catalog.regtype, "
-								"			'pg_catalog.regprocedure'::pg_catalog.regtype, "
-		  "			'pg_catalog.regoper'::pg_catalog.regtype, "
-								"			'pg_catalog.regoperator'::pg_catalog.regtype, "
-		/* regclass.oid is preserved, so 'regclass' is OK */
-		/* regtype.oid is preserved, so 'regtype' is OK */
-		"			'pg_catalog.regconfig'::pg_catalog.regtype, "
-								"			'pg_catalog.regdictionary'::pg_catalog.regtype) AND "
-								"		c.relnamespace = n.oid AND "
-							  "		n.nspname NOT IN ('pg_catalog', 'information_schema')");
-
-		ntups = PQntuples(res);
-		i_nspname = PQfnumber(res, "nspname");
-		i_relname = PQfnumber(res, "relname");
-		i_attname = PQfnumber(res, "attname");
-		for (rowno = 0; rowno < ntups; rowno++)
-		{
-			found = true;
-			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
-				pg_fatal("Could not open file \"%s\": %s\n",
-						 output_path, getErrorText(errno));
-			if (!db_used)
-			{
-				fprintf(script, "Database: %s\n", active_db->db_name);
-				db_used = true;
-			}
-			fprintf(script, "  %s.%s.%s\n",
-					PQgetvalue(res, rowno, i_nspname),
-					PQgetvalue(res, rowno, i_relname),
-					PQgetvalue(res, rowno, i_attname));
-		}
-
-		PQclear(res);
-
-		PQfinish(conn);
-	}
-
-	if (script)
-		fclose(script);
-
-	if (found)
-	{
-		pg_log(PG_REPORT, "fatal\n");
-		pg_fatal("Your installation contains one of the reg* data types in user tables.\n"
-		 "These data types reference system OIDs that are not preserved by\n"
-		"pg_upgrade, so this cluster cannot currently be upgraded.  You can\n"
-				 "remove the problem tables and restart the upgrade.  A list of the problem\n"
-				 "columns is in the file:\n"
-				 "    %s\n\n", output_path);
-	}
-	else
-		check_ok();
-}
-
-
-/*
- * check_for_jsonb_9_4_usage()
- *
- *	JSONB changed its storage format during 9.4 beta, so check for it.
- */
-static void
-check_for_jsonb_9_4_usage(ClusterInfo *cluster)
-{
-	int			dbnum;
-	FILE	   *script = NULL;
-	bool		found = false;
-	char		output_path[MAXPGPATH];
-
-	prep_status("Checking for JSONB user data types");
-
-	snprintf(output_path, sizeof(output_path), "tables_using_jsonb.txt");
-
-	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
-	{
-		PGresult   *res;
-		bool		db_used = false;
-		int			ntups;
-		int			rowno;
-		int			i_nspname,
-					i_relname,
-					i_attname;
-		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
-		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
-
-		/*
-		 * While several relkinds don't store any data, e.g. views, they can
-		 * be used to define data types of other columns, so we check all
-		 * relkinds.
-		 */
-		res = executeQueryOrDie(conn,
-								"SELECT n.nspname, c.relname, a.attname "
-								"FROM	pg_catalog.pg_class c, "
-								"		pg_catalog.pg_namespace n, "
-								"		pg_catalog.pg_attribute a "
-								"WHERE	c.oid = a.attrelid AND "
-								"		NOT a.attisdropped AND "
-								"		a.atttypid = 'pg_catalog.jsonb'::pg_catalog.regtype AND "
-								"		c.relnamespace = n.oid AND "
-		/* exclude possible orphaned temp tables */
-								"  		n.nspname !~ '^pg_temp_' AND "
-							  "		n.nspname NOT IN ('pg_catalog', 'information_schema')");
-
-		ntups = PQntuples(res);
-		i_nspname = PQfnumber(res, "nspname");
-		i_relname = PQfnumber(res, "relname");
-		i_attname = PQfnumber(res, "attname");
-		for (rowno = 0; rowno < ntups; rowno++)
-		{
-			found = true;
-			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
-				pg_fatal("Could not open file \"%s\": %s\n",
-						 output_path, getErrorText(errno));
-			if (!db_used)
-			{
-				fprintf(script, "Database: %s\n", active_db->db_name);
-				db_used = true;
-			}
-			fprintf(script, "  %s.%s.%s\n",
-					PQgetvalue(res, rowno, i_nspname),
-					PQgetvalue(res, rowno, i_relname),
-					PQgetvalue(res, rowno, i_attname));
-		}
-
-		PQclear(res);
-
-		PQfinish(conn);
-	}
-
-	if (script)
-		fclose(script);
-
-	if (found)
-	{
-		pg_log(PG_REPORT, "fatal\n");
-		pg_fatal("Your installation contains one of the JSONB data types in user tables.\n"
-		 "The internal format of JSONB changed during 9.4 beta so this cluster cannot currently\n"
-				 "be upgraded.  You can remove the problem tables and restart the upgrade.  A list\n"
-				 "of the problem columns is in the file:\n"
-				 "    %s\n\n", output_path);
-	}
-	else
-		check_ok();
-}
-
-
-static void
-get_bin_version(ClusterInfo *cluster)
-{
-	char		cmd[MAXPGPATH],
-				cmd_output[MAX_STRING];
-	FILE	   *output;
-	int			pre_dot,
-				post_dot;
-
-	snprintf(cmd, sizeof(cmd), "\"%s/pg_ctl\" --version", cluster->bindir);
-
-	if ((output = popen(cmd, "r")) == NULL ||
-		fgets(cmd_output, sizeof(cmd_output), output) == NULL)
-		pg_fatal("Could not get pg_ctl version data using %s: %s\n",
-				 cmd, getErrorText(errno));
-
-	pclose(output);
-
-	/* Remove trailing newline */
-	if (strchr(cmd_output, '\n') != NULL)
-		*strchr(cmd_output, '\n') = '\0';
-
-	if (sscanf(cmd_output, "%*s %*s %d.%d", &pre_dot, &post_dot) != 2)
-		pg_fatal("could not get version from %s\n", cmd);
-
-	cluster->bin_version = (pre_dot * 100 + post_dot) * 100;
-}
-
-
-/*
- * get_canonical_locale_name
- *
- * Send the locale name to the system, and hope we get back a canonical
- * version.  This should match the backend's check_locale() function.
- */
-static char *
-get_canonical_locale_name(int category, const char *locale)
-{
-	char	   *save;
-	char	   *res;
-
-	/* get the current setting, so we can restore it. */
-	save = setlocale(category, NULL);
-	if (!save)
-		pg_fatal("failed to get the current locale\n");
-
-	/* 'save' may be pointing at a modifiable scratch variable, so copy it. */
-	save = pg_strdup(save);
-
-	/* set the locale with setlocale, to see if it accepts it. */
-	res = setlocale(category, locale);
-
-	if (!res)
-		pg_fatal("failed to get system locale name for \"%s\"\n", locale);
-
-	res = pg_strdup(res);
-
-	/* restore old value. */
-	if (!setlocale(category, save))
-		pg_fatal("failed to restore old locale \"%s\"\n", save);
-
-	pg_free(save);
-
-	return res;
-}
diff --git a/contrib/pg_upgrade/controldata.c b/contrib/pg_upgrade/controldata.c
deleted file mode 100644
index 0e70b6f..0000000
--- a/contrib/pg_upgrade/controldata.c
+++ /dev/null
@@ -1,606 +0,0 @@
-/*
- *	controldata.c
- *
- *	controldata functions
- *
- *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
- *	contrib/pg_upgrade/controldata.c
- */
-
-#include "postgres_fe.h"
-
-#include "pg_upgrade.h"
-
-#include <ctype.h>
-
-/*
- * get_control_data()
- *
- * gets pg_control information in "ctrl". Assumes that bindir and
- * datadir are valid absolute paths to postgresql bin and pgdata
- * directories respectively *and* pg_resetxlog is version compatible
- * with datadir. The main purpose of this function is to get pg_control
- * data in a version independent manner.
- *
- * The approach taken here is to invoke pg_resetxlog with -n option
- * and then pipe its output. With little string parsing we get the
- * pg_control data.  pg_resetxlog cannot be run while the server is running
- * so we use pg_controldata;  pg_controldata doesn't provide all the fields
- * we need to actually perform the upgrade, but it provides enough for
- * check mode.  We do not implement pg_resetxlog -n because it is hard to
- * return valid xid data for a running server.
- */
-void
-get_control_data(ClusterInfo *cluster, bool live_check)
-{
-	char		cmd[MAXPGPATH];
-	char		bufin[MAX_STRING];
-	FILE	   *output;
-	char	   *p;
-	bool		got_xid = false;
-	bool		got_oid = false;
-	bool		got_nextxlogfile = false;
-	bool		got_multi = false;
-	bool		got_mxoff = false;
-	bool		got_oldestmulti = false;
-	bool		got_log_id = false;
-	bool		got_log_seg = false;
-	bool		got_tli = false;
-	bool		got_align = false;
-	bool		got_blocksz = false;
-	bool		got_largesz = false;
-	bool		got_walsz = false;
-	bool		got_walseg = false;
-	bool		got_ident = false;
-	bool		got_index = false;
-	bool		got_toast = false;
-	bool		got_large_object = false;
-	bool		got_date_is_int = false;
-	bool		got_float8_pass_by_value = false;
-	bool		got_data_checksum_version = false;
-	char	   *lc_collate = NULL;
-	char	   *lc_ctype = NULL;
-	char	   *lc_monetary = NULL;
-	char	   *lc_numeric = NULL;
-	char	   *lc_time = NULL;
-	char	   *lang = NULL;
-	char	   *language = NULL;
-	char	   *lc_all = NULL;
-	char	   *lc_messages = NULL;
-	uint32		logid = 0;
-	uint32		segno = 0;
-	uint32		tli = 0;
-
-
-	/*
-	 * Because we test the pg_resetxlog output as strings, it has to be in
-	 * English.  Copied from pg_regress.c.
-	 */
-	if (getenv("LC_COLLATE"))
-		lc_collate = pg_strdup(getenv("LC_COLLATE"));
-	if (getenv("LC_CTYPE"))
-		lc_ctype = pg_strdup(getenv("LC_CTYPE"));
-	if (getenv("LC_MONETARY"))
-		lc_monetary = pg_strdup(getenv("LC_MONETARY"));
-	if (getenv("LC_NUMERIC"))
-		lc_numeric = pg_strdup(getenv("LC_NUMERIC"));
-	if (getenv("LC_TIME"))
-		lc_time = pg_strdup(getenv("LC_TIME"));
-	if (getenv("LANG"))
-		lang = pg_strdup(getenv("LANG"));
-	if (getenv("LANGUAGE"))
-		language = pg_strdup(getenv("LANGUAGE"));
-	if (getenv("LC_ALL"))
-		lc_all = pg_strdup(getenv("LC_ALL"));
-	if (getenv("LC_MESSAGES"))
-		lc_messages = pg_strdup(getenv("LC_MESSAGES"));
-
-	pg_putenv("LC_COLLATE", NULL);
-	pg_putenv("LC_CTYPE", NULL);
-	pg_putenv("LC_MONETARY", NULL);
-	pg_putenv("LC_NUMERIC", NULL);
-	pg_putenv("LC_TIME", NULL);
-	pg_putenv("LANG",
-#ifndef WIN32
-			  NULL);
-#else
-	/* On Windows the default locale cannot be English, so force it */
-			  "en");
-#endif
-	pg_putenv("LANGUAGE", NULL);
-	pg_putenv("LC_ALL", NULL);
-	pg_putenv("LC_MESSAGES", "C");
-
-	snprintf(cmd, sizeof(cmd), "\"%s/%s \"%s\"",
-			 cluster->bindir,
-			 live_check ? "pg_controldata\"" : "pg_resetxlog\" -n",
-			 cluster->pgdata);
-	fflush(stdout);
-	fflush(stderr);
-
-	if ((output = popen(cmd, "r")) == NULL)
-		pg_fatal("Could not get control data using %s: %s\n",
-				 cmd, getErrorText(errno));
-
-	/* Only in <= 9.2 */
-	if (GET_MAJOR_VERSION(cluster->major_version) <= 902)
-	{
-		cluster->controldata.data_checksum_version = 0;
-		got_data_checksum_version = true;
-	}
-
-	/* we have the result of cmd in "output". so parse it line by line now */
-	while (fgets(bufin, sizeof(bufin), output))
-	{
-		pg_log(PG_VERBOSE, "%s", bufin);
-
-		if ((p = strstr(bufin, "pg_control version number:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: pg_resetxlog problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			cluster->controldata.ctrl_ver = str2uint(p);
-		}
-		else if ((p = strstr(bufin, "Catalog version number:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			cluster->controldata.cat_ver = str2uint(p);
-		}
-		else if ((p = strstr(bufin, "First log segment after reset:")) != NULL)
-		{
-			/* Skip the colon and any whitespace after it */
-			p = strchr(p, ':');
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-			p = strpbrk(p, "01234567890ABCDEF");
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			/* Make sure it looks like a valid WAL file name */
-			if (strspn(p, "0123456789ABCDEF") != 24)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			strlcpy(cluster->controldata.nextxlogfile, p, 25);
-			got_nextxlogfile = true;
-		}
-		else if ((p = strstr(bufin, "First log file ID after reset:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			logid = str2uint(p);
-			got_log_id = true;
-		}
-		else if ((p = strstr(bufin, "First log file segment after reset:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			segno = str2uint(p);
-			got_log_seg = true;
-		}
-		else if ((p = strstr(bufin, "Latest checkpoint's TimeLineID:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			cluster->controldata.chkpnt_tli = str2uint(p);
-			got_tli = true;
-		}
-		else if ((p = strstr(bufin, "Latest checkpoint's NextXID:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			cluster->controldata.chkpnt_nxtepoch = str2uint(p);
-
-			p = strchr(p, '/');
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove '/' char */
-			cluster->controldata.chkpnt_nxtxid = str2uint(p);
-			got_xid = true;
-		}
-		else if ((p = strstr(bufin, "Latest checkpoint's NextOID:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			cluster->controldata.chkpnt_nxtoid = str2uint(p);
-			got_oid = true;
-		}
-		else if ((p = strstr(bufin, "Latest checkpoint's NextMultiXactId:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			cluster->controldata.chkpnt_nxtmulti = str2uint(p);
-			got_multi = true;
-		}
-		else if ((p = strstr(bufin, "Latest checkpoint's oldestMultiXid:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			cluster->controldata.chkpnt_oldstMulti = str2uint(p);
-			got_oldestmulti = true;
-		}
-		else if ((p = strstr(bufin, "Latest checkpoint's NextMultiOffset:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			cluster->controldata.chkpnt_nxtmxoff = str2uint(p);
-			got_mxoff = true;
-		}
-		else if ((p = strstr(bufin, "Maximum data alignment:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			cluster->controldata.align = str2uint(p);
-			got_align = true;
-		}
-		else if ((p = strstr(bufin, "Database block size:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			cluster->controldata.blocksz = str2uint(p);
-			got_blocksz = true;
-		}
-		else if ((p = strstr(bufin, "Blocks per segment of large relation:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			cluster->controldata.largesz = str2uint(p);
-			got_largesz = true;
-		}
-		else if ((p = strstr(bufin, "WAL block size:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			cluster->controldata.walsz = str2uint(p);
-			got_walsz = true;
-		}
-		else if ((p = strstr(bufin, "Bytes per WAL segment:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			cluster->controldata.walseg = str2uint(p);
-			got_walseg = true;
-		}
-		else if ((p = strstr(bufin, "Maximum length of identifiers:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			cluster->controldata.ident = str2uint(p);
-			got_ident = true;
-		}
-		else if ((p = strstr(bufin, "Maximum columns in an index:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			cluster->controldata.index = str2uint(p);
-			got_index = true;
-		}
-		else if ((p = strstr(bufin, "Maximum size of a TOAST chunk:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			cluster->controldata.toast = str2uint(p);
-			got_toast = true;
-		}
-		else if ((p = strstr(bufin, "Size of a large-object chunk:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			cluster->controldata.large_object = str2uint(p);
-			got_large_object = true;
-		}
-		else if ((p = strstr(bufin, "Date/time type storage:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			cluster->controldata.date_is_int = strstr(p, "64-bit integers") != NULL;
-			got_date_is_int = true;
-		}
-		else if ((p = strstr(bufin, "Float8 argument passing:")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			/* used later for contrib check */
-			cluster->controldata.float8_pass_by_value = strstr(p, "by value") != NULL;
-			got_float8_pass_by_value = true;
-		}
-		else if ((p = strstr(bufin, "checksum")) != NULL)
-		{
-			p = strchr(p, ':');
-
-			if (p == NULL || strlen(p) <= 1)
-				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
-
-			p++;				/* remove ':' char */
-			/* used later for contrib check */
-			cluster->controldata.data_checksum_version = str2uint(p);
-			got_data_checksum_version = true;
-		}
-	}
-
-	if (output)
-		pclose(output);
-
-	/*
-	 * Restore environment variables
-	 */
-	pg_putenv("LC_COLLATE", lc_collate);
-	pg_putenv("LC_CTYPE", lc_ctype);
-	pg_putenv("LC_MONETARY", lc_monetary);
-	pg_putenv("LC_NUMERIC", lc_numeric);
-	pg_putenv("LC_TIME", lc_time);
-	pg_putenv("LANG", lang);
-	pg_putenv("LANGUAGE", language);
-	pg_putenv("LC_ALL", lc_all);
-	pg_putenv("LC_MESSAGES", lc_messages);
-
-	pg_free(lc_collate);
-	pg_free(lc_ctype);
-	pg_free(lc_monetary);
-	pg_free(lc_numeric);
-	pg_free(lc_time);
-	pg_free(lang);
-	pg_free(language);
-	pg_free(lc_all);
-	pg_free(lc_messages);
-
-	/*
-	 * Before 9.3, pg_resetxlog reported the xlogid and segno of the first log
-	 * file after reset as separate lines. Starting with 9.3, it reports the
-	 * WAL file name. If the old cluster is older than 9.3, we construct the
-	 * WAL file name from the xlogid and segno.
-	 */
-	if (GET_MAJOR_VERSION(cluster->major_version) <= 902)
-	{
-		if (got_log_id && got_log_seg)
-		{
-			snprintf(cluster->controldata.nextxlogfile, 25, "%08X%08X%08X",
-					 tli, logid, segno);
-			got_nextxlogfile = true;
-		}
-	}
-
-	/* verify that we got all the mandatory pg_control data */
-	if (!got_xid || !got_oid ||
-		!got_multi || !got_mxoff ||
-		(!got_oldestmulti &&
-		 cluster->controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER) ||
-		(!live_check && !got_nextxlogfile) ||
-		!got_tli ||
-		!got_align || !got_blocksz || !got_largesz || !got_walsz ||
-		!got_walseg || !got_ident || !got_index || !got_toast ||
-		(!got_large_object &&
-		 cluster->controldata.ctrl_ver >= LARGE_OBJECT_SIZE_PG_CONTROL_VER) ||
-		!got_date_is_int || !got_float8_pass_by_value || !got_data_checksum_version)
-	{
-		pg_log(PG_REPORT,
-			   "The %s cluster lacks some required control information:\n",
-			   CLUSTER_NAME(cluster));
-
-		if (!got_xid)
-			pg_log(PG_REPORT, "  checkpoint next XID\n");
-
-		if (!got_oid)
-			pg_log(PG_REPORT, "  latest checkpoint next OID\n");
-
-		if (!got_multi)
-			pg_log(PG_REPORT, "  latest checkpoint next MultiXactId\n");
-
-		if (!got_mxoff)
-			pg_log(PG_REPORT, "  latest checkpoint next MultiXactOffset\n");
-
-		if (!got_oldestmulti &&
-			cluster->controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER)
-			pg_log(PG_REPORT, "  latest checkpoint oldest MultiXactId\n");
-
-		if (!live_check && !got_nextxlogfile)
-			pg_log(PG_REPORT, "  first WAL segment after reset\n");
-
-		if (!got_tli)
-			pg_log(PG_REPORT, "  latest checkpoint timeline ID\n");
-
-		if (!got_align)
-			pg_log(PG_REPORT, "  maximum alignment\n");
-
-		if (!got_blocksz)
-			pg_log(PG_REPORT, "  block size\n");
-
-		if (!got_largesz)
-			pg_log(PG_REPORT, "  large relation segment size\n");
-
-		if (!got_walsz)
-			pg_log(PG_REPORT, "  WAL block size\n");
-
-		if (!got_walseg)
-			pg_log(PG_REPORT, "  WAL segment size\n");
-
-		if (!got_ident)
-			pg_log(PG_REPORT, "  maximum identifier length\n");
-
-		if (!got_index)
-			pg_log(PG_REPORT, "  maximum number of indexed columns\n");
-
-		if (!got_toast)
-			pg_log(PG_REPORT, "  maximum TOAST chunk size\n");
-
-		if (!got_large_object &&
-			cluster->controldata.ctrl_ver >= LARGE_OBJECT_SIZE_PG_CONTROL_VER)
-			pg_log(PG_REPORT, "  large-object chunk size\n");
-
-		if (!got_date_is_int)
-			pg_log(PG_REPORT, "  dates/times are integers?\n");
-
-		if (!got_float8_pass_by_value)
-			pg_log(PG_REPORT, "  float8 argument passing method\n");
-
-		/* value added in Postgres 9.3 */
-		if (!got_data_checksum_version)
-			pg_log(PG_REPORT, "  data checksum version\n");
-
-		pg_fatal("Cannot continue without required control information, terminating\n");
-	}
-}
-
-
-/*
- * check_control_data()
- *
- * check to make sure the control data settings are compatible
- */
-void
-check_control_data(ControlData *oldctrl,
-				   ControlData *newctrl)
-{
-	if (oldctrl->align == 0 || oldctrl->align != newctrl->align)
-		pg_fatal("old and new pg_controldata alignments are invalid or do not match\n"
-			   "Likely one cluster is a 32-bit install, the other 64-bit\n");
-
-	if (oldctrl->blocksz == 0 || oldctrl->blocksz != newctrl->blocksz)
-		pg_fatal("old and new pg_controldata block sizes are invalid or do not match\n");
-
-	if (oldctrl->largesz == 0 || oldctrl->largesz != newctrl->largesz)
-		pg_fatal("old and new pg_controldata maximum relation segement sizes are invalid or do not match\n");
-
-	if (oldctrl->walsz == 0 || oldctrl->walsz != newctrl->walsz)
-		pg_fatal("old and new pg_controldata WAL block sizes are invalid or do not match\n");
-
-	if (oldctrl->walseg == 0 || oldctrl->walseg != newctrl->walseg)
-		pg_fatal("old and new pg_controldata WAL segment sizes are invalid or do not match\n");
-
-	if (oldctrl->ident == 0 || oldctrl->ident != newctrl->ident)
-		pg_fatal("old and new pg_controldata maximum identifier lengths are invalid or do not match\n");
-
-	if (oldctrl->index == 0 || oldctrl->index != newctrl->index)
-		pg_fatal("old and new pg_controldata maximum indexed columns are invalid or do not match\n");
-
-	if (oldctrl->toast == 0 || oldctrl->toast != newctrl->toast)
-		pg_fatal("old and new pg_controldata maximum TOAST chunk sizes are invalid or do not match\n");
-
-	/* large_object added in 9.5, so it might not exist in the old cluster */
-	if (oldctrl->large_object != 0 &&
-		oldctrl->large_object != newctrl->large_object)
-		pg_fatal("old and new pg_controldata large-object chunk sizes are invalid or do not match\n");
-
-	if (oldctrl->date_is_int != newctrl->date_is_int)
-		pg_fatal("old and new pg_controldata date/time storage types do not match\n");
-
-	/*
-	 * We might eventually allow upgrades from checksum to no-checksum
-	 * clusters.
-	 */
-	if (oldctrl->data_checksum_version == 0 &&
-		newctrl->data_checksum_version != 0)
-		pg_fatal("old cluster does not use data checksums but the new one does\n");
-	else if (oldctrl->data_checksum_version != 0 &&
-			 newctrl->data_checksum_version == 0)
-		pg_fatal("old cluster uses data checksums but the new one does not\n");
-	else if (oldctrl->data_checksum_version != newctrl->data_checksum_version)
-		pg_fatal("old and new cluster pg_controldata checksum versions do not match\n");
-}
-
-
-void
-disable_old_cluster(void)
-{
-	char		old_path[MAXPGPATH],
-				new_path[MAXPGPATH];
-
-	/* rename pg_control so old server cannot be accidentally started */
-	prep_status("Adding \".old\" suffix to old global/pg_control");
-
-	snprintf(old_path, sizeof(old_path), "%s/global/pg_control", old_cluster.pgdata);
-	snprintf(new_path, sizeof(new_path), "%s/global/pg_control.old", old_cluster.pgdata);
-	if (pg_mv_file(old_path, new_path) != 0)
-		pg_fatal("Unable to rename %s to %s.\n", old_path, new_path);
-	check_ok();
-
-	pg_log(PG_REPORT, "\n"
-		   "If you want to start the old cluster, you will need to remove\n"
-		   "the \".old\" suffix from %s/global/pg_control.old.\n"
-		 "Because \"link\" mode was used, the old cluster cannot be safely\n"
-	"started once the new cluster has been started.\n\n", old_cluster.pgdata);
-}
diff --git a/contrib/pg_upgrade/dump.c b/contrib/pg_upgrade/dump.c
deleted file mode 100644
index 941c4bb..0000000
--- a/contrib/pg_upgrade/dump.c
+++ /dev/null
@@ -1,139 +0,0 @@
-/*
- *	dump.c
- *
- *	dump functions
- *
- *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
- *	contrib/pg_upgrade/dump.c
- */
-
-#include "postgres_fe.h"
-
-#include "pg_upgrade.h"
-
-#include <sys/types.h>
-#include "catalog/binary_upgrade.h"
-
-
-void
-generate_old_dump(void)
-{
-	int			dbnum;
-	mode_t		old_umask;
-
-	prep_status("Creating dump of global objects");
-
-	/* run new pg_dumpall binary for globals */
-	exec_prog(UTILITY_LOG_FILE, NULL, true,
-			  "\"%s/pg_dumpall\" %s --globals-only --quote-all-identifiers "
-			  "--binary-upgrade %s -f %s",
-			  new_cluster.bindir, cluster_conn_opts(&old_cluster),
-			  log_opts.verbose ? "--verbose" : "",
-			  GLOBALS_DUMP_FILE);
-	check_ok();
-
-	prep_status("Creating dump of database schemas\n");
-
-	/*
-	 * Set umask for this function, all functions it calls, and all
-	 * subprocesses/threads it creates.  We can't use fopen_priv() as Windows
-	 * uses threads and umask is process-global.
-	 */
-	old_umask = umask(S_IRWXG | S_IRWXO);
-
-	/* create per-db dump files */
-	for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
-	{
-		char		sql_file_name[MAXPGPATH],
-					log_file_name[MAXPGPATH];
-		DbInfo	   *old_db = &old_cluster.dbarr.dbs[dbnum];
-
-		pg_log(PG_STATUS, "%s", old_db->db_name);
-		snprintf(sql_file_name, sizeof(sql_file_name), DB_DUMP_FILE_MASK, old_db->db_oid);
-		snprintf(log_file_name, sizeof(log_file_name), DB_DUMP_LOG_FILE_MASK, old_db->db_oid);
-
-		parallel_exec_prog(log_file_name, NULL,
-				   "\"%s/pg_dump\" %s --schema-only --quote-all-identifiers "
-				  "--binary-upgrade --format=custom %s --file=\"%s\" \"%s\"",
-						 new_cluster.bindir, cluster_conn_opts(&old_cluster),
-						   log_opts.verbose ? "--verbose" : "",
-						   sql_file_name, old_db->db_name);
-	}
-
-	/* reap all children */
-	while (reap_child(true) == true)
-		;
-
-	umask(old_umask);
-
-	end_progress_output();
-	check_ok();
-}
-
-
-/*
- * It is possible for there to be a mismatch in the need for TOAST tables
- * between the old and new servers, e.g. some pre-9.1 tables didn't need
- * TOAST tables but will need them in 9.1+.  (There are also opposite cases,
- * but these are handled by setting binary_upgrade_next_toast_pg_class_oid.)
- *
- * We can't allow the TOAST table to be created by pg_dump with a
- * pg_dump-assigned oid because it might conflict with a later table that
- * uses that oid, causing a "file exists" error for pg_class conflicts, and
- * a "duplicate oid" error for pg_type conflicts.  (TOAST tables need pg_type
- * entries.)
- *
- * Therefore, a backend in binary-upgrade mode will not create a TOAST
- * table unless an OID as passed in via pg_upgrade_support functions.
- * This function is called after the restore and uses ALTER TABLE to
- * auto-create any needed TOAST tables which will not conflict with
- * restored oids.
- */
-void
-optionally_create_toast_tables(void)
-{
-	int			dbnum;
-
-	prep_status("Creating newly-required TOAST tables");
-
-	for (dbnum = 0; dbnum < new_cluster.dbarr.ndbs; dbnum++)
-	{
-		PGresult   *res;
-		int			ntups;
-		int			rowno;
-		int			i_nspname,
-					i_relname;
-		DbInfo	   *active_db = &new_cluster.dbarr.dbs[dbnum];
-		PGconn	   *conn = connectToServer(&new_cluster, active_db->db_name);
-
-		res = executeQueryOrDie(conn,
-								"SELECT n.nspname, c.relname "
-								"FROM	pg_catalog.pg_class c, "
-								"		pg_catalog.pg_namespace n "
-								"WHERE	c.relnamespace = n.oid AND "
-							  "		n.nspname NOT IN ('pg_catalog', 'information_schema') AND "
-								"c.relkind IN ('r', 'm') AND "
-								"c.reltoastrelid = 0");
-
-		ntups = PQntuples(res);
-		i_nspname = PQfnumber(res, "nspname");
-		i_relname = PQfnumber(res, "relname");
-		for (rowno = 0; rowno < ntups; rowno++)
-		{
-			/* enable auto-oid-numbered TOAST creation if needed */
-			PQclear(executeQueryOrDie(conn, "SELECT binary_upgrade.set_next_toast_pg_class_oid('%d'::pg_catalog.oid);",
-					OPTIONALLY_CREATE_TOAST_OID));
-
-			/* dummy command that also triggers check for required TOAST table */
-			PQclear(executeQueryOrDie(conn, "ALTER TABLE %s.%s RESET (binary_upgrade_dummy_option);",
-					quote_identifier(PQgetvalue(res, rowno, i_nspname)),
-					quote_identifier(PQgetvalue(res, rowno, i_relname))));
-		}
-
-		PQclear(res);
-
-		PQfinish(conn);
-	}
-
-	check_ok();
-}
diff --git a/contrib/pg_upgrade/exec.c b/contrib/pg_upgrade/exec.c
deleted file mode 100644
index bf87419..0000000
--- a/contrib/pg_upgrade/exec.c
+++ /dev/null
@@ -1,379 +0,0 @@
-/*
- *	exec.c
- *
- *	execution functions
- *
- *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
- *	contrib/pg_upgrade/exec.c
- */
-
-#include "postgres_fe.h"
-
-#include "pg_upgrade.h"
-
-#include <fcntl.h>
-#include <sys/types.h>
-
-static void check_data_dir(const char *pg_data);
-static void check_bin_dir(ClusterInfo *cluster);
-static void validate_exec(const char *dir, const char *cmdName);
-
-#ifdef WIN32
-static int	win32_check_directory_write_permissions(void);
-#endif
-
-
-/*
- * exec_prog()
- *		Execute an external program with stdout/stderr redirected, and report
- *		errors
- *
- * Formats a command from the given argument list, logs it to the log file,
- * and attempts to execute that command.  If the command executes
- * successfully, exec_prog() returns true.
- *
- * If the command fails, an error message is saved to the specified log_file.
- * If throw_error is true, this raises a PG_FATAL error and pg_upgrade
- * terminates; otherwise it is just reported as PG_REPORT and exec_prog()
- * returns false.
- *
- * The code requires it be called first from the primary thread on Windows.
- */
-bool
-exec_prog(const char *log_file, const char *opt_log_file,
-		  bool throw_error, const char *fmt,...)
-{
-	int			result = 0;
-	int			written;
-
-#define MAXCMDLEN (2 * MAXPGPATH)
-	char		cmd[MAXCMDLEN];
-	FILE	   *log;
-	va_list		ap;
-
-#ifdef WIN32
-	static DWORD mainThreadId = 0;
-
-	/* We assume we are called from the primary thread first */
-	if (mainThreadId == 0)
-		mainThreadId = GetCurrentThreadId();
-#endif
-
-	written = 0;
-	va_start(ap, fmt);
-	written += vsnprintf(cmd + written, MAXCMDLEN - written, fmt, ap);
-	va_end(ap);
-	if (written >= MAXCMDLEN)
-		pg_fatal("command too long\n");
-	written += snprintf(cmd + written, MAXCMDLEN - written,
-						" >> \"%s\" 2>&1", log_file);
-	if (written >= MAXCMDLEN)
-		pg_fatal("command too long\n");
-
-	pg_log(PG_VERBOSE, "%s\n", cmd);
-
-#ifdef WIN32
-
-	/*
-	 * For some reason, Windows issues a file-in-use error if we write data to
-	 * the log file from a non-primary thread just before we create a
-	 * subprocess that also writes to the same log file.  One fix is to sleep
-	 * for 100ms.  A cleaner fix is to write to the log file _after_ the
-	 * subprocess has completed, so we do this only when writing from a
-	 * non-primary thread.  fflush(), running system() twice, and pre-creating
-	 * the file do not see to help.
-	 */
-	if (mainThreadId != GetCurrentThreadId())
-		result = system(cmd);
-#endif
-
-	log = fopen(log_file, "a");
-
-#ifdef WIN32
-	{
-		/*
-		 * "pg_ctl -w stop" might have reported that the server has stopped
-		 * because the postmaster.pid file has been removed, but "pg_ctl -w
-		 * start" might still be in the process of closing and might still be
-		 * holding its stdout and -l log file descriptors open.  Therefore,
-		 * try to open the log file a few more times.
-		 */
-		int			iter;
-
-		for (iter = 0; iter < 4 && log == NULL; iter++)
-		{
-			pg_usleep(1000000); /* 1 sec */
-			log = fopen(log_file, "a");
-		}
-	}
-#endif
-
-	if (log == NULL)
-		pg_fatal("cannot write to log file %s\n", log_file);
-
-#ifdef WIN32
-	/* Are we printing "command:" before its output? */
-	if (mainThreadId == GetCurrentThreadId())
-		fprintf(log, "\n\n");
-#endif
-	fprintf(log, "command: %s\n", cmd);
-#ifdef WIN32
-	/* Are we printing "command:" after its output? */
-	if (mainThreadId != GetCurrentThreadId())
-		fprintf(log, "\n\n");
-#endif
-
-	/*
-	 * In Windows, we must close the log file at this point so the file is not
-	 * open while the command is running, or we get a share violation.
-	 */
-	fclose(log);
-
-#ifdef WIN32
-	/* see comment above */
-	if (mainThreadId == GetCurrentThreadId())
-#endif
-		result = system(cmd);
-
-	if (result != 0)
-	{
-		/* we might be in on a progress status line, so go to the next line */
-		report_status(PG_REPORT, "\n*failure*");
-		fflush(stdout);
-
-		pg_log(PG_VERBOSE, "There were problems executing \"%s\"\n", cmd);
-		if (opt_log_file)
-			pg_log(throw_error ? PG_FATAL : PG_REPORT,
-				   "Consult the last few lines of \"%s\" or \"%s\" for\n"
-				   "the probable cause of the failure.\n",
-				   log_file, opt_log_file);
-		else
-			pg_log(throw_error ? PG_FATAL : PG_REPORT,
-				   "Consult the last few lines of \"%s\" for\n"
-				   "the probable cause of the failure.\n",
-				   log_file);
-	}
-
-#ifndef WIN32
-
-	/*
-	 * We can't do this on Windows because it will keep the "pg_ctl start"
-	 * output filename open until the server stops, so we do the \n\n above on
-	 * that platform.  We use a unique filename for "pg_ctl start" that is
-	 * never reused while the server is running, so it works fine.  We could
-	 * log these commands to a third file, but that just adds complexity.
-	 */
-	if ((log = fopen(log_file, "a")) == NULL)
-		pg_fatal("cannot write to log file %s\n", log_file);
-	fprintf(log, "\n\n");
-	fclose(log);
-#endif
-
-	return result == 0;
-}
-
-
-/*
- * pid_lock_file_exists()
- *
- * Checks whether the postmaster.pid file exists.
- */
-bool
-pid_lock_file_exists(const char *datadir)
-{
-	char		path[MAXPGPATH];
-	int			fd;
-
-	snprintf(path, sizeof(path), "%s/postmaster.pid", datadir);
-
-	if ((fd = open(path, O_RDONLY, 0)) < 0)
-	{
-		/* ENOTDIR means we will throw a more useful error later */
-		if (errno != ENOENT && errno != ENOTDIR)
-			pg_fatal("could not open file \"%s\" for reading: %s\n",
-					 path, getErrorText(errno));
-
-		return false;
-	}
-
-	close(fd);
-	return true;
-}
-
-
-/*
- * verify_directories()
- *
- * does all the hectic work of verifying directories and executables
- * of old and new server.
- *
- * NOTE: May update the values of all parameters
- */
-void
-verify_directories(void)
-{
-#ifndef WIN32
-	if (access(".", R_OK | W_OK | X_OK) != 0)
-#else
-	if (win32_check_directory_write_permissions() != 0)
-#endif
-		pg_fatal("You must have read and write access in the current directory.\n");
-
-	check_bin_dir(&old_cluster);
-	check_data_dir(old_cluster.pgdata);
-	check_bin_dir(&new_cluster);
-	check_data_dir(new_cluster.pgdata);
-}
-
-
-#ifdef WIN32
-/*
- * win32_check_directory_write_permissions()
- *
- *	access() on WIN32 can't check directory permissions, so we have to
- *	optionally create, then delete a file to check.
- *		http://msdn.microsoft.com/en-us/library/1w06ktdy%28v=vs.80%29.aspx
- */
-static int
-win32_check_directory_write_permissions(void)
-{
-	int			fd;
-
-	/*
-	 * We open a file we would normally create anyway.  We do this even in
-	 * 'check' mode, which isn't ideal, but this is the best we can do.
-	 */
-	if ((fd = open(GLOBALS_DUMP_FILE, O_RDWR | O_CREAT, S_IRUSR | S_IWUSR)) < 0)
-		return -1;
-	close(fd);
-
-	return unlink(GLOBALS_DUMP_FILE);
-}
-#endif
-
-
-/*
- * check_data_dir()
- *
- *	This function validates the given cluster directory - we search for a
- *	small set of subdirectories that we expect to find in a valid $PGDATA
- *	directory.  If any of the subdirectories are missing (or secured against
- *	us) we display an error message and exit()
- *
- */
-static void
-check_data_dir(const char *pg_data)
-{
-	char		subDirName[MAXPGPATH];
-	int			subdirnum;
-
-	/* start check with top-most directory */
-	const char *requiredSubdirs[] = {"", "base", "global", "pg_clog",
-		"pg_multixact", "pg_subtrans", "pg_tblspc", "pg_twophase",
-	"pg_xlog"};
-
-	for (subdirnum = 0;
-		 subdirnum < sizeof(requiredSubdirs) / sizeof(requiredSubdirs[0]);
-		 ++subdirnum)
-	{
-		struct stat statBuf;
-
-		snprintf(subDirName, sizeof(subDirName), "%s%s%s", pg_data,
-		/* Win32 can't stat() a directory with a trailing slash. */
-				 *requiredSubdirs[subdirnum] ? "/" : "",
-				 requiredSubdirs[subdirnum]);
-
-		if (stat(subDirName, &statBuf) != 0)
-			report_status(PG_FATAL, "check for \"%s\" failed: %s\n",
-						  subDirName, getErrorText(errno));
-		else if (!S_ISDIR(statBuf.st_mode))
-			report_status(PG_FATAL, "%s is not a directory\n",
-						  subDirName);
-	}
-}
-
-
-/*
- * check_bin_dir()
- *
- *	This function searches for the executables that we expect to find
- *	in the binaries directory.  If we find that a required executable
- *	is missing (or secured against us), we display an error message and
- *	exit().
- */
-static void
-check_bin_dir(ClusterInfo *cluster)
-{
-	struct stat statBuf;
-
-	/* check bindir */
-	if (stat(cluster->bindir, &statBuf) != 0)
-		report_status(PG_FATAL, "check for \"%s\" failed: %s\n",
-					  cluster->bindir, getErrorText(errno));
-	else if (!S_ISDIR(statBuf.st_mode))
-		report_status(PG_FATAL, "%s is not a directory\n",
-					  cluster->bindir);
-
-	validate_exec(cluster->bindir, "postgres");
-	validate_exec(cluster->bindir, "pg_ctl");
-	validate_exec(cluster->bindir, "pg_resetxlog");
-	if (cluster == &new_cluster)
-	{
-		/* these are only needed in the new cluster */
-		validate_exec(cluster->bindir, "psql");
-		validate_exec(cluster->bindir, "pg_dump");
-		validate_exec(cluster->bindir, "pg_dumpall");
-	}
-}
-
-
-/*
- * validate_exec()
- *
- * validate "path" as an executable file
- */
-static void
-validate_exec(const char *dir, const char *cmdName)
-{
-	char		path[MAXPGPATH];
-	struct stat buf;
-
-	snprintf(path, sizeof(path), "%s/%s", dir, cmdName);
-
-#ifdef WIN32
-	/* Windows requires a .exe suffix for stat() */
-	if (strlen(path) <= strlen(EXE_EXT) ||
-		pg_strcasecmp(path + strlen(path) - strlen(EXE_EXT), EXE_EXT) != 0)
-		strlcat(path, EXE_EXT, sizeof(path));
-#endif
-
-	/*
-	 * Ensure that the file exists and is a regular file.
-	 */
-	if (stat(path, &buf) < 0)
-		pg_fatal("check for \"%s\" failed: %s\n",
-				 path, getErrorText(errno));
-	else if (!S_ISREG(buf.st_mode))
-		pg_fatal("check for \"%s\" failed: not an executable file\n",
-				 path);
-
-	/*
-	 * Ensure that the file is both executable and readable (required for
-	 * dynamic loading).
-	 */
-#ifndef WIN32
-	if (access(path, R_OK) != 0)
-#else
-	if ((buf.st_mode & S_IRUSR) == 0)
-#endif
-		pg_fatal("check for \"%s\" failed: cannot read file (permission denied)\n",
-				 path);
-
-#ifndef WIN32
-	if (access(path, X_OK) != 0)
-#else
-	if ((buf.st_mode & S_IXUSR) == 0)
-#endif
-		pg_fatal("check for \"%s\" failed: cannot execute (permission denied)\n",
-				 path);
-}
diff --git a/contrib/pg_upgrade/file.c b/contrib/pg_upgrade/file.c
deleted file mode 100644
index 5a8d17a..0000000
--- a/contrib/pg_upgrade/file.c
+++ /dev/null
@@ -1,250 +0,0 @@
-/*
- *	file.c
- *
- *	file system operations
- *
- *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
- *	contrib/pg_upgrade/file.c
- */
-
-#include "postgres_fe.h"
-
-#include "pg_upgrade.h"
-
-#include <fcntl.h>
-
-
-
-#ifndef WIN32
-static int	copy_file(const char *fromfile, const char *tofile, bool force);
-#else
-static int	win32_pghardlink(const char *src, const char *dst);
-#endif
-
-
-/*
- * copyAndUpdateFile()
- *
- *	Copies a relation file from src to dst.  If pageConverter is non-NULL, this function
- *	uses that pageConverter to do a page-by-page conversion.
- */
-const char *
-copyAndUpdateFile(pageCnvCtx *pageConverter,
-				  const char *src, const char *dst, bool force)
-{
-	if (pageConverter == NULL)
-	{
-		if (pg_copy_file(src, dst, force) == -1)
-			return getErrorText(errno);
-		else
-			return NULL;
-	}
-	else
-	{
-		/*
-		 * We have a pageConverter object - that implies that the
-		 * PageLayoutVersion differs between the two clusters so we have to
-		 * perform a page-by-page conversion.
-		 *
-		 * If the pageConverter can convert the entire file at once, invoke
-		 * that plugin function, otherwise, read each page in the relation
-		 * file and call the convertPage plugin function.
-		 */
-
-#ifdef PAGE_CONVERSION
-		if (pageConverter->convertFile)
-			return pageConverter->convertFile(pageConverter->pluginData,
-											  dst, src);
-		else
-#endif
-		{
-			int			src_fd;
-			int			dstfd;
-			char		buf[BLCKSZ];
-			ssize_t		bytesRead;
-			const char *msg = NULL;
-
-			if ((src_fd = open(src, O_RDONLY, 0)) < 0)
-				return "could not open source file";
-
-			if ((dstfd = open(dst, O_RDWR | O_CREAT | O_EXCL, S_IRUSR | S_IWUSR)) < 0)
-			{
-				close(src_fd);
-				return "could not create destination file";
-			}
-
-			while ((bytesRead = read(src_fd, buf, BLCKSZ)) == BLCKSZ)
-			{
-#ifdef PAGE_CONVERSION
-				if ((msg = pageConverter->convertPage(pageConverter->pluginData, buf, buf)) != NULL)
-					break;
-#endif
-				if (write(dstfd, buf, BLCKSZ) != BLCKSZ)
-				{
-					msg = "could not write new page to destination";
-					break;
-				}
-			}
-
-			close(src_fd);
-			close(dstfd);
-
-			if (msg)
-				return msg;
-			else if (bytesRead != 0)
-				return "found partial page in source file";
-			else
-				return NULL;
-		}
-	}
-}
-
-
-/*
- * linkAndUpdateFile()
- *
- * Creates a hard link between the given relation files. We use
- * this function to perform a true in-place update. If the on-disk
- * format of the new cluster is bit-for-bit compatible with the on-disk
- * format of the old cluster, we can simply link each relation
- * instead of copying the data from the old cluster to the new cluster.
- */
-const char *
-linkAndUpdateFile(pageCnvCtx *pageConverter,
-				  const char *src, const char *dst)
-{
-	if (pageConverter != NULL)
-		return "Cannot in-place update this cluster, page-by-page conversion is required";
-
-	if (pg_link_file(src, dst) == -1)
-		return getErrorText(errno);
-	else
-		return NULL;
-}
-
-
-#ifndef WIN32
-static int
-copy_file(const char *srcfile, const char *dstfile, bool force)
-{
-#define COPY_BUF_SIZE (50 * BLCKSZ)
-
-	int			src_fd;
-	int			dest_fd;
-	char	   *buffer;
-	int			ret = 0;
-	int			save_errno = 0;
-
-	if ((srcfile == NULL) || (dstfile == NULL))
-	{
-		errno = EINVAL;
-		return -1;
-	}
-
-	if ((src_fd = open(srcfile, O_RDONLY, 0)) < 0)
-		return -1;
-
-	if ((dest_fd = open(dstfile, O_RDWR | O_CREAT | (force ? 0 : O_EXCL), S_IRUSR | S_IWUSR)) < 0)
-	{
-		save_errno = errno;
-
-		if (src_fd != 0)
-			close(src_fd);
-
-		errno = save_errno;
-		return -1;
-	}
-
-	buffer = (char *) pg_malloc(COPY_BUF_SIZE);
-
-	/* perform data copying i.e read src source, write to destination */
-	while (true)
-	{
-		ssize_t		nbytes = read(src_fd, buffer, COPY_BUF_SIZE);
-
-		if (nbytes < 0)
-		{
-			save_errno = errno;
-			ret = -1;
-			break;
-		}
-
-		if (nbytes == 0)
-			break;
-
-		errno = 0;
-
-		if (write(dest_fd, buffer, nbytes) != nbytes)
-		{
-			/* if write didn't set errno, assume problem is no disk space */
-			if (errno == 0)
-				errno = ENOSPC;
-			save_errno = errno;
-			ret = -1;
-			break;
-		}
-	}
-
-	pg_free(buffer);
-
-	if (src_fd != 0)
-		close(src_fd);
-
-	if (dest_fd != 0)
-		close(dest_fd);
-
-	if (save_errno != 0)
-		errno = save_errno;
-
-	return ret;
-}
-#endif
-
-
-void
-check_hard_link(void)
-{
-	char		existing_file[MAXPGPATH];
-	char		new_link_file[MAXPGPATH];
-
-	snprintf(existing_file, sizeof(existing_file), "%s/PG_VERSION", old_cluster.pgdata);
-	snprintf(new_link_file, sizeof(new_link_file), "%s/PG_VERSION.linktest", new_cluster.pgdata);
-	unlink(new_link_file);		/* might fail */
-
-	if (pg_link_file(existing_file, new_link_file) == -1)
-	{
-		pg_fatal("Could not create hard link between old and new data directories: %s\n"
-				 "In link mode the old and new data directories must be on the same file system volume.\n",
-				 getErrorText(errno));
-	}
-	unlink(new_link_file);
-}
-
-#ifdef WIN32
-static int
-win32_pghardlink(const char *src, const char *dst)
-{
-	/*
-	 * CreateHardLinkA returns zero for failure
-	 * http://msdn.microsoft.com/en-us/library/aa363860(VS.85).aspx
-	 */
-	if (CreateHardLinkA(dst, src, NULL) == 0)
-		return -1;
-	else
-		return 0;
-}
-#endif
-
-
-/* fopen() file with no group/other permissions */
-FILE *
-fopen_priv(const char *path, const char *mode)
-{
-	mode_t		old_umask = umask(S_IRWXG | S_IRWXO);
-	FILE	   *fp;
-
-	fp = fopen(path, mode);
-	umask(old_umask);
-
-	return fp;
-}
diff --git a/contrib/pg_upgrade/function.c b/contrib/pg_upgrade/function.c
deleted file mode 100644
index deffe04..0000000
--- a/contrib/pg_upgrade/function.c
+++ /dev/null
@@ -1,353 +0,0 @@
-/*
- *	function.c
- *
- *	server-side function support
- *
- *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
- *	contrib/pg_upgrade/function.c
- */
-
-#include "postgres_fe.h"
-
-#include "pg_upgrade.h"
-
-#include "access/transam.h"
-
-#define PG_UPGRADE_SUPPORT	"$libdir/pg_upgrade_support"
-
-/*
- * install_support_functions_in_new_db()
- *
- * pg_upgrade requires some support functions that enable it to modify
- * backend behavior.
- */
-void
-install_support_functions_in_new_db(const char *db_name)
-{
-	PGconn	   *conn = connectToServer(&new_cluster, db_name);
-
-	/* suppress NOTICE of dropped objects */
-	PQclear(executeQueryOrDie(conn,
-							  "SET client_min_messages = warning;"));
-	PQclear(executeQueryOrDie(conn,
-						   "DROP SCHEMA IF EXISTS binary_upgrade CASCADE;"));
-	PQclear(executeQueryOrDie(conn,
-							  "RESET client_min_messages;"));
-
-	PQclear(executeQueryOrDie(conn,
-							  "CREATE SCHEMA binary_upgrade;"));
-
-	PQclear(executeQueryOrDie(conn,
-							  "CREATE OR REPLACE FUNCTION "
-							  "binary_upgrade.set_next_pg_type_oid(OID) "
-							  "RETURNS VOID "
-							  "AS '$libdir/pg_upgrade_support' "
-							  "LANGUAGE C STRICT;"));
-	PQclear(executeQueryOrDie(conn,
-							  "CREATE OR REPLACE FUNCTION "
-							"binary_upgrade.set_next_array_pg_type_oid(OID) "
-							  "RETURNS VOID "
-							  "AS '$libdir/pg_upgrade_support' "
-							  "LANGUAGE C STRICT;"));
-	PQclear(executeQueryOrDie(conn,
-							  "CREATE OR REPLACE FUNCTION "
-							"binary_upgrade.set_next_toast_pg_type_oid(OID) "
-							  "RETURNS VOID "
-							  "AS '$libdir/pg_upgrade_support' "
-							  "LANGUAGE C STRICT;"));
-	PQclear(executeQueryOrDie(conn,
-							  "CREATE OR REPLACE FUNCTION "
-							"binary_upgrade.set_next_heap_pg_class_oid(OID) "
-							  "RETURNS VOID "
-							  "AS '$libdir/pg_upgrade_support' "
-							  "LANGUAGE C STRICT;"));
-	PQclear(executeQueryOrDie(conn,
-							  "CREATE OR REPLACE FUNCTION "
-						   "binary_upgrade.set_next_index_pg_class_oid(OID) "
-							  "RETURNS VOID "
-							  "AS '$libdir/pg_upgrade_support' "
-							  "LANGUAGE C STRICT;"));
-	PQclear(executeQueryOrDie(conn,
-							  "CREATE OR REPLACE FUNCTION "
-						   "binary_upgrade.set_next_toast_pg_class_oid(OID) "
-							  "RETURNS VOID "
-							  "AS '$libdir/pg_upgrade_support' "
-							  "LANGUAGE C STRICT;"));
-	PQclear(executeQueryOrDie(conn,
-							  "CREATE OR REPLACE FUNCTION "
-							  "binary_upgrade.set_next_pg_enum_oid(OID) "
-							  "RETURNS VOID "
-							  "AS '$libdir/pg_upgrade_support' "
-							  "LANGUAGE C STRICT;"));
-	PQclear(executeQueryOrDie(conn,
-							  "CREATE OR REPLACE FUNCTION "
-							  "binary_upgrade.set_next_pg_authid_oid(OID) "
-							  "RETURNS VOID "
-							  "AS '$libdir/pg_upgrade_support' "
-							  "LANGUAGE C STRICT;"));
-	PQclear(executeQueryOrDie(conn,
-							  "CREATE OR REPLACE FUNCTION "
-							  "binary_upgrade.create_empty_extension(text, text, bool, text, oid[], text[], text[]) "
-							  "RETURNS VOID "
-							  "AS '$libdir/pg_upgrade_support' "
-							  "LANGUAGE C;"));
-	PQfinish(conn);
-}
-
-
-void
-uninstall_support_functions_from_new_cluster(void)
-{
-	int			dbnum;
-
-	prep_status("Removing support functions from new cluster");
-
-	for (dbnum = 0; dbnum < new_cluster.dbarr.ndbs; dbnum++)
-	{
-		DbInfo	   *new_db = &new_cluster.dbarr.dbs[dbnum];
-		PGconn	   *conn = connectToServer(&new_cluster, new_db->db_name);
-
-		/* suppress NOTICE of dropped objects */
-		PQclear(executeQueryOrDie(conn,
-								  "SET client_min_messages = warning;"));
-		PQclear(executeQueryOrDie(conn,
-								  "DROP SCHEMA binary_upgrade CASCADE;"));
-		PQclear(executeQueryOrDie(conn,
-								  "RESET client_min_messages;"));
-		PQfinish(conn);
-	}
-	check_ok();
-}
-
-
-/*
- * get_loadable_libraries()
- *
- *	Fetch the names of all old libraries containing C-language functions.
- *	We will later check that they all exist in the new installation.
- */
-void
-get_loadable_libraries(void)
-{
-	PGresult  **ress;
-	int			totaltups;
-	int			dbnum;
-	bool		found_public_plpython_handler = false;
-
-	ress = (PGresult **) pg_malloc(old_cluster.dbarr.ndbs * sizeof(PGresult *));
-	totaltups = 0;
-
-	/* Fetch all library names, removing duplicates within each DB */
-	for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
-	{
-		DbInfo	   *active_db = &old_cluster.dbarr.dbs[dbnum];
-		PGconn	   *conn = connectToServer(&old_cluster, active_db->db_name);
-
-		/*
-		 * Fetch all libraries referenced in this DB.  We can't exclude the
-		 * "pg_catalog" schema because, while such functions are not
-		 * explicitly dumped by pg_dump, they do reference implicit objects
-		 * that pg_dump does dump, e.g. CREATE LANGUAGE plperl.
-		 */
-		ress[dbnum] = executeQueryOrDie(conn,
-										"SELECT DISTINCT probin "
-										"FROM	pg_catalog.pg_proc "
-										"WHERE	prolang = 13 /* C */ AND "
-										"probin IS NOT NULL AND "
-										"oid >= %u;",
-										FirstNormalObjectId);
-		totaltups += PQntuples(ress[dbnum]);
-
-		/*
-		 * Systems that install plpython before 8.1 have
-		 * plpython_call_handler() defined in the "public" schema, causing
-		 * pg_dump to dump it.  However that function still references
-		 * "plpython" (no "2"), so it throws an error on restore.  This code
-		 * checks for the problem function, reports affected databases to the
-		 * user and explains how to remove them. 8.1 git commit:
-		 * e0dedd0559f005d60c69c9772163e69c204bac69
-		 * http://archives.postgresql.org/pgsql-hackers/2012-03/msg01101.php
-		 * http://archives.postgresql.org/pgsql-bugs/2012-05/msg00206.php
-		 */
-		if (GET_MAJOR_VERSION(old_cluster.major_version) < 901)
-		{
-			PGresult   *res;
-
-			res = executeQueryOrDie(conn,
-									"SELECT 1 "
-						   "FROM	pg_catalog.pg_proc JOIN pg_namespace "
-							 "		ON pronamespace = pg_namespace.oid "
-							   "WHERE proname = 'plpython_call_handler' AND "
-									"nspname = 'public' AND "
-									"prolang = 13 /* C */ AND "
-									"probin = '$libdir/plpython' AND "
-									"pg_proc.oid >= %u;",
-									FirstNormalObjectId);
-			if (PQntuples(res) > 0)
-			{
-				if (!found_public_plpython_handler)
-				{
-					pg_log(PG_WARNING,
-						   "\nThe old cluster has a \"plpython_call_handler\" function defined\n"
-						   "in the \"public\" schema which is a duplicate of the one defined\n"
-						   "in the \"pg_catalog\" schema.  You can confirm this by executing\n"
-						   "in psql:\n"
-						   "\n"
-						   "    \\df *.plpython_call_handler\n"
-						   "\n"
-						   "The \"public\" schema version of this function was created by a\n"
-						   "pre-8.1 install of plpython, and must be removed for pg_upgrade\n"
-						   "to complete because it references a now-obsolete \"plpython\"\n"
-						   "shared object file.  You can remove the \"public\" schema version\n"
-					   "of this function by running the following command:\n"
-						   "\n"
-						 "    DROP FUNCTION public.plpython_call_handler()\n"
-						   "\n"
-						   "in each affected database:\n"
-						   "\n");
-				}
-				pg_log(PG_WARNING, "    %s\n", active_db->db_name);
-				found_public_plpython_handler = true;
-			}
-			PQclear(res);
-		}
-
-		PQfinish(conn);
-	}
-
-	if (found_public_plpython_handler)
-		pg_fatal("Remove the problem functions from the old cluster to continue.\n");
-
-	totaltups++;				/* reserve for pg_upgrade_support */
-
-	/* Allocate what's certainly enough space */
-	os_info.libraries = (char **) pg_malloc(totaltups * sizeof(char *));
-
-	/*
-	 * Now remove duplicates across DBs.  This is pretty inefficient code, but
-	 * there probably aren't enough entries to matter.
-	 */
-	totaltups = 0;
-	os_info.libraries[totaltups++] = pg_strdup(PG_UPGRADE_SUPPORT);
-
-	for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
-	{
-		PGresult   *res = ress[dbnum];
-		int			ntups;
-		int			rowno;
-
-		ntups = PQntuples(res);
-		for (rowno = 0; rowno < ntups; rowno++)
-		{
-			char	   *lib = PQgetvalue(res, rowno, 0);
-			bool		dup = false;
-			int			n;
-
-			for (n = 0; n < totaltups; n++)
-			{
-				if (strcmp(lib, os_info.libraries[n]) == 0)
-				{
-					dup = true;
-					break;
-				}
-			}
-			if (!dup)
-				os_info.libraries[totaltups++] = pg_strdup(lib);
-		}
-
-		PQclear(res);
-	}
-
-	os_info.num_libraries = totaltups;
-
-	pg_free(ress);
-}
-
-
-/*
- * check_loadable_libraries()
- *
- *	Check that the new cluster contains all required libraries.
- *	We do this by actually trying to LOAD each one, thereby testing
- *	compatibility as well as presence.
- */
-void
-check_loadable_libraries(void)
-{
-	PGconn	   *conn = connectToServer(&new_cluster, "template1");
-	int			libnum;
-	FILE	   *script = NULL;
-	bool		found = false;
-	char		output_path[MAXPGPATH];
-
-	prep_status("Checking for presence of required libraries");
-
-	snprintf(output_path, sizeof(output_path), "loadable_libraries.txt");
-
-	for (libnum = 0; libnum < os_info.num_libraries; libnum++)
-	{
-		char	   *lib = os_info.libraries[libnum];
-		int			llen = strlen(lib);
-		char		cmd[7 + 2 * MAXPGPATH + 1];
-		PGresult   *res;
-
-		/*
-		 * In Postgres 9.0, Python 3 support was added, and to do that, a
-		 * plpython2u language was created with library name plpython2.so as a
-		 * symbolic link to plpython.so.  In Postgres 9.1, only the
-		 * plpython2.so library was created, and both plpythonu and plpython2u
-		 * pointing to it.  For this reason, any reference to library name
-		 * "plpython" in an old PG <= 9.1 cluster must look for "plpython2" in
-		 * the new cluster.
-		 *
-		 * For this case, we could check pg_pltemplate, but that only works
-		 * for languages, and does not help with function shared objects, so
-		 * we just do a general fix.
-		 */
-		if (GET_MAJOR_VERSION(old_cluster.major_version) < 901 &&
-			strcmp(lib, "$libdir/plpython") == 0)
-		{
-			lib = "$libdir/plpython2";
-			llen = strlen(lib);
-		}
-
-		strcpy(cmd, "LOAD '");
-		PQescapeStringConn(conn, cmd + strlen(cmd), lib, llen, NULL);
-		strcat(cmd, "'");
-
-		res = PQexec(conn, cmd);
-
-		if (PQresultStatus(res) != PGRES_COMMAND_OK)
-		{
-			found = true;
-
-			/* exit and report missing support library with special message */
-			if (strcmp(lib, PG_UPGRADE_SUPPORT) == 0)
-				pg_fatal("The pg_upgrade_support module must be created and installed in the new cluster.\n");
-
-			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
-				pg_fatal("Could not open file \"%s\": %s\n",
-						 output_path, getErrorText(errno));
-			fprintf(script, "Could not load library \"%s\"\n%s\n",
-					lib,
-					PQerrorMessage(conn));
-		}
-
-		PQclear(res);
-	}
-
-	PQfinish(conn);
-
-	if (found)
-	{
-		fclose(script);
-		pg_log(PG_REPORT, "fatal\n");
-		pg_fatal("Your installation references loadable libraries that are missing from the\n"
-				 "new installation.  You can add these libraries to the new installation,\n"
-				 "or remove the functions using them from the old installation.  A list of\n"
-				 "problem libraries is in the file:\n"
-				 "    %s\n\n", output_path);
-	}
-	else
-		check_ok();
-}
diff --git a/contrib/pg_upgrade/info.c b/contrib/pg_upgrade/info.c
deleted file mode 100644
index 1254934..0000000
--- a/contrib/pg_upgrade/info.c
+++ /dev/null
@@ -1,535 +0,0 @@
-/*
- *	info.c
- *
- *	information support functions
- *
- *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
- *	contrib/pg_upgrade/info.c
- */
-
-#include "postgres_fe.h"
-
-#include "pg_upgrade.h"
-
-#include "access/transam.h"
-
-
-static void create_rel_filename_map(const char *old_data, const char *new_data,
-						const DbInfo *old_db, const DbInfo *new_db,
-						const RelInfo *old_rel, const RelInfo *new_rel,
-						FileNameMap *map);
-static void free_db_and_rel_infos(DbInfoArr *db_arr);
-static void get_db_infos(ClusterInfo *cluster);
-static void get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo);
-static void free_rel_infos(RelInfoArr *rel_arr);
-static void print_db_infos(DbInfoArr *dbinfo);
-static void print_rel_infos(RelInfoArr *rel_arr);
-
-
-/*
- * gen_db_file_maps()
- *
- * generates database mappings for "old_db" and "new_db". Returns a malloc'ed
- * array of mappings. nmaps is a return parameter which refers to the number
- * mappings.
- */
-FileNameMap *
-gen_db_file_maps(DbInfo *old_db, DbInfo *new_db,
-				 int *nmaps, const char *old_pgdata, const char *new_pgdata)
-{
-	FileNameMap *maps;
-	int			old_relnum, new_relnum;
-	int			num_maps = 0;
-
-	maps = (FileNameMap *) pg_malloc(sizeof(FileNameMap) *
-									 old_db->rel_arr.nrels);
-
-	/*
-	 * The old database shouldn't have more relations than the new one.
-	 * We force the new cluster to have a TOAST table if the old table
-	 * had one.
-	 */
-	if (old_db->rel_arr.nrels > new_db->rel_arr.nrels)
-		pg_fatal("old and new databases \"%s\" have a mismatched number of relations\n",
-				 old_db->db_name);
-
-	/* Drive the loop using new_relnum, which might be higher. */
-	for (old_relnum = new_relnum = 0; new_relnum < new_db->rel_arr.nrels;
-		 new_relnum++)
-	{
-		RelInfo    *old_rel;
-		RelInfo    *new_rel = &new_db->rel_arr.rels[new_relnum];
-
-		/*
-		 * It is possible that the new cluster has a TOAST table for a table
-		 * that didn't need one in the old cluster, e.g. 9.0 to 9.1 changed the
-		 * NUMERIC length computation.  Therefore, if we have a TOAST table
-		 * in the new cluster that doesn't match, skip over it and continue
-		 * processing.  It is possible this TOAST table used an OID that was
-		 * reserved in the old cluster, but we have no way of testing that,
-		 * and we would have already gotten an error at the new cluster schema
-		 * creation stage.  Fortunately, since we only restore the OID counter
-		 * after schema restore, and restore in OID order via pg_dump, a
-		 * conflict would only happen if the new TOAST table had a very low
-		 * OID.  However, TOAST tables created long after initial table
-		 * creation can have any OID, particularly after OID wraparound.
-		 */
-		if (old_relnum == old_db->rel_arr.nrels)
-		{
-			if (strcmp(new_rel->nspname, "pg_toast") == 0)
-				continue;
-			else
-				pg_fatal("Extra non-TOAST relation found in database \"%s\": new OID %d\n",
-						 old_db->db_name, new_rel->reloid);
-		}
-
-		old_rel = &old_db->rel_arr.rels[old_relnum];
-
-		if (old_rel->reloid != new_rel->reloid)
-		{
-			if (strcmp(new_rel->nspname, "pg_toast") == 0)
-				continue;
-			else
-				pg_fatal("Mismatch of relation OID in database \"%s\": old OID %d, new OID %d\n",
-						 old_db->db_name, old_rel->reloid, new_rel->reloid);
-		}
-
-		/*
-		 * TOAST table names initially match the heap pg_class oid. In
-		 * pre-8.4, TOAST table names change during CLUSTER; in pre-9.0, TOAST
-		 * table names change during ALTER TABLE ALTER COLUMN SET TYPE. In >=
-		 * 9.0, TOAST relation names always use heap table oids, hence we
-		 * cannot check relation names when upgrading from pre-9.0. Clusters
-		 * upgraded to 9.0 will get matching TOAST names. If index names don't
-		 * match primary key constraint names, this will fail because pg_dump
-		 * dumps constraint names and pg_upgrade checks index names.
-		 */
-		if (strcmp(old_rel->nspname, new_rel->nspname) != 0 ||
-			((GET_MAJOR_VERSION(old_cluster.major_version) >= 900 ||
-			  strcmp(old_rel->nspname, "pg_toast") != 0) &&
-			 strcmp(old_rel->relname, new_rel->relname) != 0))
-			pg_fatal("Mismatch of relation names in database \"%s\": "
-					 "old name \"%s.%s\", new name \"%s.%s\"\n",
-					 old_db->db_name, old_rel->nspname, old_rel->relname,
-					 new_rel->nspname, new_rel->relname);
-
-		create_rel_filename_map(old_pgdata, new_pgdata, old_db, new_db,
-								old_rel, new_rel, maps + num_maps);
-		num_maps++;
-		old_relnum++;
-	}
-
-	/* Did we fail to exhaust the old array? */
-	if (old_relnum != old_db->rel_arr.nrels)
-		pg_fatal("old and new databases \"%s\" have a mismatched number of relations\n",
-				 old_db->db_name);
-
-	*nmaps = num_maps;
-	return maps;
-}
-
-
-/*
- * create_rel_filename_map()
- *
- * fills a file node map structure and returns it in "map".
- */
-static void
-create_rel_filename_map(const char *old_data, const char *new_data,
-						const DbInfo *old_db, const DbInfo *new_db,
-						const RelInfo *old_rel, const RelInfo *new_rel,
-						FileNameMap *map)
-{
-	if (strlen(old_rel->tablespace) == 0)
-	{
-		/*
-		 * relation belongs to the default tablespace, hence relfiles should
-		 * exist in the data directories.
-		 */
-		map->old_tablespace = old_data;
-		map->new_tablespace = new_data;
-		map->old_tablespace_suffix = "/base";
-		map->new_tablespace_suffix = "/base";
-	}
-	else
-	{
-		/* relation belongs to a tablespace, so use the tablespace location */
-		map->old_tablespace = old_rel->tablespace;
-		map->new_tablespace = new_rel->tablespace;
-		map->old_tablespace_suffix = old_cluster.tablespace_suffix;
-		map->new_tablespace_suffix = new_cluster.tablespace_suffix;
-	}
-
-	map->old_db_oid = old_db->db_oid;
-	map->new_db_oid = new_db->db_oid;
-
-	/*
-	 * old_relfilenode might differ from pg_class.oid (and hence
-	 * new_relfilenode) because of CLUSTER, REINDEX, or VACUUM FULL.
-	 */
-	map->old_relfilenode = old_rel->relfilenode;
-
-	/* new_relfilenode will match old and new pg_class.oid */
-	map->new_relfilenode = new_rel->relfilenode;
-
-	/* used only for logging and error reporing, old/new are identical */
-	map->nspname = old_rel->nspname;
-	map->relname = old_rel->relname;
-}
-
-
-void
-print_maps(FileNameMap *maps, int n_maps, const char *db_name)
-{
-	if (log_opts.verbose)
-	{
-		int			mapnum;
-
-		pg_log(PG_VERBOSE, "mappings for database \"%s\":\n", db_name);
-
-		for (mapnum = 0; mapnum < n_maps; mapnum++)
-			pg_log(PG_VERBOSE, "%s.%s: %u to %u\n",
-				   maps[mapnum].nspname, maps[mapnum].relname,
-				   maps[mapnum].old_relfilenode,
-				   maps[mapnum].new_relfilenode);
-
-		pg_log(PG_VERBOSE, "\n\n");
-	}
-}
-
-
-/*
- * get_db_and_rel_infos()
- *
- * higher level routine to generate dbinfos for the database running
- * on the given "port". Assumes that server is already running.
- */
-void
-get_db_and_rel_infos(ClusterInfo *cluster)
-{
-	int			dbnum;
-
-	if (cluster->dbarr.dbs != NULL)
-		free_db_and_rel_infos(&cluster->dbarr);
-
-	get_db_infos(cluster);
-
-	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
-		get_rel_infos(cluster, &cluster->dbarr.dbs[dbnum]);
-
-	pg_log(PG_VERBOSE, "\n%s databases:\n", CLUSTER_NAME(cluster));
-	if (log_opts.verbose)
-		print_db_infos(&cluster->dbarr);
-}
-
-
-/*
- * get_db_infos()
- *
- * Scans pg_database system catalog and populates all user
- * databases.
- */
-static void
-get_db_infos(ClusterInfo *cluster)
-{
-	PGconn	   *conn = connectToServer(cluster, "template1");
-	PGresult   *res;
-	int			ntups;
-	int			tupnum;
-	DbInfo	   *dbinfos;
-	int			i_datname,
-				i_oid,
-				i_encoding,
-				i_datcollate,
-				i_datctype,
-				i_spclocation;
-	char		query[QUERY_ALLOC];
-
-	snprintf(query, sizeof(query),
-			 "SELECT d.oid, d.datname, d.encoding, d.datcollate, d.datctype, "
-			 "%s AS spclocation "
-			 "FROM pg_catalog.pg_database d "
-			 " LEFT OUTER JOIN pg_catalog.pg_tablespace t "
-			 " ON d.dattablespace = t.oid "
-			 "WHERE d.datallowconn = true "
-	/* we don't preserve pg_database.oid so we sort by name */
-			 "ORDER BY 2",
-	/* 9.2 removed the spclocation column */
-			 (GET_MAJOR_VERSION(cluster->major_version) <= 901) ?
-			 "t.spclocation" : "pg_catalog.pg_tablespace_location(t.oid)");
-
-	res = executeQueryOrDie(conn, "%s", query);
-
-	i_oid = PQfnumber(res, "oid");
-	i_datname = PQfnumber(res, "datname");
-	i_encoding = PQfnumber(res, "encoding");
-	i_datcollate = PQfnumber(res, "datcollate");
-	i_datctype = PQfnumber(res, "datctype");
-	i_spclocation = PQfnumber(res, "spclocation");
-
-	ntups = PQntuples(res);
-	dbinfos = (DbInfo *) pg_malloc(sizeof(DbInfo) * ntups);
-
-	for (tupnum = 0; tupnum < ntups; tupnum++)
-	{
-		dbinfos[tupnum].db_oid = atooid(PQgetvalue(res, tupnum, i_oid));
-		dbinfos[tupnum].db_name = pg_strdup(PQgetvalue(res, tupnum, i_datname));
-		dbinfos[tupnum].db_encoding = atoi(PQgetvalue(res, tupnum, i_encoding));
-		dbinfos[tupnum].db_collate = pg_strdup(PQgetvalue(res, tupnum, i_datcollate));
-		dbinfos[tupnum].db_ctype = pg_strdup(PQgetvalue(res, tupnum, i_datctype));
-		snprintf(dbinfos[tupnum].db_tablespace, sizeof(dbinfos[tupnum].db_tablespace), "%s",
-				 PQgetvalue(res, tupnum, i_spclocation));
-	}
-	PQclear(res);
-
-	PQfinish(conn);
-
-	cluster->dbarr.dbs = dbinfos;
-	cluster->dbarr.ndbs = ntups;
-}
-
-
-/*
- * get_rel_infos()
- *
- * gets the relinfos for all the user tables of the database referred
- * by "db".
- *
- * NOTE: we assume that relations/entities with oids greater than
- * FirstNormalObjectId belongs to the user
- */
-static void
-get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
-{
-	PGconn	   *conn = connectToServer(cluster,
-									   dbinfo->db_name);
-	PGresult   *res;
-	RelInfo    *relinfos;
-	int			ntups;
-	int			relnum;
-	int			num_rels = 0;
-	char	   *nspname = NULL;
-	char	   *relname = NULL;
-	char	   *tablespace = NULL;
-	int			i_spclocation,
-				i_nspname,
-				i_relname,
-				i_oid,
-				i_relfilenode,
-				i_reltablespace;
-	char		query[QUERY_ALLOC];
-	char	   *last_namespace = NULL,
-			   *last_tablespace = NULL;
-
-	/*
-	 * pg_largeobject contains user data that does not appear in pg_dump
-	 * --schema-only output, so we have to copy that system table heap and
-	 * index.  We could grab the pg_largeobject oids from template1, but it is
-	 * easy to treat it as a normal table. Order by oid so we can join old/new
-	 * structures efficiently.
-	 */
-
-	snprintf(query, sizeof(query),
-		/* get regular heap */
-			"WITH regular_heap (reloid) AS ( "
-			"	SELECT c.oid "
-			"	FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n "
-			"		   ON c.relnamespace = n.oid "
-			"	LEFT OUTER JOIN pg_catalog.pg_index i "
-			"		   ON c.oid = i.indexrelid "
-			"	WHERE relkind IN ('r', 'm', 'i', 'S') AND "
-		/*
-		 * pg_dump only dumps valid indexes;  testing indisready is necessary in
-		 * 9.2, and harmless in earlier/later versions.
-		 */
-			"		i.indisvalid IS DISTINCT FROM false AND "
-			"		i.indisready IS DISTINCT FROM false AND "
-		/* exclude possible orphaned temp tables */
-			"	  ((n.nspname !~ '^pg_temp_' AND "
-			"	    n.nspname !~ '^pg_toast_temp_' AND "
-		/* skip pg_toast because toast index have relkind == 'i', not 't' */
-			"	    n.nspname NOT IN ('pg_catalog', 'information_schema', "
-			"							'binary_upgrade', 'pg_toast') AND "
-			"		  c.oid >= %u) OR "
-			"	  (n.nspname = 'pg_catalog' AND "
-	"    relname IN ('pg_largeobject', 'pg_largeobject_loid_pn_index'%s) ))), "
-		/*
-		 * We have to gather the TOAST tables in later steps because we
-		 * can't schema-qualify TOAST tables.
-		 */
-		 /* get TOAST heap */
-			"	toast_heap (reloid) AS ( "
-			"	SELECT reltoastrelid "
-			"	FROM regular_heap JOIN pg_catalog.pg_class c "
-			"		ON regular_heap.reloid = c.oid "
-			"		AND c.reltoastrelid != %u), "
-		 /* get indexes on regular and TOAST heap */
-			"	all_index (reloid) AS ( "
-			"	SELECT indexrelid "
-			"	FROM pg_index "
-			"	WHERE indisvalid "
-			"    AND indrelid IN (SELECT reltoastrelid "
-			"        FROM (SELECT reloid FROM regular_heap "
-			"			   UNION ALL "
-			"			   SELECT reloid FROM toast_heap) all_heap "
-			"            JOIN pg_catalog.pg_class c "
-			"            ON all_heap.reloid = c.oid "
-			"            AND c.reltoastrelid != %u)) "
-		/* get all rels */
-			"SELECT c.oid, n.nspname, c.relname, "
-			"	c.relfilenode, c.reltablespace, %s "
-			"FROM (SELECT reloid FROM regular_heap "
-			"	   UNION ALL "
-			"	   SELECT reloid FROM toast_heap  "
-			"	   UNION ALL "
-			"	   SELECT reloid FROM all_index) all_rels "
-			"  JOIN pg_catalog.pg_class c "
-			"		ON all_rels.reloid = c.oid "
-			"  JOIN pg_catalog.pg_namespace n "
-			"	   ON c.relnamespace = n.oid "
-			"  LEFT OUTER JOIN pg_catalog.pg_tablespace t "
-			"	   ON c.reltablespace = t.oid "
-	/* we preserve pg_class.oid so we sort by it to match old/new */
-			"ORDER BY 1;",
-			FirstNormalObjectId,
-	/* does pg_largeobject_metadata need to be migrated? */
-			(GET_MAJOR_VERSION(old_cluster.major_version) <= 804) ?
-	"" : ", 'pg_largeobject_metadata', 'pg_largeobject_metadata_oid_index'",
-	InvalidOid, InvalidOid,
-	/* 9.2 removed the spclocation column */
-			(GET_MAJOR_VERSION(cluster->major_version) <= 901) ?
-			"t.spclocation" : "pg_catalog.pg_tablespace_location(t.oid) AS spclocation");
-
-	res = executeQueryOrDie(conn, "%s", query);
-
-	ntups = PQntuples(res);
-
-	relinfos = (RelInfo *) pg_malloc(sizeof(RelInfo) * ntups);
-
-	i_oid = PQfnumber(res, "oid");
-	i_nspname = PQfnumber(res, "nspname");
-	i_relname = PQfnumber(res, "relname");
-	i_relfilenode = PQfnumber(res, "relfilenode");
-	i_reltablespace = PQfnumber(res, "reltablespace");
-	i_spclocation = PQfnumber(res, "spclocation");
-
-	for (relnum = 0; relnum < ntups; relnum++)
-	{
-		RelInfo    *curr = &relinfos[num_rels++];
-
-		curr->reloid = atooid(PQgetvalue(res, relnum, i_oid));
-
-		nspname = PQgetvalue(res, relnum, i_nspname);
-		curr->nsp_alloc = false;
-
-		/*
-		 * Many of the namespace and tablespace strings are identical, so we
-		 * try to reuse the allocated string pointers where possible to reduce
-		 * memory consumption.
-		 */
-		/* Can we reuse the previous string allocation? */
-		if (last_namespace && strcmp(nspname, last_namespace) == 0)
-			curr->nspname = last_namespace;
-		else
-		{
-			last_namespace = curr->nspname = pg_strdup(nspname);
-			curr->nsp_alloc = true;
-		}
-
-		relname = PQgetvalue(res, relnum, i_relname);
-		curr->relname = pg_strdup(relname);
-
-		curr->relfilenode = atooid(PQgetvalue(res, relnum, i_relfilenode));
-		curr->tblsp_alloc = false;
-
-		/* Is the tablespace oid non-zero? */
-		if (atooid(PQgetvalue(res, relnum, i_reltablespace)) != 0)
-		{
-			/*
-			 * The tablespace location might be "", meaning the cluster
-			 * default location, i.e. pg_default or pg_global.
-			 */
-			tablespace = PQgetvalue(res, relnum, i_spclocation);
-
-			/* Can we reuse the previous string allocation? */
-			if (last_tablespace && strcmp(tablespace, last_tablespace) == 0)
-				curr->tablespace = last_tablespace;
-			else
-			{
-				last_tablespace = curr->tablespace = pg_strdup(tablespace);
-				curr->tblsp_alloc = true;
-			}
-		}
-		else
-			/* A zero reltablespace oid indicates the database tablespace. */
-			curr->tablespace = dbinfo->db_tablespace;
-	}
-	PQclear(res);
-
-	PQfinish(conn);
-
-	dbinfo->rel_arr.rels = relinfos;
-	dbinfo->rel_arr.nrels = num_rels;
-}
-
-
-static void
-free_db_and_rel_infos(DbInfoArr *db_arr)
-{
-	int			dbnum;
-
-	for (dbnum = 0; dbnum < db_arr->ndbs; dbnum++)
-	{
-		free_rel_infos(&db_arr->dbs[dbnum].rel_arr);
-		pg_free(db_arr->dbs[dbnum].db_name);
-	}
-	pg_free(db_arr->dbs);
-	db_arr->dbs = NULL;
-	db_arr->ndbs = 0;
-}
-
-
-static void
-free_rel_infos(RelInfoArr *rel_arr)
-{
-	int			relnum;
-
-	for (relnum = 0; relnum < rel_arr->nrels; relnum++)
-	{
-		if (rel_arr->rels[relnum].nsp_alloc)
-			pg_free(rel_arr->rels[relnum].nspname);
-		pg_free(rel_arr->rels[relnum].relname);
-		if (rel_arr->rels[relnum].tblsp_alloc)
-			pg_free(rel_arr->rels[relnum].tablespace);
-	}
-	pg_free(rel_arr->rels);
-	rel_arr->nrels = 0;
-}
-
-
-static void
-print_db_infos(DbInfoArr *db_arr)
-{
-	int			dbnum;
-
-	for (dbnum = 0; dbnum < db_arr->ndbs; dbnum++)
-	{
-		pg_log(PG_VERBOSE, "Database: %s\n", db_arr->dbs[dbnum].db_name);
-		print_rel_infos(&db_arr->dbs[dbnum].rel_arr);
-		pg_log(PG_VERBOSE, "\n\n");
-	}
-}
-
-
-static void
-print_rel_infos(RelInfoArr *rel_arr)
-{
-	int			relnum;
-
-	for (relnum = 0; relnum < rel_arr->nrels; relnum++)
-		pg_log(PG_VERBOSE, "relname: %s.%s: reloid: %u reltblspace: %s\n",
-			   rel_arr->rels[relnum].nspname,
-			   rel_arr->rels[relnum].relname,
-			   rel_arr->rels[relnum].reloid,
-			   rel_arr->rels[relnum].tablespace);
-}
diff --git a/contrib/pg_upgrade/option.c b/contrib/pg_upgrade/option.c
deleted file mode 100644
index 742d133..0000000
--- a/contrib/pg_upgrade/option.c
+++ /dev/null
@@ -1,518 +0,0 @@
-/*
- *	opt.c
- *
- *	options functions
- *
- *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
- *	contrib/pg_upgrade/option.c
- */
-
-#include "postgres_fe.h"
-
-#include "miscadmin.h"
-#include "getopt_long.h"
-
-#include "pg_upgrade.h"
-
-#include <time.h>
-#include <sys/types.h>
-#ifdef WIN32
-#include <io.h>
-#endif
-
-
-static void usage(void);
-static void check_required_directory(char **dirpath, char **configpath,
-				   char *envVarName, char *cmdLineOption, char *description);
-#define FIX_DEFAULT_READ_ONLY "-c default_transaction_read_only=false"
-
-
-UserOpts	user_opts;
-
-
-/*
- * parseCommandLine()
- *
- *	Parses the command line (argc, argv[]) and loads structures
- */
-void
-parseCommandLine(int argc, char *argv[])
-{
-	static struct option long_options[] = {
-		{"old-datadir", required_argument, NULL, 'd'},
-		{"new-datadir", required_argument, NULL, 'D'},
-		{"old-bindir", required_argument, NULL, 'b'},
-		{"new-bindir", required_argument, NULL, 'B'},
-		{"old-options", required_argument, NULL, 'o'},
-		{"new-options", required_argument, NULL, 'O'},
-		{"old-port", required_argument, NULL, 'p'},
-		{"new-port", required_argument, NULL, 'P'},
-
-		{"username", required_argument, NULL, 'U'},
-		{"check", no_argument, NULL, 'c'},
-		{"link", no_argument, NULL, 'k'},
-		{"retain", no_argument, NULL, 'r'},
-		{"jobs", required_argument, NULL, 'j'},
-		{"verbose", no_argument, NULL, 'v'},
-		{NULL, 0, NULL, 0}
-	};
-	int			option;			/* Command line option */
-	int			optindex = 0;	/* used by getopt_long */
-	int			os_user_effective_id;
-	FILE	   *fp;
-	char	  **filename;
-	time_t		run_time = time(NULL);
-
-	user_opts.transfer_mode = TRANSFER_MODE_COPY;
-
-	os_info.progname = get_progname(argv[0]);
-
-	/* Process libpq env. variables; load values here for usage() output */
-	old_cluster.port = getenv("PGPORTOLD") ? atoi(getenv("PGPORTOLD")) : DEF_PGUPORT;
-	new_cluster.port = getenv("PGPORTNEW") ? atoi(getenv("PGPORTNEW")) : DEF_PGUPORT;
-
-	os_user_effective_id = get_user_info(&os_info.user);
-	/* we override just the database user name;  we got the OS id above */
-	if (getenv("PGUSER"))
-	{
-		pg_free(os_info.user);
-		/* must save value, getenv()'s pointer is not stable */
-		os_info.user = pg_strdup(getenv("PGUSER"));
-	}
-
-	if (argc > 1)
-	{
-		if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
-		{
-			usage();
-			exit(0);
-		}
-		if (strcmp(argv[1], "--version") == 0 || strcmp(argv[1], "-V") == 0)
-		{
-			puts("pg_upgrade (PostgreSQL) " PG_VERSION);
-			exit(0);
-		}
-	}
-
-	/* Allow help and version to be run as root, so do the test here. */
-	if (os_user_effective_id == 0)
-		pg_fatal("%s: cannot be run as root\n", os_info.progname);
-
-	if ((log_opts.internal = fopen_priv(INTERNAL_LOG_FILE, "a")) == NULL)
-		pg_fatal("cannot write to log file %s\n", INTERNAL_LOG_FILE);
-
-	while ((option = getopt_long(argc, argv, "d:D:b:B:cj:ko:O:p:P:rU:v",
-								 long_options, &optindex)) != -1)
-	{
-		switch (option)
-		{
-			case 'b':
-				old_cluster.bindir = pg_strdup(optarg);
-				break;
-
-			case 'B':
-				new_cluster.bindir = pg_strdup(optarg);
-				break;
-
-			case 'c':
-				user_opts.check = true;
-				break;
-
-			case 'd':
-				old_cluster.pgdata = pg_strdup(optarg);
-				old_cluster.pgconfig = pg_strdup(optarg);
-				break;
-
-			case 'D':
-				new_cluster.pgdata = pg_strdup(optarg);
-				new_cluster.pgconfig = pg_strdup(optarg);
-				break;
-
-			case 'j':
-				user_opts.jobs = atoi(optarg);
-				break;
-
-			case 'k':
-				user_opts.transfer_mode = TRANSFER_MODE_LINK;
-				break;
-
-			case 'o':
-				/* append option? */
-				if (!old_cluster.pgopts)
-					old_cluster.pgopts = pg_strdup(optarg);
-				else
-				{
-					char *old_pgopts = old_cluster.pgopts;
-
-					old_cluster.pgopts = psprintf("%s %s", old_pgopts, optarg);
-					free(old_pgopts);
-				}
-				break;
-
-			case 'O':
-				/* append option? */
-				if (!new_cluster.pgopts)
-					new_cluster.pgopts = pg_strdup(optarg);
-				else
-				{
-					char *new_pgopts = new_cluster.pgopts;
-
-					new_cluster.pgopts = psprintf("%s %s", new_pgopts, optarg);
-					free(new_pgopts);
-				}
-				break;
-
-				/*
-				 * Someday, the port number option could be removed and passed
-				 * using -o/-O, but that requires postmaster -C to be
-				 * supported on all old/new versions (added in PG 9.2).
-				 */
-			case 'p':
-				if ((old_cluster.port = atoi(optarg)) <= 0)
-				{
-					pg_fatal("invalid old port number\n");
-					exit(1);
-				}
-				break;
-
-			case 'P':
-				if ((new_cluster.port = atoi(optarg)) <= 0)
-				{
-					pg_fatal("invalid new port number\n");
-					exit(1);
-				}
-				break;
-
-			case 'r':
-				log_opts.retain = true;
-				break;
-
-			case 'U':
-				pg_free(os_info.user);
-				os_info.user = pg_strdup(optarg);
-				os_info.user_specified = true;
-
-				/*
-				 * Push the user name into the environment so pre-9.1
-				 * pg_ctl/libpq uses it.
-				 */
-				pg_putenv("PGUSER", os_info.user);
-				break;
-
-			case 'v':
-				pg_log(PG_REPORT, "Running in verbose mode\n");
-				log_opts.verbose = true;
-				break;
-
-			default:
-				pg_fatal("Try \"%s --help\" for more information.\n",
-						 os_info.progname);
-				break;
-		}
-	}
-
-	/* label start of upgrade in logfiles */
-	for (filename = output_files; *filename != NULL; filename++)
-	{
-		if ((fp = fopen_priv(*filename, "a")) == NULL)
-			pg_fatal("cannot write to log file %s\n", *filename);
-
-		/* Start with newline because we might be appending to a file. */
-		fprintf(fp, "\n"
-		"-----------------------------------------------------------------\n"
-				"  pg_upgrade run on %s"
-				"-----------------------------------------------------------------\n\n",
-				ctime(&run_time));
-		fclose(fp);
-	}
-
-	/* Turn off read-only mode;  add prefix to PGOPTIONS? */
-	if (getenv("PGOPTIONS"))
-	{
-		char	   *pgoptions = psprintf("%s %s", FIX_DEFAULT_READ_ONLY,
-										 getenv("PGOPTIONS"));
-
-		pg_putenv("PGOPTIONS", pgoptions);
-		pfree(pgoptions);
-	}
-	else
-		pg_putenv("PGOPTIONS", FIX_DEFAULT_READ_ONLY);
-
-	/* Get values from env if not already set */
-	check_required_directory(&old_cluster.bindir, NULL, "PGBINOLD", "-b",
-							 "old cluster binaries reside");
-	check_required_directory(&new_cluster.bindir, NULL, "PGBINNEW", "-B",
-							 "new cluster binaries reside");
-	check_required_directory(&old_cluster.pgdata, &old_cluster.pgconfig,
-							 "PGDATAOLD", "-d", "old cluster data resides");
-	check_required_directory(&new_cluster.pgdata, &new_cluster.pgconfig,
-							 "PGDATANEW", "-D", "new cluster data resides");
-
-#ifdef WIN32
-	/*
-	 * On Windows, initdb --sync-only will fail with a "Permission denied"
-	 * error on file pg_upgrade_utility.log if pg_upgrade is run inside
-	 * the new cluster directory, so we do a check here.
-	 */
-	{
-		char	cwd[MAXPGPATH], new_cluster_pgdata[MAXPGPATH];
-
-		strlcpy(new_cluster_pgdata, new_cluster.pgdata, MAXPGPATH);
-		canonicalize_path(new_cluster_pgdata);
-
-		if (!getcwd(cwd, MAXPGPATH))
-			pg_fatal("cannot find current directory\n");
-		canonicalize_path(cwd);
-		if (path_is_prefix_of_path(new_cluster_pgdata, cwd))
-			pg_fatal("cannot run pg_upgrade from inside the new cluster data directory on Windows\n");
-	}
-#endif
-}
-
-
-static void
-usage(void)
-{
-	printf(_("pg_upgrade upgrades a PostgreSQL cluster to a different major version.\n\
-\nUsage:\n\
-  pg_upgrade [OPTION]...\n\
-\n\
-Options:\n\
-  -b, --old-bindir=BINDIR       old cluster executable directory\n\
-  -B, --new-bindir=BINDIR       new cluster executable directory\n\
-  -c, --check                   check clusters only, don't change any data\n\
-  -d, --old-datadir=DATADIR     old cluster data directory\n\
-  -D, --new-datadir=DATADIR     new cluster data directory\n\
-  -j, --jobs                    number of simultaneous processes or threads to use\n\
-  -k, --link                    link instead of copying files to new cluster\n\
-  -o, --old-options=OPTIONS     old cluster options to pass to the server\n\
-  -O, --new-options=OPTIONS     new cluster options to pass to the server\n\
-  -p, --old-port=PORT           old cluster port number (default %d)\n\
-  -P, --new-port=PORT           new cluster port number (default %d)\n\
-  -r, --retain                  retain SQL and log files after success\n\
-  -U, --username=NAME           cluster superuser (default \"%s\")\n\
-  -v, --verbose                 enable verbose internal logging\n\
-  -V, --version                 display version information, then exit\n\
-  -?, --help                    show this help, then exit\n\
-\n\
-Before running pg_upgrade you must:\n\
-  create a new database cluster (using the new version of initdb)\n\
-  shutdown the postmaster servicing the old cluster\n\
-  shutdown the postmaster servicing the new cluster\n\
-\n\
-When you run pg_upgrade, you must provide the following information:\n\
-  the data directory for the old cluster  (-d DATADIR)\n\
-  the data directory for the new cluster  (-D DATADIR)\n\
-  the \"bin\" directory for the old version (-b BINDIR)\n\
-  the \"bin\" directory for the new version (-B BINDIR)\n\
-\n\
-For example:\n\
-  pg_upgrade -d oldCluster/data -D newCluster/data -b oldCluster/bin -B newCluster/bin\n\
-or\n"), old_cluster.port, new_cluster.port, os_info.user);
-#ifndef WIN32
-	printf(_("\
-  $ export PGDATAOLD=oldCluster/data\n\
-  $ export PGDATANEW=newCluster/data\n\
-  $ export PGBINOLD=oldCluster/bin\n\
-  $ export PGBINNEW=newCluster/bin\n\
-  $ pg_upgrade\n"));
-#else
-	printf(_("\
-  C:\\> set PGDATAOLD=oldCluster/data\n\
-  C:\\> set PGDATANEW=newCluster/data\n\
-  C:\\> set PGBINOLD=oldCluster/bin\n\
-  C:\\> set PGBINNEW=newCluster/bin\n\
-  C:\\> pg_upgrade\n"));
-#endif
-	printf(_("\nReport bugs to <pgsql-bugs@postgresql.org>.\n"));
-}
-
-
-/*
- * check_required_directory()
- *
- * Checks a directory option.
- *	dirpath		  - the directory name supplied on the command line
- *	configpath	  - optional configuration directory
- *	envVarName	  - the name of an environment variable to get if dirpath is NULL
- *	cmdLineOption - the command line option corresponds to this directory (-o, -O, -n, -N)
- *	description   - a description of this directory option
- *
- * We use the last two arguments to construct a meaningful error message if the
- * user hasn't provided the required directory name.
- */
-static void
-check_required_directory(char **dirpath, char **configpath,
-						 char *envVarName, char *cmdLineOption,
-						 char *description)
-{
-	if (*dirpath == NULL || strlen(*dirpath) == 0)
-	{
-		const char *envVar;
-
-		if ((envVar = getenv(envVarName)) && strlen(envVar))
-		{
-			*dirpath = pg_strdup(envVar);
-			if (configpath)
-				*configpath = pg_strdup(envVar);
-		}
-		else
-			pg_fatal("You must identify the directory where the %s.\n"
-					 "Please use the %s command-line option or the %s environment variable.\n",
-					 description, cmdLineOption, envVarName);
-	}
-
-	/*
-	 * Trim off any trailing path separators because we construct paths by
-	 * appending to this path.
-	 */
-#ifndef WIN32
-	if ((*dirpath)[strlen(*dirpath) - 1] == '/')
-#else
-	if ((*dirpath)[strlen(*dirpath) - 1] == '/' ||
-		(*dirpath)[strlen(*dirpath) - 1] == '\\')
-#endif
-		(*dirpath)[strlen(*dirpath) - 1] = 0;
-}
-
-/*
- * adjust_data_dir
- *
- * If a configuration-only directory was specified, find the real data dir
- * by quering the running server.  This has limited checking because we
- * can't check for a running server because we can't find postmaster.pid.
- */
-void
-adjust_data_dir(ClusterInfo *cluster)
-{
-	char		filename[MAXPGPATH];
-	char		cmd[MAXPGPATH],
-				cmd_output[MAX_STRING];
-	FILE	   *fp,
-			   *output;
-
-	/* If there is no postgresql.conf, it can't be a config-only dir */
-	snprintf(filename, sizeof(filename), "%s/postgresql.conf", cluster->pgconfig);
-	if ((fp = fopen(filename, "r")) == NULL)
-		return;
-	fclose(fp);
-
-	/* If PG_VERSION exists, it can't be a config-only dir */
-	snprintf(filename, sizeof(filename), "%s/PG_VERSION", cluster->pgconfig);
-	if ((fp = fopen(filename, "r")) != NULL)
-	{
-		fclose(fp);
-		return;
-	}
-
-	/* Must be a configuration directory, so find the real data directory. */
-
-	prep_status("Finding the real data directory for the %s cluster",
-				CLUSTER_NAME(cluster));
-
-	/*
-	 * We don't have a data directory yet, so we can't check the PG version,
-	 * so this might fail --- only works for PG 9.2+.   If this fails,
-	 * pg_upgrade will fail anyway because the data files will not be found.
-	 */
-	snprintf(cmd, sizeof(cmd), "\"%s/postgres\" -D \"%s\" -C data_directory",
-			 cluster->bindir, cluster->pgconfig);
-
-	if ((output = popen(cmd, "r")) == NULL ||
-		fgets(cmd_output, sizeof(cmd_output), output) == NULL)
-		pg_fatal("Could not get data directory using %s: %s\n",
-				 cmd, getErrorText(errno));
-
-	pclose(output);
-
-	/* Remove trailing newline */
-	if (strchr(cmd_output, '\n') != NULL)
-		*strchr(cmd_output, '\n') = '\0';
-
-	cluster->pgdata = pg_strdup(cmd_output);
-
-	check_ok();
-}
-
-
-/*
- * get_sock_dir
- *
- * Identify the socket directory to use for this cluster.  If we're doing
- * a live check (old cluster only), we need to find out where the postmaster
- * is listening.  Otherwise, we're going to put the socket into the current
- * directory.
- */
-void
-get_sock_dir(ClusterInfo *cluster, bool live_check)
-{
-#ifdef HAVE_UNIX_SOCKETS
-
-	/*
-	 * sockdir and port were added to postmaster.pid in PG 9.1. Pre-9.1 cannot
-	 * process pg_ctl -w for sockets in non-default locations.
-	 */
-	if (GET_MAJOR_VERSION(cluster->major_version) >= 901)
-	{
-		if (!live_check)
-		{
-			/* Use the current directory for the socket */
-			cluster->sockdir = pg_malloc(MAXPGPATH);
-			if (!getcwd(cluster->sockdir, MAXPGPATH))
-				pg_fatal("cannot find current directory\n");
-		}
-		else
-		{
-			/*
-			 * If we are doing a live check, we will use the old cluster's
-			 * Unix domain socket directory so we can connect to the live
-			 * server.
-			 */
-			unsigned short orig_port = cluster->port;
-			char		filename[MAXPGPATH],
-						line[MAXPGPATH];
-			FILE	   *fp;
-			int			lineno;
-
-			snprintf(filename, sizeof(filename), "%s/postmaster.pid",
-					 cluster->pgdata);
-			if ((fp = fopen(filename, "r")) == NULL)
-				pg_fatal("Cannot open file %s: %m\n", filename);
-
-			for (lineno = 1;
-			   lineno <= Max(LOCK_FILE_LINE_PORT, LOCK_FILE_LINE_SOCKET_DIR);
-				 lineno++)
-			{
-				if (fgets(line, sizeof(line), fp) == NULL)
-					pg_fatal("Cannot read line %d from %s: %m\n", lineno, filename);
-
-				/* potentially overwrite user-supplied value */
-				if (lineno == LOCK_FILE_LINE_PORT)
-					sscanf(line, "%hu", &old_cluster.port);
-				if (lineno == LOCK_FILE_LINE_SOCKET_DIR)
-				{
-					cluster->sockdir = pg_strdup(line);
-					/* strip off newline */
-					if (strchr(cluster->sockdir, '\n') != NULL)
-						*strchr(cluster->sockdir, '\n') = '\0';
-				}
-			}
-			fclose(fp);
-
-			/* warn of port number correction */
-			if (orig_port != DEF_PGUPORT && old_cluster.port != orig_port)
-				pg_log(PG_WARNING, "User-supplied old port number %hu corrected to %hu\n",
-					   orig_port, cluster->port);
-		}
-	}
-	else
-
-		/*
-		 * Can't get sockdir and pg_ctl -w can't use a non-default, use
-		 * default
-		 */
-		cluster->sockdir = NULL;
-#else							/* !HAVE_UNIX_SOCKETS */
-	cluster->sockdir = NULL;
-#endif
-}
diff --git a/contrib/pg_upgrade/page.c b/contrib/pg_upgrade/page.c
deleted file mode 100644
index 1cfc10f..0000000
--- a/contrib/pg_upgrade/page.c
+++ /dev/null
@@ -1,164 +0,0 @@
-/*
- *	page.c
- *
- *	per-page conversion operations
- *
- *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
- *	contrib/pg_upgrade/page.c
- */
-
-#include "postgres_fe.h"
-
-#include "pg_upgrade.h"
-
-#include "storage/bufpage.h"
-
-
-#ifdef PAGE_CONVERSION
-
-
-static void getPageVersion(
-			   uint16 *version, const char *pathName);
-static pageCnvCtx *loadConverterPlugin(
-					uint16 newPageVersion, uint16 oldPageVersion);
-
-
-/*
- * setupPageConverter()
- *
- *	This function determines the PageLayoutVersion of the old cluster and
- *	the PageLayoutVersion of the new cluster.  If the versions differ, this
- *	function loads a converter plugin and returns a pointer to a pageCnvCtx
- *	object (in *result) that knows how to convert pages from the old format
- *	to the new format.  If the versions are identical, this function just
- *	returns a NULL pageCnvCtx pointer to indicate that page-by-page conversion
- *	is not required.
- */
-pageCnvCtx *
-setupPageConverter(void)
-{
-	uint16		oldPageVersion;
-	uint16		newPageVersion;
-	pageCnvCtx *converter;
-	const char *msg;
-	char		dstName[MAXPGPATH];
-	char		srcName[MAXPGPATH];
-
-	snprintf(dstName, sizeof(dstName), "%s/global/%u", new_cluster.pgdata,
-			 new_cluster.pg_database_oid);
-	snprintf(srcName, sizeof(srcName), "%s/global/%u", old_cluster.pgdata,
-			 old_cluster.pg_database_oid);
-
-	getPageVersion(&oldPageVersion, srcName);
-	getPageVersion(&newPageVersion, dstName);
-
-	/*
-	 * If the old cluster and new cluster use the same page layouts, then we
-	 * don't need a page converter.
-	 */
-	if (newPageVersion != oldPageVersion)
-	{
-		/*
-		 * The clusters use differing page layouts, see if we can find a
-		 * plugin that knows how to convert from the old page layout to the
-		 * new page layout.
-		 */
-
-		if ((converter = loadConverterPlugin(newPageVersion, oldPageVersion)) == NULL)
-			pg_fatal("could not find plugin to convert from old page layout to new page layout\n");
-
-		return converter;
-	}
-	else
-		return NULL;
-}
-
-
-/*
- * getPageVersion()
- *
- *	Retrieves the PageLayoutVersion for the given relation.
- *
- *	Returns NULL on success (and stores the PageLayoutVersion at *version),
- *	if an error occurs, this function returns an error message (in the form
- *	of a null-terminated string).
- */
-static void
-getPageVersion(uint16 *version, const char *pathName)
-{
-	int			relfd;
-	PageHeaderData page;
-	ssize_t		bytesRead;
-
-	if ((relfd = open(pathName, O_RDONLY, 0)) < 0)
-		pg_fatal("could not open relation %s\n", pathName);
-
-	if ((bytesRead = read(relfd, &page, sizeof(page))) != sizeof(page))
-		pg_fatal("could not read page header of %s\n", pathName);
-
-	*version = PageGetPageLayoutVersion(&page);
-
-	close(relfd);
-
-	return;
-}
-
-
-/*
- * loadConverterPlugin()
- *
- *	This function loads a page-converter plugin library and grabs a
- *	pointer to each of the (interesting) functions provided by that
- *	plugin.  The name of the plugin library is derived from the given
- *	newPageVersion and oldPageVersion.  If a plugin is found, this
- *	function returns a pointer to a pageCnvCtx object (which will contain
- *	a collection of plugin function pointers). If the required plugin
- *	is not found, this function returns NULL.
- */
-static pageCnvCtx *
-loadConverterPlugin(uint16 newPageVersion, uint16 oldPageVersion)
-{
-	char		pluginName[MAXPGPATH];
-	void	   *plugin;
-
-	/*
-	 * Try to find a plugin that can convert pages of oldPageVersion into
-	 * pages of newPageVersion.  For example, if we oldPageVersion = 3 and
-	 * newPageVersion is 4, we search for a plugin named:
-	 * plugins/convertLayout_3_to_4.dll
-	 */
-
-	/*
-	 * FIXME: we are searching for plugins relative to the current directory,
-	 * we should really search relative to our own executable instead.
-	 */
-	snprintf(pluginName, sizeof(pluginName), "./plugins/convertLayout_%d_to_%d%s",
-			 oldPageVersion, newPageVersion, DLSUFFIX);
-
-	if ((plugin = pg_dlopen(pluginName)) == NULL)
-		return NULL;
-	else
-	{
-		pageCnvCtx *result = (pageCnvCtx *) pg_malloc(sizeof(*result));
-
-		result->old.PageVersion = oldPageVersion;
-		result->new.PageVersion = newPageVersion;
-
-		result->startup = (pluginStartup) pg_dlsym(plugin, "init");
-		result->convertFile = (pluginConvertFile) pg_dlsym(plugin, "convertFile");
-		result->convertPage = (pluginConvertPage) pg_dlsym(plugin, "convertPage");
-		result->shutdown = (pluginShutdown) pg_dlsym(plugin, "fini");
-		result->pluginData = NULL;
-
-		/*
-		 * If the plugin has exported an initializer, go ahead and invoke it.
-		 */
-		if (result->startup)
-			result->startup(MIGRATOR_API_VERSION, &result->pluginVersion,
-						newPageVersion, oldPageVersion, &result->pluginData);
-
-		return result;
-	}
-}
-
-#endif
diff --git a/contrib/pg_upgrade/parallel.c b/contrib/pg_upgrade/parallel.c
deleted file mode 100644
index 6da9965..0000000
--- a/contrib/pg_upgrade/parallel.c
+++ /dev/null
@@ -1,357 +0,0 @@
-/*
- *	parallel.c
- *
- *	multi-process support
- *
- *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
- *	contrib/pg_upgrade/parallel.c
- */
-
-#include "postgres_fe.h"
-
-#include "pg_upgrade.h"
-
-#include <stdlib.h>
-#include <string.h>
-#include <sys/types.h>
-#include <sys/wait.h>
-
-#ifdef WIN32
-#include <io.h>
-#endif
-
-static int	parallel_jobs;
-
-#ifdef WIN32
-/*
- *	Array holding all active threads.  There can't be any gaps/zeros so
- *	it can be passed to WaitForMultipleObjects().  We use two arrays
- *	so the thread_handles array can be passed to WaitForMultipleObjects().
- */
-HANDLE	   *thread_handles;
-
-typedef struct
-{
-	char	   *log_file;
-	char	   *opt_log_file;
-	char	   *cmd;
-} exec_thread_arg;
-
-typedef struct
-{
-	DbInfoArr  *old_db_arr;
-	DbInfoArr  *new_db_arr;
-	char	   *old_pgdata;
-	char	   *new_pgdata;
-	char	   *old_tablespace;
-} transfer_thread_arg;
-
-exec_thread_arg **exec_thread_args;
-transfer_thread_arg **transfer_thread_args;
-
-/* track current thread_args struct so reap_child() can be used for all cases */
-void	  **cur_thread_args;
-
-DWORD		win32_exec_prog(exec_thread_arg *args);
-DWORD		win32_transfer_all_new_dbs(transfer_thread_arg *args);
-#endif
-
-/*
- *	parallel_exec_prog
- *
- *	This has the same API as exec_prog, except it does parallel execution,
- *	and therefore must throw errors and doesn't return an error status.
- */
-void
-parallel_exec_prog(const char *log_file, const char *opt_log_file,
-				   const char *fmt,...)
-{
-	va_list		args;
-	char		cmd[MAX_STRING];
-
-#ifndef WIN32
-	pid_t		child;
-#else
-	HANDLE		child;
-	exec_thread_arg *new_arg;
-#endif
-
-	va_start(args, fmt);
-	vsnprintf(cmd, sizeof(cmd), fmt, args);
-	va_end(args);
-
-	if (user_opts.jobs <= 1)
-		/* throw_error must be true to allow jobs */
-		exec_prog(log_file, opt_log_file, true, "%s", cmd);
-	else
-	{
-		/* parallel */
-#ifdef WIN32
-		if (thread_handles == NULL)
-			thread_handles = pg_malloc(user_opts.jobs * sizeof(HANDLE));
-
-		if (exec_thread_args == NULL)
-		{
-			int			i;
-
-			exec_thread_args = pg_malloc(user_opts.jobs * sizeof(exec_thread_arg *));
-
-			/*
-			 * For safety and performance, we keep the args allocated during
-			 * the entire life of the process, and we don't free the args in a
-			 * thread different from the one that allocated it.
-			 */
-			for (i = 0; i < user_opts.jobs; i++)
-				exec_thread_args[i] = pg_malloc0(sizeof(exec_thread_arg));
-		}
-
-		cur_thread_args = (void **) exec_thread_args;
-#endif
-		/* harvest any dead children */
-		while (reap_child(false) == true)
-			;
-
-		/* must we wait for a dead child? */
-		if (parallel_jobs >= user_opts.jobs)
-			reap_child(true);
-
-		/* set this before we start the job */
-		parallel_jobs++;
-
-		/* Ensure stdio state is quiesced before forking */
-		fflush(NULL);
-
-#ifndef WIN32
-		child = fork();
-		if (child == 0)
-			/* use _exit to skip atexit() functions */
-			_exit(!exec_prog(log_file, opt_log_file, true, "%s", cmd));
-		else if (child < 0)
-			/* fork failed */
-			pg_fatal("could not create worker process: %s\n", strerror(errno));
-#else
-		/* empty array element are always at the end */
-		new_arg = exec_thread_args[parallel_jobs - 1];
-
-		/* Can only pass one pointer into the function, so use a struct */
-		if (new_arg->log_file)
-			pg_free(new_arg->log_file);
-		new_arg->log_file = pg_strdup(log_file);
-		if (new_arg->opt_log_file)
-			pg_free(new_arg->opt_log_file);
-		new_arg->opt_log_file = opt_log_file ? pg_strdup(opt_log_file) : NULL;
-		if (new_arg->cmd)
-			pg_free(new_arg->cmd);
-		new_arg->cmd = pg_strdup(cmd);
-
-		child = (HANDLE) _beginthreadex(NULL, 0, (void *) win32_exec_prog,
-										new_arg, 0, NULL);
-		if (child == 0)
-			pg_fatal("could not create worker thread: %s\n", strerror(errno));
-
-		thread_handles[parallel_jobs - 1] = child;
-#endif
-	}
-
-	return;
-}
-
-
-#ifdef WIN32
-DWORD
-win32_exec_prog(exec_thread_arg *args)
-{
-	int			ret;
-
-	ret = !exec_prog(args->log_file, args->opt_log_file, true, "%s", args->cmd);
-
-	/* terminates thread */
-	return ret;
-}
-#endif
-
-
-/*
- *	parallel_transfer_all_new_dbs
- *
- *	This has the same API as transfer_all_new_dbs, except it does parallel execution
- *	by transfering multiple tablespaces in parallel
- */
-void
-parallel_transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
-							  char *old_pgdata, char *new_pgdata,
-							  char *old_tablespace)
-{
-#ifndef WIN32
-	pid_t		child;
-#else
-	HANDLE		child;
-	transfer_thread_arg *new_arg;
-#endif
-
-	if (user_opts.jobs <= 1)
-		/* throw_error must be true to allow jobs */
-		transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata, new_pgdata, NULL);
-	else
-	{
-		/* parallel */
-#ifdef WIN32
-		if (thread_handles == NULL)
-			thread_handles = pg_malloc(user_opts.jobs * sizeof(HANDLE));
-
-		if (transfer_thread_args == NULL)
-		{
-			int			i;
-
-			transfer_thread_args = pg_malloc(user_opts.jobs * sizeof(transfer_thread_arg *));
-
-			/*
-			 * For safety and performance, we keep the args allocated during
-			 * the entire life of the process, and we don't free the args in a
-			 * thread different from the one that allocated it.
-			 */
-			for (i = 0; i < user_opts.jobs; i++)
-				transfer_thread_args[i] = pg_malloc0(sizeof(transfer_thread_arg));
-		}
-
-		cur_thread_args = (void **) transfer_thread_args;
-#endif
-		/* harvest any dead children */
-		while (reap_child(false) == true)
-			;
-
-		/* must we wait for a dead child? */
-		if (parallel_jobs >= user_opts.jobs)
-			reap_child(true);
-
-		/* set this before we start the job */
-		parallel_jobs++;
-
-		/* Ensure stdio state is quiesced before forking */
-		fflush(NULL);
-
-#ifndef WIN32
-		child = fork();
-		if (child == 0)
-		{
-			transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata, new_pgdata,
-								 old_tablespace);
-			/* if we take another exit path, it will be non-zero */
-			/* use _exit to skip atexit() functions */
-			_exit(0);
-		}
-		else if (child < 0)
-			/* fork failed */
-			pg_fatal("could not create worker process: %s\n", strerror(errno));
-#else
-		/* empty array element are always at the end */
-		new_arg = transfer_thread_args[parallel_jobs - 1];
-
-		/* Can only pass one pointer into the function, so use a struct */
-		new_arg->old_db_arr = old_db_arr;
-		new_arg->new_db_arr = new_db_arr;
-		if (new_arg->old_pgdata)
-			pg_free(new_arg->old_pgdata);
-		new_arg->old_pgdata = pg_strdup(old_pgdata);
-		if (new_arg->new_pgdata)
-			pg_free(new_arg->new_pgdata);
-		new_arg->new_pgdata = pg_strdup(new_pgdata);
-		if (new_arg->old_tablespace)
-			pg_free(new_arg->old_tablespace);
-		new_arg->old_tablespace = old_tablespace ? pg_strdup(old_tablespace) : NULL;
-
-		child = (HANDLE) _beginthreadex(NULL, 0, (void *) win32_transfer_all_new_dbs,
-										new_arg, 0, NULL);
-		if (child == 0)
-			pg_fatal("could not create worker thread: %s\n", strerror(errno));
-
-		thread_handles[parallel_jobs - 1] = child;
-#endif
-	}
-
-	return;
-}
-
-
-#ifdef WIN32
-DWORD
-win32_transfer_all_new_dbs(transfer_thread_arg *args)
-{
-	transfer_all_new_dbs(args->old_db_arr, args->new_db_arr, args->old_pgdata,
-						 args->new_pgdata, args->old_tablespace);
-
-	/* terminates thread */
-	return 0;
-}
-#endif
-
-
-/*
- *	collect status from a completed worker child
- */
-bool
-reap_child(bool wait_for_child)
-{
-#ifndef WIN32
-	int			work_status;
-	int			ret;
-#else
-	int			thread_num;
-	DWORD		res;
-#endif
-
-	if (user_opts.jobs <= 1 || parallel_jobs == 0)
-		return false;
-
-#ifndef WIN32
-	ret = waitpid(-1, &work_status, wait_for_child ? 0 : WNOHANG);
-
-	/* no children or, for WNOHANG, no dead children */
-	if (ret <= 0 || !WIFEXITED(work_status))
-		return false;
-
-	if (WEXITSTATUS(work_status) != 0)
-		pg_fatal("child worker exited abnormally: %s\n", strerror(errno));
-#else
-	/* wait for one to finish */
-	thread_num = WaitForMultipleObjects(parallel_jobs, thread_handles,
-										false, wait_for_child ? INFINITE : 0);
-
-	if (thread_num == WAIT_TIMEOUT || thread_num == WAIT_FAILED)
-		return false;
-
-	/* compute thread index in active_threads */
-	thread_num -= WAIT_OBJECT_0;
-
-	/* get the result */
-	GetExitCodeThread(thread_handles[thread_num], &res);
-	if (res != 0)
-		pg_fatal("child worker exited abnormally: %s\n", strerror(errno));
-
-	/* dispose of handle to stop leaks */
-	CloseHandle(thread_handles[thread_num]);
-
-	/* Move last slot into dead child's position */
-	if (thread_num != parallel_jobs - 1)
-	{
-		void	   *tmp_args;
-
-		thread_handles[thread_num] = thread_handles[parallel_jobs - 1];
-
-		/*
-		 * Move last active thead arg struct into the now-dead slot, and the
-		 * now-dead slot to the end for reuse by the next thread. Though the
-		 * thread struct is in use by another thread, we can safely swap the
-		 * struct pointers within the array.
-		 */
-		tmp_args = cur_thread_args[thread_num];
-		cur_thread_args[thread_num] = cur_thread_args[parallel_jobs - 1];
-		cur_thread_args[parallel_jobs - 1] = tmp_args;
-	}
-#endif
-
-	/* do this after job has been removed */
-	parallel_jobs--;
-
-	return true;
-}
diff --git a/contrib/pg_upgrade/pg_upgrade.c b/contrib/pg_upgrade/pg_upgrade.c
deleted file mode 100644
index eb48da7..0000000
--- a/contrib/pg_upgrade/pg_upgrade.c
+++ /dev/null
@@ -1,643 +0,0 @@
-/*
- *	pg_upgrade.c
- *
- *	main source file
- *
- *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
- *	contrib/pg_upgrade/pg_upgrade.c
- */
-
-/*
- *	To simplify the upgrade process, we force certain system values to be
- *	identical between old and new clusters:
- *
- *	We control all assignments of pg_class.oid (and relfilenode) so toast
- *	oids are the same between old and new clusters.  This is important
- *	because toast oids are stored as toast pointers in user tables.
- *
- *	While pg_class.oid and pg_class.relfilenode are initially the same
- *	in a cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM
- *	FULL.  In the new cluster, pg_class.oid and pg_class.relfilenode will
- *	be the same and will match the old pg_class.oid value.  Because of
- *	this, old/new pg_class.relfilenode values will not match if CLUSTER,
- *	REINDEX, or VACUUM FULL have been performed in the old cluster.
- *
- *	We control all assignments of pg_type.oid because these oids are stored
- *	in user composite type values.
- *
- *	We control all assignments of pg_enum.oid because these oids are stored
- *	in user tables as enum values.
- *
- *	We control all assignments of pg_authid.oid because these oids are stored
- *	in pg_largeobject_metadata.
- */
-
-
-
-#include "postgres_fe.h"
-
-#include "pg_upgrade.h"
-#include "common/restricted_token.h"
-
-#ifdef HAVE_LANGINFO_H
-#include <langinfo.h>
-#endif
-
-static void prepare_new_cluster(void);
-static void prepare_new_databases(void);
-static void create_new_objects(void);
-static void copy_clog_xlog_xid(void);
-static void set_frozenxids(bool minmxid_only);
-static void setup(char *argv0, bool *live_check);
-static void cleanup(void);
-
-ClusterInfo old_cluster,
-			new_cluster;
-OSInfo		os_info;
-
-char	   *output_files[] = {
-	SERVER_LOG_FILE,
-#ifdef WIN32
-	/* unique file for pg_ctl start */
-	SERVER_START_LOG_FILE,
-#endif
-	UTILITY_LOG_FILE,
-	INTERNAL_LOG_FILE,
-	NULL
-};
-
-
-int
-main(int argc, char **argv)
-{
-	char	   *analyze_script_file_name = NULL;
-	char	   *deletion_script_file_name = NULL;
-	bool		live_check = false;
-
-	parseCommandLine(argc, argv);
-
-	get_restricted_token(os_info.progname);
-
-	adjust_data_dir(&old_cluster);
-	adjust_data_dir(&new_cluster);
-
-	setup(argv[0], &live_check);
-
-	output_check_banner(live_check);
-
-	check_cluster_versions();
-
-	get_sock_dir(&old_cluster, live_check);
-	get_sock_dir(&new_cluster, false);
-
-	check_cluster_compatibility(live_check);
-
-	check_and_dump_old_cluster(live_check);
-
-
-	/* -- NEW -- */
-	start_postmaster(&new_cluster, true);
-
-	check_new_cluster();
-	report_clusters_compatible();
-
-	pg_log(PG_REPORT, "\nPerforming Upgrade\n");
-	pg_log(PG_REPORT, "------------------\n");
-
-	prepare_new_cluster();
-
-	stop_postmaster(false);
-
-	/*
-	 * Destructive Changes to New Cluster
-	 */
-
-	copy_clog_xlog_xid();
-
-	/* New now using xids of the old system */
-
-	/* -- NEW -- */
-	start_postmaster(&new_cluster, true);
-
-	prepare_new_databases();
-
-	create_new_objects();
-
-	stop_postmaster(false);
-
-	/*
-	 * Most failures happen in create_new_objects(), which has completed at
-	 * this point.  We do this here because it is just before linking, which
-	 * will link the old and new cluster data files, preventing the old
-	 * cluster from being safely started once the new cluster is started.
-	 */
-	if (user_opts.transfer_mode == TRANSFER_MODE_LINK)
-		disable_old_cluster();
-
-	transfer_all_new_tablespaces(&old_cluster.dbarr, &new_cluster.dbarr,
-								 old_cluster.pgdata, new_cluster.pgdata);
-
-	/*
-	 * Assuming OIDs are only used in system tables, there is no need to
-	 * restore the OID counter because we have not transferred any OIDs from
-	 * the old system, but we do it anyway just in case.  We do it late here
-	 * because there is no need to have the schema load use new oids.
-	 */
-	prep_status("Setting next OID for new cluster");
-	exec_prog(UTILITY_LOG_FILE, NULL, true,
-			  "\"%s/pg_resetxlog\" -o %u \"%s\"",
-			  new_cluster.bindir, old_cluster.controldata.chkpnt_nxtoid,
-			  new_cluster.pgdata);
-	check_ok();
-
-	prep_status("Sync data directory to disk");
-	exec_prog(UTILITY_LOG_FILE, NULL, true,
-			  "\"%s/initdb\" --sync-only \"%s\"", new_cluster.bindir,
-			  new_cluster.pgdata);
-	check_ok();
-
-	create_script_for_cluster_analyze(&analyze_script_file_name);
-	create_script_for_old_cluster_deletion(&deletion_script_file_name);
-
-	issue_warnings();
-
-	pg_log(PG_REPORT, "\nUpgrade Complete\n");
-	pg_log(PG_REPORT, "----------------\n");
-
-	output_completion_banner(analyze_script_file_name,
-							 deletion_script_file_name);
-
-	pg_free(analyze_script_file_name);
-	pg_free(deletion_script_file_name);
-
-	cleanup();
-
-	return 0;
-}
-
-
-static void
-setup(char *argv0, bool *live_check)
-{
-	char		exec_path[MAXPGPATH];	/* full path to my executable */
-
-	/*
-	 * make sure the user has a clean environment, otherwise, we may confuse
-	 * libpq when we connect to one (or both) of the servers.
-	 */
-	check_pghost_envvar();
-
-	verify_directories();
-
-	/* no postmasters should be running, except for a live check */
-	if (pid_lock_file_exists(old_cluster.pgdata))
-	{
-		/*
-		 * If we have a postmaster.pid file, try to start the server.  If it
-		 * starts, the pid file was stale, so stop the server.  If it doesn't
-		 * start, assume the server is running.  If the pid file is left over
-		 * from a server crash, this also allows any committed transactions
-		 * stored in the WAL to be replayed so they are not lost, because WAL
-		 * files are not transfered from old to new servers.
-		 */
-		if (start_postmaster(&old_cluster, false))
-			stop_postmaster(false);
-		else
-		{
-			if (!user_opts.check)
-				pg_fatal("There seems to be a postmaster servicing the old cluster.\n"
-						 "Please shutdown that postmaster and try again.\n");
-			else
-				*live_check = true;
-		}
-	}
-
-	/* same goes for the new postmaster */
-	if (pid_lock_file_exists(new_cluster.pgdata))
-	{
-		if (start_postmaster(&new_cluster, false))
-			stop_postmaster(false);
-		else
-			pg_fatal("There seems to be a postmaster servicing the new cluster.\n"
-					 "Please shutdown that postmaster and try again.\n");
-	}
-
-	/* get path to pg_upgrade executable */
-	if (find_my_exec(argv0, exec_path) < 0)
-		pg_fatal("Could not get path name to pg_upgrade: %s\n", getErrorText(errno));
-
-	/* Trim off program name and keep just path */
-	*last_dir_separator(exec_path) = '\0';
-	canonicalize_path(exec_path);
-	os_info.exec_path = pg_strdup(exec_path);
-}
-
-
-static void
-prepare_new_cluster(void)
-{
-	/*
-	 * It would make more sense to freeze after loading the schema, but that
-	 * would cause us to lose the frozenids restored by the load. We use
-	 * --analyze so autovacuum doesn't update statistics later
-	 */
-	prep_status("Analyzing all rows in the new cluster");
-	exec_prog(UTILITY_LOG_FILE, NULL, true,
-			  "\"%s/vacuumdb\" %s --all --analyze %s",
-			  new_cluster.bindir, cluster_conn_opts(&new_cluster),
-			  log_opts.verbose ? "--verbose" : "");
-	check_ok();
-
-	/*
-	 * We do freeze after analyze so pg_statistic is also frozen. template0 is
-	 * not frozen here, but data rows were frozen by initdb, and we set its
-	 * datfrozenxid, relfrozenxids, and relminmxid later to match the new xid
-	 * counter later.
-	 */
-	prep_status("Freezing all rows on the new cluster");
-	exec_prog(UTILITY_LOG_FILE, NULL, true,
-			  "\"%s/vacuumdb\" %s --all --freeze %s",
-			  new_cluster.bindir, cluster_conn_opts(&new_cluster),
-			  log_opts.verbose ? "--verbose" : "");
-	check_ok();
-
-	get_pg_database_relfilenode(&new_cluster);
-}
-
-
-static void
-prepare_new_databases(void)
-{
-	/*
-	 * We set autovacuum_freeze_max_age to its maximum value so autovacuum
-	 * does not launch here and delete clog files, before the frozen xids are
-	 * set.
-	 */
-
-	set_frozenxids(false);
-
-	prep_status("Restoring global objects in the new cluster");
-
-	/*
-	 * Install support functions in the global-object restore database to
-	 * preserve pg_authid.oid.  pg_dumpall uses 'template0' as its template
-	 * database so objects we add into 'template1' are not propogated.  They
-	 * are removed on pg_upgrade exit.
-	 */
-	install_support_functions_in_new_db("template1");
-
-	/*
-	 * We have to create the databases first so we can install support
-	 * functions in all the other databases.  Ideally we could create the
-	 * support functions in template1 but pg_dumpall creates database using
-	 * the template0 template.
-	 */
-	exec_prog(UTILITY_LOG_FILE, NULL, true,
-			  "\"%s/psql\" " EXEC_PSQL_ARGS " %s -f \"%s\"",
-			  new_cluster.bindir, cluster_conn_opts(&new_cluster),
-			  GLOBALS_DUMP_FILE);
-	check_ok();
-
-	/* we load this to get a current list of databases */
-	get_db_and_rel_infos(&new_cluster);
-}
-
-
-static void
-create_new_objects(void)
-{
-	int			dbnum;
-
-	prep_status("Adding support functions to new cluster");
-
-	/*
-	 * Technically, we only need to install these support functions in new
-	 * databases that also exist in the old cluster, but for completeness we
-	 * process all new databases.
-	 */
-	for (dbnum = 0; dbnum < new_cluster.dbarr.ndbs; dbnum++)
-	{
-		DbInfo	   *new_db = &new_cluster.dbarr.dbs[dbnum];
-
-		/* skip db we already installed */
-		if (strcmp(new_db->db_name, "template1") != 0)
-			install_support_functions_in_new_db(new_db->db_name);
-	}
-	check_ok();
-
-	prep_status("Restoring database schemas in the new cluster\n");
-
-	for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
-	{
-		char		sql_file_name[MAXPGPATH],
-					log_file_name[MAXPGPATH];
-		DbInfo	   *old_db = &old_cluster.dbarr.dbs[dbnum];
-
-		pg_log(PG_STATUS, "%s", old_db->db_name);
-		snprintf(sql_file_name, sizeof(sql_file_name), DB_DUMP_FILE_MASK, old_db->db_oid);
-		snprintf(log_file_name, sizeof(log_file_name), DB_DUMP_LOG_FILE_MASK, old_db->db_oid);
-
-		/*
-		 * pg_dump only produces its output at the end, so there is little
-		 * parallelism if using the pipe.
-		 */
-		parallel_exec_prog(log_file_name,
-						   NULL,
-						   "\"%s/pg_restore\" %s --exit-on-error --verbose --dbname \"%s\" \"%s\"",
-						   new_cluster.bindir,
-						   cluster_conn_opts(&new_cluster),
-						   old_db->db_name,
-						   sql_file_name);
-	}
-
-	/* reap all children */
-	while (reap_child(true) == true)
-		;
-
-	end_progress_output();
-	check_ok();
-
-	/*
-	 * We don't have minmxids for databases or relations in pre-9.3
-	 * clusters, so set those after we have restores the schemas.
-	 */
-	if (GET_MAJOR_VERSION(old_cluster.major_version) < 903)
-		set_frozenxids(true);
-
-	optionally_create_toast_tables();
-
-	/* regenerate now that we have objects in the databases */
-	get_db_and_rel_infos(&new_cluster);
-
-	uninstall_support_functions_from_new_cluster();
-}
-
-/*
- * Delete the given subdirectory contents from the new cluster
- */
-static void
-remove_new_subdir(char *subdir, bool rmtopdir)
-{
-	char		new_path[MAXPGPATH];
-
-	prep_status("Deleting files from new %s", subdir);
-
-	snprintf(new_path, sizeof(new_path), "%s/%s", new_cluster.pgdata, subdir);
-	if (!rmtree(new_path, rmtopdir))
-		pg_fatal("could not delete directory \"%s\"\n", new_path);
-
-	check_ok();
-}
-
-/*
- * Copy the files from the old cluster into it
- */
-static void
-copy_subdir_files(char *subdir)
-{
-	char		old_path[MAXPGPATH];
-	char		new_path[MAXPGPATH];
-
-	remove_new_subdir(subdir, true);
-
-	snprintf(old_path, sizeof(old_path), "%s/%s", old_cluster.pgdata, subdir);
-	snprintf(new_path, sizeof(new_path), "%s/%s", new_cluster.pgdata, subdir);
-
-	prep_status("Copying old %s to new server", subdir);
-
-	exec_prog(UTILITY_LOG_FILE, NULL, true,
-#ifndef WIN32
-			  "cp -Rf \"%s\" \"%s\"",
-#else
-	/* flags: everything, no confirm, quiet, overwrite read-only */
-			  "xcopy /e /y /q /r \"%s\" \"%s\\\"",
-#endif
-			  old_path, new_path);
-
-	check_ok();
-}
-
-static void
-copy_clog_xlog_xid(void)
-{
-	/* copy old commit logs to new data dir */
-	copy_subdir_files("pg_clog");
-
-	/* set the next transaction id and epoch of the new cluster */
-	prep_status("Setting next transaction ID and epoch for new cluster");
-	exec_prog(UTILITY_LOG_FILE, NULL, true,
-			  "\"%s/pg_resetxlog\" -f -x %u \"%s\"",
-			  new_cluster.bindir, old_cluster.controldata.chkpnt_nxtxid,
-			  new_cluster.pgdata);
-	exec_prog(UTILITY_LOG_FILE, NULL, true,
-			  "\"%s/pg_resetxlog\" -f -e %u \"%s\"",
-			  new_cluster.bindir, old_cluster.controldata.chkpnt_nxtepoch,
-			  new_cluster.pgdata);
-	/* must reset commit timestamp limits also */
-	exec_prog(UTILITY_LOG_FILE, NULL, true,
-			  "\"%s/pg_resetxlog\" -f -c %u,%u \"%s\"",
-			  new_cluster.bindir,
-			  old_cluster.controldata.chkpnt_nxtxid,
-			  old_cluster.controldata.chkpnt_nxtxid,
-			  new_cluster.pgdata);
-	check_ok();
-
-	/*
-	 * If the old server is before the MULTIXACT_FORMATCHANGE_CAT_VER change
-	 * (see pg_upgrade.h) and the new server is after, then we don't copy
-	 * pg_multixact files, but we need to reset pg_control so that the new
-	 * server doesn't attempt to read multis older than the cutoff value.
-	 */
-	if (old_cluster.controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER &&
-		new_cluster.controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER)
-	{
-		copy_subdir_files("pg_multixact/offsets");
-		copy_subdir_files("pg_multixact/members");
-
-		prep_status("Setting next multixact ID and offset for new cluster");
-
-		/*
-		 * we preserve all files and contents, so we must preserve both "next"
-		 * counters here and the oldest multi present on system.
-		 */
-		exec_prog(UTILITY_LOG_FILE, NULL, true,
-				  "\"%s/pg_resetxlog\" -O %u -m %u,%u \"%s\"",
-				  new_cluster.bindir,
-				  old_cluster.controldata.chkpnt_nxtmxoff,
-				  old_cluster.controldata.chkpnt_nxtmulti,
-				  old_cluster.controldata.chkpnt_oldstMulti,
-				  new_cluster.pgdata);
-		check_ok();
-	}
-	else if (new_cluster.controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER)
-	{
-		/*
-		 * Remove offsets/0000 file created by initdb that no longer matches
-		 * the new multi-xid value.  "members" starts at zero so no need to
-		 * remove it.
-		 */
-		remove_new_subdir("pg_multixact/offsets", false);
-
-		prep_status("Setting oldest multixact ID on new cluster");
-
-		/*
-		 * We don't preserve files in this case, but it's important that the
-		 * oldest multi is set to the latest value used by the old system, so
-		 * that multixact.c returns the empty set for multis that might be
-		 * present on disk.  We set next multi to the value following that; it
-		 * might end up wrapped around (i.e. 0) if the old cluster had
-		 * next=MaxMultiXactId, but multixact.c can cope with that just fine.
-		 */
-		exec_prog(UTILITY_LOG_FILE, NULL, true,
-				  "\"%s/pg_resetxlog\" -m %u,%u \"%s\"",
-				  new_cluster.bindir,
-				  old_cluster.controldata.chkpnt_nxtmulti + 1,
-				  old_cluster.controldata.chkpnt_nxtmulti,
-				  new_cluster.pgdata);
-		check_ok();
-	}
-
-	/* now reset the wal archives in the new cluster */
-	prep_status("Resetting WAL archives");
-	exec_prog(UTILITY_LOG_FILE, NULL, true,
-			  "\"%s/pg_resetxlog\" -l %s \"%s\"", new_cluster.bindir,
-			  old_cluster.controldata.nextxlogfile,
-			  new_cluster.pgdata);
-	check_ok();
-}
-
-
-/*
- *	set_frozenxids()
- *
- *	We have frozen all xids, so set datfrozenxid, relfrozenxid, and
- *	relminmxid to be the old cluster's xid counter, which we just set
- *	in the new cluster.  User-table frozenxid and minmxid values will
- *	be set by pg_dump --binary-upgrade, but objects not set by the pg_dump
- *	must have proper frozen counters.
- */
-static
-void
-set_frozenxids(bool minmxid_only)
-{
-	int			dbnum;
-	PGconn	   *conn,
-			   *conn_template1;
-	PGresult   *dbres;
-	int			ntups;
-	int			i_datname;
-	int			i_datallowconn;
-
-	if (!minmxid_only)
-		prep_status("Setting frozenxid and minmxid counters in new cluster");
-	else
-		prep_status("Setting minmxid counter in new cluster");
-
-	conn_template1 = connectToServer(&new_cluster, "template1");
-
-	if (!minmxid_only)
-		/* set pg_database.datfrozenxid */
-		PQclear(executeQueryOrDie(conn_template1,
-								  "UPDATE pg_catalog.pg_database "
-								  "SET	datfrozenxid = '%u'",
-								  old_cluster.controldata.chkpnt_nxtxid));
-
-	/* set pg_database.datminmxid */
-	PQclear(executeQueryOrDie(conn_template1,
-							  "UPDATE pg_catalog.pg_database "
-							  "SET	datminmxid = '%u'",
-							  old_cluster.controldata.chkpnt_nxtmulti));
-
-	/* get database names */
-	dbres = executeQueryOrDie(conn_template1,
-							  "SELECT	datname, datallowconn "
-							  "FROM	pg_catalog.pg_database");
-
-	i_datname = PQfnumber(dbres, "datname");
-	i_datallowconn = PQfnumber(dbres, "datallowconn");
-
-	ntups = PQntuples(dbres);
-	for (dbnum = 0; dbnum < ntups; dbnum++)
-	{
-		char	   *datname = PQgetvalue(dbres, dbnum, i_datname);
-		char	   *datallowconn = PQgetvalue(dbres, dbnum, i_datallowconn);
-
-		/*
-		 * We must update databases where datallowconn = false, e.g.
-		 * template0, because autovacuum increments their datfrozenxids,
-		 * relfrozenxids, and relminmxid  even if autovacuum is turned off,
-		 * and even though all the data rows are already frozen  To enable
-		 * this, we temporarily change datallowconn.
-		 */
-		if (strcmp(datallowconn, "f") == 0)
-			PQclear(executeQueryOrDie(conn_template1,
-								"ALTER DATABASE %s ALLOW_CONNECTIONS = true",
-									  quote_identifier(datname)));
-
-		conn = connectToServer(&new_cluster, datname);
-
-		if (!minmxid_only)
-			/* set pg_class.relfrozenxid */
-			PQclear(executeQueryOrDie(conn,
-									  "UPDATE	pg_catalog.pg_class "
-									  "SET	relfrozenxid = '%u' "
-			/* only heap, materialized view, and TOAST are vacuumed */
-									  "WHERE	relkind IN ('r', 'm', 't')",
-									  old_cluster.controldata.chkpnt_nxtxid));
-
-		/* set pg_class.relminmxid */
-		PQclear(executeQueryOrDie(conn,
-								  "UPDATE	pg_catalog.pg_class "
-								  "SET	relminmxid = '%u' "
-		/* only heap, materialized view, and TOAST are vacuumed */
-								  "WHERE	relkind IN ('r', 'm', 't')",
-								  old_cluster.controldata.chkpnt_nxtmulti));
-		PQfinish(conn);
-
-		/* Reset datallowconn flag */
-		if (strcmp(datallowconn, "f") == 0)
-			PQclear(executeQueryOrDie(conn_template1,
-							   "ALTER DATABASE %s ALLOW_CONNECTIONS = false",
-									  quote_identifier(datname)));
-	}
-
-	PQclear(dbres);
-
-	PQfinish(conn_template1);
-
-	check_ok();
-}
-
-
-static void
-cleanup(void)
-{
-	fclose(log_opts.internal);
-
-	/* Remove dump and log files? */
-	if (!log_opts.retain)
-	{
-		int			dbnum;
-		char	  **filename;
-
-		for (filename = output_files; *filename != NULL; filename++)
-			unlink(*filename);
-
-		/* remove dump files */
-		unlink(GLOBALS_DUMP_FILE);
-
-		if (old_cluster.dbarr.dbs)
-			for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
-			{
-				char		sql_file_name[MAXPGPATH],
-							log_file_name[MAXPGPATH];
-				DbInfo	   *old_db = &old_cluster.dbarr.dbs[dbnum];
-
-				snprintf(sql_file_name, sizeof(sql_file_name), DB_DUMP_FILE_MASK, old_db->db_oid);
-				unlink(sql_file_name);
-
-				snprintf(log_file_name, sizeof(log_file_name), DB_DUMP_LOG_FILE_MASK, old_db->db_oid);
-				unlink(log_file_name);
-			}
-	}
-}
diff --git a/contrib/pg_upgrade/pg_upgrade.h b/contrib/pg_upgrade/pg_upgrade.h
deleted file mode 100644
index f6b13c0..0000000
--- a/contrib/pg_upgrade/pg_upgrade.h
+++ /dev/null
@@ -1,483 +0,0 @@
-/*
- *	pg_upgrade.h
- *
- *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
- *	contrib/pg_upgrade/pg_upgrade.h
- */
-
-#include <unistd.h>
-#include <assert.h>
-#include <sys/stat.h>
-#include <sys/time.h>
-
-#include "libpq-fe.h"
-
-/* Use port in the private/dynamic port number range */
-#define DEF_PGUPORT			50432
-
-/* Allocate for null byte */
-#define USER_NAME_SIZE		128
-
-#define MAX_STRING			1024
-#define LINE_ALLOC			4096
-#define QUERY_ALLOC			8192
-
-#define MIGRATOR_API_VERSION	1
-
-#define MESSAGE_WIDTH		60
-
-#define GET_MAJOR_VERSION(v)	((v) / 100)
-
-/* contains both global db information and CREATE DATABASE commands */
-#define GLOBALS_DUMP_FILE	"pg_upgrade_dump_globals.sql"
-#define DB_DUMP_FILE_MASK	"pg_upgrade_dump_%u.custom"
-
-#define DB_DUMP_LOG_FILE_MASK	"pg_upgrade_dump_%u.log"
-#define SERVER_LOG_FILE		"pg_upgrade_server.log"
-#define UTILITY_LOG_FILE	"pg_upgrade_utility.log"
-#define INTERNAL_LOG_FILE	"pg_upgrade_internal.log"
-
-extern char *output_files[];
-
-/*
- * WIN32 files do not accept writes from multiple processes
- *
- * On Win32, we can't send both pg_upgrade output and command output to the
- * same file because we get the error: "The process cannot access the file
- * because it is being used by another process." so send the pg_ctl
- * command-line output to a new file, rather than into the server log file.
- * Ideally we could use UTILITY_LOG_FILE for this, but some Windows platforms
- * keep the pg_ctl output file open by the running postmaster, even after
- * pg_ctl exits.
- *
- * We could use the Windows pgwin32_open() flags to allow shared file
- * writes but is unclear how all other tools would use those flags, so
- * we just avoid it and log a little differently on Windows;  we adjust
- * the error message appropriately.
- */
-#ifndef WIN32
-#define SERVER_START_LOG_FILE	SERVER_LOG_FILE
-#define SERVER_STOP_LOG_FILE	SERVER_LOG_FILE
-#else
-#define SERVER_START_LOG_FILE	"pg_upgrade_server_start.log"
-/*
- *	"pg_ctl start" keeps SERVER_START_LOG_FILE and SERVER_LOG_FILE open
- *	while the server is running, so we use UTILITY_LOG_FILE for "pg_ctl
- *	stop".
- */
-#define SERVER_STOP_LOG_FILE	UTILITY_LOG_FILE
-#endif
-
-
-#ifndef WIN32
-#define pg_copy_file		copy_file
-#define pg_mv_file			rename
-#define pg_link_file		link
-#define PATH_SEPARATOR		'/'
-#define RM_CMD				"rm -f"
-#define RMDIR_CMD			"rm -rf"
-#define SCRIPT_PREFIX		"./"
-#define SCRIPT_EXT			"sh"
-#define ECHO_QUOTE	"'"
-#define ECHO_BLANK	""
-#else
-#define pg_copy_file		CopyFile
-#define pg_mv_file			pgrename
-#define pg_link_file		win32_pghardlink
-#define PATH_SEPARATOR		'\\'
-#define RM_CMD				"DEL /q"
-#define RMDIR_CMD			"RMDIR /s/q"
-#define SCRIPT_PREFIX		""
-#define SCRIPT_EXT			"bat"
-#define EXE_EXT				".exe"
-#define ECHO_QUOTE	""
-#define ECHO_BLANK	"."
-#endif
-
-#define CLUSTER_NAME(cluster)	((cluster) == &old_cluster ? "old" : \
-								 (cluster) == &new_cluster ? "new" : "none")
-
-#define atooid(x)  ((Oid) strtoul((x), NULL, 10))
-
-/* OID system catalog preservation added during PG 9.0 development */
-#define TABLE_SPACE_SUBDIRS_CAT_VER 201001111
-/* postmaster/postgres -b (binary_upgrade) flag added during PG 9.1 development */
-#define BINARY_UPGRADE_SERVER_FLAG_CAT_VER 201104251
-/*
- *	Visibility map changed with this 9.2 commit,
- *	8f9fe6edce358f7904e0db119416b4d1080a83aa; pick later catalog version.
- */
-#define VISIBILITY_MAP_CRASHSAFE_CAT_VER 201107031
-
-/*
- * pg_multixact format changed in 9.3 commit 0ac5ad5134f2769ccbaefec73844f85,
- * ("Improve concurrency of foreign key locking") which also updated catalog
- * version to this value.  pg_upgrade behavior depends on whether old and new
- * server versions are both newer than this, or only the new one is.
- */
-#define MULTIXACT_FORMATCHANGE_CAT_VER 201301231
-
-/*
- * large object chunk size added to pg_controldata,
- * commit 5f93c37805e7485488480916b4585e098d3cc883
- */
-#define LARGE_OBJECT_SIZE_PG_CONTROL_VER 942
-
-/*
- * change in JSONB format during 9.4 beta
- */
-#define JSONB_FORMAT_CHANGE_CAT_VER 201409291
-
-/*
- * Each relation is represented by a relinfo structure.
- */
-typedef struct
-{
-	/* Can't use NAMEDATALEN;  not guaranteed to fit on client */
-	char	   *nspname;		/* namespace name */
-	char	   *relname;		/* relation name */
-	Oid			reloid;			/* relation oid */
-	Oid			relfilenode;	/* relation relfile node */
-	/* relation tablespace path, or "" for the cluster default */
-	char	   *tablespace;
-	bool		nsp_alloc;
-	bool		tblsp_alloc;
-} RelInfo;
-
-typedef struct
-{
-	RelInfo    *rels;
-	int			nrels;
-} RelInfoArr;
-
-/*
- * The following structure represents a relation mapping.
- */
-typedef struct
-{
-	const char *old_tablespace;
-	const char *new_tablespace;
-	const char *old_tablespace_suffix;
-	const char *new_tablespace_suffix;
-	Oid			old_db_oid;
-	Oid			new_db_oid;
-
-	/*
-	 * old/new relfilenodes might differ for pg_largeobject(_metadata) indexes
-	 * due to VACUUM FULL or REINDEX.  Other relfilenodes are preserved.
-	 */
-	Oid			old_relfilenode;
-	Oid			new_relfilenode;
-	/* the rest are used only for logging and error reporting */
-	char	   *nspname;		/* namespaces */
-	char	   *relname;
-} FileNameMap;
-
-/*
- * Structure to store database information
- */
-typedef struct
-{
-	Oid			db_oid;			/* oid of the database */
-	char	   *db_name;		/* database name */
-	char		db_tablespace[MAXPGPATH];		/* database default tablespace
-												 * path */
-	char	   *db_collate;
-	char	   *db_ctype;
-	int			db_encoding;
-	RelInfoArr	rel_arr;		/* array of all user relinfos */
-} DbInfo;
-
-typedef struct
-{
-	DbInfo	   *dbs;			/* array of db infos */
-	int			ndbs;			/* number of db infos */
-} DbInfoArr;
-
-/*
- * The following structure is used to hold pg_control information.
- * Rather than using the backend's control structure we use our own
- * structure to avoid pg_control version issues between releases.
- */
-typedef struct
-{
-	uint32		ctrl_ver;
-	uint32		cat_ver;
-	char		nextxlogfile[25];
-	uint32		chkpnt_tli;
-	uint32		chkpnt_nxtxid;
-	uint32		chkpnt_nxtepoch;
-	uint32		chkpnt_nxtoid;
-	uint32		chkpnt_nxtmulti;
-	uint32		chkpnt_nxtmxoff;
-	uint32		chkpnt_oldstMulti;
-	uint32		align;
-	uint32		blocksz;
-	uint32		largesz;
-	uint32		walsz;
-	uint32		walseg;
-	uint32		ident;
-	uint32		index;
-	uint32		toast;
-	uint32		large_object;
-	bool		date_is_int;
-	bool		float8_pass_by_value;
-	bool		data_checksum_version;
-} ControlData;
-
-/*
- * Enumeration to denote link modes
- */
-typedef enum
-{
-	TRANSFER_MODE_COPY,
-	TRANSFER_MODE_LINK
-} transferMode;
-
-/*
- * Enumeration to denote pg_log modes
- */
-typedef enum
-{
-	PG_VERBOSE,
-	PG_STATUS,
-	PG_REPORT,
-	PG_WARNING,
-	PG_FATAL
-} eLogType;
-
-
-typedef long pgpid_t;
-
-
-/*
- * cluster
- *
- *	information about each cluster
- */
-typedef struct
-{
-	ControlData controldata;	/* pg_control information */
-	DbInfoArr	dbarr;			/* dbinfos array */
-	char	   *pgdata;			/* pathname for cluster's $PGDATA directory */
-	char	   *pgconfig;		/* pathname for cluster's config file
-								 * directory */
-	char	   *bindir;			/* pathname for cluster's executable directory */
-	char	   *pgopts;			/* options to pass to the server, like pg_ctl
-								 * -o */
-	char	   *sockdir;		/* directory for Unix Domain socket, if any */
-	unsigned short port;		/* port number where postmaster is waiting */
-	uint32		major_version;	/* PG_VERSION of cluster */
-	char		major_version_str[64];	/* string PG_VERSION of cluster */
-	uint32		bin_version;	/* version returned from pg_ctl */
-	Oid			pg_database_oid;	/* OID of pg_database relation */
-	const char *tablespace_suffix;		/* directory specification */
-} ClusterInfo;
-
-
-/*
- *	LogOpts
-*/
-typedef struct
-{
-	FILE	   *internal;		/* internal log FILE */
-	bool		verbose;		/* TRUE -> be verbose in messages */
-	bool		retain;			/* retain log files on success */
-} LogOpts;
-
-
-/*
- *	UserOpts
-*/
-typedef struct
-{
-	bool		check;			/* TRUE -> ask user for permission to make
-								 * changes */
-	transferMode transfer_mode; /* copy files or link them? */
-	int			jobs;
-} UserOpts;
-
-
-/*
- * OSInfo
- */
-typedef struct
-{
-	const char *progname;		/* complete pathname for this program */
-	char	   *exec_path;		/* full path to my executable */
-	char	   *user;			/* username for clusters */
-	bool		user_specified; /* user specified on command-line */
-	char	  **old_tablespaces;	/* tablespaces */
-	int			num_old_tablespaces;
-	char	  **libraries;		/* loadable libraries */
-	int			num_libraries;
-	ClusterInfo *running_cluster;
-} OSInfo;
-
-
-/*
- * Global variables
- */
-extern LogOpts log_opts;
-extern UserOpts user_opts;
-extern ClusterInfo old_cluster,
-			new_cluster;
-extern OSInfo os_info;
-
-
-/* check.c */
-
-void		output_check_banner(bool live_check);
-void check_and_dump_old_cluster(bool live_check);
-void		check_new_cluster(void);
-void		report_clusters_compatible(void);
-void		issue_warnings(void);
-void output_completion_banner(char *analyze_script_file_name,
-						 char *deletion_script_file_name);
-void		check_cluster_versions(void);
-void		check_cluster_compatibility(bool live_check);
-void		create_script_for_old_cluster_deletion(char **deletion_script_file_name);
-void		create_script_for_cluster_analyze(char **analyze_script_file_name);
-
-
-/* controldata.c */
-
-void		get_control_data(ClusterInfo *cluster, bool live_check);
-void		check_control_data(ControlData *oldctrl, ControlData *newctrl);
-void		disable_old_cluster(void);
-
-
-/* dump.c */
-
-void		generate_old_dump(void);
-void		optionally_create_toast_tables(void);
-
-
-/* exec.c */
-
-#define EXEC_PSQL_ARGS "--echo-queries --set ON_ERROR_STOP=on --no-psqlrc --dbname=template1"
-
-bool		exec_prog(const char *log_file, const char *opt_log_file,
-		  bool throw_error, const char *fmt,...) pg_attribute_printf(4, 5);
-void		verify_directories(void);
-bool		pid_lock_file_exists(const char *datadir);
-
-
-/* file.c */
-
-#ifdef PAGE_CONVERSION
-typedef const char *(*pluginStartup) (uint16 migratorVersion,
-								uint16 *pluginVersion, uint16 newPageVersion,
-								   uint16 oldPageVersion, void **pluginData);
-typedef const char *(*pluginConvertFile) (void *pluginData,
-								   const char *dstName, const char *srcName);
-typedef const char *(*pluginConvertPage) (void *pluginData,
-								   const char *dstPage, const char *srcPage);
-typedef const char *(*pluginShutdown) (void *pluginData);
-
-typedef struct
-{
-	uint16		oldPageVersion; /* Page layout version of the old cluster		*/
-	uint16		newPageVersion; /* Page layout version of the new cluster		*/
-	uint16		pluginVersion;	/* API version of converter plugin */
-	void	   *pluginData;		/* Plugin data (set by plugin) */
-	pluginStartup startup;		/* Pointer to plugin's startup function */
-	pluginConvertFile convertFile;		/* Pointer to plugin's file converter
-										 * function */
-	pluginConvertPage convertPage;		/* Pointer to plugin's page converter
-										 * function */
-	pluginShutdown shutdown;	/* Pointer to plugin's shutdown function */
-} pageCnvCtx;
-
-const pageCnvCtx *setupPageConverter(void);
-#else
-/* dummy */
-typedef void *pageCnvCtx;
-#endif
-
-const char *copyAndUpdateFile(pageCnvCtx *pageConverter, const char *src,
-				  const char *dst, bool force);
-const char *linkAndUpdateFile(pageCnvCtx *pageConverter, const char *src,
-				  const char *dst);
-
-void		check_hard_link(void);
-FILE	   *fopen_priv(const char *path, const char *mode);
-
-/* function.c */
-
-void		install_support_functions_in_new_db(const char *db_name);
-void		uninstall_support_functions_from_new_cluster(void);
-void		get_loadable_libraries(void);
-void		check_loadable_libraries(void);
-
-/* info.c */
-
-FileNameMap *gen_db_file_maps(DbInfo *old_db,
-				 DbInfo *new_db, int *nmaps, const char *old_pgdata,
-				 const char *new_pgdata);
-void		get_db_and_rel_infos(ClusterInfo *cluster);
-void print_maps(FileNameMap *maps, int n,
-		   const char *db_name);
-
-/* option.c */
-
-void		parseCommandLine(int argc, char *argv[]);
-void		adjust_data_dir(ClusterInfo *cluster);
-void		get_sock_dir(ClusterInfo *cluster, bool live_check);
-
-/* relfilenode.c */
-
-void		get_pg_database_relfilenode(ClusterInfo *cluster);
-void transfer_all_new_tablespaces(DbInfoArr *old_db_arr,
-				  DbInfoArr *new_db_arr, char *old_pgdata, char *new_pgdata);
-void transfer_all_new_dbs(DbInfoArr *old_db_arr,
-				   DbInfoArr *new_db_arr, char *old_pgdata, char *new_pgdata,
-					 char *old_tablespace);
-
-/* tablespace.c */
-
-void		init_tablespaces(void);
-
-
-/* server.c */
-
-PGconn	   *connectToServer(ClusterInfo *cluster, const char *db_name);
-PGresult   *executeQueryOrDie(PGconn *conn, const char *fmt,...) pg_attribute_printf(2, 3);
-
-char	   *cluster_conn_opts(ClusterInfo *cluster);
-
-bool		start_postmaster(ClusterInfo *cluster, bool throw_error);
-void		stop_postmaster(bool fast);
-uint32		get_major_server_version(ClusterInfo *cluster);
-void		check_pghost_envvar(void);
-
-
-/* util.c */
-
-char	   *quote_identifier(const char *s);
-int			get_user_info(char **user_name_p);
-void		check_ok(void);
-void		report_status(eLogType type, const char *fmt,...) pg_attribute_printf(2, 3);
-void		pg_log(eLogType type, const char *fmt,...) pg_attribute_printf(2, 3);
-void		pg_fatal(const char *fmt,...) pg_attribute_printf(1, 2) pg_attribute_noreturn();
-void		end_progress_output(void);
-void		prep_status(const char *fmt,...) pg_attribute_printf(1, 2);
-void		check_ok(void);
-const char *getErrorText(int errNum);
-unsigned int str2uint(const char *str);
-void		pg_putenv(const char *var, const char *val);
-
-
-/* version.c */
-
-void new_9_0_populate_pg_largeobject_metadata(ClusterInfo *cluster,
-										 bool check_mode);
-void old_9_3_check_for_line_data_type_usage(ClusterInfo *cluster);
-
-/* parallel.c */
-void parallel_exec_prog(const char *log_file, const char *opt_log_file,
-				   const char *fmt,...) pg_attribute_printf(3, 4);
-void parallel_transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
-							  char *old_pgdata, char *new_pgdata,
-							  char *old_tablespace);
-bool		reap_child(bool wait_for_child);
diff --git a/contrib/pg_upgrade/relfilenode.c b/contrib/pg_upgrade/relfilenode.c
deleted file mode 100644
index 423802b..0000000
--- a/contrib/pg_upgrade/relfilenode.c
+++ /dev/null
@@ -1,294 +0,0 @@
-/*
- *	relfilenode.c
- *
- *	relfilenode functions
- *
- *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
- *	contrib/pg_upgrade/relfilenode.c
- */
-
-#include "postgres_fe.h"
-
-#include "pg_upgrade.h"
-
-#include "catalog/pg_class.h"
-#include "access/transam.h"
-
-
-static void transfer_single_new_db(pageCnvCtx *pageConverter,
-					   FileNameMap *maps, int size, char *old_tablespace);
-static void transfer_relfile(pageCnvCtx *pageConverter, FileNameMap *map,
-				 const char *suffix);
-
-
-/*
- * transfer_all_new_tablespaces()
- *
- * Responsible for upgrading all database. invokes routines to generate mappings and then
- * physically link the databases.
- */
-void
-transfer_all_new_tablespaces(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
-							 char *old_pgdata, char *new_pgdata)
-{
-	pg_log(PG_REPORT, "%s user relation files\n",
-	  user_opts.transfer_mode == TRANSFER_MODE_LINK ? "Linking" : "Copying");
-
-	/*
-	 * Transfering files by tablespace is tricky because a single database can
-	 * use multiple tablespaces.  For non-parallel mode, we just pass a NULL
-	 * tablespace path, which matches all tablespaces.  In parallel mode, we
-	 * pass the default tablespace and all user-created tablespaces and let
-	 * those operations happen in parallel.
-	 */
-	if (user_opts.jobs <= 1)
-		parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
-									  new_pgdata, NULL);
-	else
-	{
-		int			tblnum;
-
-		/* transfer default tablespace */
-		parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
-									  new_pgdata, old_pgdata);
-
-		for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
-			parallel_transfer_all_new_dbs(old_db_arr,
-										  new_db_arr,
-										  old_pgdata,
-										  new_pgdata,
-										  os_info.old_tablespaces[tblnum]);
-		/* reap all children */
-		while (reap_child(true) == true)
-			;
-	}
-
-	end_progress_output();
-	check_ok();
-
-	return;
-}
-
-
-/*
- * transfer_all_new_dbs()
- *
- * Responsible for upgrading all database. invokes routines to generate mappings and then
- * physically link the databases.
- */
-void
-transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
-					 char *old_pgdata, char *new_pgdata, char *old_tablespace)
-{
-	int			old_dbnum,
-				new_dbnum;
-
-	/* Scan the old cluster databases and transfer their files */
-	for (old_dbnum = new_dbnum = 0;
-		 old_dbnum < old_db_arr->ndbs;
-		 old_dbnum++, new_dbnum++)
-	{
-		DbInfo	   *old_db = &old_db_arr->dbs[old_dbnum],
-				   *new_db = NULL;
-		FileNameMap *mappings;
-		int			n_maps;
-		pageCnvCtx *pageConverter = NULL;
-
-		/*
-		 * Advance past any databases that exist in the new cluster but not in
-		 * the old, e.g. "postgres".  (The user might have removed the
-		 * 'postgres' database from the old cluster.)
-		 */
-		for (; new_dbnum < new_db_arr->ndbs; new_dbnum++)
-		{
-			new_db = &new_db_arr->dbs[new_dbnum];
-			if (strcmp(old_db->db_name, new_db->db_name) == 0)
-				break;
-		}
-
-		if (new_dbnum >= new_db_arr->ndbs)
-			pg_fatal("old database \"%s\" not found in the new cluster\n",
-					 old_db->db_name);
-
-		mappings = gen_db_file_maps(old_db, new_db, &n_maps, old_pgdata,
-									new_pgdata);
-		if (n_maps)
-		{
-			print_maps(mappings, n_maps, new_db->db_name);
-
-#ifdef PAGE_CONVERSION
-			pageConverter = setupPageConverter();
-#endif
-			transfer_single_new_db(pageConverter, mappings, n_maps,
-								   old_tablespace);
-		}
-		/* We allocate something even for n_maps == 0 */
-		pg_free(mappings);
-	}
-
-	return;
-}
-
-
-/*
- * get_pg_database_relfilenode()
- *
- *	Retrieves the relfilenode for a few system-catalog tables.  We need these
- *	relfilenodes later in the upgrade process.
- */
-void
-get_pg_database_relfilenode(ClusterInfo *cluster)
-{
-	PGconn	   *conn = connectToServer(cluster, "template1");
-	PGresult   *res;
-	int			i_relfile;
-
-	res = executeQueryOrDie(conn,
-							"SELECT c.relname, c.relfilenode "
-							"FROM	pg_catalog.pg_class c, "
-							"		pg_catalog.pg_namespace n "
-							"WHERE	c.relnamespace = n.oid AND "
-							"		n.nspname = 'pg_catalog' AND "
-							"		c.relname = 'pg_database' "
-							"ORDER BY c.relname");
-
-	i_relfile = PQfnumber(res, "relfilenode");
-	cluster->pg_database_oid = atooid(PQgetvalue(res, 0, i_relfile));
-
-	PQclear(res);
-	PQfinish(conn);
-}
-
-
-/*
- * transfer_single_new_db()
- *
- * create links for mappings stored in "maps" array.
- */
-static void
-transfer_single_new_db(pageCnvCtx *pageConverter,
-					   FileNameMap *maps, int size, char *old_tablespace)
-{
-	int			mapnum;
-	bool		vm_crashsafe_match = true;
-
-	/*
-	 * Do the old and new cluster disagree on the crash-safetiness of the vm
-	 * files?  If so, do not copy them.
-	 */
-	if (old_cluster.controldata.cat_ver < VISIBILITY_MAP_CRASHSAFE_CAT_VER &&
-		new_cluster.controldata.cat_ver >= VISIBILITY_MAP_CRASHSAFE_CAT_VER)
-		vm_crashsafe_match = false;
-
-	for (mapnum = 0; mapnum < size; mapnum++)
-	{
-		if (old_tablespace == NULL ||
-			strcmp(maps[mapnum].old_tablespace, old_tablespace) == 0)
-		{
-			/* transfer primary file */
-			transfer_relfile(pageConverter, &maps[mapnum], "");
-
-			/* fsm/vm files added in PG 8.4 */
-			if (GET_MAJOR_VERSION(old_cluster.major_version) >= 804)
-			{
-				/*
-				 * Copy/link any fsm and vm files, if they exist
-				 */
-				transfer_relfile(pageConverter, &maps[mapnum], "_fsm");
-				if (vm_crashsafe_match)
-					transfer_relfile(pageConverter, &maps[mapnum], "_vm");
-			}
-		}
-	}
-}
-
-
-/*
- * transfer_relfile()
- *
- * Copy or link file from old cluster to new one.
- */
-static void
-transfer_relfile(pageCnvCtx *pageConverter, FileNameMap *map,
-				 const char *type_suffix)
-{
-	const char *msg;
-	char		old_file[MAXPGPATH];
-	char		new_file[MAXPGPATH];
-	int			fd;
-	int			segno;
-	char		extent_suffix[65];
-
-	/*
-	 * Now copy/link any related segments as well. Remember, PG breaks large
-	 * files into 1GB segments, the first segment has no extension, subsequent
-	 * segments are named relfilenode.1, relfilenode.2, relfilenode.3. copied.
-	 */
-	for (segno = 0;; segno++)
-	{
-		if (segno == 0)
-			extent_suffix[0] = '\0';
-		else
-			snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
-
-		snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
-				 map->old_tablespace,
-				 map->old_tablespace_suffix,
-				 map->old_db_oid,
-				 map->old_relfilenode,
-				 type_suffix,
-				 extent_suffix);
-		snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
-				 map->new_tablespace,
-				 map->new_tablespace_suffix,
-				 map->new_db_oid,
-				 map->new_relfilenode,
-				 type_suffix,
-				 extent_suffix);
-
-		/* Is it an extent, fsm, or vm file? */
-		if (type_suffix[0] != '\0' || segno != 0)
-		{
-			/* Did file open fail? */
-			if ((fd = open(old_file, O_RDONLY, 0)) == -1)
-			{
-				/* File does not exist?  That's OK, just return */
-				if (errno == ENOENT)
-					return;
-				else
-					pg_fatal("error while checking for file existence \"%s.%s\" (\"%s\" to \"%s\"): %s\n",
-							 map->nspname, map->relname, old_file, new_file,
-							 getErrorText(errno));
-			}
-			close(fd);
-		}
-
-		unlink(new_file);
-
-		/* Copying files might take some time, so give feedback. */
-		pg_log(PG_STATUS, "%s", old_file);
-
-		if ((user_opts.transfer_mode == TRANSFER_MODE_LINK) && (pageConverter != NULL))
-			pg_fatal("This upgrade requires page-by-page conversion, "
-					 "you must use copy mode instead of link mode.\n");
-
-		if (user_opts.transfer_mode == TRANSFER_MODE_COPY)
-		{
-			pg_log(PG_VERBOSE, "copying \"%s\" to \"%s\"\n", old_file, new_file);
-
-			if ((msg = copyAndUpdateFile(pageConverter, old_file, new_file, true)) != NULL)
-				pg_fatal("error while copying relation \"%s.%s\" (\"%s\" to \"%s\"): %s\n",
-						 map->nspname, map->relname, old_file, new_file, msg);
-		}
-		else
-		{
-			pg_log(PG_VERBOSE, "linking \"%s\" to \"%s\"\n", old_file, new_file);
-
-			if ((msg = linkAndUpdateFile(pageConverter, old_file, new_file)) != NULL)
-				pg_fatal("error while creating link for relation \"%s.%s\" (\"%s\" to \"%s\"): %s\n",
-						 map->nspname, map->relname, old_file, new_file, msg);
-		}
-	}
-
-	return;
-}
diff --git a/contrib/pg_upgrade/server.c b/contrib/pg_upgrade/server.c
deleted file mode 100644
index c5f66f0..0000000
--- a/contrib/pg_upgrade/server.c
+++ /dev/null
@@ -1,350 +0,0 @@
-/*
- *	server.c
- *
- *	database server functions
- *
- *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
- *	contrib/pg_upgrade/server.c
- */
-
-#include "postgres_fe.h"
-
-#include "pg_upgrade.h"
-
-
-static PGconn *get_db_conn(ClusterInfo *cluster, const char *db_name);
-
-
-/*
- * connectToServer()
- *
- *	Connects to the desired database on the designated server.
- *	If the connection attempt fails, this function logs an error
- *	message and calls exit() to kill the program.
- */
-PGconn *
-connectToServer(ClusterInfo *cluster, const char *db_name)
-{
-	PGconn	   *conn = get_db_conn(cluster, db_name);
-
-	if (conn == NULL || PQstatus(conn) != CONNECTION_OK)
-	{
-		pg_log(PG_REPORT, "connection to database failed: %s\n",
-			   PQerrorMessage(conn));
-
-		if (conn)
-			PQfinish(conn);
-
-		printf("Failure, exiting\n");
-		exit(1);
-	}
-
-	return conn;
-}
-
-
-/*
- * get_db_conn()
- *
- * get database connection, using named database + standard params for cluster
- */
-static PGconn *
-get_db_conn(ClusterInfo *cluster, const char *db_name)
-{
-	char		conn_opts[2 * NAMEDATALEN + MAXPGPATH + 100];
-
-	if (cluster->sockdir)
-		snprintf(conn_opts, sizeof(conn_opts),
-				 "dbname = '%s' user = '%s' host = '%s' port = %d",
-				 db_name, os_info.user, cluster->sockdir, cluster->port);
-	else
-		snprintf(conn_opts, sizeof(conn_opts),
-				 "dbname = '%s' user = '%s' port = %d",
-				 db_name, os_info.user, cluster->port);
-
-	return PQconnectdb(conn_opts);
-}
-
-
-/*
- * cluster_conn_opts()
- *
- * Return standard command-line options for connecting to this cluster when
- * using psql, pg_dump, etc.  Ideally this would match what get_db_conn()
- * sets, but the utilities we need aren't very consistent about the treatment
- * of database name options, so we leave that out.
- *
- * Note result is in static storage, so use it right away.
- */
-char *
-cluster_conn_opts(ClusterInfo *cluster)
-{
-	static char conn_opts[MAXPGPATH + NAMEDATALEN + 100];
-
-	if (cluster->sockdir)
-		snprintf(conn_opts, sizeof(conn_opts),
-				 "--host \"%s\" --port %d --username \"%s\"",
-				 cluster->sockdir, cluster->port, os_info.user);
-	else
-		snprintf(conn_opts, sizeof(conn_opts),
-				 "--port %d --username \"%s\"",
-				 cluster->port, os_info.user);
-
-	return conn_opts;
-}
-
-
-/*
- * executeQueryOrDie()
- *
- *	Formats a query string from the given arguments and executes the
- *	resulting query.  If the query fails, this function logs an error
- *	message and calls exit() to kill the program.
- */
-PGresult *
-executeQueryOrDie(PGconn *conn, const char *fmt,...)
-{
-	static char query[QUERY_ALLOC];
-	va_list		args;
-	PGresult   *result;
-	ExecStatusType status;
-
-	va_start(args, fmt);
-	vsnprintf(query, sizeof(query), fmt, args);
-	va_end(args);
-
-	pg_log(PG_VERBOSE, "executing: %s\n", query);
-	result = PQexec(conn, query);
-	status = PQresultStatus(result);
-
-	if ((status != PGRES_TUPLES_OK) && (status != PGRES_COMMAND_OK))
-	{
-		pg_log(PG_REPORT, "SQL command failed\n%s\n%s\n", query,
-			   PQerrorMessage(conn));
-		PQclear(result);
-		PQfinish(conn);
-		printf("Failure, exiting\n");
-		exit(1);
-	}
-	else
-		return result;
-}
-
-
-/*
- * get_major_server_version()
- *
- * gets the version (in unsigned int form) for the given datadir. Assumes
- * that datadir is an absolute path to a valid pgdata directory. The version
- * is retrieved by reading the PG_VERSION file.
- */
-uint32
-get_major_server_version(ClusterInfo *cluster)
-{
-	FILE	   *version_fd;
-	char		ver_filename[MAXPGPATH];
-	int			integer_version = 0;
-	int			fractional_version = 0;
-
-	snprintf(ver_filename, sizeof(ver_filename), "%s/PG_VERSION",
-			 cluster->pgdata);
-	if ((version_fd = fopen(ver_filename, "r")) == NULL)
-		pg_fatal("could not open version file: %s\n", ver_filename);
-
-	if (fscanf(version_fd, "%63s", cluster->major_version_str) == 0 ||
-		sscanf(cluster->major_version_str, "%d.%d", &integer_version,
-			   &fractional_version) != 2)
-		pg_fatal("could not get version from %s\n", cluster->pgdata);
-
-	fclose(version_fd);
-
-	return (100 * integer_version + fractional_version) * 100;
-}
-
-
-static void
-stop_postmaster_atexit(void)
-{
-	stop_postmaster(true);
-}
-
-
-bool
-start_postmaster(ClusterInfo *cluster, bool throw_error)
-{
-	char		cmd[MAXPGPATH * 4 + 1000];
-	PGconn	   *conn;
-	bool		exit_hook_registered = false;
-	bool		pg_ctl_return = false;
-	char		socket_string[MAXPGPATH + 200];
-
-	if (!exit_hook_registered)
-	{
-		atexit(stop_postmaster_atexit);
-		exit_hook_registered = true;
-	}
-
-	socket_string[0] = '\0';
-
-#ifdef HAVE_UNIX_SOCKETS
-	/* prevent TCP/IP connections, restrict socket access */
-	strcat(socket_string,
-		   " -c listen_addresses='' -c unix_socket_permissions=0700");
-
-	/* Have a sockdir?	Tell the postmaster. */
-	if (cluster->sockdir)
-		snprintf(socket_string + strlen(socket_string),
-				 sizeof(socket_string) - strlen(socket_string),
-				 " -c %s='%s'",
-				 (GET_MAJOR_VERSION(cluster->major_version) < 903) ?
-				 "unix_socket_directory" : "unix_socket_directories",
-				 cluster->sockdir);
-#endif
-
-	/*
-	 * Since PG 9.1, we have used -b to disable autovacuum.  For earlier
-	 * releases, setting autovacuum=off disables cleanup vacuum and analyze,
-	 * but freeze vacuums can still happen, so we set autovacuum_freeze_max_age
-	 * to its maximum.  (autovacuum_multixact_freeze_max_age was introduced
-	 * after 9.1, so there is no need to set that.)  We assume all datfrozenxid
-	 * and relfrozenxid values are less than a gap of 2000000000 from the current
-	 * xid counter, so autovacuum will not touch them.
-	 *
-	 * Turn off durability requirements to improve object creation speed, and
-	 * we only modify the new cluster, so only use it there.  If there is a
-	 * crash, the new cluster has to be recreated anyway.  fsync=off is a big
-	 * win on ext4.
-	 */
-	snprintf(cmd, sizeof(cmd),
-		  "\"%s/pg_ctl\" -w -l \"%s\" -D \"%s\" -o \"-p %d%s%s %s%s\" start",
-		  cluster->bindir, SERVER_LOG_FILE, cluster->pgconfig, cluster->port,
-			 (cluster->controldata.cat_ver >=
-			  BINARY_UPGRADE_SERVER_FLAG_CAT_VER) ? " -b" :
-			 " -c autovacuum=off -c autovacuum_freeze_max_age=2000000000",
-			 (cluster == &new_cluster) ?
-	  " -c synchronous_commit=off -c fsync=off -c full_page_writes=off" : "",
-			 cluster->pgopts ? cluster->pgopts : "", socket_string);
-
-	/*
-	 * Don't throw an error right away, let connecting throw the error because
-	 * it might supply a reason for the failure.
-	 */
-	pg_ctl_return = exec_prog(SERVER_START_LOG_FILE,
-	/* pass both file names if they differ */
-							  (strcmp(SERVER_LOG_FILE,
-									  SERVER_START_LOG_FILE) != 0) ?
-							  SERVER_LOG_FILE : NULL,
-							  false,
-							  "%s", cmd);
-
-	/* Did it fail and we are just testing if the server could be started? */
-	if (!pg_ctl_return && !throw_error)
-		return false;
-
-	/*
-	 * We set this here to make sure atexit() shuts down the server, but only
-	 * if we started the server successfully.  We do it before checking for
-	 * connectivity in case the server started but there is a connectivity
-	 * failure.  If pg_ctl did not return success, we will exit below.
-	 *
-	 * Pre-9.1 servers do not have PQping(), so we could be leaving the server
-	 * running if authentication was misconfigured, so someday we might went
-	 * to be more aggressive about doing server shutdowns even if pg_ctl
-	 * fails, but now (2013-08-14) it seems prudent to be cautious.  We don't
-	 * want to shutdown a server that might have been accidentally started
-	 * during the upgrade.
-	 */
-	if (pg_ctl_return)
-		os_info.running_cluster = cluster;
-
-	/*
-	 * pg_ctl -w might have failed because the server couldn't be started, or
-	 * there might have been a connection problem in _checking_ if the server
-	 * has started.  Therefore, even if pg_ctl failed, we continue and test
-	 * for connectivity in case we get a connection reason for the failure.
-	 */
-	if ((conn = get_db_conn(cluster, "template1")) == NULL ||
-		PQstatus(conn) != CONNECTION_OK)
-	{
-		pg_log(PG_REPORT, "\nconnection to database failed: %s\n",
-			   PQerrorMessage(conn));
-		if (conn)
-			PQfinish(conn);
-		pg_fatal("could not connect to %s postmaster started with the command:\n"
-				 "%s\n",
-				 CLUSTER_NAME(cluster), cmd);
-	}
-	PQfinish(conn);
-
-	/*
-	 * If pg_ctl failed, and the connection didn't fail, and throw_error is
-	 * enabled, fail now.  This could happen if the server was already
-	 * running.
-	 */
-	if (!pg_ctl_return)
-		pg_fatal("pg_ctl failed to start the %s server, or connection failed\n",
-				 CLUSTER_NAME(cluster));
-
-	return true;
-}
-
-
-void
-stop_postmaster(bool fast)
-{
-	ClusterInfo *cluster;
-
-	if (os_info.running_cluster == &old_cluster)
-		cluster = &old_cluster;
-	else if (os_info.running_cluster == &new_cluster)
-		cluster = &new_cluster;
-	else
-		return;					/* no cluster running */
-
-	exec_prog(SERVER_STOP_LOG_FILE, NULL, !fast,
-			  "\"%s/pg_ctl\" -w -D \"%s\" -o \"%s\" %s stop",
-			  cluster->bindir, cluster->pgconfig,
-			  cluster->pgopts ? cluster->pgopts : "",
-			  fast ? "-m fast" : "");
-
-	os_info.running_cluster = NULL;
-}
-
-
-/*
- * check_pghost_envvar()
- *
- * Tests that PGHOST does not point to a non-local server
- */
-void
-check_pghost_envvar(void)
-{
-	PQconninfoOption *option;
-	PQconninfoOption *start;
-
-	/* Get valid libpq env vars from the PQconndefaults function */
-
-	start = PQconndefaults();
-
-	if (!start)
-		pg_fatal("out of memory\n");
-
-	for (option = start; option->keyword != NULL; option++)
-	{
-		if (option->envvar && (strcmp(option->envvar, "PGHOST") == 0 ||
-							   strcmp(option->envvar, "PGHOSTADDR") == 0))
-		{
-			const char *value = getenv(option->envvar);
-
-			if (value && strlen(value) > 0 &&
-			/* check for 'local' host values */
-				(strcmp(value, "localhost") != 0 && strcmp(value, "127.0.0.1") != 0 &&
-				 strcmp(value, "::1") != 0 && value[0] != '/'))
-				pg_fatal("libpq environment variable %s has a non-local server value: %s\n",
-						 option->envvar, value);
-		}
-	}
-
-	/* Free the memory that libpq allocated on our behalf */
-	PQconninfoFree(start);
-}
diff --git a/contrib/pg_upgrade/tablespace.c b/contrib/pg_upgrade/tablespace.c
deleted file mode 100644
index eecdf4b..0000000
--- a/contrib/pg_upgrade/tablespace.c
+++ /dev/null
@@ -1,124 +0,0 @@
-/*
- *	tablespace.c
- *
- *	tablespace functions
- *
- *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
- *	contrib/pg_upgrade/tablespace.c
- */
-
-#include "postgres_fe.h"
-
-#include "pg_upgrade.h"
-
-#include <sys/types.h>
-
-static void get_tablespace_paths(void);
-static void set_tablespace_directory_suffix(ClusterInfo *cluster);
-
-
-void
-init_tablespaces(void)
-{
-	get_tablespace_paths();
-
-	set_tablespace_directory_suffix(&old_cluster);
-	set_tablespace_directory_suffix(&new_cluster);
-
-	if (os_info.num_old_tablespaces > 0 &&
-	strcmp(old_cluster.tablespace_suffix, new_cluster.tablespace_suffix) == 0)
-		pg_fatal("Cannot upgrade to/from the same system catalog version when\n"
-				 "using tablespaces.\n");
-}
-
-
-/*
- * get_tablespace_paths()
- *
- * Scans pg_tablespace and returns a malloc'ed array of all tablespace
- * paths. Its the caller's responsibility to free the array.
- */
-static void
-get_tablespace_paths(void)
-{
-	PGconn	   *conn = connectToServer(&old_cluster, "template1");
-	PGresult   *res;
-	int			tblnum;
-	int			i_spclocation;
-	char		query[QUERY_ALLOC];
-
-	snprintf(query, sizeof(query),
-			 "SELECT	%s "
-			 "FROM	pg_catalog.pg_tablespace "
-			 "WHERE	spcname != 'pg_default' AND "
-			 "		spcname != 'pg_global'",
-	/* 9.2 removed the spclocation column */
-			 (GET_MAJOR_VERSION(old_cluster.major_version) <= 901) ?
-	"spclocation" : "pg_catalog.pg_tablespace_location(oid) AS spclocation");
-
-	res = executeQueryOrDie(conn, "%s", query);
-
-	if ((os_info.num_old_tablespaces = PQntuples(res)) != 0)
-		os_info.old_tablespaces = (char **) pg_malloc(
-							   os_info.num_old_tablespaces * sizeof(char *));
-	else
-		os_info.old_tablespaces = NULL;
-
-	i_spclocation = PQfnumber(res, "spclocation");
-
-	for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
-	{
-		struct stat statBuf;
-
-		os_info.old_tablespaces[tblnum] = pg_strdup(
-									 PQgetvalue(res, tblnum, i_spclocation));
-
-		/*
-		 * Check that the tablespace path exists and is a directory.
-		 * Effectively, this is checking only for tables/indexes in
-		 * non-existent tablespace directories.  Databases located in
-		 * non-existent tablespaces already throw a backend error.
-		 * Non-existent tablespace directories can occur when a data directory
-		 * that contains user tablespaces is moved as part of pg_upgrade
-		 * preparation and the symbolic links are not updated.
-		 */
-		if (stat(os_info.old_tablespaces[tblnum], &statBuf) != 0)
-		{
-			if (errno == ENOENT)
-				report_status(PG_FATAL,
-							  "tablespace directory \"%s\" does not exist\n",
-							  os_info.old_tablespaces[tblnum]);
-			else
-				report_status(PG_FATAL,
-						   "cannot stat() tablespace directory \"%s\": %s\n",
-					   os_info.old_tablespaces[tblnum], getErrorText(errno));
-		}
-		if (!S_ISDIR(statBuf.st_mode))
-			report_status(PG_FATAL,
-						  "tablespace path \"%s\" is not a directory\n",
-						  os_info.old_tablespaces[tblnum]);
-	}
-
-	PQclear(res);
-
-	PQfinish(conn);
-
-	return;
-}
-
-
-static void
-set_tablespace_directory_suffix(ClusterInfo *cluster)
-{
-	if (GET_MAJOR_VERSION(cluster->major_version) <= 804)
-		cluster->tablespace_suffix = pg_strdup("");
-	else
-	{
-		/* This cluster has a version-specific subdirectory */
-
-		/* The leading slash is needed to start a new directory. */
-		cluster->tablespace_suffix = psprintf("/PG_%s_%d",
-											  cluster->major_version_str,
-											  cluster->controldata.cat_ver);
-	}
-}
diff --git a/contrib/pg_upgrade/test.sh b/contrib/pg_upgrade/test.sh
deleted file mode 100644
index 75b6357..0000000
--- a/contrib/pg_upgrade/test.sh
+++ /dev/null
@@ -1,225 +0,0 @@
-#!/bin/sh
-
-# contrib/pg_upgrade/test.sh
-#
-# Test driver for pg_upgrade.  Initializes a new database cluster,
-# runs the regression tests (to put in some data), runs pg_dumpall,
-# runs pg_upgrade, runs pg_dumpall again, compares the dumps.
-#
-# Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
-# Portions Copyright (c) 1994, Regents of the University of California
-
-set -e
-
-: ${MAKE=make}
-
-# Guard against parallel make issues (see comments in pg_regress.c)
-unset MAKEFLAGS
-unset MAKELEVEL
-
-# Run a given "initdb" binary and overlay the regression testing
-# authentication configuration.
-standard_initdb() {
-	"$1" -N
-	../../src/test/regress/pg_regress --config-auth "$PGDATA"
-}
-
-# Establish how the server will listen for connections
-testhost=`uname -s`
-
-case $testhost in
-	MINGW*)
-		LISTEN_ADDRESSES="localhost"
-		PGHOST=localhost
-		;;
-	*)
-		LISTEN_ADDRESSES=""
-		# Select a socket directory.  The algorithm is from the "configure"
-		# script; the outcome mimics pg_regress.c:make_temp_sockdir().
-		PGHOST=$PG_REGRESS_SOCK_DIR
-		if [ "x$PGHOST" = x ]; then
-			{
-				dir=`(umask 077 &&
-					  mktemp -d /tmp/pg_upgrade_check-XXXXXX) 2>/dev/null` &&
-				[ -d "$dir" ]
-			} ||
-			{
-				dir=/tmp/pg_upgrade_check-$$-$RANDOM
-				(umask 077 && mkdir "$dir")
-			} ||
-			{
-				echo "could not create socket temporary directory in \"/tmp\""
-				exit 1
-			}
-
-			PGHOST=$dir
-			trap 'rm -rf "$PGHOST"' 0
-			trap 'exit 3' 1 2 13 15
-		fi
-		;;
-esac
-
-POSTMASTER_OPTS="-F -c listen_addresses=$LISTEN_ADDRESSES -k \"$PGHOST\""
-export PGHOST
-
-temp_root=$PWD/tmp_check
-
-if [ "$1" = '--install' ]; then
-	temp_install=$temp_root/install
-	bindir=$temp_install/$bindir
-	libdir=$temp_install/$libdir
-
-	"$MAKE" -s -C ../.. install DESTDIR="$temp_install"
-	"$MAKE" -s -C ../pg_upgrade_support install DESTDIR="$temp_install"
-	"$MAKE" -s -C . install DESTDIR="$temp_install"
-
-	# platform-specific magic to find the shared libraries; see pg_regress.c
-	LD_LIBRARY_PATH=$libdir:$LD_LIBRARY_PATH
-	export LD_LIBRARY_PATH
-	DYLD_LIBRARY_PATH=$libdir:$DYLD_LIBRARY_PATH
-	export DYLD_LIBRARY_PATH
-	LIBPATH=$libdir:$LIBPATH
-	export LIBPATH
-	PATH=$libdir:$PATH
-
-	# We need to make it use psql from our temporary installation,
-	# because otherwise the installcheck run below would try to
-	# use psql from the proper installation directory, which might
-	# be outdated or missing. But don't override anything else that's
-	# already in EXTRA_REGRESS_OPTS.
-	EXTRA_REGRESS_OPTS="$EXTRA_REGRESS_OPTS --psqldir='$bindir'"
-	export EXTRA_REGRESS_OPTS
-fi
-
-: ${oldbindir=$bindir}
-
-: ${oldsrc=../..}
-oldsrc=`cd "$oldsrc" && pwd`
-newsrc=`cd ../.. && pwd`
-
-PATH=$bindir:$PATH
-export PATH
-
-BASE_PGDATA=$temp_root/data
-PGDATA="$BASE_PGDATA.old"
-export PGDATA
-rm -rf "$BASE_PGDATA" "$PGDATA"
-
-logdir=$PWD/log
-rm -rf "$logdir"
-mkdir "$logdir"
-
-# Clear out any environment vars that might cause libpq to connect to
-# the wrong postmaster (cf pg_regress.c)
-#
-# Some shells, such as NetBSD's, return non-zero from unset if the variable
-# is already unset. Since we are operating under 'set -e', this causes the
-# script to fail. To guard against this, set them all to an empty string first.
-PGDATABASE="";        unset PGDATABASE
-PGUSER="";            unset PGUSER
-PGSERVICE="";         unset PGSERVICE
-PGSSLMODE="";         unset PGSSLMODE
-PGREQUIRESSL="";      unset PGREQUIRESSL
-PGCONNECT_TIMEOUT=""; unset PGCONNECT_TIMEOUT
-PGHOSTADDR="";        unset PGHOSTADDR
-
-# Select a non-conflicting port number, similarly to pg_regress.c
-PG_VERSION_NUM=`grep '#define PG_VERSION_NUM' "$newsrc"/src/include/pg_config.h | awk '{print $3}'`
-PGPORT=`expr $PG_VERSION_NUM % 16384 + 49152`
-export PGPORT
-
-i=0
-while psql -X postgres </dev/null 2>/dev/null
-do
-	i=`expr $i + 1`
-	if [ $i -eq 16 ]
-	then
-		echo port $PGPORT apparently in use
-		exit 1
-	fi
-	PGPORT=`expr $PGPORT + 1`
-	export PGPORT
-done
-
-# buildfarm may try to override port via EXTRA_REGRESS_OPTS ...
-EXTRA_REGRESS_OPTS="$EXTRA_REGRESS_OPTS --port=$PGPORT"
-export EXTRA_REGRESS_OPTS
-
-# enable echo so the user can see what is being executed
-set -x
-
-standard_initdb "$oldbindir"/initdb
-"$oldbindir"/pg_ctl start -l "$logdir/postmaster1.log" -o "$POSTMASTER_OPTS" -w
-if "$MAKE" -C "$oldsrc" installcheck; then
-	pg_dumpall -f "$temp_root"/dump1.sql || pg_dumpall1_status=$?
-	if [ "$newsrc" != "$oldsrc" ]; then
-		oldpgversion=`psql -A -t -d regression -c "SHOW server_version_num"`
-		fix_sql=""
-		case $oldpgversion in
-			804??)
-				fix_sql="UPDATE pg_proc SET probin = replace(probin::text, '$oldsrc', '$newsrc')::bytea WHERE probin LIKE '$oldsrc%'; DROP FUNCTION public.myfunc(integer);"
-				;;
-			900??)
-				fix_sql="SET bytea_output TO escape; UPDATE pg_proc SET probin = replace(probin::text, '$oldsrc', '$newsrc')::bytea WHERE probin LIKE '$oldsrc%';"
-				;;
-			901??)
-				fix_sql="UPDATE pg_proc SET probin = replace(probin, '$oldsrc', '$newsrc') WHERE probin LIKE '$oldsrc%';"
-				;;
-		esac
-		psql -d regression -c "$fix_sql;" || psql_fix_sql_status=$?
-
-		mv "$temp_root"/dump1.sql "$temp_root"/dump1.sql.orig
-		sed "s;$oldsrc;$newsrc;g" "$temp_root"/dump1.sql.orig >"$temp_root"/dump1.sql
-	fi
-else
-	make_installcheck_status=$?
-fi
-"$oldbindir"/pg_ctl -m fast stop
-if [ -n "$make_installcheck_status" ]; then
-	exit 1
-fi
-if [ -n "$psql_fix_sql_status" ]; then
-	exit 1
-fi
-if [ -n "$pg_dumpall1_status" ]; then
-	echo "pg_dumpall of pre-upgrade database cluster failed"
-	exit 1
-fi
-
-PGDATA=$BASE_PGDATA
-
-standard_initdb 'initdb'
-
-pg_upgrade $PG_UPGRADE_OPTS -d "${PGDATA}.old" -D "${PGDATA}" -b "$oldbindir" -B "$bindir" -p "$PGPORT" -P "$PGPORT"
-
-pg_ctl start -l "$logdir/postmaster2.log" -o "$POSTMASTER_OPTS" -w
-
-case $testhost in
-	MINGW*)	cmd /c analyze_new_cluster.bat ;;
-	*)		sh ./analyze_new_cluster.sh ;;
-esac
-
-pg_dumpall -f "$temp_root"/dump2.sql || pg_dumpall2_status=$?
-pg_ctl -m fast stop
-
-# no need to echo commands anymore
-set +x
-echo
-
-if [ -n "$pg_dumpall2_status" ]; then
-	echo "pg_dumpall of post-upgrade database cluster failed"
-	exit 1
-fi
-
-case $testhost in
-	MINGW*)	cmd /c delete_old_cluster.bat ;;
-	*)	    sh ./delete_old_cluster.sh ;;
-esac
-
-if diff -q "$temp_root"/dump1.sql "$temp_root"/dump2.sql; then
-	echo PASSED
-	exit 0
-else
-	echo "dumps were not identical"
-	exit 1
-fi
diff --git a/contrib/pg_upgrade/util.c b/contrib/pg_upgrade/util.c
deleted file mode 100644
index 6184cee..0000000
--- a/contrib/pg_upgrade/util.c
+++ /dev/null
@@ -1,298 +0,0 @@
-/*
- *	util.c
- *
- *	utility functions
- *
- *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
- *	contrib/pg_upgrade/util.c
- */
-
-#include "postgres_fe.h"
-
-#include "common/username.h"
-#include "pg_upgrade.h"
-
-#include <signal.h>
-
-
-LogOpts		log_opts;
-
-static void pg_log_v(eLogType type, const char *fmt, va_list ap) pg_attribute_printf(2, 0);
-
-
-/*
- * report_status()
- *
- *	Displays the result of an operation (ok, failed, error message,...)
- */
-void
-report_status(eLogType type, const char *fmt,...)
-{
-	va_list		args;
-	char		message[MAX_STRING];
-
-	va_start(args, fmt);
-	vsnprintf(message, sizeof(message), fmt, args);
-	va_end(args);
-
-	pg_log(type, "%s\n", message);
-}
-
-
-/* force blank output for progress display */
-void
-end_progress_output(void)
-{
-	/*
-	 * In case nothing printed; pass a space so gcc doesn't complain about
-	 * empty format string.
-	 */
-	prep_status(" ");
-}
-
-
-/*
- * prep_status
- *
- *	Displays a message that describes an operation we are about to begin.
- *	We pad the message out to MESSAGE_WIDTH characters so that all of the "ok" and
- *	"failed" indicators line up nicely.
- *
- *	A typical sequence would look like this:
- *		prep_status("about to flarb the next %d files", fileCount );
- *
- *		if(( message = flarbFiles(fileCount)) == NULL)
- *		  report_status(PG_REPORT, "ok" );
- *		else
- *		  pg_log(PG_FATAL, "failed - %s\n", message );
- */
-void
-prep_status(const char *fmt,...)
-{
-	va_list		args;
-	char		message[MAX_STRING];
-
-	va_start(args, fmt);
-	vsnprintf(message, sizeof(message), fmt, args);
-	va_end(args);
-
-	if (strlen(message) > 0 && message[strlen(message) - 1] == '\n')
-		pg_log(PG_REPORT, "%s", message);
-	else
-		/* trim strings that don't end in a newline */
-		pg_log(PG_REPORT, "%-*s", MESSAGE_WIDTH, message);
-}
-
-
-static void
-pg_log_v(eLogType type, const char *fmt, va_list ap)
-{
-	char		message[QUERY_ALLOC];
-
-	vsnprintf(message, sizeof(message), fmt, ap);
-
-	/* PG_VERBOSE and PG_STATUS are only output in verbose mode */
-	/* fopen() on log_opts.internal might have failed, so check it */
-	if (((type != PG_VERBOSE && type != PG_STATUS) || log_opts.verbose) &&
-		log_opts.internal != NULL)
-	{
-		if (type == PG_STATUS)
-			/* status messages need two leading spaces and a newline */
-			fprintf(log_opts.internal, "  %s\n", message);
-		else
-			fprintf(log_opts.internal, "%s", message);
-		fflush(log_opts.internal);
-	}
-
-	switch (type)
-	{
-		case PG_VERBOSE:
-			if (log_opts.verbose)
-				printf("%s", _(message));
-			break;
-
-		case PG_STATUS:
-			/* for output to a display, do leading truncation and append \r */
-			if (isatty(fileno(stdout)))
-				/* -2 because we use a 2-space indent */
-				printf("  %s%-*.*s\r",
-				/* prefix with "..." if we do leading truncation */
-					   strlen(message) <= MESSAGE_WIDTH - 2 ? "" : "...",
-					   MESSAGE_WIDTH - 2, MESSAGE_WIDTH - 2,
-				/* optional leading truncation */
-					   strlen(message) <= MESSAGE_WIDTH - 2 ? message :
-					   message + strlen(message) - MESSAGE_WIDTH + 3 + 2);
-			else
-				printf("  %s\n", _(message));
-			break;
-
-		case PG_REPORT:
-		case PG_WARNING:
-			printf("%s", _(message));
-			break;
-
-		case PG_FATAL:
-			printf("\n%s", _(message));
-			printf("Failure, exiting\n");
-			exit(1);
-			break;
-
-		default:
-			break;
-	}
-	fflush(stdout);
-}
-
-
-void
-pg_log(eLogType type, const char *fmt,...)
-{
-	va_list		args;
-
-	va_start(args, fmt);
-	pg_log_v(type, fmt, args);
-	va_end(args);
-}
-
-
-void
-pg_fatal(const char *fmt,...)
-{
-	va_list		args;
-
-	va_start(args, fmt);
-	pg_log_v(PG_FATAL, fmt, args);
-	va_end(args);
-	printf("Failure, exiting\n");
-	exit(1);
-}
-
-
-void
-check_ok(void)
-{
-	/* all seems well */
-	report_status(PG_REPORT, "ok");
-	fflush(stdout);
-}
-
-
-/*
- * quote_identifier()
- *		Properly double-quote a SQL identifier.
- *
- * The result should be pg_free'd, but most callers don't bother because
- * memory leakage is not a big deal in this program.
- */
-char *
-quote_identifier(const char *s)
-{
-	char	   *result = pg_malloc(strlen(s) * 2 + 3);
-	char	   *r = result;
-
-	*r++ = '"';
-	while (*s)
-	{
-		if (*s == '"')
-			*r++ = *s;
-		*r++ = *s;
-		s++;
-	}
-	*r++ = '"';
-	*r++ = '\0';
-
-	return result;
-}
-
-
-/*
- * get_user_info()
- */
-int
-get_user_info(char **user_name_p)
-{
-	int			user_id;
-	const char *user_name;
-	char	   *errstr;
-
-#ifndef WIN32
-	user_id = geteuid();
-#else
-	user_id = 1;
-#endif
-
-	user_name = get_user_name(&errstr);
-	if (!user_name)
-		pg_fatal("%s\n", errstr);
-
-	/* make a copy */
-	*user_name_p = pg_strdup(user_name);
-
-	return user_id;
-}
-
-
-/*
- * getErrorText()
- *
- *	Returns the text of the error message for the given error number
- *
- *	This feature is factored into a separate function because it is
- *	system-dependent.
- */
-const char *
-getErrorText(int errNum)
-{
-#ifdef WIN32
-	_dosmaperr(GetLastError());
-#endif
-	return pg_strdup(strerror(errNum));
-}
-
-
-/*
- *	str2uint()
- *
- *	convert string to oid
- */
-unsigned int
-str2uint(const char *str)
-{
-	return strtoul(str, NULL, 10);
-}
-
-
-/*
- *	pg_putenv()
- *
- *	This is like putenv(), but takes two arguments.
- *	It also does unsetenv() if val is NULL.
- */
-void
-pg_putenv(const char *var, const char *val)
-{
-	if (val)
-	{
-#ifndef WIN32
-		char	   *envstr;
-
-		envstr = psprintf("%s=%s", var, val);
-		putenv(envstr);
-
-		/*
-		 * Do not free envstr because it becomes part of the environment on
-		 * some operating systems.  See port/unsetenv.c::unsetenv.
-		 */
-#else
-		SetEnvironmentVariableA(var, val);
-#endif
-	}
-	else
-	{
-#ifndef WIN32
-		unsetenv(var);
-#else
-		SetEnvironmentVariableA(var, "");
-#endif
-	}
-}
diff --git a/contrib/pg_upgrade/version.c b/contrib/pg_upgrade/version.c
deleted file mode 100644
index 4ae9511..0000000
--- a/contrib/pg_upgrade/version.c
+++ /dev/null
@@ -1,178 +0,0 @@
-/*
- *	version.c
- *
- *	Postgres-version-specific routines
- *
- *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
- *	contrib/pg_upgrade/version.c
- */
-
-#include "postgres_fe.h"
-
-#include "pg_upgrade.h"
-
-
-
-/*
- * new_9_0_populate_pg_largeobject_metadata()
- *	new >= 9.0, old <= 8.4
- *	9.0 has a new pg_largeobject permission table
- */
-void
-new_9_0_populate_pg_largeobject_metadata(ClusterInfo *cluster, bool check_mode)
-{
-	int			dbnum;
-	FILE	   *script = NULL;
-	bool		found = false;
-	char		output_path[MAXPGPATH];
-
-	prep_status("Checking for large objects");
-
-	snprintf(output_path, sizeof(output_path), "pg_largeobject.sql");
-
-	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
-	{
-		PGresult   *res;
-		int			i_count;
-		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
-		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
-
-		/* find if there are any large objects */
-		res = executeQueryOrDie(conn,
-								"SELECT count(*) "
-								"FROM	pg_catalog.pg_largeobject ");
-
-		i_count = PQfnumber(res, "count");
-		if (atoi(PQgetvalue(res, 0, i_count)) != 0)
-		{
-			found = true;
-			if (!check_mode)
-			{
-				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
-					pg_fatal("could not open file \"%s\": %s\n", output_path, getErrorText(errno));
-				fprintf(script, "\\connect %s\n",
-						quote_identifier(active_db->db_name));
-				fprintf(script,
-						"SELECT pg_catalog.lo_create(t.loid)\n"
-						"FROM (SELECT DISTINCT loid FROM pg_catalog.pg_largeobject) AS t;\n");
-			}
-		}
-
-		PQclear(res);
-		PQfinish(conn);
-	}
-
-	if (script)
-		fclose(script);
-
-	if (found)
-	{
-		report_status(PG_WARNING, "warning");
-		if (check_mode)
-			pg_log(PG_WARNING, "\n"
-				   "Your installation contains large objects.  The new database has an\n"
-				   "additional large object permission table.  After upgrading, you will be\n"
-				   "given a command to populate the pg_largeobject permission table with\n"
-				   "default permissions.\n\n");
-		else
-			pg_log(PG_WARNING, "\n"
-				   "Your installation contains large objects.  The new database has an\n"
-				   "additional large object permission table, so default permissions must be\n"
-				   "defined for all large objects.  The file\n"
-				   "    %s\n"
-				   "when executed by psql by the database superuser will set the default\n"
-				   "permissions.\n\n",
-				   output_path);
-	}
-	else
-		check_ok();
-}
-
-
-/*
- * old_9_3_check_for_line_data_type_usage()
- *	9.3 -> 9.4
- *	Fully implement the 'line' data type in 9.4, which previously returned
- *	"not enabled" by default and was only functionally enabled with a
- *	compile-time switch;  9.4 "line" has different binary and text
- *	representation formats;  checks tables and indexes.
- */
-void
-old_9_3_check_for_line_data_type_usage(ClusterInfo *cluster)
-{
-	int			dbnum;
-	FILE	   *script = NULL;
-	bool		found = false;
-	char		output_path[MAXPGPATH];
-
-	prep_status("Checking for invalid \"line\" user columns");
-
-	snprintf(output_path, sizeof(output_path), "tables_using_line.txt");
-
-	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
-	{
-		PGresult   *res;
-		bool		db_used = false;
-		int			ntups;
-		int			rowno;
-		int			i_nspname,
-					i_relname,
-					i_attname;
-		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
-		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
-
-		res = executeQueryOrDie(conn,
-								"SELECT n.nspname, c.relname, a.attname "
-								"FROM	pg_catalog.pg_class c, "
-								"		pg_catalog.pg_namespace n, "
-								"		pg_catalog.pg_attribute a "
-								"WHERE	c.oid = a.attrelid AND "
-								"		NOT a.attisdropped AND "
-								"		a.atttypid = 'pg_catalog.line'::pg_catalog.regtype AND "
-								"		c.relnamespace = n.oid AND "
-		/* exclude possible orphaned temp tables */
-								"		n.nspname !~ '^pg_temp_' AND "
-						 "		n.nspname !~ '^pg_toast_temp_' AND "
-								"		n.nspname NOT IN ('pg_catalog', 'information_schema')");
-
-		ntups = PQntuples(res);
-		i_nspname = PQfnumber(res, "nspname");
-		i_relname = PQfnumber(res, "relname");
-		i_attname = PQfnumber(res, "attname");
-		for (rowno = 0; rowno < ntups; rowno++)
-		{
-			found = true;
-			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
-				pg_fatal("could not open file \"%s\": %s\n", output_path, getErrorText(errno));
-			if (!db_used)
-			{
-				fprintf(script, "Database: %s\n", active_db->db_name);
-				db_used = true;
-			}
-			fprintf(script, "  %s.%s.%s\n",
-					PQgetvalue(res, rowno, i_nspname),
-					PQgetvalue(res, rowno, i_relname),
-					PQgetvalue(res, rowno, i_attname));
-		}
-
-		PQclear(res);
-
-		PQfinish(conn);
-	}
-
-	if (script)
-		fclose(script);
-
-	if (found)
-	{
-		pg_log(PG_REPORT, "fatal\n");
-		pg_fatal("Your installation contains the \"line\" data type in user tables.  This\n"
-		"data type changed its internal and input/output format between your old\n"
-				 "and new clusters so this cluster cannot currently be upgraded.  You can\n"
-		"remove the problem tables and restart the upgrade.  A list of the problem\n"
-				 "columns is in the file:\n"
-				 "    %s\n\n", output_path);
-	}
-	else
-		check_ok();
-}
diff --git a/contrib/pg_upgrade_support/Makefile b/contrib/pg_upgrade_support/Makefile
deleted file mode 100644
index f7def16..0000000
--- a/contrib/pg_upgrade_support/Makefile
+++ /dev/null
@@ -1,16 +0,0 @@
-# contrib/pg_upgrade_support/Makefile
-
-PGFILEDESC = "pg_upgrade_support - server-side functions for pg_upgrade"
-
-MODULES = pg_upgrade_support
-
-ifdef USE_PGXS
-PG_CONFIG = pg_config
-PGXS := $(shell $(PG_CONFIG) --pgxs)
-include $(PGXS)
-else
-subdir = contrib/pg_upgrade_support
-top_builddir = ../..
-include $(top_builddir)/src/Makefile.global
-include $(top_srcdir)/contrib/contrib-global.mk
-endif
diff --git a/contrib/pg_upgrade_support/pg_upgrade_support.c b/contrib/pg_upgrade_support/pg_upgrade_support.c
deleted file mode 100644
index f477973..0000000
--- a/contrib/pg_upgrade_support/pg_upgrade_support.c
+++ /dev/null
@@ -1,190 +0,0 @@
-/*
- *	pg_upgrade_support.c
- *
- *	server-side functions to set backend global variables
- *	to control oid and relfilenode assignment, and do other special
- *	hacks needed for pg_upgrade.
- *
- *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
- *	contrib/pg_upgrade_support/pg_upgrade_support.c
- */
-
-#include "postgres.h"
-
-#include "catalog/binary_upgrade.h"
-#include "catalog/namespace.h"
-#include "catalog/pg_type.h"
-#include "commands/extension.h"
-#include "miscadmin.h"
-#include "utils/array.h"
-#include "utils/builtins.h"
-
-/* THIS IS USED ONLY FOR PG >= 9.0 */
-
-#ifdef PG_MODULE_MAGIC
-PG_MODULE_MAGIC;
-#endif
-
-PG_FUNCTION_INFO_V1(set_next_pg_type_oid);
-PG_FUNCTION_INFO_V1(set_next_array_pg_type_oid);
-PG_FUNCTION_INFO_V1(set_next_toast_pg_type_oid);
-
-PG_FUNCTION_INFO_V1(set_next_heap_pg_class_oid);
-PG_FUNCTION_INFO_V1(set_next_index_pg_class_oid);
-PG_FUNCTION_INFO_V1(set_next_toast_pg_class_oid);
-
-PG_FUNCTION_INFO_V1(set_next_pg_enum_oid);
-PG_FUNCTION_INFO_V1(set_next_pg_authid_oid);
-
-PG_FUNCTION_INFO_V1(create_empty_extension);
-
-#define CHECK_IS_BINARY_UPGRADE 								\
-do { 															\
-	if (!IsBinaryUpgrade)										\
-		ereport(ERROR,											\
-				(errcode(ERRCODE_CANT_CHANGE_RUNTIME_PARAM),	\
-				 (errmsg("function can only be called when server is in binary upgrade mode")))); \
-} while (0)
-
-Datum
-set_next_pg_type_oid(PG_FUNCTION_ARGS)
-{
-	Oid			typoid = PG_GETARG_OID(0);
-
-	CHECK_IS_BINARY_UPGRADE;
-	binary_upgrade_next_pg_type_oid = typoid;
-
-	PG_RETURN_VOID();
-}
-
-Datum
-set_next_array_pg_type_oid(PG_FUNCTION_ARGS)
-{
-	Oid			typoid = PG_GETARG_OID(0);
-
-	CHECK_IS_BINARY_UPGRADE;
-	binary_upgrade_next_array_pg_type_oid = typoid;
-
-	PG_RETURN_VOID();
-}
-
-Datum
-set_next_toast_pg_type_oid(PG_FUNCTION_ARGS)
-{
-	Oid			typoid = PG_GETARG_OID(0);
-
-	CHECK_IS_BINARY_UPGRADE;
-	binary_upgrade_next_toast_pg_type_oid = typoid;
-
-	PG_RETURN_VOID();
-}
-
-Datum
-set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
-{
-	Oid			reloid = PG_GETARG_OID(0);
-
-	CHECK_IS_BINARY_UPGRADE;
-	binary_upgrade_next_heap_pg_class_oid = reloid;
-
-	PG_RETURN_VOID();
-}
-
-Datum
-set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
-{
-	Oid			reloid = PG_GETARG_OID(0);
-
-	CHECK_IS_BINARY_UPGRADE;
-	binary_upgrade_next_index_pg_class_oid = reloid;
-
-	PG_RETURN_VOID();
-}
-
-Datum
-set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
-{
-	Oid			reloid = PG_GETARG_OID(0);
-
-	CHECK_IS_BINARY_UPGRADE;
-	binary_upgrade_next_toast_pg_class_oid = reloid;
-
-	PG_RETURN_VOID();
-}
-
-Datum
-set_next_pg_enum_oid(PG_FUNCTION_ARGS)
-{
-	Oid			enumoid = PG_GETARG_OID(0);
-
-	CHECK_IS_BINARY_UPGRADE;
-	binary_upgrade_next_pg_enum_oid = enumoid;
-
-	PG_RETURN_VOID();
-}
-
-Datum
-set_next_pg_authid_oid(PG_FUNCTION_ARGS)
-{
-	Oid			authoid = PG_GETARG_OID(0);
-
-	CHECK_IS_BINARY_UPGRADE;
-	binary_upgrade_next_pg_authid_oid = authoid;
-	PG_RETURN_VOID();
-}
-
-Datum
-create_empty_extension(PG_FUNCTION_ARGS)
-{
-	text	   *extName = PG_GETARG_TEXT_PP(0);
-	text	   *schemaName = PG_GETARG_TEXT_PP(1);
-	bool		relocatable = PG_GETARG_BOOL(2);
-	text	   *extVersion = PG_GETARG_TEXT_PP(3);
-	Datum		extConfig;
-	Datum		extCondition;
-	List	   *requiredExtensions;
-
-	CHECK_IS_BINARY_UPGRADE;
-
-	if (PG_ARGISNULL(4))
-		extConfig = PointerGetDatum(NULL);
-	else
-		extConfig = PG_GETARG_DATUM(4);
-
-	if (PG_ARGISNULL(5))
-		extCondition = PointerGetDatum(NULL);
-	else
-		extCondition = PG_GETARG_DATUM(5);
-
-	requiredExtensions = NIL;
-	if (!PG_ARGISNULL(6))
-	{
-		ArrayType  *textArray = PG_GETARG_ARRAYTYPE_P(6);
-		Datum	   *textDatums;
-		int			ndatums;
-		int			i;
-
-		deconstruct_array(textArray,
-						  TEXTOID, -1, false, 'i',
-						  &textDatums, NULL, &ndatums);
-		for (i = 0; i < ndatums; i++)
-		{
-			text	   *txtname = DatumGetTextPP(textDatums[i]);
-			char	   *extName = text_to_cstring(txtname);
-			Oid			extOid = get_extension_oid(extName, false);
-
-			requiredExtensions = lappend_oid(requiredExtensions, extOid);
-		}
-	}
-
-	InsertExtensionTuple(text_to_cstring(extName),
-						 GetUserId(),
-					   get_namespace_oid(text_to_cstring(schemaName), false),
-						 relocatable,
-						 text_to_cstring(extVersion),
-						 extConfig,
-						 extCondition,
-						 requiredExtensions);
-
-	PG_RETURN_VOID();
-}
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 94fab18..5ec3d89 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -44,8 +44,11 @@
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
+#include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "nodes/plannodes.h"
 #include "optimizer/clauses.h"
+#include "optimizer/prep.h"
 #include "optimizer/var.h"
 #include "parser/parsetree.h"
 #include "utils/builtins.h"
@@ -89,6 +92,8 @@ typedef struct deparse_expr_cxt
 	RelOptInfo *foreignrel;		/* the foreign relation we are planning for */
 	StringInfo	buf;			/* output buffer to append to */
 	List	  **params_list;	/* exprs that will become remote Params */
+	List	   *outertlist;		/* outer child's target list */
+	List	   *innertlist;		/* inner child's target list */
 } deparse_expr_cxt;
 
 /*
@@ -136,6 +141,13 @@ static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod,
 				 deparse_expr_cxt *context);
 static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
 					   deparse_expr_cxt *context);
+static const char *get_jointype_name(JoinType jointype);
+
+/*
+ * convert absolute attnum to relative one.  This would be handy for handling
+ * attnum for attrs_used and column aliases.
+ */
+#define TO_RELATIVE(x)	((x) - FirstLowInvalidHeapAttributeNumber)
 
 
 /*
@@ -143,6 +155,7 @@ static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
  * which are returned as two lists:
  *	- remote_conds contains expressions that can be evaluated remotely
  *	- local_conds contains expressions that can't be evaluated remotely
+ * Note that each element is Expr, which was stripped from RestrictInfo, 
  */
 void
 classifyConditions(PlannerInfo *root,
@@ -161,9 +174,9 @@ classifyConditions(PlannerInfo *root,
 		RestrictInfo *ri = (RestrictInfo *) lfirst(lc);
 
 		if (is_foreign_expr(root, baserel, ri->clause))
-			*remote_conds = lappend(*remote_conds, ri);
+			*remote_conds = lappend(*remote_conds, ri->clause);
 		else
-			*local_conds = lappend(*local_conds, ri);
+			*local_conds = lappend(*local_conds, ri->clause);
 	}
 }
 
@@ -250,7 +263,7 @@ foreign_expr_walker(Node *node,
 				 * Param's collation, ie it's not safe for it to have a
 				 * non-default collation.
 				 */
-				if (var->varno == glob_cxt->foreignrel->relid &&
+				if (bms_is_member(var->varno, glob_cxt->foreignrel->relids) &&
 					var->varlevelsup == 0)
 				{
 					/* Var belongs to foreign table */
@@ -675,18 +688,83 @@ is_builtin(Oid oid)
  *
  * We also create an integer List of the columns being retrieved, which is
  * returned to *retrieved_attrs.
+ *
+ * The relations is a string buffer for "Relations" portion of EXPLAIN output,
+ * or NULL if caller doesn't need it.  Note that it should have been
+ * initialized by caller.
  */
 void
 deparseSelectSql(StringInfo buf,
 				 PlannerInfo *root,
 				 RelOptInfo *baserel,
 				 Bitmapset *attrs_used,
-				 List **retrieved_attrs)
+				 List *remote_conds,
+				 List **params_list,
+				 List **fdw_ps_tlist,
+				 List **retrieved_attrs,
+				 StringInfo relations)
 {
+	PgFdwRelationInfo  *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
 	RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
 	Relation	rel;
 
 	/*
+	 * If given relation was a join relation, recursively construct statement
+	 * by putting each outer and inner relations in FROM clause as a subquery
+	 * with aliasing.
+	 */
+	if (baserel->reloptkind == RELOPT_JOINREL)
+	{
+		RelOptInfo		   *rel_o = fpinfo->outerrel;
+		RelOptInfo		   *rel_i = fpinfo->innerrel;
+		PgFdwRelationInfo  *fpinfo_o = (PgFdwRelationInfo *) rel_o->fdw_private;
+		PgFdwRelationInfo  *fpinfo_i = (PgFdwRelationInfo *) rel_i->fdw_private;
+		StringInfoData		sql_o;
+		StringInfoData		sql_i;
+		List			   *ret_attrs_tmp;	/* not used */
+		StringInfoData		relations_o;
+		StringInfoData		relations_i;
+		const char		   *jointype_str;
+
+		/*
+		 * Deparse query for outer and inner relation, and combine them into
+		 * a query.
+		 *
+		 * Here we don't pass fdw_ps_tlist because targets of underlying
+		 * relations are already put in joinrel->reltargetlist, and
+		 * deparseJoinRel() takes all care about it.
+		 */
+		initStringInfo(&sql_o);
+		initStringInfo(&relations_o);
+		deparseSelectSql(&sql_o, root, rel_o, fpinfo_o->attrs_used,
+						 fpinfo_o->remote_conds, params_list,
+						 NULL, &ret_attrs_tmp, &relations_o);
+		initStringInfo(&sql_i);
+		initStringInfo(&relations_i);
+		deparseSelectSql(&sql_i, root, rel_i, fpinfo_i->attrs_used,
+						 fpinfo_i->remote_conds, params_list,
+						 NULL, &ret_attrs_tmp, &relations_i);
+
+		/* For EXPLAIN output */
+		jointype_str = get_jointype_name(fpinfo->jointype);
+		if (relations)
+			appendStringInfo(relations, "(%s) %s JOIN (%s)",
+							 relations_o.data, jointype_str, relations_i.data);
+
+		deparseJoinSql(buf, root, baserel,
+					   fpinfo->outerrel,
+					   fpinfo->innerrel,
+					   sql_o.data,
+					   sql_i.data,
+					   fpinfo->jointype,
+					   fpinfo->joinclauses,
+					   fpinfo->otherclauses,
+					   fdw_ps_tlist,
+					   retrieved_attrs);
+		return;
+	}
+
+	/*
 	 * Core code already has some lock on each rel being planned, so we can
 	 * use NoLock here.
 	 */
@@ -705,6 +783,87 @@ deparseSelectSql(StringInfo buf,
 	appendStringInfoString(buf, " FROM ");
 	deparseRelation(buf, rel);
 
+	/*
+	 * Return local relation name for EXPLAIN output.
+	 * We can't know VERBOSE option is specified or not, so always add shcema
+	 * name.
+	 */
+	if (relations)
+	{
+		const char	   *namespace;
+		const char	   *relname;
+		const char	   *refname;
+
+		namespace = get_namespace_name(get_rel_namespace(rte->relid));
+		relname = get_rel_name(rte->relid);
+		refname = rte->eref->aliasname;
+		appendStringInfo(relations, "%s.%s",
+						 quote_identifier(namespace),
+						 quote_identifier(relname));
+		if (*refname && strcmp(refname, relname) != 0)
+			appendStringInfo(relations, " %s",
+							 quote_identifier(rte->eref->aliasname));
+	}
+
+	/*
+	 * Construct WHERE clause
+	 */
+	if (remote_conds)
+		appendConditions(buf, root, baserel, NULL, NULL, remote_conds,
+						 " WHERE ", params_list);
+
+	/*
+	 * Add FOR UPDATE/SHARE if appropriate.  We apply locking during the
+	 * initial row fetch, rather than later on as is done for local tables.
+	 * The extra roundtrips involved in trying to duplicate the local
+	 * semantics exactly don't seem worthwhile (see also comments for
+	 * RowMarkType).
+	 *
+	 * Note: because we actually run the query as a cursor, this assumes
+	 * that DECLARE CURSOR ... FOR UPDATE is supported, which it isn't
+	 * before 8.3.
+	 */
+	if (baserel->relid == root->parse->resultRelation &&
+		(root->parse->commandType == CMD_UPDATE ||
+		 root->parse->commandType == CMD_DELETE))
+	{
+		/* Relation is UPDATE/DELETE target, so use FOR UPDATE */
+		appendStringInfoString(buf, " FOR UPDATE");
+	}
+	else
+	{
+		PlanRowMark *rc = get_plan_rowmark(root->rowMarks, baserel->relid);
+
+		if (rc)
+		{
+			/*
+			 * Relation is specified as a FOR UPDATE/SHARE target, so handle
+			 * that.  (But we could also see LCS_NONE, meaning this isn't a
+			 * target relation after all.)
+			 *
+			 * For now, just ignore any [NO] KEY specification, since (a)
+			 * it's not clear what that means for a remote table that we
+			 * don't have complete information about, and (b) it wouldn't
+			 * work anyway on older remote servers.  Likewise, we don't
+			 * worry about NOWAIT.
+			 */
+			switch (rc->strength)
+			{
+				case LCS_NONE:
+					/* No locking needed */
+					break;
+				case LCS_FORKEYSHARE:
+				case LCS_FORSHARE:
+					appendStringInfoString(buf, " FOR SHARE");
+					break;
+				case LCS_FORNOKEYUPDATE:
+				case LCS_FORUPDATE:
+					appendStringInfoString(buf, " FOR UPDATE");
+					break;
+			}
+		}
+	}
+
 	heap_close(rel, NoLock);
 }
 
@@ -731,8 +890,7 @@ deparseTargetList(StringInfo buf,
 	*retrieved_attrs = NIL;
 
 	/* If there's a whole-row reference, we'll need all the columns. */
-	have_wholerow = bms_is_member(0 - FirstLowInvalidHeapAttributeNumber,
-								  attrs_used);
+	have_wholerow = bms_is_member(TO_RELATIVE(0), attrs_used);
 
 	first = true;
 	for (i = 1; i <= tupdesc->natts; i++)
@@ -743,15 +901,14 @@ deparseTargetList(StringInfo buf,
 		if (attr->attisdropped)
 			continue;
 
-		if (have_wholerow ||
-			bms_is_member(i - FirstLowInvalidHeapAttributeNumber,
-						  attrs_used))
+		if (have_wholerow || bms_is_member(TO_RELATIVE(i), attrs_used))
 		{
 			if (!first)
 				appendStringInfoString(buf, ", ");
 			first = false;
 
 			deparseColumnRef(buf, rtindex, i, root);
+			appendStringInfo(buf, " a%d", TO_RELATIVE(i));
 
 			*retrieved_attrs = lappend_int(*retrieved_attrs, i);
 		}
@@ -761,17 +918,17 @@ deparseTargetList(StringInfo buf,
 	 * Add ctid if needed.  We currently don't support retrieving any other
 	 * system columns.
 	 */
-	if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-					  attrs_used))
+	if (bms_is_member(TO_RELATIVE(SelfItemPointerAttributeNumber), attrs_used))
 	{
 		if (!first)
 			appendStringInfoString(buf, ", ");
 		first = false;
 
-		appendStringInfoString(buf, "ctid");
+		appendStringInfo(buf, "ctid a%d",
+						 TO_RELATIVE(SelfItemPointerAttributeNumber));
 
 		*retrieved_attrs = lappend_int(*retrieved_attrs,
-									   SelfItemPointerAttributeNumber);
+										   SelfItemPointerAttributeNumber);
 	}
 
 	/* Don't generate bad syntax if no undropped columns */
@@ -780,7 +937,8 @@ deparseTargetList(StringInfo buf,
 }
 
 /*
- * Deparse WHERE clauses in given list of RestrictInfos and append them to buf.
+ * Deparse conditions, such as WHERE clause and ON clause of JOIN, in given
+ * list of Expr and append them to buf.
  *
  * baserel is the foreign table we're planning for.
  *
@@ -794,12 +952,14 @@ deparseTargetList(StringInfo buf,
  * so Params and other-relation Vars should be replaced by dummy values.
  */
 void
-appendWhereClause(StringInfo buf,
-				  PlannerInfo *root,
-				  RelOptInfo *baserel,
-				  List *exprs,
-				  bool is_first,
-				  List **params)
+appendConditions(StringInfo buf,
+				 PlannerInfo *root,
+				 RelOptInfo *baserel,
+				 List *outertlist,
+				 List *innertlist,
+				 List *exprs,
+				 const char *prefix,
+				 List **params)
 {
 	deparse_expr_cxt context;
 	int			nestlevel;
@@ -813,31 +973,315 @@ appendWhereClause(StringInfo buf,
 	context.foreignrel = baserel;
 	context.buf = buf;
 	context.params_list = params;
+	context.outertlist = outertlist;
+	context.innertlist = innertlist;
 
 	/* Make sure any constants in the exprs are printed portably */
 	nestlevel = set_transmission_modes();
 
 	foreach(lc, exprs)
 	{
-		RestrictInfo *ri = (RestrictInfo *) lfirst(lc);
+		Expr	   *expr = (Expr *) lfirst(lc);
 
 		/* Connect expressions with "AND" and parenthesize each condition. */
-		if (is_first)
-			appendStringInfoString(buf, " WHERE ");
-		else
-			appendStringInfoString(buf, " AND ");
+		if (prefix)
+			appendStringInfo(buf, "%s", prefix);
 
 		appendStringInfoChar(buf, '(');
-		deparseExpr(ri->clause, &context);
+		deparseExpr(expr, &context);
 		appendStringInfoChar(buf, ')');
 
-		is_first = false;
+		prefix= " AND ";
 	}
 
 	reset_transmission_modes(nestlevel);
 }
 
 /*
+ * Returns position index (start with 1) of given var in given target list, or
+ * 0 when not found.
+ */
+static int
+find_var_pos(Var *node, List *tlist)
+{
+	int		pos = 1;
+	ListCell *lc;
+
+	foreach(lc, tlist)
+	{
+		Var *var = (Var *) lfirst(lc);
+
+		if (equal(var, node))
+		{
+			return pos;
+		}
+		pos++;
+	}
+
+	return 0;
+}
+
+/*
+ * Deparse given Var into buf.
+ */
+static void
+deparseJoinVar(Var *node, deparse_expr_cxt *context)
+{
+	char		side;
+	int			pos;
+
+	pos = find_var_pos(node, context->outertlist);
+	if (pos > 0)
+		side = 'l';
+	else
+	{
+		side = 'r';
+		pos = find_var_pos(node, context->innertlist);
+	}
+	Assert(pos > 0);
+	Assert(side == 'l' || side == 'r');
+
+	/*
+	 * We treat whole-row reference same as ordinary attribute references,
+	 * because such transformation should be done in lower level.
+	 */
+	appendStringInfo(context->buf, "%c.a%d", side, pos);
+}
+
+/*
+ * Deparse column alias list for a subquery in FROM clause.
+ */
+static void
+deparseColumnAliases(StringInfo buf, List *tlist)
+{
+	int			pos;
+	ListCell   *lc;
+
+	pos = 1;
+	foreach(lc, tlist)
+	{
+		/* Deparse column alias for the subquery */
+		if (pos > 1)
+			appendStringInfoString(buf, ", ");
+		appendStringInfo(buf, "a%d", pos);
+		pos++;
+	}
+}
+
+/*
+ * Deparse "wrapper" SQL for a query which projects target lists in proper
+ * order and contents.  Note that this treatment is necessary only for queries
+ * used in FROM clause of a join query.
+ *
+ * Even if the SQL is enough simple (no ctid, no whole-row reference), the order
+ * of output column might different from underlying scan, so we always need to
+ * wrap the queries for join sources.
+ *
+ */
+static const char *
+deparseProjectionSql(PlannerInfo *root,
+					 RelOptInfo *baserel,
+					 const char *sql,
+					 char side)
+{
+	StringInfoData wholerow;
+	StringInfoData buf;
+	ListCell   *lc;
+	bool		first;
+	bool		have_wholerow = false;
+
+	/*
+	 * We have nothing to do if the targetlist contains no special reference,
+	 * such as whole-row and ctid.
+	 */
+	foreach(lc, baserel->reltargetlist)
+	{
+		Var		   *var = (Var *) lfirst(lc);
+		if (var->varattno == 0)
+		{
+			have_wholerow = true;
+			break;
+		}
+	}
+
+	/*
+	 * Construct whole-row reference with ROW() syntax
+	 */
+	if (have_wholerow)
+	{
+		RangeTblEntry *rte;
+		Relation		rel;
+		TupleDesc		tupdesc;
+		int				i;
+
+		/* Obtain TupleDesc for deparsing all valid columns */
+		rte = planner_rt_fetch(baserel->relid, root);
+		rel = heap_open(rte->relid, NoLock);
+		tupdesc = rel->rd_att;
+
+		/* Print all valid columns in ROW() to generate whole-row value */
+		initStringInfo(&wholerow);
+		appendStringInfoString(&wholerow, "ROW(");
+		first = true;
+		for (i = 1; i <= tupdesc->natts; i++)
+		{
+			Form_pg_attribute attr = tupdesc->attrs[i - 1];
+
+			/* Ignore dropped columns. */
+			if (attr->attisdropped)
+				continue;
+
+			if (!first)
+				appendStringInfoString(&wholerow, ", ");
+			first = false;
+
+			appendStringInfo(&wholerow, "%c.a%d", side, TO_RELATIVE(i));
+		}
+		appendStringInfoString(&wholerow, ")");
+
+		heap_close(rel, NoLock);
+	}
+
+	/*
+	 * Construct a SELECT statement which has the original query in its FROM
+	 * clause, and have target list entries in its SELECT clause.  The number
+	 * used in column aliases are attnum - FirstLowInvalidHeapAttributeNumber in
+	 * order to make all numbers positive even for system columns which have
+	 * minus value as attnum.
+	 */
+	initStringInfo(&buf);
+	appendStringInfoString(&buf, "SELECT ");
+	first = true;
+	foreach(lc, baserel->reltargetlist)
+	{
+		Var *var = (Var *) lfirst(lc);
+
+		if (!first)
+			appendStringInfoString(&buf, ", ");
+	
+		if (var->varattno == 0)
+			appendStringInfo(&buf, "%s", wholerow.data);
+		else
+			appendStringInfo(&buf, "%c.a%d", side, TO_RELATIVE(var->varattno));
+
+		first = false;
+	}
+	appendStringInfo(&buf, " FROM (%s) %c", sql, side);
+
+	return buf.data;
+}
+
+static const char *
+get_jointype_name(JoinType jointype)
+{
+	return jointype == JOIN_INNER ? "INNER" :
+		   jointype == JOIN_LEFT ? "LEFT" :
+		   jointype == JOIN_RIGHT ? "RIGHT" :
+		   jointype == JOIN_FULL ? "FULL" : "";
+}
+
+/*
+ * Construct a SELECT statement which contains join clause.
+ *
+ * We also create an TargetEntry List of the columns being retrieved, which is
+ * returned to *fdw_ps_tlist.
+ *
+ * path_o, tl_o, sql_o are respectively path, targetlist, and remote query
+ * statement of the outer child relation.  postfix _i means those for the inner
+ * child relation.  jointype and joinclauses are information of join method.
+ * fdw_ps_tlist is output parameter to pass target list of the pseudo scan to
+ * caller.
+ */
+void
+deparseJoinSql(StringInfo buf,
+			   PlannerInfo *root,
+			   RelOptInfo *baserel,
+			   RelOptInfo *outerrel,
+			   RelOptInfo *innerrel,
+			   const char *sql_o,
+			   const char *sql_i,
+			   JoinType jointype,
+			   List *joinclauses,
+			   List *otherclauses,
+			   List **fdw_ps_tlist,
+			   List **retrieved_attrs)
+{
+	StringInfoData selbuf;		/* buffer for SELECT clause */
+	StringInfoData abuf_o;		/* buffer for column alias list of outer */
+	StringInfoData abuf_i;		/* buffer for column alias list of inner */
+	int			i;
+	ListCell   *lc;
+	const char *jointype_str;
+	deparse_expr_cxt context;
+
+	context.root = root;
+	context.foreignrel = baserel;
+	context.buf = &selbuf;
+	context.params_list = NULL;
+	context.outertlist = outerrel->reltargetlist;
+	context.innertlist = innerrel->reltargetlist;
+
+	jointype_str = get_jointype_name(jointype);
+	*retrieved_attrs = NIL;
+
+	/* print SELECT clause of the join scan */
+	initStringInfo(&selbuf);
+	i = 0;
+	foreach(lc, baserel->reltargetlist)
+	{
+		Var		   *var = (Var *) lfirst(lc);
+		TargetEntry *tle;
+
+		if (i > 0)
+			appendStringInfoString(&selbuf, ", ");
+		deparseJoinVar(var, &context);
+
+		tle = makeTargetEntry((Expr *) var, i + 1, NULL, false);
+		if (fdw_ps_tlist)
+			*fdw_ps_tlist = lappend(*fdw_ps_tlist, tle);
+
+		*retrieved_attrs = lappend_int(*retrieved_attrs, i + 1);
+
+		i++;
+	}
+	if (i == 0)
+		appendStringInfoString(&selbuf, "NULL");
+
+	/*
+	 * Do pseudo-projection for an underlying scan on a foreign table, if a) the
+	 * relation is a base relation, and b) its targetlist contains whole-row
+	 * reference.
+	 */
+	if (outerrel->reloptkind == RELOPT_BASEREL)
+		sql_o = deparseProjectionSql(root, outerrel, sql_o, 'l');
+	if (innerrel->reloptkind == RELOPT_BASEREL)
+		sql_i = deparseProjectionSql(root, innerrel, sql_i, 'r');
+
+	/* Deparse column alias portion of subquery in FROM clause. */
+	initStringInfo(&abuf_o);
+	deparseColumnAliases(&abuf_o, outerrel->reltargetlist);
+	initStringInfo(&abuf_i);
+	deparseColumnAliases(&abuf_i, innerrel->reltargetlist);
+
+	/* Construct SELECT statement */
+	appendStringInfo(buf, "SELECT %s FROM", selbuf.data);
+	appendStringInfo(buf, " (%s) l (%s) %s JOIN (%s) r (%s)",
+					 sql_o, abuf_o.data, jointype_str, sql_i, abuf_i.data);
+	/* Append ON clause */
+	if (joinclauses)
+		appendConditions(buf, root, baserel,
+						 outerrel->reltargetlist, innerrel->reltargetlist,
+						 joinclauses,
+						 " ON ", NULL);
+	/* Append WHERE clause */
+	if (otherclauses)
+		appendConditions(buf, root, baserel,
+						 outerrel->reltargetlist, innerrel->reltargetlist,
+						 otherclauses,
+						 " WHERE ", NULL);
+}
+
+/*
  * deparse remote INSERT statement
  *
  * The statement text is appended to buf, and we also create an integer List
@@ -976,8 +1420,7 @@ deparseReturningList(StringInfo buf, PlannerInfo *root,
 	if (trig_after_row)
 	{
 		/* whole-row reference acquires all non-system columns */
-		attrs_used =
-			bms_make_singleton(0 - FirstLowInvalidHeapAttributeNumber);
+		attrs_used = bms_make_singleton(TO_RELATIVE(0));
 	}
 
 	if (returningList != NIL)
@@ -1261,6 +1704,8 @@ deparseExpr(Expr *node, deparse_expr_cxt *context)
 /*
  * Deparse given Var node into context->buf.
  *
+ * If context has valid innerrel, this is invoked for a join conditions.
+ *
  * If the Var belongs to the foreign relation, just print its remote name.
  * Otherwise, it's effectively a Param (and will in fact be a Param at
  * run time).  Handle it the same way we handle plain Params --- see
@@ -1271,39 +1716,46 @@ deparseVar(Var *node, deparse_expr_cxt *context)
 {
 	StringInfo	buf = context->buf;
 
-	if (node->varno == context->foreignrel->relid &&
-		node->varlevelsup == 0)
+	if (context->foreignrel->reloptkind == RELOPT_JOINREL)
 	{
-		/* Var belongs to foreign table */
-		deparseColumnRef(buf, node->varno, node->varattno, context->root);
+		deparseJoinVar(node, context);
 	}
 	else
 	{
-		/* Treat like a Param */
-		if (context->params_list)
+		if (node->varno == context->foreignrel->relid &&
+			node->varlevelsup == 0)
 		{
-			int			pindex = 0;
-			ListCell   *lc;
-
-			/* find its index in params_list */
-			foreach(lc, *context->params_list)
+			/* Var belongs to foreign table */
+			deparseColumnRef(buf, node->varno, node->varattno, context->root);
+		}
+		else
+		{
+			/* Treat like a Param */
+			if (context->params_list)
 			{
-				pindex++;
-				if (equal(node, (Node *) lfirst(lc)))
-					break;
+				int			pindex = 0;
+				ListCell   *lc;
+
+				/* find its index in params_list */
+				foreach(lc, *context->params_list)
+				{
+					pindex++;
+					if (equal(node, (Node *) lfirst(lc)))
+						break;
+				}
+				if (lc == NULL)
+				{
+					/* not in list, so add it */
+					pindex++;
+					*context->params_list = lappend(*context->params_list, node);
+				}
+
+				printRemoteParam(pindex, node->vartype, node->vartypmod, context);
 			}
-			if (lc == NULL)
+			else
 			{
-				/* not in list, so add it */
-				pindex++;
-				*context->params_list = lappend(*context->params_list, node);
+				printRemotePlaceholder(node->vartype, node->vartypmod, context);
 			}
-
-			printRemoteParam(pindex, node->vartype, node->vartypmod, context);
-		}
-		else
-		{
-			printRemotePlaceholder(node->vartype, node->vartypmod, context);
 		}
 	}
 }
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 783cb41..58f24c0 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9,11 +9,16 @@ DO $d$
             OPTIONS (dbname '$$||current_database()||$$',
                      port '$$||current_setting('port')||$$'
             )$$;
+        EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER postgres_fdw
+            OPTIONS (dbname '$$||current_database()||$$',
+                     port '$$||current_setting('port')||$$'
+            )$$;
     END;
 $d$;
 CREATE USER MAPPING FOR public SERVER testserver1
 	OPTIONS (user 'value', password 'value');
 CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
+CREATE USER MAPPING FOR CURRENT_USER SERVER loopback2;
 -- ===================================================================
 -- create objects used through FDW loopback server
 -- ===================================================================
@@ -35,6 +40,18 @@ CREATE TABLE "S 1"."T 2" (
 	c2 text,
 	CONSTRAINT t2_pkey PRIMARY KEY (c1)
 );
+CREATE TABLE "S 1"."T 3" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text,
+	CONSTRAINT t3_pkey PRIMARY KEY (c1)
+);
+CREATE TABLE "S 1"."T 4" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c4 text,
+	CONSTRAINT t4_pkey PRIMARY KEY (c1)
+);
 INSERT INTO "S 1"."T 1"
 	SELECT id,
 	       id % 10,
@@ -49,8 +66,22 @@ INSERT INTO "S 1"."T 2"
 	SELECT id,
 	       'AAA' || to_char(id, 'FM000')
 	FROM generate_series(1, 100) id;
+INSERT INTO "S 1"."T 3"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 3" WHERE c1 % 2 != 0;	-- delete for outer join tests
+INSERT INTO "S 1"."T 4"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 4" WHERE c1 % 3 != 0;	-- delete for outer join tests
 ANALYZE "S 1"."T 1";
 ANALYZE "S 1"."T 2";
+ANALYZE "S 1"."T 3";
+ANALYZE "S 1"."T 4";
 -- ===================================================================
 -- create foreign tables
 -- ===================================================================
@@ -78,6 +109,26 @@ CREATE FOREIGN TABLE ft2 (
 	c8 user_enum
 ) SERVER loopback;
 ALTER FOREIGN TABLE ft2 DROP COLUMN cx;
+CREATE FOREIGN TABLE ft4 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3');
+CREATE FOREIGN TABLE ft5 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE FOREIGN TABLE ft6 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback2 OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE USER view_owner;
+GRANT ALL ON ft5 TO view_owner;
+CREATE VIEW v_ft5 AS SELECT * FROM ft5;
+ALTER VIEW v_ft5 OWNER TO view_owner;
+CREATE USER MAPPING FOR view_owner SERVER loopback;
 -- ===================================================================
 -- tests for validator
 -- ===================================================================
@@ -119,12 +170,15 @@ ALTER FOREIGN TABLE ft2 OPTIONS (schema_name 'S 1', table_name 'T 1');
 ALTER FOREIGN TABLE ft1 ALTER COLUMN c1 OPTIONS (column_name 'C 1');
 ALTER FOREIGN TABLE ft2 ALTER COLUMN c1 OPTIONS (column_name 'C 1');
 \det+
-                             List of foreign tables
- Schema | Table |  Server  |              FDW Options              | Description 
---------+-------+----------+---------------------------------------+-------------
- public | ft1   | loopback | (schema_name 'S 1', table_name 'T 1') | 
- public | ft2   | loopback | (schema_name 'S 1', table_name 'T 1') | 
-(2 rows)
+                              List of foreign tables
+ Schema | Table |  Server   |              FDW Options              | Description 
+--------+-------+-----------+---------------------------------------+-------------
+ public | ft1   | loopback  | (schema_name 'S 1', table_name 'T 1') | 
+ public | ft2   | loopback  | (schema_name 'S 1', table_name 'T 1') | 
+ public | ft4   | loopback  | (schema_name 'S 1', table_name 'T 3') | 
+ public | ft5   | loopback  | (schema_name 'S 1', table_name 'T 4') | 
+ public | ft6   | loopback2 | (schema_name 'S 1', table_name 'T 4') | 
+(5 rows)
 
 -- Now we should be able to run ANALYZE.
 -- To exercise multiple code paths, we use local stats on ft1
@@ -160,8 +214,8 @@ SELECT * FROM ft1 ORDER BY c3, c1 OFFSET 100 LIMIT 10;
 (10 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                     QUERY PLAN                                                      
+---------------------------------------------------------------------------------------------------------------------
  Limit
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    ->  Sort
@@ -169,7 +223,7 @@ EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan on public.ft1 t1
                Output: c1, c2, c3, c4, c5, c6, c7, c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (8 rows)
 
 SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
@@ -189,8 +243,8 @@ SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
 
 -- whole-row reference
 EXPLAIN (VERBOSE, COSTS false) SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                     QUERY PLAN                                                      
+---------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.*, c3, c1
    ->  Sort
@@ -198,7 +252,7 @@ EXPLAIN (VERBOSE, COSTS false) SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSE
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan on public.ft1 t1
                Output: t1.*, c3, c1
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (8 rows)
 
 SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
@@ -224,11 +278,11 @@ SELECT * FROM ft1 WHERE false;
 
 -- with WHERE clause
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
-                                                                   QUERY PLAN                                                                   
-------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                   QUERY PLAN                                                                                   
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c7 >= '1'::bpchar)) AND (("C 1" = 101)) AND ((c6 = '1'::text))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c7 >= '1'::bpchar)) AND (("C 1" = 101)) AND ((c6 = '1'::text))
 (3 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
@@ -239,13 +293,13 @@ SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
 
 -- with FOR UPDATE/SHARE
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
-                                                   QUERY PLAN                                                   
-----------------------------------------------------------------------------------------------------------------
+                                                                   QUERY PLAN                                                                   
+------------------------------------------------------------------------------------------------------------------------------------------------
  LockRows
    Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 101)) FOR UPDATE
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 101)) FOR UPDATE
 (5 rows)
 
 SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
@@ -255,13 +309,13 @@ SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
 (1 row)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
-                                                  QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
+                                                                  QUERY PLAN                                                                   
+-----------------------------------------------------------------------------------------------------------------------------------------------
  LockRows
    Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 102)) FOR SHARE
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 102)) FOR SHARE
 (5 rows)
 
 SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
@@ -277,22 +331,6 @@ SELECT COUNT(*) FROM ft1 t1;
   1000
 (1 row)
 
--- join two tables
-SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
- c1  
------
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
-(10 rows)
-
 -- subquery
 SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <= 10) ORDER BY c1;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -353,153 +391,149 @@ CREATE OPERATOR === (
     NEGATOR = !==
 );
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = postgres_fdw_abs(t1.c2);
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c1 = postgres_fdw_abs(t1.c2))
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 === t1.c2;
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c1 === t1.c2)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = abs(t1.c2);
-                                            QUERY PLAN                                             
----------------------------------------------------------------------------------------------------
+                                                            QUERY PLAN                                                             
+-----------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = abs(c2)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = abs(c2)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = t1.c2;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
+                                                          QUERY PLAN                                                          
+------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = c2))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = c2))
 (3 rows)
 
 -- ===================================================================
 -- WHERE with remotely-executable conditions
 -- ===================================================================
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 1;         -- Var, OpExpr(b), Const
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
-                                                  QUERY PLAN                                                  
---------------------------------------------------------------------------------------------------------------
+                                                                  QUERY PLAN                                                                  
+----------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 100)) AND ((c2 = 0))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 100)) AND ((c2 = 0))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NULL;        -- NullTest
-                                           QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
+                                                           QUERY PLAN                                                            
+---------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;    -- NullTest
-                                             QUERY PLAN                                              
------------------------------------------------------------------------------------------------------
+                                                             QUERY PLAN                                                              
+-------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
-                                                     QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
+                                                                     QUERY PLAN                                                                      
+-----------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((round(abs("C 1"), 0) = 1::numeric))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((round(abs("C 1"), 0) = 1::numeric))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = -c1;          -- OpExpr(l)
-                                             QUERY PLAN                                              
------------------------------------------------------------------------------------------------------
+                                                             QUERY PLAN                                                              
+-------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = (- "C 1")))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = (- "C 1")))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE 1 = c1!;           -- OpExpr(r)
-                                                QUERY PLAN                                                
-----------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((1::numeric = ("C 1" !)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((1::numeric = ("C 1" !)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE (c1 IS NOT NULL) IS DISTINCT FROM (c1 IS NOT NULL); -- DistinctExpr
-                                                                 QUERY PLAN                                                                 
---------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                 QUERY PLAN                                                                                 
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" IS NOT NULL) IS DISTINCT FROM ("C 1" IS NOT NULL)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" IS NOT NULL) IS DISTINCT FROM ("C 1" IS NOT NULL)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = ANY(ARRAY[c2, 1, c1 + 0]); -- ScalarArrayOpExpr
-                                                        QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
+                                                                        QUERY PLAN                                                                         
+-----------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = ANY (ARRAY[c2, 1, ("C 1" + 0)])))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = ANY (ARRAY[c2, 1, ("C 1" + 0)])))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = (ARRAY[c1,c2,3])[1]; -- ArrayRef
-                                                      QUERY PLAN                                                      
-----------------------------------------------------------------------------------------------------------------------
+                                                                      QUERY PLAN                                                                      
+------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = ((ARRAY["C 1", c2, 3])[1])))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = ((ARRAY["C 1", c2, 3])[1])))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c6 = E'foo''s\\bar';  -- check special chars
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                  
+---------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c6 = E'foo''s\\bar'::text))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c6 = E'foo''s\\bar'::text))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c8 = 'foo';  -- can't be sent to remote
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 -- parameterized remote path
 EXPLAIN (VERBOSE, COSTS false)
   SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
- Nested Loop
+                                                                                                                                                                                                                                                                                     QUERY PLAN                                                                                                                                                                                                                                                                                      
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan
    Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8, b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
-   ->  Foreign Scan on public.ft2 a
-         Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 47))
-   ->  Foreign Scan on public.ft2 b
-         Output: b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
-(8 rows)
+   Relations: (public.ft2 a) INNER JOIN (public.ft2 b)
+   Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, r.a1, r.a2, r.a3, r.a4, r.a5, r.a6, r.a7, r.a8 FROM (SELECT l.a9, l.a10, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 47))) l) l (a1, a2, a3, a4, a5, a6, a7, a8) INNER JOIN (SELECT r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2, a3, a4, a5, a6, a7, a8) ON ((l.a2 = r.a1))
+(4 rows)
 
 SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  | c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -511,18 +545,18 @@ SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
 EXPLAIN (VERBOSE, COSTS false)
   SELECT * FROM ft2 a, ft2 b
   WHERE a.c2 = 6 AND b.c1 = a.c1 AND a.c8 = 'foo' AND b.c7 = upper(a.c7);
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                 
+--------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop
    Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8, b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
    ->  Foreign Scan on public.ft2 a
          Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8
          Filter: (a.c8 = 'foo'::user_enum)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c2 = 6))
+         Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c2 = 6))
    ->  Foreign Scan on public.ft2 b
          Output: b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
          Filter: (upper((a.c7)::text) = (b.c7)::text)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
+         Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
 (10 rows)
 
 SELECT * FROM ft2 a, ft2 b
@@ -651,21 +685,597 @@ SELECT * FROM ft2 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft1 WHERE c1 < 5));
 (4 rows)
 
 -- ===================================================================
+-- JOIN queries
+-- ===================================================================
+-- join two tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                                                                                               QUERY PLAN                                                                                                                
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1, t1.c3
+               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+               Remote SQL: SELECT l.a1, l.a2, r.a1 FROM (SELECT l.a10, l.a12 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l) l (a1, a2) INNER JOIN (SELECT r.a9 FROM (SELECT "C 1" a9 FROM "S 1"."T 1") r) r (a1) ON ((l.a1 = r.a1))
+(9 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1  | c1  
+-----+-----
+ 101 | 101
+ 102 | 102
+ 103 | 103
+ 104 | 104
+ 105 | 105
+ 106 | 106
+ 107 | 107
+ 108 | 108
+ 109 | 109
+ 110 | 110
+(10 rows)
+
+-- join three tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+                                                                                                                                                                                                              QUERY PLAN                                                                                                                                                                                                               
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c2, t3.c3, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c2, t3.c3, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c2, t3.c3, t1.c3
+               Relations: ((public.ft1 t1) INNER JOIN (public.ft2 t2)) INNER JOIN (public.ft4 t3)
+               Remote SQL: SELECT l.a1, l.a2, l.a3, r.a1 FROM (SELECT l.a1, l.a2, r.a1, r.a2 FROM (SELECT l.a10, l.a12 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l) l (a1, a2) INNER JOIN (SELECT r.a10, r.a9 FROM (SELECT "C 1" a9, c2 a10 FROM "S 1"."T 1") r) r (a1, a2) ON ((l.a1 = r.a2))) l (a1, a2, a3, a4) INNER JOIN (SELECT r.a11, r.a9 FROM (SELECT c1 a9, c3 a11 FROM "S 1"."T 3") r) r (a1, a2) ON ((l.a1 = r.a2))
+(9 rows)
+
+SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+ c1 | c2 |   c3   
+----+----+--------
+ 22 |  2 | AAA022
+ 24 |  4 | AAA024
+ 26 |  6 | AAA026
+ 28 |  8 | AAA028
+ 30 |  0 | AAA030
+ 32 |  2 | AAA032
+ 34 |  4 | AAA034
+ 36 |  6 | AAA036
+ 38 |  8 | AAA038
+ 40 |  0 | AAA040
+(10 rows)
+
+-- left outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                                              QUERY PLAN                                                                                               
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Relations: (public.ft4 t1) LEFT JOIN (public.ft5 t2)
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) LEFT JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1))
+(9 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 22 |   
+ 24 | 24
+ 26 |   
+ 28 |   
+ 30 | 30
+ 32 |   
+ 34 |   
+ 36 | 36
+ 38 |   
+ 40 |   
+(10 rows)
+
+-- right outer join
+SET enable_mergejoin = off; -- planner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                                              QUERY PLAN                                                                                               
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Relations: (public.ft5 t2) LEFT JOIN (public.ft4 t1)
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") l) l (a1) LEFT JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") r) r (a1) ON ((r.a1 = l.a1))
+(9 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+    | 33
+ 36 | 36
+    | 39
+ 42 | 42
+    | 45
+ 48 | 48
+    | 51
+ 54 | 54
+    | 57
+ 60 | 60
+(10 rows)
+
+SET enable_mergejoin = on;
+-- full outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+                                                                                              QUERY PLAN                                                                                               
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Relations: (public.ft4 t1) FULL JOIN (public.ft5 t2)
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) FULL JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1))
+(9 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+ c1  | c1 
+-----+----
+  92 |   
+  94 |   
+  96 | 96
+  98 |   
+ 100 |   
+     |  3
+     |  9
+     | 15
+     | 21
+     | 27
+(10 rows)
+
+-- full outer join + WHERE clause, only matched rows
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                                                                   QUERY PLAN                                                                                                                    
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Relations: (public.ft4 t1) FULL JOIN (public.ft5 t2)
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) FULL JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1)) WHERE (((l.a1 = r.a1) OR (l.a1 IS NULL)))
+(9 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 66 | 66
+ 72 | 72
+ 78 | 78
+ 84 | 84
+ 90 | 90
+ 96 | 96
+    |  3
+    |  9
+    | 15
+    | 21
+(10 rows)
+
+-- join at WHERE clause 
+SET enable_mergejoin = off; -- planner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                                               QUERY PLAN                                                                                               
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Relations: (public.ft4 t1) INNER JOIN (public.ft5 t2)
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) INNER JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1))
+(9 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 66 | 66
+ 72 | 72
+ 78 | 78
+ 84 | 84
+ 90 | 90
+ 96 | 96
+(6 rows)
+
+SET enable_mergejoin = on;
+-- join in CTE
+EXPLAIN (COSTS false, VERBOSE)
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+                                                                                                             QUERY PLAN                                                                                                              
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t.c1_1, t.c2_1, t.c1_3
+   CTE t
+     ->  Foreign Scan
+           Output: t1.c1, t1.c3, t2.c1
+           Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+           Remote SQL: SELECT l.a1, l.a2, r.a1 FROM (SELECT l.a10, l.a12 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l) l (a1, a2) INNER JOIN (SELECT r.a9 FROM (SELECT "C 1" a9 FROM "S 1"."T 1") r) r (a1) ON ((l.a1 = r.a1))
+   ->  Sort
+         Output: t.c1_1, t.c2_1, t.c1_3
+         Sort Key: t.c1_3, t.c1_1
+         ->  CTE Scan on t
+               Output: t.c1_1, t.c2_1, t.c1_3
+(12 rows)
+
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+ c1_1 | c2_1 
+------+------
+  101 |  101
+  102 |  102
+  103 |  103
+  104 |  104
+  105 |  105
+  106 |  106
+  107 |  107
+  108 |  108
+  109 |  109
+  110 |  110
+(10 rows)
+
+-- ctid with whole-row reference
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                                                                                                                                                                                                                                   QUERY PLAN                                                                                                                                                                                                                                                    
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+   ->  Sort
+         Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+               Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, r.a1 FROM (SELECT l.a7, ROW(l.a10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17), l.a10, l.a12 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17, ctid a7 FROM "S 1"."T 1") l) l (a1, a2, a3, a4) INNER JOIN (SELECT ROW(r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a9 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2) ON ((l.a3 = r.a2))
+(9 rows)
+
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+  ctid  |                                             t1                                             |                                             t2                                             | c1  
+--------+--------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------+-----
+ (1,4)  | (101,1,00101,"Fri Jan 02 00:00:00 1970 PST","Fri Jan 02 00:00:00 1970",1,"1         ",foo) | (101,1,00101,"Fri Jan 02 00:00:00 1970 PST","Fri Jan 02 00:00:00 1970",1,"1         ",foo) | 101
+ (1,5)  | (102,2,00102,"Sat Jan 03 00:00:00 1970 PST","Sat Jan 03 00:00:00 1970",2,"2         ",foo) | (102,2,00102,"Sat Jan 03 00:00:00 1970 PST","Sat Jan 03 00:00:00 1970",2,"2         ",foo) | 102
+ (1,6)  | (103,3,00103,"Sun Jan 04 00:00:00 1970 PST","Sun Jan 04 00:00:00 1970",3,"3         ",foo) | (103,3,00103,"Sun Jan 04 00:00:00 1970 PST","Sun Jan 04 00:00:00 1970",3,"3         ",foo) | 103
+ (1,7)  | (104,4,00104,"Mon Jan 05 00:00:00 1970 PST","Mon Jan 05 00:00:00 1970",4,"4         ",foo) | (104,4,00104,"Mon Jan 05 00:00:00 1970 PST","Mon Jan 05 00:00:00 1970",4,"4         ",foo) | 104
+ (1,8)  | (105,5,00105,"Tue Jan 06 00:00:00 1970 PST","Tue Jan 06 00:00:00 1970",5,"5         ",foo) | (105,5,00105,"Tue Jan 06 00:00:00 1970 PST","Tue Jan 06 00:00:00 1970",5,"5         ",foo) | 105
+ (1,9)  | (106,6,00106,"Wed Jan 07 00:00:00 1970 PST","Wed Jan 07 00:00:00 1970",6,"6         ",foo) | (106,6,00106,"Wed Jan 07 00:00:00 1970 PST","Wed Jan 07 00:00:00 1970",6,"6         ",foo) | 106
+ (1,10) | (107,7,00107,"Thu Jan 08 00:00:00 1970 PST","Thu Jan 08 00:00:00 1970",7,"7         ",foo) | (107,7,00107,"Thu Jan 08 00:00:00 1970 PST","Thu Jan 08 00:00:00 1970",7,"7         ",foo) | 107
+ (1,11) | (108,8,00108,"Fri Jan 09 00:00:00 1970 PST","Fri Jan 09 00:00:00 1970",8,"8         ",foo) | (108,8,00108,"Fri Jan 09 00:00:00 1970 PST","Fri Jan 09 00:00:00 1970",8,"8         ",foo) | 108
+ (1,12) | (109,9,00109,"Sat Jan 10 00:00:00 1970 PST","Sat Jan 10 00:00:00 1970",9,"9         ",foo) | (109,9,00109,"Sat Jan 10 00:00:00 1970 PST","Sat Jan 10 00:00:00 1970",9,"9         ",foo) | 109
+ (1,13) | (110,0,00110,"Sun Jan 11 00:00:00 1970 PST","Sun Jan 11 00:00:00 1970",0,"0         ",foo) | (110,0,00110,"Sun Jan 11 00:00:00 1970 PST","Sun Jan 11 00:00:00 1970",0,"0         ",foo) | 110
+(10 rows)
+
+-- partially unsafe to push down, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+                                                                                                               QUERY PLAN                                                                                                                
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Nested Loop
+               Output: t1.c1
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Materialize
+                     ->  Foreign Scan
+                           Relations: (public.ft2 t2) INNER JOIN (public.ft4 t3)
+                           Remote SQL: SELECT NULL FROM (SELECT l.a9 FROM (SELECT "C 1" a9 FROM "S 1"."T 1" WHERE (("C 1" = "C 1"))) l) l (a1) INNER JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") r) r (a1) ON ((l.a1 = r.a1))
+(14 rows)
+
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+ c1 
+----
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+(10 rows)
+
+-- SEMI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Hash Join
+               Output: t1.c1
+               Hash Cond: (t1.c1 = t2.c1)
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t2.c1
+                     ->  HashAggregate
+                           Output: t2.c1
+                           Group Key: t2.c1
+                           ->  Foreign Scan on public.ft2 t2
+                                 Output: t2.c1
+                                 Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+(19 rows)
+
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+ c1  
+-----
+ 101
+ 102
+ 103
+ 104
+ 105
+ 106
+ 107
+ 108
+ 109
+ 110
+(10 rows)
+
+-- ANTI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+                              QUERY PLAN                              
+----------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Hash Anti Join
+               Output: t1.c1
+               Hash Cond: (t1.c1 = t2.c2)
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t2.c2
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c2
+                           Remote SQL: SELECT c2 a10 FROM "S 1"."T 1"
+(16 rows)
+
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+ c1  
+-----
+ 110
+ 111
+ 112
+ 113
+ 114
+ 115
+ 116
+ 117
+ 118
+ 119
+(10 rows)
+
+-- CROSS JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Nested Loop
+               Output: t1.c1, t2.c1
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Materialize
+                     Output: t2.c1
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c1
+                           Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+(15 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1  
+----+-----
+  1 | 101
+  1 | 102
+  1 | 103
+  1 | 104
+  1 | 105
+  1 | 106
+  1 | 107
+  1 | 108
+  1 | 109
+  1 | 110
+(10 rows)
+
+-- different server
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Merge Join
+         Output: t1.c1, t2.c1
+         Merge Cond: (t1.c1 = t2.c1)
+         ->  Sort
+               Output: t1.c1
+               Sort Key: t1.c1
+               ->  Foreign Scan on public.ft5 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+         ->  Sort
+               Output: t2.c1
+               Sort Key: t2.c1
+               ->  Foreign Scan on public.ft6 t2
+                     Output: t2.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1 
+----+----
+(0 rows)
+
+-- different effective user for permission check
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Limit
+   Output: t1.c1, ft5.c1
+   ->  Merge Join
+         Output: t1.c1, ft5.c1
+         Merge Cond: (t1.c1 = ft5.c1)
+         ->  Sort
+               Output: t1.c1
+               Sort Key: t1.c1
+               ->  Foreign Scan on public.ft5 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+         ->  Sort
+               Output: ft5.c1
+               Sort Key: ft5.c1
+               ->  Foreign Scan on public.ft5
+                     Output: ft5.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1 
+----+----
+(0 rows)
+
+-- unsafe join conditions
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Merge Join
+               Output: t1.c1, t2.c1, t1.c3
+               Merge Cond: (t1.c8 = t2.c8)
+               ->  Sort
+                     Output: t1.c1, t1.c3, t1.c8
+                     Sort Key: t1.c8
+                     ->  Foreign Scan on public.ft1 t1
+                           Output: t1.c1, t1.c3, t1.c8
+                           Remote SQL: SELECT "C 1" a10, c3 a12, c8 a17 FROM "S 1"."T 1"
+               ->  Sort
+                     Output: t2.c1, t2.c8
+                     Sort Key: t2.c8
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c1, t2.c8
+                           Remote SQL: SELECT "C 1" a9, c8 a17 FROM "S 1"."T 1"
+(20 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1 | c1  
+----+-----
+  1 | 102
+  1 | 103
+  1 | 104
+  1 | 105
+  1 | 106
+  1 | 107
+  1 | 108
+  1 | 109
+  1 | 110
+  1 |   1
+(10 rows)
+
+-- local filter (unsafe conditions on one side)
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Hash Join
+               Output: t1.c1, t2.c1, t1.c3
+               Hash Cond: (t2.c1 = t1.c1)
+               ->  Foreign Scan on public.ft2 t2
+                     Output: t2.c1
+                     Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t1.c1, t1.c3
+                     ->  Foreign Scan on public.ft1 t1
+                           Output: t1.c1, t1.c3
+                           Filter: (t1.c8 = 'foo'::user_enum)
+                           Remote SQL: SELECT "C 1" a10, c3 a12, c8 a17 FROM "S 1"."T 1"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1  | c1  
+-----+-----
+ 101 | 101
+ 102 | 102
+ 103 | 103
+ 104 | 104
+ 105 | 105
+ 106 | 106
+ 107 | 107
+ 108 | 108
+ 109 | 109
+ 110 | 110
+(10 rows)
+
+-- ===================================================================
 -- parameterized queries
 -- ===================================================================
 -- simple join
 PREPARE st1(int, int) AS SELECT t1.c3, t2.c3 FROM ft1 t1, ft2 t2 WHERE t1.c1 = $1 AND t2.c1 = $2;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st1(1, 2);
-                             QUERY PLAN                             
---------------------------------------------------------------------
+                               QUERY PLAN                               
+------------------------------------------------------------------------
  Nested Loop
    Output: t1.c3, t2.c3
    ->  Foreign Scan on public.ft1 t1
          Output: t1.c3
-         Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+         Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" = 1))
    ->  Foreign Scan on public.ft2 t2
          Output: t2.c3
-         Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" = 2))
+         Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" = 2))
 (8 rows)
 
 EXECUTE st1(1, 1);
@@ -683,8 +1293,8 @@ EXECUTE st1(101, 101);
 -- subquery using stable function (can't be sent to remote)
 PREPARE st2(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c4) = '1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st2(10, 20);
-                                                QUERY PLAN                                                
-----------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Sort
    Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
    Sort Key: t1.c1
@@ -693,13 +1303,13 @@ EXPLAIN (VERBOSE, COSTS false) EXECUTE st2(10, 20);
          Join Filter: (t1.c3 = t2.c3)
          ->  Foreign Scan on public.ft1 t1
                Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20))
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" < 20))
          ->  Materialize
                Output: t2.c3
                ->  Foreign Scan on public.ft2 t2
                      Output: t2.c3
                      Filter: (date(t2.c4) = '01-17-1970'::date)
-                     Remote SQL: SELECT c3, c4 FROM "S 1"."T 1" WHERE (("C 1" > 10))
+                     Remote SQL: SELECT c3 a12, c4 a13 FROM "S 1"."T 1" WHERE (("C 1" > 10))
 (15 rows)
 
 EXECUTE st2(10, 20);
@@ -717,8 +1327,8 @@ EXECUTE st2(101, 121);
 -- subquery using immutable function (can be sent to remote)
 PREPARE st3(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c5) = '1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st3(10, 20);
-                                                      QUERY PLAN                                                       
------------------------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Sort
    Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
    Sort Key: t1.c1
@@ -727,12 +1337,12 @@ EXPLAIN (VERBOSE, COSTS false) EXECUTE st3(10, 20);
          Join Filter: (t1.c3 = t2.c3)
          ->  Foreign Scan on public.ft1 t1
                Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20))
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" < 20))
          ->  Materialize
                Output: t2.c3
                ->  Foreign Scan on public.ft2 t2
                      Output: t2.c3
-                     Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date))
+                     Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date))
 (14 rows)
 
 EXECUTE st3(10, 20);
@@ -749,108 +1359,108 @@ EXECUTE st3(20, 30);
 -- custom plan should be chosen initially
 PREPARE st4(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 = $1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 -- once we try it enough times, should switch to generic plan
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
 (3 rows)
 
 -- value of $1 should not be sent to remote
 PREPARE st5(user_enum,int) AS SELECT * FROM ft1 t1 WHERE c8 = $1 and c1 = $2;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = $1)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
 (4 rows)
 
 EXECUTE st5('foo', 1);
@@ -868,14 +1478,14 @@ DEALLOCATE st5;
 -- System columns, except ctid, should not be sent to remote
 EXPLAIN (VERBOSE, COSTS false)
 SELECT * FROM ft1 t1 WHERE t1.tableoid = 'pg_class'::regclass LIMIT 1;
-                                  QUERY PLAN                                   
--------------------------------------------------------------------------------
+                                                  QUERY PLAN                                                   
+---------------------------------------------------------------------------------------------------------------
  Limit
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8
          Filter: (t1.tableoid = '1259'::oid)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (6 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.tableoid = 'ft1'::regclass LIMIT 1;
@@ -886,13 +1496,13 @@ SELECT * FROM ft1 t1 WHERE t1.tableoid = 'ft1'::regclass LIMIT 1;
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
-                                  QUERY PLAN                                   
--------------------------------------------------------------------------------
+                                                  QUERY PLAN                                                   
+---------------------------------------------------------------------------------------------------------------
  Limit
    Output: ((tableoid)::regclass), c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: (tableoid)::regclass, c1, c2, c3, c4, c5, c6, c7, c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (5 rows)
 
 SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
@@ -903,11 +1513,11 @@ SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((ctid = '(0,2)'::tid))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((ctid = '(0,2)'::tid))
 (3 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
@@ -918,13 +1528,13 @@ SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT ctid, * FROM ft1 t1 LIMIT 1;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                       QUERY PLAN                                                       
+------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: ctid, c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: ctid, c1, c2, c3, c4, c5, c6, c7, c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17, ctid a7 FROM "S 1"."T 1"
 (5 rows)
 
 SELECT ctid, * FROM ft1 t1 LIMIT 1;
@@ -987,7 +1597,7 @@ FETCH c;
 SAVEPOINT s;
 SELECT * FROM ft1 WHERE 1 / (c1 - 1) > 0;  -- ERROR
 ERROR:  division by zero
-CONTEXT:  Remote SQL command: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (((1 / ("C 1" - 1)) > 0))
+CONTEXT:  Remote SQL command: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (((1 / ("C 1" - 1)) > 0))
 ROLLBACK TO s;
 FETCH c;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -1010,64 +1620,64 @@ create foreign table ft3 (f1 text collate "C", f2 text)
   server loopback options (table_name 'loct3');
 -- can be sent to remote
 explain (verbose, costs off) select * from ft3 where f1 = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f1 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f1 = 'foo'::text))
 (3 rows)
 
 explain (verbose, costs off) select * from ft3 where f1 COLLATE "C" = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f1 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f1 = 'foo'::text))
 (3 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f2 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f2 = 'foo'::text))
 (3 rows)
 
 -- can't be sent to remote
 explain (verbose, costs off) select * from ft3 where f1 COLLATE "POSIX" = 'foo';
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: ((ft3.f1)::text = 'foo'::text)
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f1 = 'foo' COLLATE "C";
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: (ft3.f1 = 'foo'::text COLLATE "C")
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 COLLATE "C" = 'foo';
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: ((ft3.f2)::text = 'foo'::text)
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 = 'foo' COLLATE "C";
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: (ft3.f2 = 'foo'::text COLLATE "C")
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 -- ===================================================================
@@ -1085,7 +1695,7 @@ INSERT INTO ft2 (c1,c2,c3) SELECT c1+1000,c2+100, c3 || c3 FROM ft2 LIMIT 20;
                Output: ((ft2_1.c1 + 1000)), ((ft2_1.c2 + 100)), ((ft2_1.c3 || ft2_1.c3))
                ->  Foreign Scan on public.ft2 ft2_1
                      Output: (ft2_1.c1 + 1000), (ft2_1.c2 + 100), (ft2_1.c3 || ft2_1.c3)
-                     Remote SQL: SELECT "C 1", c2, c3 FROM "S 1"."T 1"
+                     Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12 FROM "S 1"."T 1"
 (9 rows)
 
 INSERT INTO ft2 (c1,c2,c3) SELECT c1+1000,c2+100, c3 || c3 FROM ft2 LIMIT 20;
@@ -1210,35 +1820,28 @@ UPDATE ft2 SET c2 = c2 + 400, c3 = c3 || '_update7' WHERE c1 % 10 = 7 RETURNING
 EXPLAIN (verbose, costs off)
 UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
-                                                                            QUERY PLAN                                                                             
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                                                                                                                       QUERY PLAN                                                                                                                                                                                                                                                                       
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Update on public.ft2
    Remote SQL: UPDATE "S 1"."T 1" SET c2 = $2, c3 = $3, c7 = $4 WHERE ctid = $1
-   ->  Hash Join
+   ->  Foreign Scan
          Output: ft2.c1, (ft2.c2 + 500), NULL::integer, (ft2.c3 || '_update9'::text), ft2.c4, ft2.c5, ft2.c6, 'ft2       '::character(10), ft2.c8, ft2.ctid, ft1.*
-         Hash Cond: (ft2.c2 = ft1.c1)
-         ->  Foreign Scan on public.ft2
-               Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c8, ft2.ctid
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c8, ctid FROM "S 1"."T 1" FOR UPDATE
-         ->  Hash
-               Output: ft1.*, ft1.c1
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.*, ft1.c1
-                     Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 9))
-(13 rows)
+         Relations: (public.ft2) INNER JOIN (public.ft1)
+         Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, r.a1 FROM (SELECT l.a9, l.a10, l.a12, l.a13, l.a14, l.a15, l.a17, l.a7 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c8 a17, ctid a7 FROM "S 1"."T 1" FOR UPDATE) l) l (a1, a2, a3, a4, a5, a6, a7, a8) INNER JOIN (SELECT ROW(r.a10, r.a11, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a10 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 9))) r) r (a1, a2) ON ((l.a2 = r.a2))
+(6 rows)
 
 UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
 EXPLAIN (verbose, costs off)
   DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
-                                       QUERY PLAN                                       
-----------------------------------------------------------------------------------------
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
  Delete on public.ft2
    Output: c1, c4
-   Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1 RETURNING "C 1", c4
+   Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1 RETURNING "C 1" a9, c4 a13
    ->  Foreign Scan on public.ft2
          Output: ctid
-         Remote SQL: SELECT ctid FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 5)) FOR UPDATE
+         Remote SQL: SELECT ctid a7 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 5)) FOR UPDATE
 (6 rows)
 
 DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
@@ -1351,22 +1954,15 @@ DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
 
 EXPLAIN (verbose, costs off)
 DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
-                                                      QUERY PLAN                                                      
-----------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                                        QUERY PLAN                                                                                                                                                                                         
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Delete on public.ft2
    Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1
-   ->  Hash Join
+   ->  Foreign Scan
          Output: ft2.ctid, ft1.*
-         Hash Cond: (ft2.c2 = ft1.c1)
-         ->  Foreign Scan on public.ft2
-               Output: ft2.ctid, ft2.c2
-               Remote SQL: SELECT c2, ctid FROM "S 1"."T 1" FOR UPDATE
-         ->  Hash
-               Output: ft1.*, ft1.c1
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.*, ft1.c1
-                     Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 2))
-(13 rows)
+         Relations: (public.ft2) INNER JOIN (public.ft1)
+         Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a7, l.a10 FROM (SELECT c2 a10, ctid a7 FROM "S 1"."T 1" FOR UPDATE) l) l (a1, a2) INNER JOIN (SELECT ROW(r.a10, r.a11, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a10 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 2))) r) r (a1, a2) ON ((l.a2 = r.a2))
+(6 rows)
 
 DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
 SELECT c1,c2,c3,c4 FROM ft2 ORDER BY c1;
@@ -3027,386 +3623,6 @@ NOTICE:  NEW: (13,"test triggered !")
 (1 row)
 
 -- ===================================================================
--- test inheritance features
--- ===================================================================
-CREATE TABLE a (aa TEXT);
-CREATE TABLE loct (aa TEXT, bb TEXT);
-CREATE FOREIGN TABLE b (bb TEXT) INHERITS (a)
-  SERVER loopback OPTIONS (table_name 'loct');
-INSERT INTO a(aa) VALUES('aaa');
-INSERT INTO a(aa) VALUES('aaaa');
-INSERT INTO a(aa) VALUES('aaaaa');
-INSERT INTO b(aa) VALUES('bbb');
-INSERT INTO b(aa) VALUES('bbbb');
-INSERT INTO b(aa) VALUES('bbbbb');
-SELECT tableoid::regclass, * FROM a;
- tableoid |  aa   
-----------+-------
- a        | aaa
- a        | aaaa
- a        | aaaaa
- b        | bbb
- b        | bbbb
- b        | bbbbb
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |  aa   | bb 
-----------+-------+----
- b        | bbb   | 
- b        | bbbb  | 
- b        | bbbbb | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |  aa   
-----------+-------
- a        | aaa
- a        | aaaa
- a        | aaaaa
-(3 rows)
-
-UPDATE a SET aa = 'zzzzzz' WHERE aa LIKE 'aaaa%';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
- b        | bbb
- b        | bbbb
- b        | bbbbb
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |  aa   | bb 
-----------+-------+----
- b        | bbb   | 
- b        | bbbb  | 
- b        | bbbbb | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
-(3 rows)
-
-UPDATE b SET aa = 'new';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
- b        | new
- b        | new
- b        | new
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid | aa  | bb 
-----------+-----+----
- b        | new | 
- b        | new | 
- b        | new | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
-(3 rows)
-
-UPDATE a SET aa = 'newtoo';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | newtoo
- a        | newtoo
- a        | newtoo
- b        | newtoo
- b        | newtoo
- b        | newtoo
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |   aa   | bb 
-----------+--------+----
- b        | newtoo | 
- b        | newtoo | 
- b        | newtoo | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | newtoo
- a        | newtoo
- a        | newtoo
-(3 rows)
-
-DELETE FROM a;
-SELECT tableoid::regclass, * FROM a;
- tableoid | aa 
-----------+----
-(0 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid | aa | bb 
-----------+----+----
-(0 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid | aa 
-----------+----
-(0 rows)
-
-DROP TABLE a CASCADE;
-NOTICE:  drop cascades to foreign table b
-DROP TABLE loct;
--- Check SELECT FOR UPDATE/SHARE with an inherited source table
-create table loct1 (f1 int, f2 int, f3 int);
-create table loct2 (f1 int, f2 int, f3 int);
-create table foo (f1 int, f2 int);
-create foreign table foo2 (f3 int) inherits (foo)
-  server loopback options (table_name 'loct1');
-create table bar (f1 int, f2 int);
-create foreign table bar2 (f3 int) inherits (bar)
-  server loopback options (table_name 'loct2');
-insert into foo values(1,1);
-insert into foo values(3,3);
-insert into foo2 values(2,2,2);
-insert into foo2 values(4,4,4);
-insert into bar values(1,11);
-insert into bar values(2,22);
-insert into bar values(6,66);
-insert into bar2 values(3,33,33);
-insert into bar2 values(4,44,44);
-insert into bar2 values(7,77,77);
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for update;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
- LockRows
-   Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-   ->  Hash Join
-         Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Append
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.ctid, bar2.tableoid, bar2.*
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(22 rows)
-
-select * from bar where f1 in (select f1 from foo) for update;
- f1 | f2 
-----+----
-  1 | 11
-  2 | 22
-  3 | 33
-  4 | 44
-(4 rows)
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for share;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
- LockRows
-   Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-   ->  Hash Join
-         Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Append
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.ctid, bar2.tableoid, bar2.*
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR SHARE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(22 rows)
-
-select * from bar where f1 in (select f1 from foo) for share;
- f1 | f2 
-----+----
-  1 | 11
-  2 | 22
-  3 | 33
-  4 | 44
-(4 rows)
-
--- Check UPDATE with inherited target and an inherited source table
-explain (verbose, costs off)
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
- Update on public.bar
-   Update on public.bar
-   Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
-   ->  Hash Join
-         Output: bar.f1, (bar.f2 + 100), bar.ctid, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Seq Scan on public.bar
-               Output: bar.f1, bar.f2, bar.ctid
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-   ->  Hash Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar2.f1 = foo.f1)
-         ->  Foreign Scan on public.bar2
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-               Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(37 rows)
-
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-select tableoid::regclass, * from bar order by 1,2;
- tableoid | f1 | f2  
-----------+----+-----
- bar      |  1 | 111
- bar      |  2 | 122
- bar      |  6 |  66
- bar2     |  3 | 133
- bar2     |  4 | 144
- bar2     |  7 |  77
-(6 rows)
-
--- Check UPDATE with inherited target and an appendrel subquery
-explain (verbose, costs off)
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-                                      QUERY PLAN                                      
---------------------------------------------------------------------------------------
- Update on public.bar
-   Update on public.bar
-   Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
-   ->  Hash Join
-         Output: bar.f1, (bar.f2 + 100), bar.ctid, (ROW(foo.f1))
-         Hash Cond: (foo.f1 = bar.f1)
-         ->  Append
-               ->  Seq Scan on public.foo
-                     Output: ROW(foo.f1), foo.f1
-               ->  Foreign Scan on public.foo2
-                     Output: ROW(foo2.f1), foo2.f1
-                     Remote SQL: SELECT f1 FROM public.loct1
-               ->  Seq Scan on public.foo foo_1
-                     Output: ROW((foo_1.f1 + 3)), (foo_1.f1 + 3)
-               ->  Foreign Scan on public.foo2 foo2_1
-                     Output: ROW((foo2_1.f1 + 3)), (foo2_1.f1 + 3)
-                     Remote SQL: SELECT f1 FROM public.loct1
-         ->  Hash
-               Output: bar.f1, bar.f2, bar.ctid
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid
-   ->  Merge Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, (ROW(foo.f1))
-         Merge Cond: (bar2.f1 = foo.f1)
-         ->  Sort
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-               Sort Key: bar2.f1
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Sort
-               Output: (ROW(foo.f1)), foo.f1
-               Sort Key: foo.f1
-               ->  Append
-                     ->  Seq Scan on public.foo
-                           Output: ROW(foo.f1), foo.f1
-                     ->  Foreign Scan on public.foo2
-                           Output: ROW(foo2.f1), foo2.f1
-                           Remote SQL: SELECT f1 FROM public.loct1
-                     ->  Seq Scan on public.foo foo_1
-                           Output: ROW((foo_1.f1 + 3)), (foo_1.f1 + 3)
-                     ->  Foreign Scan on public.foo2 foo2_1
-                           Output: ROW((foo2_1.f1 + 3)), (foo2_1.f1 + 3)
-                           Remote SQL: SELECT f1 FROM public.loct1
-(45 rows)
-
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-select tableoid::regclass, * from bar order by 1,2;
- tableoid | f1 | f2  
-----------+----+-----
- bar      |  1 | 211
- bar      |  2 | 222
- bar      |  6 | 166
- bar2     |  3 | 233
- bar2     |  4 | 244
- bar2     |  7 | 177
-(6 rows)
-
--- Test that WHERE CURRENT OF is not supported
-begin;
-declare c cursor for select * from bar where f1 = 7;
-fetch from c;
- f1 | f2  
-----+-----
-  7 | 177
-(1 row)
-
-update bar set f2 = null where current of c;
-ERROR:  WHERE CURRENT OF is not supported for this table type
-rollback;
-drop table foo cascade;
-NOTICE:  drop cascades to foreign table foo2
-drop table bar cascade;
-NOTICE:  drop cascades to foreign table bar2
-drop table loct1;
-drop table loct2;
--- ===================================================================
 -- test IMPORT FOREIGN SCHEMA
 -- ===================================================================
 CREATE SCHEMA import_source;
@@ -3636,3 +3852,6 @@ QUERY:  CREATE FOREIGN TABLE t5 (
 OPTIONS (schema_name 'import_source', table_name 't5');
 CONTEXT:  importing foreign table "t5"
 ROLLBACK;
+-- Cleanup
+DROP OWNED BY view_owner;
+DROP USER view_owner;
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 478e124..de64627 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -28,7 +28,6 @@
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/planmain.h"
-#include "optimizer/prep.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/var.h"
 #include "parser/parsetree.h"
@@ -47,41 +46,8 @@ PG_MODULE_MAGIC;
 #define DEFAULT_FDW_TUPLE_COST		0.01
 
 /*
- * FDW-specific planner information kept in RelOptInfo.fdw_private for a
- * foreign table.  This information is collected by postgresGetForeignRelSize.
- */
-typedef struct PgFdwRelationInfo
-{
-	/* baserestrictinfo clauses, broken down into safe and unsafe subsets. */
-	List	   *remote_conds;
-	List	   *local_conds;
-
-	/* Bitmap of attr numbers we need to fetch from the remote server. */
-	Bitmapset  *attrs_used;
-
-	/* Cost and selectivity of local_conds. */
-	QualCost	local_conds_cost;
-	Selectivity local_conds_sel;
-
-	/* Estimated size and cost for a scan with baserestrictinfo quals. */
-	double		rows;
-	int			width;
-	Cost		startup_cost;
-	Cost		total_cost;
-
-	/* Options extracted from catalogs. */
-	bool		use_remote_estimate;
-	Cost		fdw_startup_cost;
-	Cost		fdw_tuple_cost;
-
-	/* Cached catalog information. */
-	ForeignTable *table;
-	ForeignServer *server;
-	UserMapping *user;			/* only set in use_remote_estimate mode */
-} PgFdwRelationInfo;
-
-/*
- * Indexes of FDW-private information stored in fdw_private lists.
+ * Indexes of FDW-private information stored in fdw_private of ForeignScan of
+ * a simple foreign table scan for a SELECT statement.
  *
  * We store various information in ForeignScan.fdw_private to pass it from
  * planner to executor.  Currently we store:
@@ -98,7 +64,13 @@ enum FdwScanPrivateIndex
 	/* SQL statement to execute remotely (as a String node) */
 	FdwScanPrivateSelectSql,
 	/* Integer list of attribute numbers retrieved by the SELECT */
-	FdwScanPrivateRetrievedAttrs
+	FdwScanPrivateRetrievedAttrs,
+	/* Integer value of server for the scan */
+	FdwScanPrivateServerOid,
+	/* Integer value of effective userid for the scan */
+	FdwScanPrivateUserOid,
+	/* Names of relation scanned, added when the scan is join */
+	FdwScanPrivateRelations,
 };
 
 /*
@@ -128,7 +100,8 @@ enum FdwModifyPrivateIndex
  */
 typedef struct PgFdwScanState
 {
-	Relation	rel;			/* relcache entry for the foreign table */
+	const char *relname;		/* name of relation being scanned */
+	TupleDesc	tupdesc;		/* tuple descriptor of the scan */
 	AttInMetadata *attinmeta;	/* attribute datatype conversion metadata */
 
 	/* extracted fdw_private data */
@@ -194,6 +167,8 @@ typedef struct PgFdwAnalyzeState
 	AttInMetadata *attinmeta;	/* attribute datatype conversion metadata */
 	List	   *retrieved_attrs;	/* attr numbers retrieved by query */
 
+	char	   *query;			/* text of SELECT command */
+
 	/* collected sample rows */
 	HeapTuple  *rows;			/* array of size targrows */
 	int			targrows;		/* target # of sample rows */
@@ -214,7 +189,10 @@ typedef struct PgFdwAnalyzeState
  */
 typedef struct ConversionLocation
 {
-	Relation	rel;			/* foreign table's relcache entry */
+	const char *relname;		/* name of relation being processed, or NULL for
+								   a foreign join */
+	const char *query;			/* query being processed */
+	TupleDesc	tupdesc;		/* tuple descriptor for attribute names */
 	AttrNumber	cur_attno;		/* attribute number being processed, or 0 */
 } ConversionLocation;
 
@@ -288,6 +266,12 @@ static bool postgresAnalyzeForeignTable(Relation relation,
 							BlockNumber *totalpages);
 static List *postgresImportForeignSchema(ImportForeignSchemaStmt *stmt,
 							Oid serverOid);
+static void postgresGetForeignJoinPaths(PlannerInfo *root,
+						   RelOptInfo *joinrel,
+						   RelOptInfo *outerrel,
+						   RelOptInfo *innerrel,
+						   SpecialJoinInfo *sjinfo,
+						   List *restrictlisti);
 
 /*
  * Helper functions
@@ -323,12 +307,40 @@ static void analyze_row_processor(PGresult *res, int row,
 					  PgFdwAnalyzeState *astate);
 static HeapTuple make_tuple_from_result_row(PGresult *res,
 						   int row,
-						   Relation rel,
+						   const char *relname,
+						   const char *query,
+						   TupleDesc tupdesc,
 						   AttInMetadata *attinmeta,
 						   List *retrieved_attrs,
 						   MemoryContext temp_context);
 static void conversion_error_callback(void *arg);
 
+/*
+ * Describe Bitmapset as comma-separated integer list.
+ * For debug purpose.
+ * XXX Can this become a member of bitmapset.c?
+ */
+static char *
+bms_to_str(Bitmapset *bmp)
+{
+	StringInfoData buf;
+	bool		first = true;
+	int			x;
+
+	initStringInfo(&buf);
+
+	x = -1;
+	while ((x = bms_next_member(bmp, x)) >= 0)
+	{
+		if (!first)
+			appendStringInfoString(&buf, ", ");
+		appendStringInfo(&buf, "%d", x);
+
+		first = false;
+	}
+
+	return buf.data;
+}
 
 /*
  * Foreign-data wrapper handler function: return a struct with pointers
@@ -368,6 +380,9 @@ postgres_fdw_handler(PG_FUNCTION_ARGS)
 	/* Support functions for IMPORT FOREIGN SCHEMA */
 	routine->ImportForeignSchema = postgresImportForeignSchema;
 
+	/* Support functions for join push-down */
+	routine->GetForeignJoinPaths = postgresGetForeignJoinPaths;
+
 	PG_RETURN_POINTER(routine);
 }
 
@@ -383,7 +398,9 @@ postgresGetForeignRelSize(PlannerInfo *root,
 						  RelOptInfo *baserel,
 						  Oid foreigntableid)
 {
+	RangeTblEntry *rte;
 	PgFdwRelationInfo *fpinfo;
+	ForeignTable *table;
 	ListCell   *lc;
 
 	/*
@@ -394,8 +411,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	baserel->fdw_private = (void *) fpinfo;
 
 	/* Look up foreign-table catalog info. */
-	fpinfo->table = GetForeignTable(foreigntableid);
-	fpinfo->server = GetForeignServer(fpinfo->table->serverid);
+	table = GetForeignTable(foreigntableid);
+	fpinfo->server = GetForeignServer(table->serverid);
 
 	/*
 	 * Extract user-settable option values.  Note that per-table setting of
@@ -416,7 +433,7 @@ postgresGetForeignRelSize(PlannerInfo *root,
 		else if (strcmp(def->defname, "fdw_tuple_cost") == 0)
 			fpinfo->fdw_tuple_cost = strtod(defGetString(def), NULL);
 	}
-	foreach(lc, fpinfo->table->options)
+	foreach(lc, table->options)
 	{
 		DefElem    *def = (DefElem *) lfirst(lc);
 
@@ -428,20 +445,12 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	}
 
 	/*
-	 * If the table or the server is configured to use remote estimates,
-	 * identify which user to do remote access as during planning.  This
+	 * Identify which user to do remote access as during planning.  This
 	 * should match what ExecCheckRTEPerms() does.  If we fail due to lack of
 	 * permissions, the query would have failed at runtime anyway.
 	 */
-	if (fpinfo->use_remote_estimate)
-	{
-		RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
-		Oid			userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-		fpinfo->user = GetUserMapping(userid, fpinfo->server->serverid);
-	}
-	else
-		fpinfo->user = NULL;
+	rte = planner_rt_fetch(baserel->relid, root);
+	fpinfo->userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
 
 	/*
 	 * Identify which baserestrictinfo clauses can be sent to the remote
@@ -463,10 +472,9 @@ postgresGetForeignRelSize(PlannerInfo *root,
 				   &fpinfo->attrs_used);
 	foreach(lc, fpinfo->local_conds)
 	{
-		RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
+		Expr *expr = (Expr *) lfirst(lc);
 
-		pull_varattnos((Node *) rinfo->clause, baserel->relid,
-					   &fpinfo->attrs_used);
+		pull_varattnos((Node *) expr, baserel->relid, &fpinfo->attrs_used);
 	}
 
 	/*
@@ -752,6 +760,9 @@ postgresGetForeignPlan(PlannerInfo *root,
 	List	   *retrieved_attrs;
 	StringInfoData sql;
 	ListCell   *lc;
+	List	   *fdw_ps_tlist = NIL;
+	ForeignScan *scan;
+	StringInfoData relations;
 
 	/*
 	 * Separate the scan_clauses into those that can be executed remotely and
@@ -768,9 +779,6 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 *
 	 * This code must match "extract_actual_clauses(scan_clauses, false)"
 	 * except for the additional decision about remote versus local execution.
-	 * Note however that we only strip the RestrictInfo nodes from the
-	 * local_exprs list, since appendWhereClause expects a list of
-	 * RestrictInfos.
 	 */
 	foreach(lc, scan_clauses)
 	{
@@ -783,82 +791,37 @@ postgresGetForeignPlan(PlannerInfo *root,
 			continue;
 
 		if (list_member_ptr(fpinfo->remote_conds, rinfo))
-			remote_conds = lappend(remote_conds, rinfo);
+			remote_conds = lappend(remote_conds, rinfo->clause);
 		else if (list_member_ptr(fpinfo->local_conds, rinfo))
 			local_exprs = lappend(local_exprs, rinfo->clause);
 		else if (is_foreign_expr(root, baserel, rinfo->clause))
-			remote_conds = lappend(remote_conds, rinfo);
+			remote_conds = lappend(remote_conds, rinfo->clause);
 		else
 			local_exprs = lappend(local_exprs, rinfo->clause);
 	}
 
 	/*
 	 * Build the query string to be sent for execution, and identify
-	 * expressions to be sent as parameters.
+	 * expressions to be sent as parameters.  If the relation to scan is a join
+	 * relation, receive constructed relations string from deparseSelectSql.
 	 */
 	initStringInfo(&sql);
-	deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used,
-					 &retrieved_attrs);
-	if (remote_conds)
-		appendWhereClause(&sql, root, baserel, remote_conds,
-						  true, &params_list);
-
-	/*
-	 * Add FOR UPDATE/SHARE if appropriate.  We apply locking during the
-	 * initial row fetch, rather than later on as is done for local tables.
-	 * The extra roundtrips involved in trying to duplicate the local
-	 * semantics exactly don't seem worthwhile (see also comments for
-	 * RowMarkType).
-	 *
-	 * Note: because we actually run the query as a cursor, this assumes that
-	 * DECLARE CURSOR ... FOR UPDATE is supported, which it isn't before 8.3.
-	 */
-	if (baserel->relid == root->parse->resultRelation &&
-		(root->parse->commandType == CMD_UPDATE ||
-		 root->parse->commandType == CMD_DELETE))
-	{
-		/* Relation is UPDATE/DELETE target, so use FOR UPDATE */
-		appendStringInfoString(&sql, " FOR UPDATE");
-	}
-	else
-	{
-		PlanRowMark *rc = get_plan_rowmark(root->rowMarks, baserel->relid);
-
-		if (rc)
-		{
-			/*
-			 * Relation is specified as a FOR UPDATE/SHARE target, so handle
-			 * that.  (But we could also see LCS_NONE, meaning this isn't a
-			 * target relation after all.)
-			 *
-			 * For now, just ignore any [NO] KEY specification, since (a) it's
-			 * not clear what that means for a remote table that we don't have
-			 * complete information about, and (b) it wouldn't work anyway on
-			 * older remote servers.  Likewise, we don't worry about NOWAIT.
-			 */
-			switch (rc->strength)
-			{
-				case LCS_NONE:
-					/* No locking needed */
-					break;
-				case LCS_FORKEYSHARE:
-				case LCS_FORSHARE:
-					appendStringInfoString(&sql, " FOR SHARE");
-					break;
-				case LCS_FORNOKEYUPDATE:
-				case LCS_FORUPDATE:
-					appendStringInfoString(&sql, " FOR UPDATE");
-					break;
-			}
-		}
-	}
+	if (baserel->reloptkind == RELOPT_JOINREL)
+		initStringInfo(&relations);
+	deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used, remote_conds,
+					 &params_list, &fdw_ps_tlist, &retrieved_attrs,
+					 baserel->reloptkind == RELOPT_JOINREL ? &relations : NULL);
 
 	/*
-	 * Build the fdw_private list that will be available to the executor.
+	 * Build the fdw_private list that will be available in the executor.
 	 * Items in the list must match enum FdwScanPrivateIndex, above.
 	 */
-	fdw_private = list_make2(makeString(sql.data),
-							 retrieved_attrs);
+	fdw_private = list_make4(makeString(sql.data),
+							 retrieved_attrs,
+							 makeInteger(fpinfo->server->serverid),
+							 makeInteger(fpinfo->userid));
+	if (baserel->reloptkind == RELOPT_JOINREL)
+		fdw_private = lappend(fdw_private, makeString(relations.data));
 
 	/*
 	 * Create the ForeignScan node from target list, local filtering
@@ -868,11 +831,18 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 * field of the finished plan node; we can't keep them in private state
 	 * because then they wouldn't be subject to later planner processing.
 	 */
-	return make_foreignscan(tlist,
+	scan = make_foreignscan(tlist,
 							local_exprs,
 							scan_relid,
 							params_list,
 							fdw_private);
+
+	/*
+	 * set fdw_ps_tlist to handle tuples generated by this scan.
+	 */
+	scan->fdw_ps_tlist = fdw_ps_tlist;
+
+	return scan;
 }
 
 /*
@@ -885,9 +855,8 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	ForeignScan *fsplan = (ForeignScan *) node->ss.ps.plan;
 	EState	   *estate = node->ss.ps.state;
 	PgFdwScanState *fsstate;
-	RangeTblEntry *rte;
+	Oid			serverid;
 	Oid			userid;
-	ForeignTable *table;
 	ForeignServer *server;
 	UserMapping *user;
 	int			numParams;
@@ -907,22 +876,13 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	node->fdw_state = (void *) fsstate;
 
 	/*
-	 * Identify which user to do the remote access as.  This should match what
-	 * ExecCheckRTEPerms() does.
-	 */
-	rte = rt_fetch(fsplan->scan.scanrelid, estate->es_range_table);
-	userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-	/* Get info about foreign table. */
-	fsstate->rel = node->ss.ss_currentRelation;
-	table = GetForeignTable(RelationGetRelid(fsstate->rel));
-	server = GetForeignServer(table->serverid);
-	user = GetUserMapping(userid, server->serverid);
-
-	/*
 	 * Get connection to the foreign server.  Connection manager will
 	 * establish new connection if necessary.
 	 */
+	serverid = intVal(list_nth(fsplan->fdw_private, FdwScanPrivateServerOid));
+	userid = intVal(list_nth(fsplan->fdw_private, FdwScanPrivateUserOid));
+	server = GetForeignServer(serverid);
+	user = GetUserMapping(userid, server->serverid);
 	fsstate->conn = GetConnection(server, user, false);
 
 	/* Assign a unique ID for my cursor */
@@ -932,8 +892,8 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	/* Get private info created by planner functions. */
 	fsstate->query = strVal(list_nth(fsplan->fdw_private,
 									 FdwScanPrivateSelectSql));
-	fsstate->retrieved_attrs = (List *) list_nth(fsplan->fdw_private,
-											   FdwScanPrivateRetrievedAttrs);
+	fsstate->retrieved_attrs = list_nth(fsplan->fdw_private,
+										FdwScanPrivateRetrievedAttrs);
 
 	/* Create contexts for batches of tuples and per-tuple temp workspace. */
 	fsstate->batch_cxt = AllocSetContextCreate(estate->es_query_cxt,
@@ -947,8 +907,18 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 											  ALLOCSET_SMALL_INITSIZE,
 											  ALLOCSET_SMALL_MAXSIZE);
 
-	/* Get info we'll need for input data conversion. */
-	fsstate->attinmeta = TupleDescGetAttInMetadata(RelationGetDescr(fsstate->rel));
+	/* Get info we'll need for input data conversion and error report. */
+	if (fsplan->scan.scanrelid > 0)
+	{
+		fsstate->relname = RelationGetRelationName(node->ss.ss_currentRelation);
+		fsstate->tupdesc = RelationGetDescr(node->ss.ss_currentRelation);
+	}
+	else
+	{
+		fsstate->relname = NULL;
+		fsstate->tupdesc = node->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
+	}
+	fsstate->attinmeta = TupleDescGetAttInMetadata(fsstate->tupdesc);
 
 	/* Prepare for output conversion of parameters used in remote query. */
 	numParams = list_length(fsplan->fdw_exprs);
@@ -1664,10 +1634,25 @@ postgresExplainForeignScan(ForeignScanState *node, ExplainState *es)
 {
 	List	   *fdw_private;
 	char	   *sql;
+	char	   *relations;
 
+	fdw_private = ((ForeignScan *) node->ss.ps.plan)->fdw_private;
+
+	/*
+	 * Add names of relation handled by the foreign scan when the scan is a
+	 * join
+	 */
+	if (list_length(fdw_private) > FdwScanPrivateRelations)
+	{
+		relations = strVal(list_nth(fdw_private, FdwScanPrivateRelations));
+		ExplainPropertyText("Relations", relations, es);
+	}
+
+	/*
+	 * Add remote query, when VERBOSE option is specified.
+	 */
 	if (es->verbose)
 	{
-		fdw_private = ((ForeignScan *) node->ss.ps.plan)->fdw_private;
 		sql = strVal(list_nth(fdw_private, FdwScanPrivateSelectSql));
 		ExplainPropertyText("Remote SQL", sql, es);
 	}
@@ -1726,10 +1711,12 @@ estimate_path_cost_size(PlannerInfo *root,
 	 */
 	if (fpinfo->use_remote_estimate)
 	{
+		List	   *remote_conds;
 		List	   *remote_join_conds;
 		List	   *local_join_conds;
-		StringInfoData sql;
 		List	   *retrieved_attrs;
+		StringInfoData sql;
+		UserMapping *user;
 		PGconn	   *conn;
 		Selectivity local_sel;
 		QualCost	local_cost;
@@ -1741,24 +1728,24 @@ estimate_path_cost_size(PlannerInfo *root,
 		classifyConditions(root, baserel, join_conds,
 						   &remote_join_conds, &local_join_conds);
 
+		remote_conds = copyObject(fpinfo->remote_conds);
+		remote_conds = list_concat(remote_conds, remote_join_conds);
+
 		/*
 		 * Construct EXPLAIN query including the desired SELECT, FROM, and
 		 * WHERE clauses.  Params and other-relation Vars are replaced by
 		 * dummy values.
+		 * Here we waste params_list and fdw_ps_tlist because they are
+		 * unnecessary for EXPLAIN.
 		 */
 		initStringInfo(&sql);
 		appendStringInfoString(&sql, "EXPLAIN ");
-		deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used,
-						 &retrieved_attrs);
-		if (fpinfo->remote_conds)
-			appendWhereClause(&sql, root, baserel, fpinfo->remote_conds,
-							  true, NULL);
-		if (remote_join_conds)
-			appendWhereClause(&sql, root, baserel, remote_join_conds,
-							  (fpinfo->remote_conds == NIL), NULL);
+		deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used, remote_conds,
+						 NULL, NULL, &retrieved_attrs, NULL);
 
 		/* Get the remote estimate */
-		conn = GetConnection(fpinfo->server, fpinfo->user, false);
+		user = GetUserMapping(fpinfo->userid, fpinfo->server->serverid);
+		conn = GetConnection(fpinfo->server, user, false);
 		get_remote_estimate(sql.data, conn, &rows, &width,
 							&startup_cost, &total_cost);
 		ReleaseConnection(conn);
@@ -2055,7 +2042,9 @@ fetch_more_data(ForeignScanState *node)
 		{
 			fsstate->tuples[i] =
 				make_tuple_from_result_row(res, i,
-										   fsstate->rel,
+										   fsstate->relname,
+										   fsstate->query,
+										   fsstate->tupdesc,
 										   fsstate->attinmeta,
 										   fsstate->retrieved_attrs,
 										   fsstate->temp_cxt);
@@ -2273,7 +2262,9 @@ store_returning_result(PgFdwModifyState *fmstate,
 		HeapTuple	newtup;
 
 		newtup = make_tuple_from_result_row(res, 0,
-											fmstate->rel,
+										RelationGetRelationName(fmstate->rel),
+											fmstate->query,
+											RelationGetDescr(fmstate->rel),
 											fmstate->attinmeta,
 											fmstate->retrieved_attrs,
 											fmstate->temp_cxt);
@@ -2423,6 +2414,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
 	initStringInfo(&sql);
 	appendStringInfo(&sql, "DECLARE c%u CURSOR FOR ", cursor_number);
 	deparseAnalyzeSql(&sql, relation, &astate.retrieved_attrs);
+	astate.query = sql.data;
 
 	/* In what follows, do not risk leaking any PGresults. */
 	PG_TRY();
@@ -2565,7 +2557,9 @@ analyze_row_processor(PGresult *res, int row, PgFdwAnalyzeState *astate)
 		oldcontext = MemoryContextSwitchTo(astate->anl_cxt);
 
 		astate->rows[pos] = make_tuple_from_result_row(res, row,
-													   astate->rel,
+										   RelationGetRelationName(astate->rel),
+													   astate->query,
+											   RelationGetDescr(astate->rel),
 													   astate->attinmeta,
 													 astate->retrieved_attrs,
 													   astate->temp_cxt);
@@ -2839,6 +2833,269 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
 }
 
 /*
+ * Construct PgFdwRelationInfo from two join sources
+ */
+static PgFdwRelationInfo *
+merge_fpinfo(RelOptInfo *outerrel,
+			 RelOptInfo *innerrel,
+			 JoinType jointype,
+			 double rows,
+			 int width)
+{
+	PgFdwRelationInfo *fpinfo_o;
+	PgFdwRelationInfo *fpinfo_i;
+	PgFdwRelationInfo *fpinfo;
+
+	fpinfo_o = (PgFdwRelationInfo *) outerrel->fdw_private;
+	fpinfo_i = (PgFdwRelationInfo *) innerrel->fdw_private;
+
+	fpinfo = (PgFdwRelationInfo *) palloc0(sizeof(PgFdwRelationInfo));
+
+	/* Join relation must have conditions come from sources */
+	fpinfo->remote_conds = list_concat(copyObject(fpinfo_o->remote_conds),
+									   copyObject(fpinfo_i->remote_conds));
+	fpinfo->local_conds = list_concat(copyObject(fpinfo_o->local_conds),
+									  copyObject(fpinfo_i->local_conds));
+
+	/* Only for simple foreign table scan */
+	fpinfo->attrs_used = NULL;
+
+	/* rows and width will be set later */
+	fpinfo->rows = rows;
+	fpinfo->width = width;
+
+	/* A join have local conditions for outer and inner, so sum up them. */
+	fpinfo->local_conds_cost.startup = fpinfo_o->local_conds_cost.startup +
+									   fpinfo_i->local_conds_cost.startup;
+	fpinfo->local_conds_cost.per_tuple = fpinfo_o->local_conds_cost.per_tuple +
+										 fpinfo_i->local_conds_cost.per_tuple;
+
+	/* Don't consider correlation between local filters. */
+	fpinfo->local_conds_sel = fpinfo_o->local_conds_sel *
+							  fpinfo_i->local_conds_sel;
+
+	fpinfo->use_remote_estimate = false;
+
+	/*
+	 * These two comes default or per-server setting, so outer and inner must
+	 * have same value.
+	 */
+	fpinfo->fdw_startup_cost = fpinfo_o->fdw_startup_cost;
+	fpinfo->fdw_tuple_cost = fpinfo_o->fdw_tuple_cost;
+
+	/*
+	 * TODO estimate more accurately
+	 */
+	fpinfo->startup_cost = fpinfo->fdw_startup_cost +
+						   fpinfo->local_conds_cost.startup;
+	fpinfo->total_cost = fpinfo->startup_cost +
+						 (fpinfo->fdw_tuple_cost +
+						  fpinfo->local_conds_cost.per_tuple +
+						  cpu_tuple_cost) * fpinfo->rows;
+
+	/* serverid and userid are respectively identical */
+	fpinfo->server = fpinfo_o->server;
+	fpinfo->userid = fpinfo_o->userid;
+
+	fpinfo->outerrel = outerrel;
+	fpinfo->innerrel = innerrel;
+	fpinfo->jointype = jointype;
+
+	/* joinclauses and otherclauses will be set later */
+
+	return fpinfo;
+}
+
+/*
+ * postgresGetForeignJoinPaths
+ *		Add possible ForeignPath to joinrel.
+ *
+ * Joins satisfy conditions below can be pushed down to the remote PostgreSQL
+ * server.
+ *
+ * 1) Join type is INNER or OUTER (one of LEFT/RIGHT/FULL)
+ * 2) Both outer and inner portions are safe to push-down
+ * 3) All foreign tables in the join belong to the same foreign server
+ * 4) All foreign tables are accessed with identical user
+ * 5) All join conditions are safe to push down
+ * 6) No relation has local filter (this can be relaxed for INNER JOIN with
+ * no volatile function/operator, but as of now we want safer way)
+ */
+static void
+postgresGetForeignJoinPaths(PlannerInfo *root,
+							RelOptInfo *joinrel,
+							RelOptInfo *outerrel,
+							RelOptInfo *innerrel,
+							SpecialJoinInfo *sjinfo,
+							List *restrictlist)
+{
+	PgFdwRelationInfo *fpinfo;
+	PgFdwRelationInfo *fpinfo_o;
+	PgFdwRelationInfo *fpinfo_i;
+	JoinType		jointype = !sjinfo ? JOIN_INNER : sjinfo->jointype;
+	ForeignPath	   *joinpath;
+	double			rows;
+	Cost			startup_cost;
+	Cost			total_cost;
+
+	ListCell	   *lc;
+	List		   *joinclauses;
+	List		   *otherclauses;
+
+	/*
+	 * We support all outer joins in addition to inner join.  CROSS JOIN is
+	 * an INNER JOIN with no conditions internally, so will be checked later.
+	 */
+	if (jointype != JOIN_INNER && jointype != JOIN_LEFT &&
+		jointype != JOIN_RIGHT && jointype != JOIN_FULL)
+	{
+		ereport(DEBUG3, (errmsg("unsupported join type (SEMI, ANTI)")));
+		return;
+	}
+
+	/*
+	 * Having valid PgFdwRelationInfo in RelOptInfo#fdw_private indicates that
+	 * scanning against the relation can be pushed down.  If either of them
+	 * doesn't have PgFdwRelationInfo, give up to push down this join relation.
+	 */
+	if (!outerrel->fdw_private)
+	{
+		ereport(DEBUG3, (errmsg("outer is not safe to push-down")));
+		return;
+	}
+	if (!innerrel->fdw_private)
+	{
+		ereport(DEBUG3, (errmsg("inner is not safe to push-down")));
+		return;
+	}
+	fpinfo_o = (PgFdwRelationInfo *) outerrel->fdw_private;
+	fpinfo_i = (PgFdwRelationInfo *) innerrel->fdw_private;
+
+	/*
+	 * All relations in the join must belong to same server.  Having a valid
+	 * fdw_private means that all relations in the relations belong to the
+	 * server the fdw_private has, so what we should do is just compare
+	 * serverid of outer/inner relations.
+	 */
+	if (fpinfo_o->server->serverid != fpinfo_i->server->serverid)
+	{
+		ereport(DEBUG3, (errmsg("server unmatch")));
+		return;
+	}
+
+	/*
+	 * effective userid of all source relations should be identical.
+	 * Having a valid fdw_private means that all relations in the relations is
+	 * accessed with identical user, so what we should do is just compare
+	 * userid of outer/inner relations.
+	 */
+	if (fpinfo_o->userid != fpinfo_i->userid)
+	{
+		ereport(DEBUG3, (errmsg("unmatch userid")));
+		return;
+	}
+
+	/*
+	 * No source relation can have local conditions.  This can be relaxed
+	 * if the join is an inner join and local conditions don't contain
+	 * volatile function/operator, but as of now we leave it as future
+	 * enhancement.
+	 */
+	if (fpinfo_o->local_conds != NULL || fpinfo_i->local_conds != NULL)
+	{
+		ereport(DEBUG3, (errmsg("join with local filter")));
+		return;
+	}
+
+	/*
+	 * Separate restrictlist into two lists, join conditions and remote filters.
+	 */
+	joinclauses = restrictlist;
+	if (IS_OUTER_JOIN(jointype))
+	{
+		extract_actual_join_clauses(joinclauses, &joinclauses, &otherclauses);
+	}
+	else
+	{
+		joinclauses = extract_actual_clauses(joinclauses, false);
+		otherclauses = NIL;
+	}
+
+	/*
+	 * Note that CROSS JOIN (cartesian product) is transformed to JOIN_INNER
+	 * with empty joinclauses.  Pushing down CROSS JOIN usually produces more
+	 * result than retrieving each tables separately, so we don't push down
+	 * such joins.
+	 */
+	if (jointype == JOIN_INNER && joinclauses == NIL)
+	{
+		ereport(DEBUG3, (errmsg("unsupported join type (CROSS)")));
+		return;
+	}
+
+	/*
+	 * Join condition must be safe to push down.
+	 */
+	foreach(lc, joinclauses)
+	{
+		Expr *expr = (Expr *) lfirst(lc);
+
+		if (!is_foreign_expr(root, joinrel, expr))
+		{
+			ereport(DEBUG3, (errmsg("join quals contains unsafe conditions")));
+			return;
+		}
+	}
+
+	/*
+	 * Other condition for the join must be safe to push down.
+	 */
+	foreach(lc, otherclauses)
+	{
+		Expr *expr = (Expr *) lfirst(lc);
+
+		if (!is_foreign_expr(root, joinrel, expr))
+		{
+			ereport(DEBUG3, (errmsg("filter contains unsafe conditions")));
+			return;
+		}
+	}
+
+	/* Here we know that this join can be pushed-down to remote side. */
+
+	/* Construct fpinfo for the join relation */
+	fpinfo = merge_fpinfo(outerrel, innerrel, jointype, joinrel->rows, joinrel->width); 
+	fpinfo->joinclauses = joinclauses;
+	fpinfo->otherclauses = otherclauses;
+	joinrel->fdw_private = fpinfo;
+
+	/* TODO determine more accurate cost and rows of the join. */
+	rows = joinrel->rows;
+	startup_cost = fpinfo->startup_cost;
+	total_cost = fpinfo->total_cost;
+
+	/*
+	 * Create a new join path and add it to the joinrel which represents a join
+	 * between foreign tables.
+	 */
+	joinpath = create_foreignscan_path(root,
+									   joinrel,
+									   rows,
+									   startup_cost,
+									   total_cost,
+									   NIL,		/* no pathkeys */
+									   NULL,	/* no required_outer */
+									   NIL);	/* no fdw_private */
+
+	/* Add generated path into joinrel by add_path(). */
+	add_path(joinrel, (Path *) joinpath);
+	elog(DEBUG3, "join path added for (%s) join (%s)",
+		 bms_to_str(outerrel->relids), bms_to_str(innerrel->relids));
+
+	/* TODO consider parameterized paths */
+}
+
+/*
  * Create a tuple from the specified row of the PGresult.
  *
  * rel is the local representation of the foreign table, attinmeta is
@@ -2849,13 +3106,14 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
 static HeapTuple
 make_tuple_from_result_row(PGresult *res,
 						   int row,
-						   Relation rel,
+						   const char *relname,
+						   const char *query,
+						   TupleDesc tupdesc,
 						   AttInMetadata *attinmeta,
 						   List *retrieved_attrs,
 						   MemoryContext temp_context)
 {
 	HeapTuple	tuple;
-	TupleDesc	tupdesc = RelationGetDescr(rel);
 	Datum	   *values;
 	bool	   *nulls;
 	ItemPointer ctid = NULL;
@@ -2882,7 +3140,9 @@ make_tuple_from_result_row(PGresult *res,
 	/*
 	 * Set up and install callback to report where conversion error occurs.
 	 */
-	errpos.rel = rel;
+	errpos.relname = relname;
+	errpos.query = query;
+	errpos.tupdesc = tupdesc;
 	errpos.cur_attno = 0;
 	errcallback.callback = conversion_error_callback;
 	errcallback.arg = (void *) &errpos;
@@ -2966,11 +3226,39 @@ make_tuple_from_result_row(PGresult *res,
 static void
 conversion_error_callback(void *arg)
 {
+	const char *attname;
+	const char *relname;
 	ConversionLocation *errpos = (ConversionLocation *) arg;
-	TupleDesc	tupdesc = RelationGetDescr(errpos->rel);
+	TupleDesc	tupdesc = errpos->tupdesc;
+	StringInfoData buf;
+
+	if (errpos->relname)
+	{
+		/* error occurred in a scan against a foreign table */ 
+		initStringInfo(&buf);
+		if (errpos->cur_attno > 0)
+			appendStringInfo(&buf, "column \"%s\"",
+					 NameStr(tupdesc->attrs[errpos->cur_attno - 1]->attname));
+		else if (errpos->cur_attno == SelfItemPointerAttributeNumber)
+			appendStringInfoString(&buf, "column \"ctid\"");
+		attname = buf.data;
+
+		initStringInfo(&buf);
+		appendStringInfo(&buf, "foreign table \"%s\"", errpos->relname);
+		relname = buf.data;
+	}
+	else
+	{
+		/* error occurred in a scan against a foreign join */ 
+		initStringInfo(&buf);
+		appendStringInfo(&buf, "column %d", errpos->cur_attno - 1);
+		attname = buf.data;
+
+		initStringInfo(&buf);
+		appendStringInfo(&buf, "foreign join \"%s\"", errpos->query);
+		relname = buf.data;
+	}
 
 	if (errpos->cur_attno > 0 && errpos->cur_attno <= tupdesc->natts)
-		errcontext("column \"%s\" of foreign table \"%s\"",
-				   NameStr(tupdesc->attrs[errpos->cur_attno - 1]->attname),
-				   RelationGetRelationName(errpos->rel));
+		errcontext("%s of %s", attname, relname);
 }
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 950c6f7..d6b16d8 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -16,10 +16,52 @@
 #include "foreign/foreign.h"
 #include "lib/stringinfo.h"
 #include "nodes/relation.h"
+#include "nodes/plannodes.h"
 #include "utils/relcache.h"
 
 #include "libpq-fe.h"
 
+/*
+ * FDW-specific planner information kept in RelOptInfo.fdw_private for a
+ * foreign table or a foreign join.  This information is collected by
+ * postgresGetForeignRelSize, or calculated from join source relations.
+ */
+typedef struct PgFdwRelationInfo
+{
+	/* baserestrictinfo clauses, broken down into safe and unsafe subsets. */
+	List	   *remote_conds;
+	List	   *local_conds;
+
+	/* Bitmap of attr numbers we need to fetch from the remote server. */
+	Bitmapset  *attrs_used;
+
+	/* Cost and selectivity of local_conds. */
+	QualCost	local_conds_cost;
+	Selectivity local_conds_sel;
+
+	/* Estimated size and cost for a scan with baserestrictinfo quals. */
+	double		rows;
+	int			width;
+	Cost		startup_cost;
+	Cost		total_cost;
+
+	/* Options extracted from catalogs. */
+	bool		use_remote_estimate;
+	Cost		fdw_startup_cost;
+	Cost		fdw_tuple_cost;
+
+	/* Cached catalog information. */
+	ForeignServer *server;
+	Oid			userid;
+
+	/* Join information */
+	RelOptInfo *outerrel;
+	RelOptInfo *innerrel;
+	JoinType	jointype;
+	List	   *joinclauses;
+	List	   *otherclauses;
+} PgFdwRelationInfo;
+
 /* in postgres_fdw.c */
 extern int	set_transmission_modes(void);
 extern void reset_transmission_modes(int nestlevel);
@@ -51,13 +93,31 @@ extern void deparseSelectSql(StringInfo buf,
 				 PlannerInfo *root,
 				 RelOptInfo *baserel,
 				 Bitmapset *attrs_used,
-				 List **retrieved_attrs);
-extern void appendWhereClause(StringInfo buf,
+				 List *remote_conds,
+				 List **params_list,
+				 List **fdw_ps_tlist,
+				 List **retrieved_attrs,
+				 StringInfo relations);
+extern void appendConditions(StringInfo buf,
 				  PlannerInfo *root,
 				  RelOptInfo *baserel,
+				  List *outertlist,
+				  List *innertlist,
 				  List *exprs,
-				  bool is_first,
+				  const char *prefix,
 				  List **params);
+extern void deparseJoinSql(StringInfo sql,
+			   PlannerInfo *root,
+			   RelOptInfo *baserel,
+			   RelOptInfo *outerrel,
+			   RelOptInfo *innerrel,
+			   const char *sql_o,
+			   const char *sql_i,
+			   JoinType jointype,
+			   List *joinclauses,
+			   List *otherclauses,
+			   List **fdw_ps_tlist,
+			   List **retrieved_attrs);
 extern void deparseInsertSql(StringInfo buf, PlannerInfo *root,
 				 Index rtindex, Relation rel,
 				 List *targetAttrs, List *returningList,
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 4a23457..b0c9a8d 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -11,12 +11,17 @@ DO $d$
             OPTIONS (dbname '$$||current_database()||$$',
                      port '$$||current_setting('port')||$$'
             )$$;
+        EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER postgres_fdw
+            OPTIONS (dbname '$$||current_database()||$$',
+                     port '$$||current_setting('port')||$$'
+            )$$;
     END;
 $d$;
 
 CREATE USER MAPPING FOR public SERVER testserver1
 	OPTIONS (user 'value', password 'value');
 CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
+CREATE USER MAPPING FOR CURRENT_USER SERVER loopback2;
 
 -- ===================================================================
 -- create objects used through FDW loopback server
@@ -39,6 +44,18 @@ CREATE TABLE "S 1"."T 2" (
 	c2 text,
 	CONSTRAINT t2_pkey PRIMARY KEY (c1)
 );
+CREATE TABLE "S 1"."T 3" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text,
+	CONSTRAINT t3_pkey PRIMARY KEY (c1)
+);
+CREATE TABLE "S 1"."T 4" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c4 text,
+	CONSTRAINT t4_pkey PRIMARY KEY (c1)
+);
 
 INSERT INTO "S 1"."T 1"
 	SELECT id,
@@ -54,9 +71,23 @@ INSERT INTO "S 1"."T 2"
 	SELECT id,
 	       'AAA' || to_char(id, 'FM000')
 	FROM generate_series(1, 100) id;
+INSERT INTO "S 1"."T 3"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 3" WHERE c1 % 2 != 0;	-- delete for outer join tests
+INSERT INTO "S 1"."T 4"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 4" WHERE c1 % 3 != 0;	-- delete for outer join tests
 
 ANALYZE "S 1"."T 1";
 ANALYZE "S 1"."T 2";
+ANALYZE "S 1"."T 3";
+ANALYZE "S 1"."T 4";
 
 -- ===================================================================
 -- create foreign tables
@@ -87,6 +118,29 @@ CREATE FOREIGN TABLE ft2 (
 ) SERVER loopback;
 ALTER FOREIGN TABLE ft2 DROP COLUMN cx;
 
+CREATE FOREIGN TABLE ft4 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3');
+
+CREATE FOREIGN TABLE ft5 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4');
+
+CREATE FOREIGN TABLE ft6 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback2 OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE USER view_owner;
+GRANT ALL ON ft5 TO view_owner;
+CREATE VIEW v_ft5 AS SELECT * FROM ft5;
+ALTER VIEW v_ft5 OWNER TO view_owner;
+CREATE USER MAPPING FOR view_owner SERVER loopback;
+
 -- ===================================================================
 -- tests for validator
 -- ===================================================================
@@ -158,8 +212,6 @@ EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
 SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
 -- aggregate
 SELECT COUNT(*) FROM ft1 t1;
--- join two tables
-SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
 -- subquery
 SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <= 10) ORDER BY c1;
 -- subquery+MAX
@@ -216,6 +268,82 @@ SELECT * FROM ft1 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft2 WHERE c1 < 5));
 SELECT * FROM ft2 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft1 WHERE c1 < 5));
 
 -- ===================================================================
+-- JOIN queries
+-- ===================================================================
+-- join two tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- join three tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+-- left outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+-- right outer join
+SET enable_mergejoin = off; -- planner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+SET enable_mergejoin = on;
+-- full outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+-- full outer join + WHERE clause, only matched rows
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+-- join at WHERE clause 
+SET enable_mergejoin = off; -- planner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SET enable_mergejoin = on;
+-- join in CTE
+EXPLAIN (COSTS false, VERBOSE)
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+-- ctid with whole-row reference
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- partially unsafe to push down, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+-- SEMI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+-- ANTI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+-- CROSS JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- different server
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- different effective user for permission check
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- unsafe join conditions
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- local filter (unsafe conditions on one side)
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+
+-- ===================================================================
 -- parameterized queries
 -- ===================================================================
 -- simple join
@@ -666,116 +794,6 @@ UPDATE rem1 SET f2 = 'testo';
 INSERT INTO rem1(f2) VALUES ('test') RETURNING ctid;
 
 -- ===================================================================
--- test inheritance features
--- ===================================================================
-
-CREATE TABLE a (aa TEXT);
-CREATE TABLE loct (aa TEXT, bb TEXT);
-CREATE FOREIGN TABLE b (bb TEXT) INHERITS (a)
-  SERVER loopback OPTIONS (table_name 'loct');
-
-INSERT INTO a(aa) VALUES('aaa');
-INSERT INTO a(aa) VALUES('aaaa');
-INSERT INTO a(aa) VALUES('aaaaa');
-
-INSERT INTO b(aa) VALUES('bbb');
-INSERT INTO b(aa) VALUES('bbbb');
-INSERT INTO b(aa) VALUES('bbbbb');
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE a SET aa = 'zzzzzz' WHERE aa LIKE 'aaaa%';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE b SET aa = 'new';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE a SET aa = 'newtoo';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-DELETE FROM a;
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-DROP TABLE a CASCADE;
-DROP TABLE loct;
-
--- Check SELECT FOR UPDATE/SHARE with an inherited source table
-create table loct1 (f1 int, f2 int, f3 int);
-create table loct2 (f1 int, f2 int, f3 int);
-
-create table foo (f1 int, f2 int);
-create foreign table foo2 (f3 int) inherits (foo)
-  server loopback options (table_name 'loct1');
-create table bar (f1 int, f2 int);
-create foreign table bar2 (f3 int) inherits (bar)
-  server loopback options (table_name 'loct2');
-
-insert into foo values(1,1);
-insert into foo values(3,3);
-insert into foo2 values(2,2,2);
-insert into foo2 values(4,4,4);
-insert into bar values(1,11);
-insert into bar values(2,22);
-insert into bar values(6,66);
-insert into bar2 values(3,33,33);
-insert into bar2 values(4,44,44);
-insert into bar2 values(7,77,77);
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for update;
-select * from bar where f1 in (select f1 from foo) for update;
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for share;
-select * from bar where f1 in (select f1 from foo) for share;
-
--- Check UPDATE with inherited target and an inherited source table
-explain (verbose, costs off)
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-
-select tableoid::regclass, * from bar order by 1,2;
-
--- Check UPDATE with inherited target and an appendrel subquery
-explain (verbose, costs off)
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-
-select tableoid::regclass, * from bar order by 1,2;
-
--- Test that WHERE CURRENT OF is not supported
-begin;
-declare c cursor for select * from bar where f1 = 7;
-fetch from c;
-update bar set f2 = null where current of c;
-rollback;
-
-drop table foo cascade;
-drop table bar cascade;
-drop table loct1;
-drop table loct2;
-
--- ===================================================================
 -- test IMPORT FOREIGN SCHEMA
 -- ===================================================================
 
@@ -831,3 +849,7 @@ DROP TYPE "Colors" CASCADE;
 IMPORT FOREIGN SCHEMA import_source LIMIT TO (t5)
   FROM SERVER loopback INTO import_dest5;  -- ERROR
 ROLLBACK;
+
+-- Cleanup
+DROP OWNED BY view_owner;
+DROP USER view_owner;
diff --git a/doc/src/sgml/contrib.sgml b/doc/src/sgml/contrib.sgml
index 5773095..adc2184 100644
--- a/doc/src/sgml/contrib.sgml
+++ b/doc/src/sgml/contrib.sgml
@@ -204,7 +204,6 @@ pages.
  &pgstandby;
  &pgtestfsync;
  &pgtesttiming;
- &pgupgrade;
  &pgxlogdump;
  </sect1>
 </appendix>
diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml
index ab935a6..2d7514c 100644
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -136,7 +136,6 @@
 <!ENTITY pgtestfsync     SYSTEM "pgtestfsync.sgml">
 <!ENTITY pgtesttiming    SYSTEM "pgtesttiming.sgml">
 <!ENTITY pgtrgm          SYSTEM "pgtrgm.sgml">
-<!ENTITY pgupgrade       SYSTEM "pgupgrade.sgml">
 <!ENTITY pgxlogdump      SYSTEM "pg_xlogdump.sgml">
 <!ENTITY postgres-fdw    SYSTEM "postgres-fdw.sgml">
 <!ENTITY seg             SYSTEM "seg.sgml">
diff --git a/doc/src/sgml/pgupgrade.sgml b/doc/src/sgml/pgupgrade.sgml
deleted file mode 100644
index 0d79fb5..0000000
--- a/doc/src/sgml/pgupgrade.sgml
+++ /dev/null
@@ -1,724 +0,0 @@
-<!-- doc/src/sgml/pgupgrade.sgml -->
-
-<refentry id="pgupgrade">
- <indexterm zone="pgupgrade">
-  <primary>pg_upgrade</primary>
- </indexterm>
-
- <refmeta>
-  <refentrytitle><application>pg_upgrade</application></refentrytitle>
-  <manvolnum>1</manvolnum>
-  <refmiscinfo>Application</refmiscinfo>
- </refmeta>
-
- <refnamediv>
-  <refname>pg_upgrade</refname>
-  <refpurpose>upgrade a <productname>PostgreSQL</productname> server instance</refpurpose>
- </refnamediv>
-
- <refsynopsisdiv>
-  <cmdsynopsis>
-   <command>pg_upgrade</command>
-   <arg choice="plain"><option>-b</option></arg>
-   <arg choice="plain"><replaceable>oldbindir</replaceable></arg>
-   <arg choice="plain"><option>-B</option></arg>
-   <arg choice="plain"><replaceable>newbindir</replaceable></arg>
-   <arg choice="plain"><option>-d</option></arg>
-   <arg choice="plain"><replaceable>olddatadir</replaceable></arg>
-   <arg choice="plain"><option>-D</option></arg>
-   <arg choice="plain"><replaceable>newdatadir</replaceable></arg>
-   <arg rep="repeat"><replaceable>option</replaceable></arg>
-  </cmdsynopsis>
- </refsynopsisdiv>
-
- <refsect1>
-  <title>Description</title>
-
- <para>
-  <application>pg_upgrade</> (formerly called <application>pg_migrator</>) allows data
-  stored in <productname>PostgreSQL</> data files to be upgraded to a later <productname>PostgreSQL</>
-  major version without the data dump/reload typically required for
-  major version upgrades, e.g. from 8.4.7 to the current major release
-  of <productname>PostgreSQL</>.  It is not required for minor version upgrades, e.g. from
-  9.0.1 to 9.0.4.
- </para>
-
- <para>
-  Major PostgreSQL releases regularly add new features that often
-  change the layout of the system tables, but the internal data storage
-  format rarely changes.  <application>pg_upgrade</> uses this fact
-  to perform rapid upgrades by creating new system tables and simply
-  reusing the old user data files.  If a future major release ever
-  changes the data storage format in a way that makes the old data
-  format unreadable, <application>pg_upgrade</> will not be usable
-  for such upgrades.  (The community will attempt to avoid such
-  situations.)
- </para>
-
- <para>
-  <application>pg_upgrade</> does its best to
-  make sure the old and new clusters are binary-compatible, e.g.  by
-  checking for compatible compile-time settings, including 32/64-bit
-  binaries.  It is important that
-  any external modules are also binary compatible, though this cannot
-  be checked by <application>pg_upgrade</>.
- </para>
-
-  <para>
-   pg_upgrade supports upgrades from 8.4.X and later to the current
-   major release of <productname>PostgreSQL</>, including snapshot and alpha releases.
-  </para>
- </refsect1>
-
- <refsect1>
-  <title>Options</title>
-
-   <para>
-    <application>pg_upgrade</application> accepts the following command-line arguments:
-
-    <variablelist>
-
-     <varlistentry>
-      <term><option>-b</option> <replaceable>bindir</></term>
-      <term><option>--old-bindir=</option><replaceable>bindir</></term>
-      <listitem><para>the old PostgreSQL executable directory;
-      environment variable <envar>PGBINOLD</></para></listitem>
-     </varlistentry>
-
-     <varlistentry>
-      <term><option>-B</option> <replaceable>bindir</></term>
-      <term><option>--new-bindir=</option><replaceable>bindir</></term>
-      <listitem><para>the new PostgreSQL executable directory;
-      environment variable <envar>PGBINNEW</></para></listitem>
-     </varlistentry>
-
-     <varlistentry>
-      <term><option>-c</option></term>
-      <term><option>--check</option></term>
-      <listitem><para>check clusters only, don't change any data</para></listitem>
-     </varlistentry>
-
-     <varlistentry>
-      <term><option>-d</option> <replaceable>datadir</></term>
-      <term><option>--old-datadir=</option><replaceable>datadir</></term>
-      <listitem><para>the old cluster data directory; environment
-      variable <envar>PGDATAOLD</></para></listitem>
-     </varlistentry>
-
-     <varlistentry>
-      <term><option>-D</option> <replaceable>datadir</></term>
-      <term><option>--new-datadir=</option><replaceable>datadir</></term>
-      <listitem><para>the new cluster data directory; environment
-      variable <envar>PGDATANEW</></para></listitem>
-     </varlistentry>
-
-     <varlistentry>
-      <term><option>-j</option></term>
-      <term><option>--jobs</option></term>
-      <listitem><para>number of simultaneous processes or threads to use
-      </para></listitem>
-     </varlistentry>
-
-     <varlistentry>
-      <term><option>-k</option></term>
-      <term><option>--link</option></term>
-      <listitem><para>use hard links instead of copying files to the new
-      cluster (use junction points on Windows)</para></listitem>
-     </varlistentry>
-
-     <varlistentry>
-      <term><option>-o</option> <replaceable class="parameter">options</replaceable></term>
-      <term><option>--old-options</option> <replaceable class="parameter">options</replaceable></term>
-      <listitem><para>options to be passed directly to the
-      old <command>postgres</command> command;  multiple
-      option invocations are appended</para></listitem>
-     </varlistentry>
-
-     <varlistentry>
-      <term><option>-O</option> <replaceable class="parameter">options</replaceable></term>
-      <term><option>--new-options</option> <replaceable class="parameter">options</replaceable></term>
-      <listitem><para>options to be passed directly to the
-      new <command>postgres</command> command;  multiple
-      option invocations are appended</para></listitem>
-     </varlistentry>
-
-     <varlistentry>
-      <term><option>-p</option> <replaceable>port</></term>
-      <term><option>--old-port=</option><replaceable>port</></term>
-      <listitem><para>the old cluster port number; environment
-      variable <envar>PGPORTOLD</></para></listitem>
-     </varlistentry>
-
-     <varlistentry>
-      <term><option>-P</option> <replaceable>port</></term>
-      <term><option>--new-port=</option><replaceable>port</></term>
-      <listitem><para>the new cluster port number; environment
-      variable <envar>PGPORTNEW</></para></listitem>
-     </varlistentry>
-
-     <varlistentry>
-      <term><option>-r</option></term>
-      <term><option>--retain</option></term>
-      <listitem><para>retain SQL and log files even after successful completion
-      </para></listitem>
-     </varlistentry>
-
-     <varlistentry>
-      <term><option>-U</option> <replaceable>username</></term>
-      <term><option>--username=</option><replaceable>username</></term>
-      <listitem><para>cluster's install user name; environment
-      variable <envar>PGUSER</></para></listitem>
-     </varlistentry>
-
-     <varlistentry>
-      <term><option>-v</option></term>
-      <term><option>--verbose</option></term>
-      <listitem><para>enable verbose internal logging</para></listitem>
-     </varlistentry>
-
-     <varlistentry>
-      <term><option>-V</option></term>
-      <term><option>--version</option></term>
-      <listitem><para>display version information, then exit</para></listitem>
-     </varlistentry>
-
-     <varlistentry>
-      <term><option>-?</option></term>
-      <term><option>--help</option></term>
-      <listitem><para>show help, then exit</para></listitem>
-     </varlistentry>
-
-    </variablelist>
-   </para>
-
- </refsect1>
-
- <refsect1>
-  <title>Usage</title>
-
-  <para>
-   These are the steps to perform an upgrade
-   with <application>pg_upgrade</application>:
-  </para>
-
-  <procedure>
-   <step performance="optional">
-    <title>Optionally move the old cluster</title>
-
-    <para>
-     If you are using a version-specific installation directory, e.g.
-     <filename>/opt/PostgreSQL/9.1</>, you do not need to move the old cluster. The
-     graphical installers all use version-specific installation directories.
-    </para>
-
-    <para>
-     If your installation directory is not version-specific, e.g.
-     <filename>/usr/local/pgsql</>, it is necessary to move the current PostgreSQL install
-     directory so it does not interfere with the new <productname>PostgreSQL</> installation.
-     Once the current <productname>PostgreSQL</> server is shut down, it is safe to rename the
-     PostgreSQL installation directory; assuming the old directory is
-     <filename>/usr/local/pgsql</>, you can do:
-
-<programlisting>
-mv /usr/local/pgsql /usr/local/pgsql.old
-</programlisting>
-     to rename the directory.
-    </para>
-   </step>
-
-   <step>
-    <title>For source installs, build the new version</title>
-
-    <para>
-     Build the new PostgreSQL source with <command>configure</> flags that are compatible
-     with the old cluster. <application>pg_upgrade</> will check <command>pg_controldata</> to make
-     sure all settings are compatible before starting the upgrade.
-    </para>
-   </step>
-
-   <step>
-    <title>Install the new PostgreSQL binaries</title>
-
-    <para>
-     Install the new server's binaries and support files.
-    </para>
-
-    <para>
-     For source installs, if you wish to install the new server in a custom
-     location, use the <literal>prefix</literal> variable:
-
-<programlisting>
-make prefix=/usr/local/pgsql.new install
-</programlisting></para>
-   </step>
-
-   <step>
-    <title>Install pg_upgrade and pg_upgrade_support</title>
-
-    <para>
-     Install the <application>pg_upgrade</> binary and
-     <application>pg_upgrade_support</> library in the new PostgreSQL
-     installation.
-    </para>
-   </step>
-
-   <step>
-    <title>Initialize the new PostgreSQL cluster</title>
-
-    <para>
-     Initialize the new cluster using <command>initdb</command>.
-     Again, use compatible <command>initdb</command>
-     flags that match the old cluster. Many
-     prebuilt installers do this step automatically. There is no need to
-     start the new cluster.
-    </para>
-   </step>
-
-   <step>
-    <title>Install custom shared object files</title>
-
-    <para>
-     Install any custom shared object files (or DLLs) used by the old cluster
-     into the new cluster, e.g. <filename>pgcrypto.so</filename>,
-     whether they are from <filename>contrib</filename>
-     or some other source. Do not install the schema definitions, e.g.
-     <filename>pgcrypto.sql</>, because these will be upgraded from the old cluster.
-    </para>
-   </step>
-
-   <step>
-    <title>Adjust authentication</title>
-
-    <para>
-     <command>pg_upgrade</> will connect to the old and new servers several
-     times, so you might want to set authentication to <literal>peer</>
-     in <filename>pg_hba.conf</> or use a <filename>~/.pgpass</> file
-     (see <xref linkend="libpq-pgpass">).
-    </para>
-   </step>
-
-   <step>
-    <title>Stop both servers</title>
-
-    <para>
-     Make sure both database servers are stopped using, on Unix, e.g.:
-
-<programlisting>
-pg_ctl -D /opt/PostgreSQL/8.4 stop
-pg_ctl -D /opt/PostgreSQL/9.0 stop
-</programlisting>
-
-     or on Windows, using the proper service names:
-
-<programlisting>
-NET STOP postgresql-8.4
-NET STOP postgresql-9.0
-</programlisting>
-    </para>
-
-    <para>
-     Streaming replication and log-shipping standby servers can remain running until
-     a later step.
-    </para>
-   </step>
-
-   <step>
-    <title>Run <application>pg_upgrade</></title>
-
-    <para>
-     Always run the <application>pg_upgrade</> binary of the new server, not the old one.
-     <application>pg_upgrade</> requires the specification of the old and new cluster's
-     data and executable (<filename>bin</>) directories. You can also specify
-     user and port values, and whether you want the data linked instead of
-     copied (the default).
-    </para>
-
-    <para>
-     If you use link mode, the upgrade will be much faster (no file
-     copying) and use less disk space, but you will not be able to access
-     your old cluster
-     once you start the new cluster after the upgrade.  Link mode also
-     requires that the old and new cluster data directories be in the
-     same file system.  (Tablespaces and <filename>pg_xlog</> can be on
-     different file systems.)  See <literal>pg_upgrade --help</> for a full
-     list of options.
-    </para>
-
-    <para>
-     The <option>--jobs</> option allows multiple CPU cores to be used
-     for copying/linking of files and to dump and reload database schemas
-     in parallel;  a good place to start is the maximum of the number of
-     CPU cores and tablespaces.  This option can dramatically reduce the
-     time to upgrade a multi-database server running on a multiprocessor
-     machine.
-    </para>
-
-    <para>
-     For Windows users, you must be logged into an administrative account, and
-     then start a shell as the <literal>postgres</> user and set the proper path:
-
-<programlisting>
-RUNAS /USER:postgres "CMD.EXE"
-SET PATH=%PATH%;C:\Program Files\PostgreSQL\9.0\bin;
-</programlisting>
-
-     and then run <application>pg_upgrade</> with quoted directories, e.g.:
-
-<programlisting>
-pg_upgrade.exe
-        --old-datadir "C:/Program Files/PostgreSQL/8.4/data"
-        --new-datadir "C:/Program Files/PostgreSQL/9.0/data"
-        --old-bindir "C:/Program Files/PostgreSQL/8.4/bin"
-        --new-bindir "C:/Program Files/PostgreSQL/9.0/bin"
-</programlisting>
-
-     Once started, <command>pg_upgrade</> will verify the two clusters are compatible
-     and then do the upgrade. You can use <command>pg_upgrade --check</>
-     to perform only the checks, even if the old server is still
-     running. <command>pg_upgrade --check</> will also outline any
-     manual adjustments you will need to make after the upgrade.  If you
-     are going to be using link mode, you should use the <option>--link</>
-     option with <option>--check</option> to enable link-mode-specific checks.
-     <command>pg_upgrade</> requires write permission in the current directory.
-    </para>
-
-    <para>
-     Obviously, no one should be accessing the clusters during the
-     upgrade.  <application>pg_upgrade</> defaults to running servers
-     on port 50432 to avoid unintended client connections.
-     You can use the same port number for both clusters when doing an
-     upgrade because the old and new clusters will not be running at the
-     same time.  However, when checking an old running server, the old
-     and new port numbers must be different.
-    </para>
-
-    <para>
-     If an error occurs while restoring the database schema, <command>pg_upgrade</> will
-     exit and you will have to revert to the old cluster as outlined in <xref linkend="pgupgrade-step-revert">
-     below. To try <command>pg_upgrade</command> again, you will need to modify the old
-     cluster so the pg_upgrade schema restore succeeds. If the problem is a
-     contrib module, you might need to uninstall the contrib module from
-     the old cluster and install it in the new cluster after the upgrade,
-     assuming the module is not being used to store user data.
-    </para>
-   </step>
-
-   <step>
-    <title>Upgrade Streaming Replication and Log-Shipping standby
-    servers</title>
-
-    <para>
-     If you have Streaming Replication (<xref
-     linkend="streaming-replication">) or Log-Shipping (<xref
-     linkend="warm-standby">) standby servers, follow these steps to
-     upgrade them (before starting any servers):
-    </para>
-
-    <procedure>
-
-     <step>
-      <title>Install the new PostgreSQL binaries on standby servers</title>
-
-      <para>
-       Make sure the new binaries and support files are installed on all
-       standby servers.
-      </para>
-     </step>
-
-     <step>
-      <title>Make sure the new standby data directories do <emphasis>not</>
-      exist</title>
-
-      <para>
-       Make sure the new standby data directories do <emphasis>not</>
-       exist or are empty.  If <application>initdb</> was run, delete
-       the standby server data directories.
-      </para>
-     </step>
-
-     <step>
-      <title>Install custom shared object files</title>
-
-      <para>
-       Install the same custom shared object files on the new standbys
-       that you installed in the new master cluster.
-      </para>
-     </step>
-
-     <step>
-      <title>Stop standby servers</title>
-
-      <para>
-       If the standby servers are still running, stop them now using the
-       above instructions.
-      </para>
-     </step>
-
-     <step>
-      <title>Verify standby servers</title>
-
-      <para>
-       To prevent old standby servers from being modified, run
-       <application>pg_controldata</> against the primary and standby
-       clusters and verify that the <quote>Latest checkpoint location</>
-       values match in all clusters.  (This requires the standbys to be
-       shut down after the primary.)
-      </para>
-     </step>
-
-     <step>
-      <title>Save configuration files</title>
-
-      <para>
-       Save any configuration files from the standbys you need to keep,
-       e.g.  <filename>postgresql.conf</>, <literal>recovery.conf</>,
-       as these will be overwritten or removed in the next step.
-      </para>
-     </step>
-
-     <step>
-      <title>Start and stop the new master cluster</title>
-
-      <para>
-       In the new master cluster, change <varname>wal_level</> to
-       <literal>hot_standby</> in the <filename>postgresql.conf</> file
-       and then start and stop the cluster.
-      </para>
-     </step>
-
-     <step>
-      <title>Run <application>rsync</></title>
-
-      <para>
-       From a directory that is above the old and new database cluster
-       directories, run this for each slave:
-
-<programlisting>
-       rsync --archive --delete --hard-links --size-only old_pgdata new_pgdata remote_dir
-</programlisting>
-
-       where <option>old_pgdata</> and <option>new_pgdata</> are relative
-       to the current directory, and <option>remote_dir</> is
-       <emphasis>above</> the old and new cluster directories on
-       the standby server.  The old and new relative cluster paths
-       must match on the master and standby server.  Consult the
-       <application>rsync</> manual page for details on specifying the
-       remote directory, e.g. <literal>standbyhost:/opt/PostgreSQL/</>.
-       <application>rsync</> will be fast when <application>pg_upgrade</>'s
-       <option>--link</> mode is used because it will create hard links
-       on the remote server rather than transferring user data.
-      </para>
-
-      <para>
-       If you have tablespaces, you will need to run a similar
-       <application>rsync</> command for each tablespace directory.  If you
-       have relocated <filename>pg_xlog</> outside the data directories,
-       <application>rsync</> must be run on those directories too.
-      </para>
-     </step>
-
-     <step>
-      <title>Configure streaming replication and log-shipping standby
-      servers</title>
-
-      <para>
-       Configure the servers for log shipping.  (You do not need to run
-       <function>pg_start_backup()</> and <function>pg_stop_backup()</>
-       or take a file system backup as the slaves are still synchronized
-       with the master.)
-      </para>
-     </step>
-
-    </procedure>
-
-   </step>
-
-   <step>
-    <title>Restore <filename>pg_hba.conf</></title>
-
-    <para>
-     If you modified <filename>pg_hba.conf</>, restore its original settings.
-     It might also be necessary to adjust other configuration files in the new
-     cluster to match the old cluster, e.g. <filename>postgresql.conf</>.
-    </para>
-   </step>
-
-   <step>
-    <title>Start the new server</title>
-
-    <para>
-     The new server can now be safely started, and then any
-     <application>rsync</>'ed standby servers.
-    </para>
-   </step>
-
-   <step>
-    <title>Post-Upgrade processing</title>
-
-    <para>
-     If any post-upgrade processing is required, pg_upgrade will issue
-     warnings as it completes. It will also generate script files that must
-     be run by the administrator. The script files will connect to each
-     database that needs post-upgrade processing. Each script should be
-     run using:
-
-<programlisting>
-psql --username postgres --file script.sql postgres
-</programlisting>
-
-     The scripts can be run in any order and can be deleted once they have
-     been run.
-    </para>
-
-    <caution>
-    <para>
-     In general it is unsafe to access tables referenced in rebuild scripts
-     until the rebuild scripts have run to completion; doing so could yield
-     incorrect results or poor performance. Tables not referenced in rebuild
-     scripts can be accessed immediately.
-    </para>
-    </caution>
-   </step>
-
-   <step>
-    <title>Statistics</title>
-
-    <para>
-     Because optimizer statistics are not transferred by <command>pg_upgrade</>, you will
-     be instructed to run a command to regenerate that information at the end
-     of the upgrade.  You might need to set connection parameters to
-     match your new cluster.
-    </para>
-   </step>
-
-   <step>
-    <title>Delete old cluster</title>
-
-    <para>
-     Once you are satisfied with the upgrade, you can delete the old
-     cluster's data directories by running the script mentioned when
-     <command>pg_upgrade</command> completes. (Automatic deletion is not
-     possible if you have user-defined tablespaces inside the old data
-     directory.)  You can also delete the old installation directories
-     (e.g. <filename>bin</>, <filename>share</>).
-    </para>
-   </step>
-
-   <step id="pgupgrade-step-revert" performance="optional">
-    <title>Reverting to old cluster</title>
-
-    <para>
-     If, after running <command>pg_upgrade</command>, you wish to revert to the old cluster,
-     there are several options:
-
-     <itemizedlist>
-      <listitem>
-       <para>
-        If you ran <command>pg_upgrade</command>
-        with <option>--check</>, no modifications were made to the old
-        cluster and you can re-use it anytime.
-       </para>
-      </listitem>
-
-      <listitem>
-       <para>
-        If you ran <command>pg_upgrade</command>
-        with <option>--link</>, the data files are shared between the
-        old and new cluster. If you started the new cluster, the new
-        server has written to those shared files and it is unsafe to
-        use the old cluster.
-       </para>
-      </listitem>
-
-      <listitem>
-       <para>
-        If you ran <command>pg_upgrade</command> <emphasis>without</>
-        <option>--link</> or did not start the new server, the
-        old cluster was not modified except that, if linking
-        started, a <literal>.old</> suffix was appended to
-        <filename>$PGDATA/global/pg_control</>.  To reuse the old
-        cluster, possibly remove the <filename>.old</> suffix from
-        <filename>$PGDATA/global/pg_control</>; you can then restart the
-        old cluster.
-       </para>
-      </listitem>
-     </itemizedlist>
-    </para>
-   </step>
-  </procedure>
-
- </refsect1>
-
- <refsect1>
-  <title>Notes</title>
-
-  <para>
-   <application>pg_upgrade</> does not support upgrading of databases
-   containing these <type>reg*</> OID-referencing system data types:
-   <type>regproc</>, <type>regprocedure</>, <type>regoper</>,
-   <type>regoperator</>, <type>regconfig</>, and
-   <type>regdictionary</>.  (<type>regtype</> can be upgraded.)
-  </para>
-
-  <para>
-   All failure, rebuild, and reindex cases will be reported by
-   <application>pg_upgrade</> if they affect your installation;
-   post-upgrade scripts to rebuild tables and indexes will be
-   generated automatically.  If you are trying to automate the upgrade
-   of many clusters, you should find that clusters with identical database
-   schemas require the same post-upgrade steps for all cluster upgrades;
-   this is because the post-upgrade steps are based on the database
-   schemas, and not user data.
-  </para>
-
-  <para>
-   For deployment testing, create a schema-only copy of the old cluster,
-   insert dummy data, and upgrade that.
-  </para>
-
-  <para>
-   If you are upgrading a pre-<productname>PostgreSQL</> 9.2 cluster
-   that uses a configuration-file-only directory, you must pass the
-   real data directory location to <application>pg_upgrade</>, and
-   pass the configuration directory location to the server, e.g.
-   <literal>-d /real-data-directory -o '-D /configuration-directory'</>.
-  </para>
-
-  <para>
-   If using a pre-9.1 old server that is using a non-default Unix-domain
-   socket directory or a default that differs from the default of the
-   new cluster, set <envar>PGHOST</> to point to the old server's socket
-   location.  (This is not relevant on Windows.)
-  </para>
-
-  <para>
-   If you want to use link mode and you do not want your old cluster
-   to be modified when the new cluster is started, make a copy of the
-   old cluster and upgrade that in link mode. To make a valid copy
-   of the old cluster, use <command>rsync</> to create a dirty
-   copy of the old cluster while the server is running, then shut down
-   the old server and run <command>rsync --checksum</> again to update the
-   copy with any changes to make it consistent.  (<option>--checksum</>
-   is necessary because <command>rsync</> only has file modification-time
-   granularity of one second.)  You might want to exclude some
-   files, e.g. <filename>postmaster.pid</>, as documented in <xref
-   linkend="backup-lowlevel-base-backup">.  If your file system supports
-   file system snapshots or copy-on-write file copies, you can use that
-   to make a backup of the old cluster and tablespaces, though the snapshot
-   and copies must be created simultaneously or while the database server
-   is down.
-  </para>
-
- </refsect1>
-
- <refsect1>
-  <title>See Also</title>
-
-  <simplelist type="inline">
-   <member><xref linkend="app-initdb"></member>
-   <member><xref linkend="app-pg-ctl"></member>
-   <member><xref linkend="app-pgdump"></member>
-   <member><xref linkend="app-postgres"></member>
-  </simplelist>
- </refsect1>
-</refentry>
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 43adb61..fb39c38 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -406,11 +406,27 @@
   <title>Remote Query Optimization</title>
 
   <para>
-   <filename>postgres_fdw</> attempts to optimize remote queries to reduce
-   the amount of data transferred from foreign servers.  This is done by
-   sending query <literal>WHERE</> clauses to the remote server for
-   execution, and by not retrieving table columns that are not needed for
-   the current query.  To reduce the risk of misexecution of queries,
+   <filename>postgres_fdw</filename> attempts to optimize remote queries to
+   reduce the amount of data transferred from foreign servers.
+   This is done by various ways.
+  </para>
+
+  <para>
+   For <literal>SELECT</> clause, <filename>postgres_fdw</filename> sends only
+   actually necessary columns in it.
+  </para>
+
+  <para>
+   If <literal>FROM</> clause contains multiple foreign tables managed
+   by the same server and accessed with identical user,
+   <filename>postgres_fdw</> tries to join foreign tables on the remote side as
+   much as it can.
+   To reduce risk of misexecution of queries, <filename>postgres_fdw</>
+   gives up sending joins to remote when join conditions might have different
+   semantics on the remote side.
+  </para>
+
+  <para>
    <literal>WHERE</> clauses are not sent to the remote server unless they use
    only built-in data types, operators, and functions.  Operators and
    functions in the clauses must be <literal>IMMUTABLE</> as well.
diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml
index 9ae6aec..211a3c4 100644
--- a/doc/src/sgml/ref/allfiles.sgml
+++ b/doc/src/sgml/ref/allfiles.sgml
@@ -193,6 +193,7 @@ Complete list of usable sgml source files in this directory.
 <!ENTITY pgResetxlog        SYSTEM "pg_resetxlog.sgml">
 <!ENTITY pgRestore          SYSTEM "pg_restore.sgml">
 <!ENTITY pgRewind           SYSTEM "pg_rewind.sgml">
+<!ENTITY pgupgrade          SYSTEM "pgupgrade.sgml">
 <!ENTITY postgres           SYSTEM "postgres-ref.sgml">
 <!ENTITY postmaster         SYSTEM "postmaster.sgml">
 <!ENTITY psqlRef            SYSTEM "psql-ref.sgml">
diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
new file mode 100644
index 0000000..ce5e308
--- /dev/null
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -0,0 +1,715 @@
+<!-- doc/src/sgml/ref/pgupgrade.sgml -->
+
+<refentry id="pgupgrade">
+ <indexterm zone="pgupgrade">
+  <primary>pg_upgrade</primary>
+ </indexterm>
+
+ <refmeta>
+  <refentrytitle><application>pg_upgrade</application></refentrytitle>
+  <manvolnum>1</manvolnum>
+  <refmiscinfo>Application</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+  <refname>pg_upgrade</refname>
+  <refpurpose>upgrade a <productname>PostgreSQL</productname> server instance</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+  <cmdsynopsis>
+   <command>pg_upgrade</command>
+   <arg choice="plain"><option>-b</option></arg>
+   <arg choice="plain"><replaceable>oldbindir</replaceable></arg>
+   <arg choice="plain"><option>-B</option></arg>
+   <arg choice="plain"><replaceable>newbindir</replaceable></arg>
+   <arg choice="plain"><option>-d</option></arg>
+   <arg choice="plain"><replaceable>olddatadir</replaceable></arg>
+   <arg choice="plain"><option>-D</option></arg>
+   <arg choice="plain"><replaceable>newdatadir</replaceable></arg>
+   <arg rep="repeat"><replaceable>option</replaceable></arg>
+  </cmdsynopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+  <title>Description</title>
+
+ <para>
+  <application>pg_upgrade</> (formerly called <application>pg_migrator</>) allows data
+  stored in <productname>PostgreSQL</> data files to be upgraded to a later <productname>PostgreSQL</>
+  major version without the data dump/reload typically required for
+  major version upgrades, e.g. from 8.4.7 to the current major release
+  of <productname>PostgreSQL</>.  It is not required for minor version upgrades, e.g. from
+  9.0.1 to 9.0.4.
+ </para>
+
+ <para>
+  Major PostgreSQL releases regularly add new features that often
+  change the layout of the system tables, but the internal data storage
+  format rarely changes.  <application>pg_upgrade</> uses this fact
+  to perform rapid upgrades by creating new system tables and simply
+  reusing the old user data files.  If a future major release ever
+  changes the data storage format in a way that makes the old data
+  format unreadable, <application>pg_upgrade</> will not be usable
+  for such upgrades.  (The community will attempt to avoid such
+  situations.)
+ </para>
+
+ <para>
+  <application>pg_upgrade</> does its best to
+  make sure the old and new clusters are binary-compatible, e.g.  by
+  checking for compatible compile-time settings, including 32/64-bit
+  binaries.  It is important that
+  any external modules are also binary compatible, though this cannot
+  be checked by <application>pg_upgrade</>.
+ </para>
+
+  <para>
+   pg_upgrade supports upgrades from 8.4.X and later to the current
+   major release of <productname>PostgreSQL</>, including snapshot and alpha releases.
+  </para>
+ </refsect1>
+
+ <refsect1>
+  <title>Options</title>
+
+   <para>
+    <application>pg_upgrade</application> accepts the following command-line arguments:
+
+    <variablelist>
+
+     <varlistentry>
+      <term><option>-b</option> <replaceable>bindir</></term>
+      <term><option>--old-bindir=</option><replaceable>bindir</></term>
+      <listitem><para>the old PostgreSQL executable directory;
+      environment variable <envar>PGBINOLD</></para></listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-B</option> <replaceable>bindir</></term>
+      <term><option>--new-bindir=</option><replaceable>bindir</></term>
+      <listitem><para>the new PostgreSQL executable directory;
+      environment variable <envar>PGBINNEW</></para></listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-c</option></term>
+      <term><option>--check</option></term>
+      <listitem><para>check clusters only, don't change any data</para></listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-d</option> <replaceable>datadir</></term>
+      <term><option>--old-datadir=</option><replaceable>datadir</></term>
+      <listitem><para>the old cluster data directory; environment
+      variable <envar>PGDATAOLD</></para></listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-D</option> <replaceable>datadir</></term>
+      <term><option>--new-datadir=</option><replaceable>datadir</></term>
+      <listitem><para>the new cluster data directory; environment
+      variable <envar>PGDATANEW</></para></listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-j</option></term>
+      <term><option>--jobs</option></term>
+      <listitem><para>number of simultaneous processes or threads to use
+      </para></listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-k</option></term>
+      <term><option>--link</option></term>
+      <listitem><para>use hard links instead of copying files to the new
+      cluster (use junction points on Windows)</para></listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-o</option> <replaceable class="parameter">options</replaceable></term>
+      <term><option>--old-options</option> <replaceable class="parameter">options</replaceable></term>
+      <listitem><para>options to be passed directly to the
+      old <command>postgres</command> command;  multiple
+      option invocations are appended</para></listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-O</option> <replaceable class="parameter">options</replaceable></term>
+      <term><option>--new-options</option> <replaceable class="parameter">options</replaceable></term>
+      <listitem><para>options to be passed directly to the
+      new <command>postgres</command> command;  multiple
+      option invocations are appended</para></listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-p</option> <replaceable>port</></term>
+      <term><option>--old-port=</option><replaceable>port</></term>
+      <listitem><para>the old cluster port number; environment
+      variable <envar>PGPORTOLD</></para></listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-P</option> <replaceable>port</></term>
+      <term><option>--new-port=</option><replaceable>port</></term>
+      <listitem><para>the new cluster port number; environment
+      variable <envar>PGPORTNEW</></para></listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-r</option></term>
+      <term><option>--retain</option></term>
+      <listitem><para>retain SQL and log files even after successful completion
+      </para></listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-U</option> <replaceable>username</></term>
+      <term><option>--username=</option><replaceable>username</></term>
+      <listitem><para>cluster's install user name; environment
+      variable <envar>PGUSER</></para></listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-v</option></term>
+      <term><option>--verbose</option></term>
+      <listitem><para>enable verbose internal logging</para></listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-V</option></term>
+      <term><option>--version</option></term>
+      <listitem><para>display version information, then exit</para></listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-?</option></term>
+      <term><option>--help</option></term>
+      <listitem><para>show help, then exit</para></listitem>
+     </varlistentry>
+
+    </variablelist>
+   </para>
+
+ </refsect1>
+
+ <refsect1>
+  <title>Usage</title>
+
+  <para>
+   These are the steps to perform an upgrade
+   with <application>pg_upgrade</application>:
+  </para>
+
+  <procedure>
+   <step performance="optional">
+    <title>Optionally move the old cluster</title>
+
+    <para>
+     If you are using a version-specific installation directory, e.g.
+     <filename>/opt/PostgreSQL/9.1</>, you do not need to move the old cluster. The
+     graphical installers all use version-specific installation directories.
+    </para>
+
+    <para>
+     If your installation directory is not version-specific, e.g.
+     <filename>/usr/local/pgsql</>, it is necessary to move the current PostgreSQL install
+     directory so it does not interfere with the new <productname>PostgreSQL</> installation.
+     Once the current <productname>PostgreSQL</> server is shut down, it is safe to rename the
+     PostgreSQL installation directory; assuming the old directory is
+     <filename>/usr/local/pgsql</>, you can do:
+
+<programlisting>
+mv /usr/local/pgsql /usr/local/pgsql.old
+</programlisting>
+     to rename the directory.
+    </para>
+   </step>
+
+   <step>
+    <title>For source installs, build the new version</title>
+
+    <para>
+     Build the new PostgreSQL source with <command>configure</> flags that are compatible
+     with the old cluster. <application>pg_upgrade</> will check <command>pg_controldata</> to make
+     sure all settings are compatible before starting the upgrade.
+    </para>
+   </step>
+
+   <step>
+    <title>Install the new PostgreSQL binaries</title>
+
+    <para>
+     Install the new server's binaries and support
+     files.  <application>pg_upgrade</> is included in a default installation.
+    </para>
+
+    <para>
+     For source installs, if you wish to install the new server in a custom
+     location, use the <literal>prefix</literal> variable:
+
+<programlisting>
+make prefix=/usr/local/pgsql.new install
+</programlisting></para>
+   </step>
+
+   <step>
+    <title>Initialize the new PostgreSQL cluster</title>
+
+    <para>
+     Initialize the new cluster using <command>initdb</command>.
+     Again, use compatible <command>initdb</command>
+     flags that match the old cluster. Many
+     prebuilt installers do this step automatically. There is no need to
+     start the new cluster.
+    </para>
+   </step>
+
+   <step>
+    <title>Install custom shared object files</title>
+
+    <para>
+     Install any custom shared object files (or DLLs) used by the old cluster
+     into the new cluster, e.g. <filename>pgcrypto.so</filename>,
+     whether they are from <filename>contrib</filename>
+     or some other source. Do not install the schema definitions, e.g.
+     <filename>pgcrypto.sql</>, because these will be upgraded from the old cluster.
+    </para>
+   </step>
+
+   <step>
+    <title>Adjust authentication</title>
+
+    <para>
+     <command>pg_upgrade</> will connect to the old and new servers several
+     times, so you might want to set authentication to <literal>peer</>
+     in <filename>pg_hba.conf</> or use a <filename>~/.pgpass</> file
+     (see <xref linkend="libpq-pgpass">).
+    </para>
+   </step>
+
+   <step>
+    <title>Stop both servers</title>
+
+    <para>
+     Make sure both database servers are stopped using, on Unix, e.g.:
+
+<programlisting>
+pg_ctl -D /opt/PostgreSQL/8.4 stop
+pg_ctl -D /opt/PostgreSQL/9.0 stop
+</programlisting>
+
+     or on Windows, using the proper service names:
+
+<programlisting>
+NET STOP postgresql-8.4
+NET STOP postgresql-9.0
+</programlisting>
+    </para>
+
+    <para>
+     Streaming replication and log-shipping standby servers can remain running until
+     a later step.
+    </para>
+   </step>
+
+   <step>
+    <title>Run <application>pg_upgrade</></title>
+
+    <para>
+     Always run the <application>pg_upgrade</> binary of the new server, not the old one.
+     <application>pg_upgrade</> requires the specification of the old and new cluster's
+     data and executable (<filename>bin</>) directories. You can also specify
+     user and port values, and whether you want the data linked instead of
+     copied (the default).
+    </para>
+
+    <para>
+     If you use link mode, the upgrade will be much faster (no file
+     copying) and use less disk space, but you will not be able to access
+     your old cluster
+     once you start the new cluster after the upgrade.  Link mode also
+     requires that the old and new cluster data directories be in the
+     same file system.  (Tablespaces and <filename>pg_xlog</> can be on
+     different file systems.)  See <literal>pg_upgrade --help</> for a full
+     list of options.
+    </para>
+
+    <para>
+     The <option>--jobs</> option allows multiple CPU cores to be used
+     for copying/linking of files and to dump and reload database schemas
+     in parallel;  a good place to start is the maximum of the number of
+     CPU cores and tablespaces.  This option can dramatically reduce the
+     time to upgrade a multi-database server running on a multiprocessor
+     machine.
+    </para>
+
+    <para>
+     For Windows users, you must be logged into an administrative account, and
+     then start a shell as the <literal>postgres</> user and set the proper path:
+
+<programlisting>
+RUNAS /USER:postgres "CMD.EXE"
+SET PATH=%PATH%;C:\Program Files\PostgreSQL\9.0\bin;
+</programlisting>
+
+     and then run <application>pg_upgrade</> with quoted directories, e.g.:
+
+<programlisting>
+pg_upgrade.exe
+        --old-datadir "C:/Program Files/PostgreSQL/8.4/data"
+        --new-datadir "C:/Program Files/PostgreSQL/9.0/data"
+        --old-bindir "C:/Program Files/PostgreSQL/8.4/bin"
+        --new-bindir "C:/Program Files/PostgreSQL/9.0/bin"
+</programlisting>
+
+     Once started, <command>pg_upgrade</> will verify the two clusters are compatible
+     and then do the upgrade. You can use <command>pg_upgrade --check</>
+     to perform only the checks, even if the old server is still
+     running. <command>pg_upgrade --check</> will also outline any
+     manual adjustments you will need to make after the upgrade.  If you
+     are going to be using link mode, you should use the <option>--link</>
+     option with <option>--check</option> to enable link-mode-specific checks.
+     <command>pg_upgrade</> requires write permission in the current directory.
+    </para>
+
+    <para>
+     Obviously, no one should be accessing the clusters during the
+     upgrade.  <application>pg_upgrade</> defaults to running servers
+     on port 50432 to avoid unintended client connections.
+     You can use the same port number for both clusters when doing an
+     upgrade because the old and new clusters will not be running at the
+     same time.  However, when checking an old running server, the old
+     and new port numbers must be different.
+    </para>
+
+    <para>
+     If an error occurs while restoring the database schema, <command>pg_upgrade</> will
+     exit and you will have to revert to the old cluster as outlined in <xref linkend="pgupgrade-step-revert">
+     below. To try <command>pg_upgrade</command> again, you will need to modify the old
+     cluster so the pg_upgrade schema restore succeeds. If the problem is a
+     contrib module, you might need to uninstall the contrib module from
+     the old cluster and install it in the new cluster after the upgrade,
+     assuming the module is not being used to store user data.
+    </para>
+   </step>
+
+   <step>
+    <title>Upgrade Streaming Replication and Log-Shipping standby
+    servers</title>
+
+    <para>
+     If you have Streaming Replication (<xref
+     linkend="streaming-replication">) or Log-Shipping (<xref
+     linkend="warm-standby">) standby servers, follow these steps to
+     upgrade them (before starting any servers):
+    </para>
+
+    <procedure>
+
+     <step>
+      <title>Install the new PostgreSQL binaries on standby servers</title>
+
+      <para>
+       Make sure the new binaries and support files are installed on all
+       standby servers.
+      </para>
+     </step>
+
+     <step>
+      <title>Make sure the new standby data directories do <emphasis>not</>
+      exist</title>
+
+      <para>
+       Make sure the new standby data directories do <emphasis>not</>
+       exist or are empty.  If <application>initdb</> was run, delete
+       the standby server data directories.
+      </para>
+     </step>
+
+     <step>
+      <title>Install custom shared object files</title>
+
+      <para>
+       Install the same custom shared object files on the new standbys
+       that you installed in the new master cluster.
+      </para>
+     </step>
+
+     <step>
+      <title>Stop standby servers</title>
+
+      <para>
+       If the standby servers are still running, stop them now using the
+       above instructions.
+      </para>
+     </step>
+
+     <step>
+      <title>Verify standby servers</title>
+
+      <para>
+       To prevent old standby servers from being modified, run
+       <application>pg_controldata</> against the primary and standby
+       clusters and verify that the <quote>Latest checkpoint location</>
+       values match in all clusters.  (This requires the standbys to be
+       shut down after the primary.)
+      </para>
+     </step>
+
+     <step>
+      <title>Save configuration files</title>
+
+      <para>
+       Save any configuration files from the standbys you need to keep,
+       e.g.  <filename>postgresql.conf</>, <literal>recovery.conf</>,
+       as these will be overwritten or removed in the next step.
+      </para>
+     </step>
+
+     <step>
+      <title>Start and stop the new master cluster</title>
+
+      <para>
+       In the new master cluster, change <varname>wal_level</> to
+       <literal>hot_standby</> in the <filename>postgresql.conf</> file
+       and then start and stop the cluster.
+      </para>
+     </step>
+
+     <step>
+      <title>Run <application>rsync</></title>
+
+      <para>
+       From a directory that is above the old and new database cluster
+       directories, run this for each slave:
+
+<programlisting>
+       rsync --archive --delete --hard-links --size-only old_pgdata new_pgdata remote_dir
+</programlisting>
+
+       where <option>old_pgdata</> and <option>new_pgdata</> are relative
+       to the current directory, and <option>remote_dir</> is
+       <emphasis>above</> the old and new cluster directories on
+       the standby server.  The old and new relative cluster paths
+       must match on the master and standby server.  Consult the
+       <application>rsync</> manual page for details on specifying the
+       remote directory, e.g. <literal>standbyhost:/opt/PostgreSQL/</>.
+       <application>rsync</> will be fast when <application>pg_upgrade</>'s
+       <option>--link</> mode is used because it will create hard links
+       on the remote server rather than transferring user data.
+      </para>
+
+      <para>
+       If you have tablespaces, you will need to run a similar
+       <application>rsync</> command for each tablespace directory.  If you
+       have relocated <filename>pg_xlog</> outside the data directories,
+       <application>rsync</> must be run on those directories too.
+      </para>
+     </step>
+
+     <step>
+      <title>Configure streaming replication and log-shipping standby
+      servers</title>
+
+      <para>
+       Configure the servers for log shipping.  (You do not need to run
+       <function>pg_start_backup()</> and <function>pg_stop_backup()</>
+       or take a file system backup as the slaves are still synchronized
+       with the master.)
+      </para>
+     </step>
+
+    </procedure>
+
+   </step>
+
+   <step>
+    <title>Restore <filename>pg_hba.conf</></title>
+
+    <para>
+     If you modified <filename>pg_hba.conf</>, restore its original settings.
+     It might also be necessary to adjust other configuration files in the new
+     cluster to match the old cluster, e.g. <filename>postgresql.conf</>.
+    </para>
+   </step>
+
+   <step>
+    <title>Start the new server</title>
+
+    <para>
+     The new server can now be safely started, and then any
+     <application>rsync</>'ed standby servers.
+    </para>
+   </step>
+
+   <step>
+    <title>Post-Upgrade processing</title>
+
+    <para>
+     If any post-upgrade processing is required, pg_upgrade will issue
+     warnings as it completes. It will also generate script files that must
+     be run by the administrator. The script files will connect to each
+     database that needs post-upgrade processing. Each script should be
+     run using:
+
+<programlisting>
+psql --username postgres --file script.sql postgres
+</programlisting>
+
+     The scripts can be run in any order and can be deleted once they have
+     been run.
+    </para>
+
+    <caution>
+    <para>
+     In general it is unsafe to access tables referenced in rebuild scripts
+     until the rebuild scripts have run to completion; doing so could yield
+     incorrect results or poor performance. Tables not referenced in rebuild
+     scripts can be accessed immediately.
+    </para>
+    </caution>
+   </step>
+
+   <step>
+    <title>Statistics</title>
+
+    <para>
+     Because optimizer statistics are not transferred by <command>pg_upgrade</>, you will
+     be instructed to run a command to regenerate that information at the end
+     of the upgrade.  You might need to set connection parameters to
+     match your new cluster.
+    </para>
+   </step>
+
+   <step>
+    <title>Delete old cluster</title>
+
+    <para>
+     Once you are satisfied with the upgrade, you can delete the old
+     cluster's data directories by running the script mentioned when
+     <command>pg_upgrade</command> completes. (Automatic deletion is not
+     possible if you have user-defined tablespaces inside the old data
+     directory.)  You can also delete the old installation directories
+     (e.g. <filename>bin</>, <filename>share</>).
+    </para>
+   </step>
+
+   <step id="pgupgrade-step-revert" performance="optional">
+    <title>Reverting to old cluster</title>
+
+    <para>
+     If, after running <command>pg_upgrade</command>, you wish to revert to the old cluster,
+     there are several options:
+
+     <itemizedlist>
+      <listitem>
+       <para>
+        If you ran <command>pg_upgrade</command>
+        with <option>--check</>, no modifications were made to the old
+        cluster and you can re-use it anytime.
+       </para>
+      </listitem>
+
+      <listitem>
+       <para>
+        If you ran <command>pg_upgrade</command>
+        with <option>--link</>, the data files are shared between the
+        old and new cluster. If you started the new cluster, the new
+        server has written to those shared files and it is unsafe to
+        use the old cluster.
+       </para>
+      </listitem>
+
+      <listitem>
+       <para>
+        If you ran <command>pg_upgrade</command> <emphasis>without</>
+        <option>--link</> or did not start the new server, the
+        old cluster was not modified except that, if linking
+        started, a <literal>.old</> suffix was appended to
+        <filename>$PGDATA/global/pg_control</>.  To reuse the old
+        cluster, possibly remove the <filename>.old</> suffix from
+        <filename>$PGDATA/global/pg_control</>; you can then restart the
+        old cluster.
+       </para>
+      </listitem>
+     </itemizedlist>
+    </para>
+   </step>
+  </procedure>
+
+ </refsect1>
+
+ <refsect1>
+  <title>Notes</title>
+
+  <para>
+   <application>pg_upgrade</> does not support upgrading of databases
+   containing these <type>reg*</> OID-referencing system data types:
+   <type>regproc</>, <type>regprocedure</>, <type>regoper</>,
+   <type>regoperator</>, <type>regconfig</>, and
+   <type>regdictionary</>.  (<type>regtype</> can be upgraded.)
+  </para>
+
+  <para>
+   All failure, rebuild, and reindex cases will be reported by
+   <application>pg_upgrade</> if they affect your installation;
+   post-upgrade scripts to rebuild tables and indexes will be
+   generated automatically.  If you are trying to automate the upgrade
+   of many clusters, you should find that clusters with identical database
+   schemas require the same post-upgrade steps for all cluster upgrades;
+   this is because the post-upgrade steps are based on the database
+   schemas, and not user data.
+  </para>
+
+  <para>
+   For deployment testing, create a schema-only copy of the old cluster,
+   insert dummy data, and upgrade that.
+  </para>
+
+  <para>
+   If you are upgrading a pre-<productname>PostgreSQL</> 9.2 cluster
+   that uses a configuration-file-only directory, you must pass the
+   real data directory location to <application>pg_upgrade</>, and
+   pass the configuration directory location to the server, e.g.
+   <literal>-d /real-data-directory -o '-D /configuration-directory'</>.
+  </para>
+
+  <para>
+   If using a pre-9.1 old server that is using a non-default Unix-domain
+   socket directory or a default that differs from the default of the
+   new cluster, set <envar>PGHOST</> to point to the old server's socket
+   location.  (This is not relevant on Windows.)
+  </para>
+
+  <para>
+   If you want to use link mode and you do not want your old cluster
+   to be modified when the new cluster is started, make a copy of the
+   old cluster and upgrade that in link mode. To make a valid copy
+   of the old cluster, use <command>rsync</> to create a dirty
+   copy of the old cluster while the server is running, then shut down
+   the old server and run <command>rsync --checksum</> again to update the
+   copy with any changes to make it consistent.  (<option>--checksum</>
+   is necessary because <command>rsync</> only has file modification-time
+   granularity of one second.)  You might want to exclude some
+   files, e.g. <filename>postmaster.pid</>, as documented in <xref
+   linkend="backup-lowlevel-base-backup">.  If your file system supports
+   file system snapshots or copy-on-write file copies, you can use that
+   to make a backup of the old cluster and tablespaces, though the snapshot
+   and copies must be created simultaneously or while the database server
+   is down.
+  </para>
+
+ </refsect1>
+
+ <refsect1>
+  <title>See Also</title>
+
+  <simplelist type="inline">
+   <member><xref linkend="app-initdb"></member>
+   <member><xref linkend="app-pg-ctl"></member>
+   <member><xref linkend="app-pgdump"></member>
+   <member><xref linkend="app-postgres"></member>
+  </simplelist>
+ </refsect1>
+</refentry>
diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml
index c1765ef..fb18d94 100644
--- a/doc/src/sgml/reference.sgml
+++ b/doc/src/sgml/reference.sgml
@@ -263,6 +263,7 @@
    &pgCtl;
    &pgResetxlog;
    &pgRewind;
+   &pgupgrade;
    &postgres;
    &postmaster;
 
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 7c39d82..4b06fc2 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -225,6 +225,7 @@ GCC = @GCC@
 SUN_STUDIO_CC = @SUN_STUDIO_CC@
 CFLAGS = @CFLAGS@
 CFLAGS_VECTOR = @CFLAGS_VECTOR@
+CFLAGS_SSE42 = @CFLAGS_SSE42@
 
 # Kind-of compilers
 
@@ -548,6 +549,9 @@ endif
 
 LIBOBJS = @LIBOBJS@
 
+# files needed for the chosen CRC-32C implementation
+PG_CRC32C_OBJS = @PG_CRC32C_OBJS@
+
 LIBS := -lpgcommon -lpgport $(LIBS)
 
 # to make ws2_32.lib the last library
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index c87c8ca..4bc24d9 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -42,7 +42,7 @@
  * Note: because TransactionIds are 32 bits and wrap around at 0xFFFFFFFF,
  * SubTrans page numbering also wraps around at
  * 0xFFFFFFFF/SUBTRANS_XACTS_PER_PAGE, and segment numbering at
- * 0xFFFFFFFF/SUBTRANS_XACTS_PER_PAGE/SLRU_SEGMENTS_PER_PAGE.  We need take no
+ * 0xFFFFFFFF/SUBTRANS_XACTS_PER_PAGE/SLRU_PAGES_PER_SEGMENT.  We need take no
  * explicit notice of that fact in this module, except when comparing segment
  * and page numbers in TruncateSUBTRANS (see SubTransPagePrecedes).
  */
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 4075a6f..b85a666 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -1023,8 +1023,8 @@ EndPrepare(GlobalTransaction gxact)
 	TwoPhaseFileHeader *hdr;
 	char		path[MAXPGPATH];
 	StateFileChunk *record;
-	pg_crc32	statefile_crc;
-	pg_crc32	bogus_crc;
+	pg_crc32c	statefile_crc;
+	pg_crc32c	bogus_crc;
 	int			fd;
 
 	/* Add the end sentinel to the list of 2PC records */
@@ -1034,7 +1034,7 @@ EndPrepare(GlobalTransaction gxact)
 	/* Go back and fill in total_len in the file header record */
 	hdr = (TwoPhaseFileHeader *) records.head->data;
 	Assert(hdr->magic == TWOPHASE_MAGIC);
-	hdr->total_len = records.total_len + sizeof(pg_crc32);
+	hdr->total_len = records.total_len + sizeof(pg_crc32c);
 
 	/*
 	 * If the file size exceeds MaxAllocSize, we won't be able to read it in
@@ -1082,7 +1082,7 @@ EndPrepare(GlobalTransaction gxact)
 	 */
 	bogus_crc = ~statefile_crc;
 
-	if ((write(fd, &bogus_crc, sizeof(pg_crc32))) != sizeof(pg_crc32))
+	if ((write(fd, &bogus_crc, sizeof(pg_crc32c))) != sizeof(pg_crc32c))
 	{
 		CloseTransientFile(fd);
 		ereport(ERROR,
@@ -1091,7 +1091,7 @@ EndPrepare(GlobalTransaction gxact)
 	}
 
 	/* Back up to prepare for rewriting the CRC */
-	if (lseek(fd, -((off_t) sizeof(pg_crc32)), SEEK_CUR) < 0)
+	if (lseek(fd, -((off_t) sizeof(pg_crc32c)), SEEK_CUR) < 0)
 	{
 		CloseTransientFile(fd);
 		ereport(ERROR,
@@ -1135,7 +1135,7 @@ EndPrepare(GlobalTransaction gxact)
 	/* If we crash now, we have prepared: WAL replay will fix things */
 
 	/* write correct CRC and close file */
-	if ((write(fd, &statefile_crc, sizeof(pg_crc32))) != sizeof(pg_crc32))
+	if ((write(fd, &statefile_crc, sizeof(pg_crc32c))) != sizeof(pg_crc32c))
 	{
 		CloseTransientFile(fd);
 		ereport(ERROR,
@@ -1223,7 +1223,7 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings)
 	int			fd;
 	struct stat stat;
 	uint32		crc_offset;
-	pg_crc32	calc_crc,
+	pg_crc32c	calc_crc,
 				file_crc;
 
 	TwoPhaseFilePath(path, xid);
@@ -1258,14 +1258,14 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings)
 
 	if (stat.st_size < (MAXALIGN(sizeof(TwoPhaseFileHeader)) +
 						MAXALIGN(sizeof(TwoPhaseRecordOnDisk)) +
-						sizeof(pg_crc32)) ||
+						sizeof(pg_crc32c)) ||
 		stat.st_size > MaxAllocSize)
 	{
 		CloseTransientFile(fd);
 		return NULL;
 	}
 
-	crc_offset = stat.st_size - sizeof(pg_crc32);
+	crc_offset = stat.st_size - sizeof(pg_crc32c);
 	if (crc_offset != MAXALIGN(crc_offset))
 	{
 		CloseTransientFile(fd);
@@ -1302,7 +1302,7 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings)
 	COMP_CRC32C(calc_crc, buf, crc_offset);
 	FIN_CRC32C(calc_crc);
 
-	file_crc = *((pg_crc32 *) (buf + crc_offset));
+	file_crc = *((pg_crc32c *) (buf + crc_offset));
 
 	if (!EQ_CRC32C(calc_crc, file_crc))
 	{
@@ -1545,7 +1545,7 @@ void
 RecreateTwoPhaseFile(TransactionId xid, void *content, int len)
 {
 	char		path[MAXPGPATH];
-	pg_crc32	statefile_crc;
+	pg_crc32c	statefile_crc;
 	int			fd;
 
 	/* Recompute CRC */
@@ -1572,7 +1572,7 @@ RecreateTwoPhaseFile(TransactionId xid, void *content, int len)
 				(errcode_for_file_access(),
 				 errmsg("could not write two-phase state file: %m")));
 	}
-	if (write(fd, &statefile_crc, sizeof(pg_crc32)) != sizeof(pg_crc32))
+	if (write(fd, &statefile_crc, sizeof(pg_crc32c)) != sizeof(pg_crc32c))
 	{
 		CloseTransientFile(fd);
 		ereport(ERROR,
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 5688268..2580996 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -862,7 +862,7 @@ XLogRecPtr
 XLogInsertRecord(XLogRecData *rdata, XLogRecPtr fpw_lsn)
 {
 	XLogCtlInsert *Insert = &XLogCtl->Insert;
-	pg_crc32	rdata_crc;
+	pg_crc32c	rdata_crc;
 	bool		inserted;
 	XLogRecord *rechdr = (XLogRecord *) rdata->data;
 	bool		isLogSwitch = (rechdr->xl_rmid == RM_XLOG_ID &&
@@ -4179,7 +4179,7 @@ WriteControlFile(void)
 static void
 ReadControlFile(void)
 {
-	pg_crc32	crc;
+	pg_crc32c	crc;
 	int			fd;
 
 	/*
@@ -4681,7 +4681,7 @@ BootStrapXLOG(void)
 	bool		use_existent;
 	uint64		sysidentifier;
 	struct timeval tv;
-	pg_crc32	crc;
+	pg_crc32c	crc;
 
 	/*
 	 * Select a hopefully-unique system identifier code for this installation.
@@ -7903,6 +7903,7 @@ CreateCheckPoint(int flags)
 	uint32		freespace;
 	XLogRecPtr	PriorRedoPtr;
 	XLogRecPtr	curInsert;
+	XLogRecPtr	prevPtr;
 	VirtualTransactionId *vxids;
 	int			nvxids;
 
@@ -7988,6 +7989,7 @@ CreateCheckPoint(int flags)
 	 */
 	WALInsertLockAcquireExclusive();
 	curInsert = XLogBytePosToRecPtr(Insert->CurrBytePos);
+	prevPtr = XLogBytePosToRecPtr(Insert->PrevBytePos);
 
 	/*
 	 * If this isn't a shutdown or forced checkpoint, and we have not inserted
@@ -7999,17 +8001,17 @@ CreateCheckPoint(int flags)
 	 * (Perhaps it'd make even more sense to checkpoint only when the previous
 	 * checkpoint record is in a different xlog page?)
 	 *
-	 * We have to make two tests to determine that nothing has happened since
-	 * the start of the last checkpoint: current insertion point must match
-	 * the end of the last checkpoint record, and its redo pointer must point
-	 * to itself.
+	 * If the previous checkpoint crossed a WAL segment, however, we create
+	 * the checkpoint anyway, to have the latest checkpoint fully contained in
+	 * the new segment. This is for a little bit of extra robustness: it's
+	 * better if you don't need to keep two WAL segments around to recover the
+	 * checkpoint.
 	 */
 	if ((flags & (CHECKPOINT_IS_SHUTDOWN | CHECKPOINT_END_OF_RECOVERY |
 				  CHECKPOINT_FORCE)) == 0)
 	{
-		if (curInsert == ControlFile->checkPoint +
-			MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint)) &&
-			ControlFile->checkPoint == ControlFile->checkPointCopy.redo)
+		if (prevPtr == ControlFile->checkPointCopy.redo &&
+			prevPtr / XLOG_SEG_SIZE == curInsert / XLOG_SEG_SIZE)
 		{
 			WALInsertLockRelease();
 			LWLockRelease(CheckpointLock);
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 88209c3..618f879 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -459,7 +459,7 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
 	XLogRecData *rdt;
 	uint32		total_len = 0;
 	int			block_id;
-	pg_crc32	rdata_crc;
+	pg_crc32c	rdata_crc;
 	registered_buffer *prev_regbuf = NULL;
 	XLogRecData *rdt_datas_last;
 	XLogRecord *rechdr;
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index a4124d9..77be1b8 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -665,7 +665,7 @@ ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
 static bool
 ValidXLogRecord(XLogReaderState *state, XLogRecord *record, XLogRecPtr recptr)
 {
-	pg_crc32	crc;
+	pg_crc32c	crc;
 
 	/* Calculate the CRC */
 	INIT_CRC32C(crc);
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index c518c50..d04e94d 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -76,7 +76,7 @@
 #include "utils/tqual.h"
 
 
-/* Potentially set by contrib/pg_upgrade_support functions */
+/* Potentially set by pg_upgrade_support functions */
 Oid			binary_upgrade_next_heap_pg_class_oid = InvalidOid;
 Oid			binary_upgrade_next_toast_pg_class_oid = InvalidOid;
 
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index d8ff554..ac3b785 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -69,7 +69,7 @@
 #include "utils/tqual.h"
 
 
-/* Potentially set by contrib/pg_upgrade_support functions */
+/* Potentially set by pg_upgrade_support functions */
 Oid			binary_upgrade_next_index_pg_class_oid = InvalidOid;
 
 /* state info for validate_index bulkdelete callback */
diff --git a/src/backend/catalog/pg_enum.c b/src/backend/catalog/pg_enum.c
index d87090a..c880486 100644
--- a/src/backend/catalog/pg_enum.c
+++ b/src/backend/catalog/pg_enum.c
@@ -31,7 +31,7 @@
 #include "utils/tqual.h"
 
 
-/* Potentially set by contrib/pg_upgrade_support functions */
+/* Potentially set by pg_upgrade_support functions */
 Oid			binary_upgrade_next_pg_enum_oid = InvalidOid;
 
 static void RenumberEnumType(Relation pg_enum, HeapTuple *existing, int nelems);
diff --git a/src/backend/catalog/pg_type.c b/src/backend/catalog/pg_type.c
index d1ed53f..32453c3 100644
--- a/src/backend/catalog/pg_type.c
+++ b/src/backend/catalog/pg_type.c
@@ -36,7 +36,7 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 
-/* Potentially set by contrib/pg_upgrade_support functions */
+/* Potentially set by pg_upgrade_support functions */
 Oid			binary_upgrade_next_pg_type_oid = InvalidOid;
 
 /* ----------------------------------------------------------------
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index d14c33c..c99d353 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -32,7 +32,7 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 
-/* Potentially set by contrib/pg_upgrade_support functions */
+/* Potentially set by pg_upgrade_support functions */
 Oid			binary_upgrade_next_toast_pg_type_oid = InvalidOid;
 
 static void CheckAndCreateToastTable(Oid relOid, Datum reloptions,
diff --git a/src/backend/commands/typecmds.c b/src/backend/commands/typecmds.c
index 67e2ae2..907ba11 100644
--- a/src/backend/commands/typecmds.c
+++ b/src/backend/commands/typecmds.c
@@ -80,7 +80,7 @@ typedef struct
 	/* atts[] is of allocated length RelationGetNumberOfAttributes(rel) */
 } RelToCheck;
 
-/* Potentially set by contrib/pg_upgrade_support functions */
+/* Potentially set by pg_upgrade_support functions */
 Oid			binary_upgrade_next_array_pg_type_oid = InvalidOid;
 
 static void makeRangeConstructors(const char *name, Oid namespace,
diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c
index 75f1b3c..456c27e 100644
--- a/src/backend/commands/user.c
+++ b/src/backend/commands/user.c
@@ -38,7 +38,7 @@
 #include "utils/timestamp.h"
 #include "utils/tqual.h"
 
-/* Potentially set by contrib/pg_upgrade_support functions */
+/* Potentially set by pg_upgrade_support functions */
 Oid			binary_upgrade_next_pg_authid_oid = InvalidOid;
 
 
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index ff5ff26..82e7d98 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1391,7 +1391,7 @@ typedef struct SnapBuildOnDisk
 
 	/* data not covered by checksum */
 	uint32		magic;
-	pg_crc32	checksum;
+	pg_crc32c	checksum;
 
 	/* data covered by checksum */
 
@@ -1634,7 +1634,7 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
 	char		path[MAXPGPATH];
 	Size		sz;
 	int			readBytes;
-	pg_crc32	checksum;
+	pg_crc32c	checksum;
 
 	/* no point in loading a snapshot if we're already there */
 	if (builder->state == SNAPBUILD_CONSISTENT)
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fcd7ba1..16ea80b 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -56,7 +56,7 @@ typedef struct ReplicationSlotOnDisk
 
 	/* data not covered by checksum */
 	uint32		magic;
-	pg_crc32	checksum;
+	pg_crc32c	checksum;
 
 	/* data covered by checksum */
 	uint32		version;
@@ -1075,7 +1075,7 @@ RestoreSlotFromDisk(const char *name)
 	int			fd;
 	bool		restored = false;
 	int			readBytes;
-	pg_crc32	checksum;
+	pg_crc32c	checksum;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 20e5ff1..1f1bee7 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -25,7 +25,8 @@ OBJS = acl.o arrayfuncs.o array_selfuncs.o array_typanalyze.o \
 	jsonfuncs.o like.o lockfuncs.o mac.o misc.o nabstime.o name.o \
 	network.o network_gist.o network_selfuncs.o \
 	numeric.o numutils.o oid.o oracle_compat.o \
-	orderedsetaggs.o pg_locale.o pg_lsn.o pgstatfuncs.o \
+	orderedsetaggs.o pg_locale.o pg_lsn.o pg_upgrade_support.o \
+	pgstatfuncs.o \
 	pseudotypes.o quote.o rangetypes.o rangetypes_gist.o \
 	rangetypes_selfuncs.o rangetypes_spgist.o rangetypes_typanalyze.o \
 	regexp.o regproc.o ri_triggers.o rowtypes.o ruleutils.o \
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
new file mode 100644
index 0000000..d69fa53
--- /dev/null
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -0,0 +1,183 @@
+/*
+ *	pg_upgrade_support.c
+ *
+ *	server-side functions to set backend global variables
+ *	to control oid and relfilenode assignment, and do other special
+ *	hacks needed for pg_upgrade.
+ *
+ *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
+ *	src/backend/utils/adt/pg_upgrade_support.c
+ */
+
+#include "postgres.h"
+
+#include "catalog/binary_upgrade.h"
+#include "catalog/namespace.h"
+#include "catalog/pg_type.h"
+#include "commands/extension.h"
+#include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/builtins.h"
+
+
+Datum binary_upgrade_set_next_pg_type_oid(PG_FUNCTION_ARGS);
+Datum binary_upgrade_set_next_array_pg_type_oid(PG_FUNCTION_ARGS);
+Datum binary_upgrade_set_next_toast_pg_type_oid(PG_FUNCTION_ARGS);
+Datum binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS);
+Datum binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS);
+Datum binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS);
+Datum binary_upgrade_set_next_pg_enum_oid(PG_FUNCTION_ARGS);
+Datum binary_upgrade_set_next_pg_authid_oid(PG_FUNCTION_ARGS);
+Datum binary_upgrade_create_empty_extension(PG_FUNCTION_ARGS);
+
+
+#define CHECK_IS_BINARY_UPGRADE 								\
+do { 															\
+	if (!IsBinaryUpgrade)										\
+		ereport(ERROR,											\
+				(errcode(ERRCODE_CANT_CHANGE_RUNTIME_PARAM),	\
+				 (errmsg("function can only be called when server is in binary upgrade mode")))); \
+} while (0)
+
+Datum
+binary_upgrade_set_next_pg_type_oid(PG_FUNCTION_ARGS)
+{
+	Oid			typoid = PG_GETARG_OID(0);
+
+	CHECK_IS_BINARY_UPGRADE;
+	binary_upgrade_next_pg_type_oid = typoid;
+
+	PG_RETURN_VOID();
+}
+
+Datum
+binary_upgrade_set_next_array_pg_type_oid(PG_FUNCTION_ARGS)
+{
+	Oid			typoid = PG_GETARG_OID(0);
+
+	CHECK_IS_BINARY_UPGRADE;
+	binary_upgrade_next_array_pg_type_oid = typoid;
+
+	PG_RETURN_VOID();
+}
+
+Datum
+binary_upgrade_set_next_toast_pg_type_oid(PG_FUNCTION_ARGS)
+{
+	Oid			typoid = PG_GETARG_OID(0);
+
+	CHECK_IS_BINARY_UPGRADE;
+	binary_upgrade_next_toast_pg_type_oid = typoid;
+
+	PG_RETURN_VOID();
+}
+
+Datum
+binary_upgrade_set_next_heap_pg_class_oid(PG_FUNCTION_ARGS)
+{
+	Oid			reloid = PG_GETARG_OID(0);
+
+	CHECK_IS_BINARY_UPGRADE;
+	binary_upgrade_next_heap_pg_class_oid = reloid;
+
+	PG_RETURN_VOID();
+}
+
+Datum
+binary_upgrade_set_next_index_pg_class_oid(PG_FUNCTION_ARGS)
+{
+	Oid			reloid = PG_GETARG_OID(0);
+
+	CHECK_IS_BINARY_UPGRADE;
+	binary_upgrade_next_index_pg_class_oid = reloid;
+
+	PG_RETURN_VOID();
+}
+
+Datum
+binary_upgrade_set_next_toast_pg_class_oid(PG_FUNCTION_ARGS)
+{
+	Oid			reloid = PG_GETARG_OID(0);
+
+	CHECK_IS_BINARY_UPGRADE;
+	binary_upgrade_next_toast_pg_class_oid = reloid;
+
+	PG_RETURN_VOID();
+}
+
+Datum
+binary_upgrade_set_next_pg_enum_oid(PG_FUNCTION_ARGS)
+{
+	Oid			enumoid = PG_GETARG_OID(0);
+
+	CHECK_IS_BINARY_UPGRADE;
+	binary_upgrade_next_pg_enum_oid = enumoid;
+
+	PG_RETURN_VOID();
+}
+
+Datum
+binary_upgrade_set_next_pg_authid_oid(PG_FUNCTION_ARGS)
+{
+	Oid			authoid = PG_GETARG_OID(0);
+
+	CHECK_IS_BINARY_UPGRADE;
+	binary_upgrade_next_pg_authid_oid = authoid;
+	PG_RETURN_VOID();
+}
+
+Datum
+binary_upgrade_create_empty_extension(PG_FUNCTION_ARGS)
+{
+	text	   *extName = PG_GETARG_TEXT_PP(0);
+	text	   *schemaName = PG_GETARG_TEXT_PP(1);
+	bool		relocatable = PG_GETARG_BOOL(2);
+	text	   *extVersion = PG_GETARG_TEXT_PP(3);
+	Datum		extConfig;
+	Datum		extCondition;
+	List	   *requiredExtensions;
+
+	CHECK_IS_BINARY_UPGRADE;
+
+	if (PG_ARGISNULL(4))
+		extConfig = PointerGetDatum(NULL);
+	else
+		extConfig = PG_GETARG_DATUM(4);
+
+	if (PG_ARGISNULL(5))
+		extCondition = PointerGetDatum(NULL);
+	else
+		extCondition = PG_GETARG_DATUM(5);
+
+	requiredExtensions = NIL;
+	if (!PG_ARGISNULL(6))
+	{
+		ArrayType  *textArray = PG_GETARG_ARRAYTYPE_P(6);
+		Datum	   *textDatums;
+		int			ndatums;
+		int			i;
+
+		deconstruct_array(textArray,
+						  TEXTOID, -1, false, 'i',
+						  &textDatums, NULL, &ndatums);
+		for (i = 0; i < ndatums; i++)
+		{
+			text	   *txtname = DatumGetTextPP(textDatums[i]);
+			char	   *extName = text_to_cstring(txtname);
+			Oid			extOid = get_extension_oid(extName, false);
+
+			requiredExtensions = lappend_oid(requiredExtensions, extOid);
+		}
+	}
+
+	InsertExtensionTuple(text_to_cstring(extName),
+						 GetUserId(),
+					   get_namespace_oid(text_to_cstring(schemaName), false),
+						 relocatable,
+						 text_to_cstring(extVersion),
+						 extConfig,
+						 extCondition,
+						 requiredExtensions);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/utils/adt/tsgistidx.c b/src/backend/utils/adt/tsgistidx.c
index 25132be..0a1e664 100644
--- a/src/backend/utils/adt/tsgistidx.c
+++ b/src/backend/utils/adt/tsgistidx.c
@@ -17,6 +17,7 @@
 #include "access/gist.h"
 #include "access/tuptoaster.h"
 #include "tsearch/ts_utils.h"
+#include "utils/pg_crc.h"
 
 
 #define SIGLENINT  31			/* >121 => key will toast, so it will not work
diff --git a/src/backend/utils/adt/tsquery.c b/src/backend/utils/adt/tsquery.c
index acabd94..2c32ffe 100644
--- a/src/backend/utils/adt/tsquery.c
+++ b/src/backend/utils/adt/tsquery.c
@@ -14,13 +14,13 @@
 
 #include "postgres.h"
 
-#include "common/pg_crc.h"
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
 #include "tsearch/ts_locale.h"
 #include "tsearch/ts_utils.h"
 #include "utils/builtins.h"
 #include "utils/memutils.h"
+#include "utils/pg_crc.h"
 
 
 struct TSQueryParserStateData
diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 48b8351..c151b92 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -86,7 +86,7 @@ typedef struct RelMapFile
 	int32		magic;			/* always RELMAPPER_FILEMAGIC */
 	int32		num_mappings;	/* number of valid RelMapping entries */
 	RelMapping	mappings[MAX_MAPPINGS];
-	pg_crc32	crc;			/* CRC of all above */
+	pg_crc32c	crc;			/* CRC of all above */
 	int32		pad;			/* to make the struct size be 512 exactly */
 } RelMapFile;
 
@@ -626,7 +626,7 @@ load_relmap_file(bool shared)
 {
 	RelMapFile *map;
 	char		mapfilename[MAXPGPATH];
-	pg_crc32	crc;
+	pg_crc32c	crc;
 	int			fd;
 
 	if (shared)
diff --git a/src/backend/utils/hash/Makefile b/src/backend/utils/hash/Makefile
index 05d347c..64eebd1 100644
--- a/src/backend/utils/hash/Makefile
+++ b/src/backend/utils/hash/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/hash
 top_builddir = ../../../..
 include $(top_builddir)/src/Makefile.global
 
-OBJS = dynahash.o hashfn.o
+OBJS = dynahash.o hashfn.o pg_crc.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/hash/pg_crc.c b/src/backend/utils/hash/pg_crc.c
new file mode 100644
index 0000000..74c1618
--- /dev/null
+++ b/src/backend/utils/hash/pg_crc.c
@@ -0,0 +1,97 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_crc.c
+ *	  PostgreSQL CRC support
+ *
+ * See Ross Williams' excellent introduction
+ * A PAINLESS GUIDE TO CRC ERROR DETECTION ALGORITHMS, available from
+ * http://www.ross.net/crc/download/crc_v3.txt or several other net sites.
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/utils/hash/pg_crc.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "c.h"
+
+#include "utils/pg_crc.h"
+
+/*
+ * Lookup table for calculating CRC-32 using Sarwate's algorithm.
+ *
+ * This table is based on the polynomial
+ *	x^32+x^26+x^23+x^22+x^16+x^12+x^11+x^10+x^8+x^7+x^5+x^4+x^2+x+1.
+ * (This is the same polynomial used in Ethernet checksums, for instance.)
+ * Using Williams' terms, this is the "normal", not "reflected" version.
+ */
+const uint32 pg_crc32_table[256] = {
+	0x00000000, 0x77073096, 0xEE0E612C, 0x990951BA,
+	0x076DC419, 0x706AF48F, 0xE963A535, 0x9E6495A3,
+	0x0EDB8832, 0x79DCB8A4, 0xE0D5E91E, 0x97D2D988,
+	0x09B64C2B, 0x7EB17CBD, 0xE7B82D07, 0x90BF1D91,
+	0x1DB71064, 0x6AB020F2, 0xF3B97148, 0x84BE41DE,
+	0x1ADAD47D, 0x6DDDE4EB, 0xF4D4B551, 0x83D385C7,
+	0x136C9856, 0x646BA8C0, 0xFD62F97A, 0x8A65C9EC,
+	0x14015C4F, 0x63066CD9, 0xFA0F3D63, 0x8D080DF5,
+	0x3B6E20C8, 0x4C69105E, 0xD56041E4, 0xA2677172,
+	0x3C03E4D1, 0x4B04D447, 0xD20D85FD, 0xA50AB56B,
+	0x35B5A8FA, 0x42B2986C, 0xDBBBC9D6, 0xACBCF940,
+	0x32D86CE3, 0x45DF5C75, 0xDCD60DCF, 0xABD13D59,
+	0x26D930AC, 0x51DE003A, 0xC8D75180, 0xBFD06116,
+	0x21B4F4B5, 0x56B3C423, 0xCFBA9599, 0xB8BDA50F,
+	0x2802B89E, 0x5F058808, 0xC60CD9B2, 0xB10BE924,
+	0x2F6F7C87, 0x58684C11, 0xC1611DAB, 0xB6662D3D,
+	0x76DC4190, 0x01DB7106, 0x98D220BC, 0xEFD5102A,
+	0x71B18589, 0x06B6B51F, 0x9FBFE4A5, 0xE8B8D433,
+	0x7807C9A2, 0x0F00F934, 0x9609A88E, 0xE10E9818,
+	0x7F6A0DBB, 0x086D3D2D, 0x91646C97, 0xE6635C01,
+	0x6B6B51F4, 0x1C6C6162, 0x856530D8, 0xF262004E,
+	0x6C0695ED, 0x1B01A57B, 0x8208F4C1, 0xF50FC457,
+	0x65B0D9C6, 0x12B7E950, 0x8BBEB8EA, 0xFCB9887C,
+	0x62DD1DDF, 0x15DA2D49, 0x8CD37CF3, 0xFBD44C65,
+	0x4DB26158, 0x3AB551CE, 0xA3BC0074, 0xD4BB30E2,
+	0x4ADFA541, 0x3DD895D7, 0xA4D1C46D, 0xD3D6F4FB,
+	0x4369E96A, 0x346ED9FC, 0xAD678846, 0xDA60B8D0,
+	0x44042D73, 0x33031DE5, 0xAA0A4C5F, 0xDD0D7CC9,
+	0x5005713C, 0x270241AA, 0xBE0B1010, 0xC90C2086,
+	0x5768B525, 0x206F85B3, 0xB966D409, 0xCE61E49F,
+	0x5EDEF90E, 0x29D9C998, 0xB0D09822, 0xC7D7A8B4,
+	0x59B33D17, 0x2EB40D81, 0xB7BD5C3B, 0xC0BA6CAD,
+	0xEDB88320, 0x9ABFB3B6, 0x03B6E20C, 0x74B1D29A,
+	0xEAD54739, 0x9DD277AF, 0x04DB2615, 0x73DC1683,
+	0xE3630B12, 0x94643B84, 0x0D6D6A3E, 0x7A6A5AA8,
+	0xE40ECF0B, 0x9309FF9D, 0x0A00AE27, 0x7D079EB1,
+	0xF00F9344, 0x8708A3D2, 0x1E01F268, 0x6906C2FE,
+	0xF762575D, 0x806567CB, 0x196C3671, 0x6E6B06E7,
+	0xFED41B76, 0x89D32BE0, 0x10DA7A5A, 0x67DD4ACC,
+	0xF9B9DF6F, 0x8EBEEFF9, 0x17B7BE43, 0x60B08ED5,
+	0xD6D6A3E8, 0xA1D1937E, 0x38D8C2C4, 0x4FDFF252,
+	0xD1BB67F1, 0xA6BC5767, 0x3FB506DD, 0x48B2364B,
+	0xD80D2BDA, 0xAF0A1B4C, 0x36034AF6, 0x41047A60,
+	0xDF60EFC3, 0xA867DF55, 0x316E8EEF, 0x4669BE79,
+	0xCB61B38C, 0xBC66831A, 0x256FD2A0, 0x5268E236,
+	0xCC0C7795, 0xBB0B4703, 0x220216B9, 0x5505262F,
+	0xC5BA3BBE, 0xB2BD0B28, 0x2BB45A92, 0x5CB36A04,
+	0xC2D7FFA7, 0xB5D0CF31, 0x2CD99E8B, 0x5BDEAE1D,
+	0x9B64C2B0, 0xEC63F226, 0x756AA39C, 0x026D930A,
+	0x9C0906A9, 0xEB0E363F, 0x72076785, 0x05005713,
+	0x95BF4A82, 0xE2B87A14, 0x7BB12BAE, 0x0CB61B38,
+	0x92D28E9B, 0xE5D5BE0D, 0x7CDCEFB7, 0x0BDBDF21,
+	0x86D3D2D4, 0xF1D4E242, 0x68DDB3F8, 0x1FDA836E,
+	0x81BE16CD, 0xF6B9265B, 0x6FB077E1, 0x18B74777,
+	0x88085AE6, 0xFF0F6A70, 0x66063BCA, 0x11010B5C,
+	0x8F659EFF, 0xF862AE69, 0x616BFFD3, 0x166CCF45,
+	0xA00AE278, 0xD70DD2EE, 0x4E048354, 0x3903B3C2,
+	0xA7672661, 0xD06016F7, 0x4969474D, 0x3E6E77DB,
+	0xAED16A4A, 0xD9D65ADC, 0x40DF0B66, 0x37D83BF0,
+	0xA9BCAE53, 0xDEBB9EC5, 0x47B2CF7F, 0x30B5FFE9,
+	0xBDBDF21C, 0xCABAC28A, 0x53B39330, 0x24B4A3A6,
+	0xBAD03605, 0xCDD70693, 0x54DE5729, 0x23D967BF,
+	0xB3667A2E, 0xC4614AB8, 0x5D681B02, 0x2A6F2B94,
+	0xB40BBE37, 0xC30C8EA1, 0x5A05DF1B, 0x2D02EF8D
+};
diff --git a/src/bin/Makefile b/src/bin/Makefile
index bb77142..cc78798 100644
--- a/src/bin/Makefile
+++ b/src/bin/Makefile
@@ -23,6 +23,7 @@ SUBDIRS = \
 	pg_dump \
 	pg_resetxlog \
 	pg_rewind \
+	pg_upgrade \
 	pgbench \
 	psql \
 	scripts
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index a838bb5..d8cfe5e 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -92,7 +92,7 @@ main(int argc, char *argv[])
 	int			fd;
 	char		ControlFilePath[MAXPGPATH];
 	char	   *DataDir = NULL;
-	pg_crc32	crc;
+	pg_crc32c	crc;
 	time_t		time_tmp;
 	char		pgctime_str[128];
 	char		ckpttime_str[128];
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7da5c41..fe08c1b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3045,7 +3045,7 @@ binary_upgrade_set_type_oids_by_type_oid(Archive *fout,
 
 	appendPQExpBufferStr(upgrade_buffer, "\n-- For binary upgrade, must preserve pg_type oid\n");
 	appendPQExpBuffer(upgrade_buffer,
-	 "SELECT binary_upgrade.set_next_pg_type_oid('%u'::pg_catalog.oid);\n\n",
+	 "SELECT pg_catalog.binary_upgrade_set_next_pg_type_oid('%u'::pg_catalog.oid);\n\n",
 					  pg_type_oid);
 
 	/* we only support old >= 8.3 for binary upgrades */
@@ -3064,7 +3064,7 @@ binary_upgrade_set_type_oids_by_type_oid(Archive *fout,
 		appendPQExpBufferStr(upgrade_buffer,
 			   "\n-- For binary upgrade, must preserve pg_type array oid\n");
 		appendPQExpBuffer(upgrade_buffer,
-						  "SELECT binary_upgrade.set_next_array_pg_type_oid('%u'::pg_catalog.oid);\n\n",
+						  "SELECT pg_catalog.binary_upgrade_set_next_array_pg_type_oid('%u'::pg_catalog.oid);\n\n",
 						  pg_type_array_oid);
 	}
 
@@ -3106,7 +3106,7 @@ binary_upgrade_set_type_oids_by_rel_oid(Archive *fout,
 
 		appendPQExpBufferStr(upgrade_buffer, "\n-- For binary upgrade, must preserve pg_type toast oid\n");
 		appendPQExpBuffer(upgrade_buffer,
-						  "SELECT binary_upgrade.set_next_toast_pg_type_oid('%u'::pg_catalog.oid);\n\n",
+						  "SELECT pg_catalog.binary_upgrade_set_next_toast_pg_type_oid('%u'::pg_catalog.oid);\n\n",
 						  pg_type_toast_oid);
 
 		toast_set = true;
@@ -3146,7 +3146,7 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
 	if (!is_index)
 	{
 		appendPQExpBuffer(upgrade_buffer,
-						  "SELECT binary_upgrade.set_next_heap_pg_class_oid('%u'::pg_catalog.oid);\n",
+						  "SELECT pg_catalog.binary_upgrade_set_next_heap_pg_class_oid('%u'::pg_catalog.oid);\n",
 						  pg_class_oid);
 		/* only tables have toast tables, not indexes */
 		if (OidIsValid(pg_class_reltoastrelid))
@@ -3161,18 +3161,18 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
 			 */
 
 			appendPQExpBuffer(upgrade_buffer,
-							  "SELECT binary_upgrade.set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
+							  "SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n",
 							  pg_class_reltoastrelid);
 
 			/* every toast table has an index */
 			appendPQExpBuffer(upgrade_buffer,
-							  "SELECT binary_upgrade.set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
+							  "SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
 							  pg_index_indexrelid);
 		}
 	}
 	else
 		appendPQExpBuffer(upgrade_buffer,
-						  "SELECT binary_upgrade.set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
+						  "SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n",
 						  pg_class_oid);
 
 	appendPQExpBufferChar(upgrade_buffer, '\n');
@@ -8352,7 +8352,7 @@ dumpExtension(Archive *fout, DumpOptions *dopt, ExtensionInfo *extinfo)
 		appendPQExpBuffer(q, "DROP EXTENSION IF EXISTS %s;\n", qextname);
 
 		appendPQExpBufferStr(q,
-							 "SELECT binary_upgrade.create_empty_extension(");
+							 "SELECT pg_catalog.binary_upgrade_create_empty_extension(");
 		appendStringLiteralAH(q, extinfo->dobj.name, fout);
 		appendPQExpBufferStr(q, ", ");
 		appendStringLiteralAH(q, extinfo->namespace, fout);
@@ -8530,7 +8530,7 @@ dumpEnumType(Archive *fout, DumpOptions *dopt, TypeInfo *tyinfo)
 			if (i == 0)
 				appendPQExpBufferStr(q, "\n-- For binary upgrade, must preserve pg_enum oids\n");
 			appendPQExpBuffer(q,
-							  "SELECT binary_upgrade.set_next_pg_enum_oid('%u'::pg_catalog.oid);\n",
+							  "SELECT pg_catalog.binary_upgrade_set_next_pg_enum_oid('%u'::pg_catalog.oid);\n",
 							  enum_oid);
 			appendPQExpBuffer(q, "ALTER TYPE %s.",
 							  fmtId(tyinfo->dobj.namespace->dobj.name));
diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index 6a7a641..7169ad0 100644
--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -781,7 +781,7 @@ dumpRoles(PGconn *conn)
 		{
 			appendPQExpBufferStr(buf, "\n-- For binary upgrade, must preserve pg_authid.oid\n");
 			appendPQExpBuffer(buf,
-							  "SELECT binary_upgrade.set_next_pg_authid_oid('%u'::pg_catalog.oid);\n\n",
+							  "SELECT pg_catalog.binary_upgrade_set_next_pg_authid_oid('%u'::pg_catalog.oid);\n\n",
 							  auth_oid);
 		}
 
diff --git a/src/bin/pg_resetxlog/pg_resetxlog.c b/src/bin/pg_resetxlog/pg_resetxlog.c
index 3361111..a0805d8 100644
--- a/src/bin/pg_resetxlog/pg_resetxlog.c
+++ b/src/bin/pg_resetxlog/pg_resetxlog.c
@@ -465,7 +465,7 @@ ReadControlFile(void)
 	int			fd;
 	int			len;
 	char	   *buffer;
-	pg_crc32	crc;
+	pg_crc32c	crc;
 
 	if ((fd = open(XLOG_CONTROL_FILE, O_RDONLY | PG_BINARY, 0)) < 0)
 	{
@@ -1062,7 +1062,7 @@ WriteEmptyXLOG(void)
 	XLogPageHeader page;
 	XLogLongPageHeader longpage;
 	XLogRecord *record;
-	pg_crc32	crc;
+	pg_crc32c	crc;
 	char		path[MAXPGPATH];
 	int			fd;
 	int			nbytes;
diff --git a/src/bin/pg_rewind/RewindTest.pm b/src/bin/pg_rewind/RewindTest.pm
index 50cae2c..e6a5b9b 100644
--- a/src/bin/pg_rewind/RewindTest.pm
+++ b/src/bin/pg_rewind/RewindTest.pm
@@ -22,6 +22,9 @@ package RewindTest;
 # 5. run_pg_rewind - stops the old master (if it's still running) and runs
 # pg_rewind to synchronize it with the now-promoted standby server.
 #
+# 6. clean_rewind_test - stops both servers used in the test, if they're
+# still running.
+#
 # The test script can use the helper functions master_psql and standby_psql
 # to run psql against the master and standby servers, respectively. The
 # test script can also use the $connstr_master and $connstr_standby global
@@ -56,6 +59,7 @@ our @EXPORT = qw(
   create_standby
   promote_standby
   run_pg_rewind
+  clean_rewind_test
 );
 
 
@@ -262,9 +266,8 @@ recovery_target_timeline='latest'
 }
 
 # Clean up after the test. Stop both servers, if they're still running.
-END
+sub clean_rewind_test
 {
-	my $save_rc = $?;
 	if ($test_master_datadir)
 	{
 		system "pg_ctl -D $test_master_datadir -s -m immediate stop 2> /dev/null";
@@ -273,5 +276,12 @@ END
 	{
 		system "pg_ctl -D $test_standby_datadir -s -m immediate stop 2> /dev/null";
 	}
+}
+
+# Stop the test servers, just in case they're still running.
+END
+{
+	my $save_rc = $?;
+	clean_rewind_test();
 	$? = $save_rc;
 }
diff --git a/src/bin/pg_rewind/copy_fetch.c b/src/bin/pg_rewind/copy_fetch.c
index 887fec9..397bf20 100644
--- a/src/bin/pg_rewind/copy_fetch.c
+++ b/src/bin/pg_rewind/copy_fetch.c
@@ -75,17 +75,20 @@ recurse_dir(const char *datadir, const char *parentpath,
 
 		if (lstat(fullpath, &fst) < 0)
 		{
-			pg_log(PG_WARNING, "could not stat file \"%s\": %s",
-				   fullpath, strerror(errno));
-
-			/*
-			 * This is ok, if the new master is running and the file was just
-			 * removed. If it was a data file, there should be a WAL record of
-			 * the removal. If it was something else, it couldn't have been
-			 * critical anyway.
-			 *
-			 * TODO: But complain if we're processing the target dir!
-			 */
+			if (errno == ENOENT)
+			{
+				/*
+				 * File doesn't exist anymore. This is ok, if the new master
+				 * is running and the file was just removed. If it was a data
+				 * file, there should be a WAL record of the removal. If it
+				 * was something else, it couldn't have been anyway.
+				 *
+				 * TODO: But complain if we're processing the target dir!
+				 */
+			}
+			else
+				pg_fatal("could not stat file \"%s\": %s",
+						 fullpath, strerror(errno));
 		}
 
 		if (parentpath)
diff --git a/src/bin/pg_rewind/fetch.c b/src/bin/pg_rewind/fetch.c
index eb2dd24..0a1b117 100644
--- a/src/bin/pg_rewind/fetch.c
+++ b/src/bin/pg_rewind/fetch.c
@@ -26,10 +26,10 @@
 #include "filemap.h"
 
 void
-fetchRemoteFileList(void)
+fetchSourceFileList(void)
 {
 	if (datadir_source)
-		traverse_datadir(datadir_source, &process_remote_file);
+		traverse_datadir(datadir_source, &process_source_file);
 	else
 		libpqProcessFileList();
 }
diff --git a/src/bin/pg_rewind/fetch.h b/src/bin/pg_rewind/fetch.h
index d0e7dd3..185d5ea 100644
--- a/src/bin/pg_rewind/fetch.h
+++ b/src/bin/pg_rewind/fetch.h
@@ -25,7 +25,7 @@
  * Common interface. Calls the copy or libpq method depending on global
  * config options.
  */
-extern void fetchRemoteFileList(void);
+extern void fetchSourceFileList(void);
 extern char *fetchFile(char *filename, size_t *filesize);
 extern void executeFileMap(void);
 
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index ee6e6db..1a56866 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -51,14 +51,14 @@ filemap_create(void)
 }
 
 /*
- * Callback for processing remote file list.
+ * Callback for processing source file list.
  *
  * This is called once for every file in the source server. We decide what
  * action needs to be taken for the file, depending on whether the file
  * exists in the target and whether the size matches.
  */
 void
-process_remote_file(const char *path, file_type_t type, size_t newsize,
+process_source_file(const char *path, file_type_t type, size_t newsize,
 					const char *link_target)
 {
 	bool		exists;
@@ -97,7 +97,7 @@ process_remote_file(const char *path, file_type_t type, size_t newsize,
 
 	snprintf(localpath, sizeof(localpath), "%s/%s", datadir_target, path);
 
-	/* Does the corresponding local file exist? */
+	/* Does the corresponding file exist in the target data dir? */
 	if (lstat(localpath, &statbuf) < 0)
 	{
 		if (errno != ENOENT)
@@ -185,18 +185,19 @@ process_remote_file(const char *path, file_type_t type, size_t newsize,
 				 *
 				 * If it's smaller in the target, it means that it has been
 				 * truncated in the target, or enlarged in the source, or
-				 * both. If it was truncated locally, we need to copy the
-				 * missing tail from the remote system. If it was enlarged in
-				 * the remote system, there will be WAL records in the remote
+				 * both. If it was truncated in the target, we need to copy the
+				 * missing tail from the source system. If it was enlarged in
+				 * the source system, there will be WAL records in the source
 				 * system for the new blocks, so we wouldn't need to copy them
 				 * here. But we don't know which scenario we're dealing with,
 				 * and there's no harm in copying the missing blocks now, so
 				 * do it now.
 				 *
-				 * If it's the same size, do nothing here. Any locally
-				 * modified blocks will be copied based on parsing the local
-				 * WAL, and any remotely modified blocks will be updated after
-				 * rewinding, when the remote WAL is replayed.
+				 * If it's the same size, do nothing here. Any blocks modified
+				 * in the target will be copied based on parsing the target
+				 * system's WAL, and any blocks modified in the source will be
+				 * updated after rewinding, when the source system's WAL is
+				 * replayed.
 				 */
 				oldsize = statbuf.st_size;
 				if (oldsize < newsize)
@@ -233,14 +234,15 @@ process_remote_file(const char *path, file_type_t type, size_t newsize,
 }
 
 /*
- * Callback for processing local file list.
+ * Callback for processing target file list.
  *
- * All remote files must be already processed before calling this. This only
- * marks local files that didn't exist in the remote system for deletion.
+ * All source files must be already processed before calling this. This only
+ * marks target data directory's files that didn't exist in the source for
+ * deletion.
  */
 void
-process_local_file(const char *path, file_type_t type, size_t oldsize,
-				   const char *link_target)
+process_target_file(const char *path, file_type_t type, size_t oldsize,
+					const char *link_target)
 {
 	bool		exists;
 	char		localpath[MAXPGPATH];
@@ -266,7 +268,7 @@ process_local_file(const char *path, file_type_t type, size_t oldsize,
 		if (map->nlist == 0)
 		{
 			/* should not happen */
-			pg_fatal("remote file list is empty\n");
+			pg_fatal("source file list is empty\n");
 		}
 
 		filemap_list_to_array(map);
@@ -288,7 +290,7 @@ process_local_file(const char *path, file_type_t type, size_t oldsize,
 	exists = (bsearch(&key_ptr, map->array, map->narray, sizeof(file_entry_t *),
 					  path_cmp) != NULL);
 
-	/* Remove any file or folder that doesn't exist in the remote system. */
+	/* Remove any file or folder that doesn't exist in the source system. */
 	if (!exists)
 	{
 		entry = pg_malloc(sizeof(file_entry_t));
@@ -313,16 +315,16 @@ process_local_file(const char *path, file_type_t type, size_t oldsize,
 	else
 	{
 		/*
-		 * We already handled all files that exist in the remote system in
-		 * process_remote_file().
+		 * We already handled all files that exist in the source system in
+		 * process_source_file().
 		 */
 	}
 }
 
 /*
- * This callback gets called while we read the old WAL, for every block that
- * have changed in the local system. It makes note of all the changed blocks
- * in the pagemap of the file.
+ * This callback gets called while we read the WAL in the target, for every
+ * block that have changed in the target system. It makes note of all the
+ * changed blocks in the pagemap of the file.
  */
 void
 process_block_change(ForkNumber forknum, RelFileNode rnode, BlockNumber blkno)
@@ -388,8 +390,8 @@ process_block_change(ForkNumber forknum, RelFileNode rnode, BlockNumber blkno)
 	{
 		/*
 		 * If we don't have any record of this file in the file map, it means
-		 * that it's a relation that doesn't exist in the remote system, and
-		 * it was subsequently removed in the local system, too. We can safely
+		 * that it's a relation that doesn't exist in the source system, and
+		 * it was subsequently removed in the target system, too. We can safely
 		 * ignore it.
 		 */
 	}
diff --git a/src/bin/pg_rewind/filemap.h b/src/bin/pg_rewind/filemap.h
index 8fa1939..73113ec 100644
--- a/src/bin/pg_rewind/filemap.h
+++ b/src/bin/pg_rewind/filemap.h
@@ -62,8 +62,8 @@ typedef struct file_entry_t
 typedef struct filemap_t
 {
 	/*
-	 * New entries are accumulated to a linked list, in process_remote_file
-	 * and process_local_file.
+	 * New entries are accumulated to a linked list, in process_source_file
+	 * and process_target_file.
 	 */
 	file_entry_t *first;
 	file_entry_t *last;
@@ -94,9 +94,12 @@ extern void calculate_totals(void);
 extern void print_filemap(void);
 
 /* Functions for populating the filemap */
-extern void process_remote_file(const char *path, file_type_t type, size_t newsize, const char *link_target);
-extern void process_local_file(const char *path, file_type_t type, size_t newsize, const char *link_target);
-extern void process_block_change(ForkNumber forknum, RelFileNode rnode, BlockNumber blkno);
+extern void process_source_file(const char *path, file_type_t type,
+					size_t newsize, const char *link_target);
+extern void process_target_file(const char *path, file_type_t type,
+					size_t newsize, const char *link_target);
+extern void process_block_change(ForkNumber forknum, RelFileNode rnode,
+					 BlockNumber blkno);
 extern void filemap_finalize(void);
 
 #endif   /* FILEMAP_H */
diff --git a/src/bin/pg_rewind/libpq_fetch.c b/src/bin/pg_rewind/libpq_fetch.c
index e696554..14a8610 100644
--- a/src/bin/pg_rewind/libpq_fetch.c
+++ b/src/bin/pg_rewind/libpq_fetch.c
@@ -190,7 +190,7 @@ libpqProcessFileList(void)
 		else
 			type = FILE_TYPE_REGULAR;
 
-		process_remote_file(path, type, filesize, link_target);
+		process_source_file(path, type, filesize, link_target);
 	}
 }
 
diff --git a/src/bin/pg_rewind/logging.c b/src/bin/pg_rewind/logging.c
index 3e2dc76..0e05f96 100644
--- a/src/bin/pg_rewind/logging.c
+++ b/src/bin/pg_rewind/logging.c
@@ -76,6 +76,9 @@ pg_log(eLogType type, const char *fmt,...)
 }
 
 
+/*
+ * Print an error message, and exit.
+ */
 void
 pg_fatal(const char *fmt,...)
 {
diff --git a/src/bin/pg_rewind/parsexlog.c b/src/bin/pg_rewind/parsexlog.c
index 715aaab..69e28f2 100644
--- a/src/bin/pg_rewind/parsexlog.c
+++ b/src/bin/pg_rewind/parsexlog.c
@@ -314,39 +314,37 @@ extractPageInfo(XLogReaderState *record)
 	{
 		/*
 		 * New databases can be safely ignored. It won't be present in the
-		 * remote system, so it will be copied in toto. There's one
-		 * corner-case, though: if a new, different, database is also created
-		 * in the remote system, we'll see that the files already exist and
-		 * not copy them. That's OK, though; WAL replay of creating the new
-		 * database, from the remote WAL, will re-copy the new database,
-		 * overwriting the database created in the local system.
+		 * source system, so it will be deleted. There's one corner-case,
+		 * though: if a new, different, database is also created in the
+		 * source system, we'll see that the files already exist and not copy
+		 * them. That's OK, though; WAL replay of creating the new database,
+		 * from the source systems's WAL, will re-copy the new database,
+		 * overwriting the database created in the target system.
 		 */
 	}
 	else if (rmid == RM_DBASE_ID && rminfo == XLOG_DBASE_DROP)
 	{
 		/*
 		 * An existing database was dropped. We'll see that the files don't
-		 * exist in local system, and copy them in toto from the remote
+		 * exist in the target data dir, and copy them in toto from the source
 		 * system. No need to do anything special here.
 		 */
 	}
 	else if (rmid == RM_SMGR_ID && rminfo == XLOG_SMGR_CREATE)
 	{
 		/*
-		 * We can safely ignore these. The local file will be removed, if it
-		 * doesn't exist in remote system. If a file with same name is created
-		 * in remote system, too, there will be WAL records for all the blocks
-		 * in it.
+		 * We can safely ignore these. The file will be removed from the
+		 * target, if it doesn't exist in source system. If a file with same
+		 * name is created in source system, too, there will be WAL records
+		 * for all the blocks in it.
 		 */
 	}
 	else if (rmid == RM_SMGR_ID && rminfo == XLOG_SMGR_TRUNCATE)
 	{
 		/*
-		 * We can safely ignore these. If a file is truncated locally, we'll
-		 * notice that when we compare the sizes, and will copy the missing
-		 * tail from remote system.
-		 *
-		 * TODO: But it would be nice to do some sanity cross-checking here..
+		 * We can safely ignore these. When we compare the sizes later on,
+		 * we'll notice that they differ, and copy the missing tail from
+		 * source system.
 		 */
 	}
 	else if (info & XLR_SPECIAL_REL_UPDATE)
diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index 93341a3..d3ae767 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -263,13 +263,13 @@ main(int argc, char **argv)
 		   chkpttli);
 
 	/*
-	 * Build the filemap, by comparing the remote and local data directories.
+	 * Build the filemap, by comparing the source and target data directories.
 	 */
 	filemap_create();
 	pg_log(PG_PROGRESS, "reading source file list\n");
-	fetchRemoteFileList();
+	fetchSourceFileList();
 	pg_log(PG_PROGRESS, "reading target file list\n");
-	traverse_datadir(datadir_target, &process_local_file);
+	traverse_datadir(datadir_target, &process_target_file);
 
 	/*
 	 * Read the target WAL from last checkpoint before the point of fork, to
@@ -508,7 +508,7 @@ createBackupLabel(XLogRecPtr startpoint, TimeLineID starttli, XLogRecPtr checkpo
 static void
 checkControlFile(ControlFileData *ControlFile)
 {
-	pg_crc32	crc;
+	pg_crc32c	crc;
 
 	/* Calculate CRC */
 	INIT_CRC32C(crc);
diff --git a/src/bin/pg_rewind/t/001_basic.pl b/src/bin/pg_rewind/t/001_basic.pl
index ae26d01..a1d679f 100644
--- a/src/bin/pg_rewind/t/001_basic.pl
+++ b/src/bin/pg_rewind/t/001_basic.pl
@@ -78,6 +78,7 @@ in master, before promotion
 ),
 		'tail-copy');
 
+	RewindTest::clean_rewind_test();
 }
 
 # Run the test in both modes
diff --git a/src/bin/pg_rewind/t/002_databases.pl b/src/bin/pg_rewind/t/002_databases.pl
index 1cf9a3a..be1e194 100644
--- a/src/bin/pg_rewind/t/002_databases.pl
+++ b/src/bin/pg_rewind/t/002_databases.pl
@@ -40,6 +40,7 @@ standby_afterpromotion
 ),
 			   'database names');
 
+	RewindTest::clean_rewind_test();
 }
 
 # Run the test in both modes.
diff --git a/src/bin/pg_rewind/t/003_extrafiles.pl b/src/bin/pg_rewind/t/003_extrafiles.pl
index 218b865..ed50659 100644
--- a/src/bin/pg_rewind/t/003_extrafiles.pl
+++ b/src/bin/pg_rewind/t/003_extrafiles.pl
@@ -62,6 +62,8 @@ sub run_test
 			   "$test_master_datadir/tst_standby_dir/standby_subdir",
 			   "$test_master_datadir/tst_standby_dir/standby_subdir/standby_file3"],
 			  "file lists match");
+
+	RewindTest::clean_rewind_test();
 }
 
 # Run the test in both modes.
diff --git a/src/bin/pg_upgrade/.gitignore b/src/bin/pg_upgrade/.gitignore
new file mode 100644
index 0000000..d24ec60
--- /dev/null
+++ b/src/bin/pg_upgrade/.gitignore
@@ -0,0 +1,8 @@
+/pg_upgrade
+# Generated by test suite
+/analyze_new_cluster.sh
+/delete_old_cluster.sh
+/analyze_new_cluster.bat
+/delete_old_cluster.bat
+/log/
+/tmp_check/
diff --git a/src/bin/pg_upgrade/IMPLEMENTATION b/src/bin/pg_upgrade/IMPLEMENTATION
new file mode 100644
index 0000000..9b5ff72
--- /dev/null
+++ b/src/bin/pg_upgrade/IMPLEMENTATION
@@ -0,0 +1,98 @@
+------------------------------------------------------------------------------
+PG_UPGRADE: IN-PLACE UPGRADES FOR POSTGRESQL
+------------------------------------------------------------------------------
+
+Upgrading a PostgreSQL database from one major release to another can be
+an expensive process. For minor upgrades, you can simply install new
+executables and forget about upgrading existing data. But for major
+upgrades, you have to export all of your data using pg_dump, install the
+new release, run initdb to create a new cluster, and then import your
+old data. If you have a lot of data, that can take a considerable amount
+of time. If you have too much data, you may have to buy more storage
+since you need enough room to hold the original data plus the exported
+data.  pg_upgrade can reduce the amount of time and disk space required
+for many upgrades.
+
+The URL http://momjian.us/main/writings/pgsql/pg_upgrade.pdf contains a
+presentation about pg_upgrade internals that mirrors the text
+description below.
+
+------------------------------------------------------------------------------
+WHAT IT DOES
+------------------------------------------------------------------------------
+
+pg_upgrade is a tool that performs an in-place upgrade of existing
+data. Some upgrades change the on-disk representation of data;
+pg_upgrade cannot help in those upgrades.  However, many upgrades do
+not change the on-disk representation of a user-defined table.  In those
+cases, pg_upgrade can move existing user-defined tables from the old
+database cluster into the new cluster.
+
+There are two factors that determine whether an in-place upgrade is
+practical.
+
+Every table in a cluster shares the same on-disk representation of the
+table headers and trailers and the on-disk representation of tuple
+headers. If this changes between the old version of PostgreSQL and the
+new version, pg_upgrade cannot move existing tables to the new cluster;
+you will have to pg_dump the old data and then import that data into the
+new cluster.
+
+Second, all data types should have the same binary representation
+between the two major PostgreSQL versions.
+
+------------------------------------------------------------------------------
+HOW IT WORKS
+------------------------------------------------------------------------------
+
+To use pg_upgrade during an upgrade, start by installing a fresh
+cluster using the newest version in a new directory. When you've
+finished installation, the new cluster will contain the new executables
+and the usual template0, template1, and postgres, but no user-defined
+tables. At this point, you can shut down the old and new postmasters and
+invoke pg_upgrade.
+
+When pg_upgrade starts, it ensures that all required executables are
+present and contain the expected version numbers. The verification
+process also checks the old and new $PGDATA directories to ensure that
+the expected files and subdirectories are in place.  If the verification
+process succeeds, pg_upgrade starts the old postmaster and runs
+pg_dumpall --schema-only to capture the metadata contained in the old
+cluster. The script produced by pg_dumpall will be used in a later step
+to recreate all user-defined objects in the new cluster.
+
+Note that the script produced by pg_dumpall will only recreate
+user-defined objects, not system-defined objects.  The new cluster will
+contain the system-defined objects created by the latest version of
+PostgreSQL.
+
+Once pg_upgrade has extracted the metadata from the old cluster, it
+performs a number of bookkeeping tasks required to 'sync up' the new
+cluster with the existing data.
+
+First, pg_upgrade copies the commit status information and 'next
+transaction ID' from the old cluster to the new cluster. This is the
+steps ensures that the proper tuples are visible from the new cluster.
+Remember, pg_upgrade does not export/import the content of user-defined
+tables so the transaction IDs in the new cluster must match the
+transaction IDs in the old data. pg_upgrade also copies the starting
+address for write-ahead logs from the old cluster to the new cluster.
+
+Now pg_upgrade begins reconstructing the metadata obtained from the old
+cluster using the first part of the pg_dumpall output.
+
+Next, pg_upgrade executes the remainder of the script produced earlier
+by pg_dumpall --- this script effectively creates the complete
+user-defined metadata from the old cluster to the new cluster.  It
+preserves the relfilenode numbers so TOAST and other references
+to relfilenodes in user data is preserved.  (See binary-upgrade usage
+in pg_dump).
+
+Finally, pg_upgrade links or copies each user-defined table and its
+supporting indexes and toast tables from the old cluster to the new
+cluster.
+
+An important feature of the pg_upgrade design is that it leaves the
+original cluster intact --- if a problem occurs during the upgrade, you
+can still run the previous version, after renaming the tablespaces back
+to the original names.
diff --git a/src/bin/pg_upgrade/Makefile b/src/bin/pg_upgrade/Makefile
new file mode 100644
index 0000000..4eb20d6
--- /dev/null
+++ b/src/bin/pg_upgrade/Makefile
@@ -0,0 +1,42 @@
+# src/bin/pg_upgrade/Makefile
+
+PGFILEDESC = "pg_upgrade - an in-place binary upgrade utility"
+PGAPPICON = win32
+
+subdir = src/bin/pg_upgrade
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+OBJS = check.o controldata.o dump.o exec.o file.o function.o info.o \
+       option.o page.o parallel.o pg_upgrade.o relfilenode.o server.o \
+       tablespace.o util.o version.o $(WIN32RES)
+
+override CPPFLAGS := -DFRONTEND -DDLSUFFIX=\"$(DLSUFFIX)\" -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS)
+
+
+all: pg_upgrade
+
+pg_upgrade: $(OBJS) | submake-libpq submake-libpgport
+	$(CC) $(CFLAGS) $^ $(libpq_pgport) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)
+
+install: all installdirs
+	$(INSTALL_PROGRAM) pg_upgrade$(X) '$(DESTDIR)$(bindir)/pg_upgrade$(X)'
+
+installdirs:
+	$(MKDIR_P) '$(DESTDIR)$(bindir)'
+
+uninstall:
+	rm -f '$(DESTDIR)$(bindir)/pg_upgrade$(X)'
+
+clean distclean maintainer-clean:
+	rm -f pg_upgrade$(X) $(OBJS)
+	rm -rf analyze_new_cluster.sh delete_old_cluster.sh log/ tmp_check/ \
+	       pg_upgrade_dump_globals.sql \
+	       pg_upgrade_dump_*.custom pg_upgrade_*.log
+
+check: test.sh all
+	MAKE=$(MAKE) bindir=$(bindir) libdir=$(libdir) EXTRA_REGRESS_OPTS="$(EXTRA_REGRESS_OPTS)" $(SHELL) $< --install
+
+# disabled because it upsets the build farm
+#installcheck: test.sh
+#	MAKE=$(MAKE) bindir=$(bindir) libdir=$(libdir) $(SHELL) $<
diff --git a/src/bin/pg_upgrade/TESTING b/src/bin/pg_upgrade/TESTING
new file mode 100644
index 0000000..4ecfc57
--- /dev/null
+++ b/src/bin/pg_upgrade/TESTING
@@ -0,0 +1,81 @@
+The most effective way to test pg_upgrade, aside from testing on user
+data, is by upgrading the PostgreSQL regression database.
+
+This testing process first requires the creation of a valid regression
+database dump.  Such files contain most database features and are
+specific to each major version of Postgres.
+
+Here are the steps needed to create a regression database dump file:
+
+1)  Create and populate the regression database in the old cluster
+    This database can be created by running 'make installcheck' from
+    src/test/regression.
+
+2)  Use pg_dump to dump out the regression database.  Use the new
+    cluster's pg_dump on the old database to minimize whitespace
+    differences in the diff.
+
+3)  Adjust the regression database dump file
+
+    a)  Perform the load/dump twice
+        This fixes problems with the ordering of COPY columns for
+        inherited tables.
+
+    b)  Change CREATE FUNCTION shared object paths to use '$libdir'
+        The old and new cluster will have different shared object paths.
+
+    c)  Fix any wrapping format differences
+        Commands like CREATE TRIGGER and ALTER TABLE sometimes have
+        differences.
+
+    d)  For pre-9.0, change CREATE OR REPLACE LANGUAGE to CREATE LANGUAGE
+
+    e)  For pre-9.0, remove 'regex_flavor'
+
+    f)  For pre-9.0, adjust extra_float_digits
+        Postgres 9.0 pg_dump uses extra_float_digits=-2 for pre-9.0
+        databases, and extra_float_digits=-3 for >= 9.0 databases.
+        It is necessary to modify 9.0 pg_dump to always use -3, and
+        modify the pre-9.0 old server to accept extra_float_digits=-3.
+
+Once the dump is created, it can be repeatedly loaded into the old
+database, upgraded, and dumped out of the new database, and then
+compared to the original version. To test the dump file, perform these
+steps:
+
+1)  Create the old and new clusters in different directories.
+
+2)  Copy the regression shared object files into the appropriate /lib
+    directory for old and new clusters.
+
+3)  Create the regression database in the old server.
+
+4)  Load the dump file created above into the regression database;
+    check for errors while loading.
+
+5)  Upgrade the old database to the new major version, as outlined in
+    the pg_upgrade manual section.
+
+6)  Use pg_dump to dump out the regression database in the new cluster.
+
+7)  Diff the regression database dump file with the regression dump
+    file loaded into the old server.
+
+The shell script test.sh in this directory performs more or less this
+procedure.  You can invoke it by running
+
+    make check
+
+or by running
+
+    make installcheck
+
+if "make install" (or "make install-world") were done beforehand.
+When invoked without arguments, it will run an upgrade from the
+version in this source tree to a new instance of the same version.  To
+test an upgrade from a different version, invoke it like this:
+
+    make installcheck oldbindir=...otherversion/bin oldsrc=...somewhere/postgresql
+
+In this case, you will have to manually eyeball the resulting dump
+diff for version-specific differences, as explained above.
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
new file mode 100644
index 0000000..647bf34
--- /dev/null
+++ b/src/bin/pg_upgrade/check.c
@@ -0,0 +1,1016 @@
+/*
+ *	check.c
+ *
+ *	server checks and output routines
+ *
+ *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
+ *	src/bin/pg_upgrade/check.c
+ */
+
+#include "postgres_fe.h"
+
+#include "catalog/pg_authid.h"
+#include "mb/pg_wchar.h"
+#include "pg_upgrade.h"
+
+
+static void check_new_cluster_is_empty(void);
+static void check_databases_are_compatible(void);
+static void check_locale_and_encoding(DbInfo *olddb, DbInfo *newdb);
+static bool equivalent_locale(int category, const char *loca, const char *locb);
+static void check_is_install_user(ClusterInfo *cluster);
+static void check_for_prepared_transactions(ClusterInfo *cluster);
+static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
+static void check_for_reg_data_type_usage(ClusterInfo *cluster);
+static void check_for_jsonb_9_4_usage(ClusterInfo *cluster);
+static void get_bin_version(ClusterInfo *cluster);
+static char *get_canonical_locale_name(int category, const char *locale);
+
+
+/*
+ * fix_path_separator
+ * For non-Windows, just return the argument.
+ * For Windows convert any forward slash to a backslash
+ * such as is suitable for arguments to builtin commands
+ * like RMDIR and DEL.
+ */
+static char *
+fix_path_separator(char *path)
+{
+#ifdef WIN32
+
+	char	   *result;
+	char	   *c;
+
+	result = pg_strdup(path);
+
+	for (c = result; *c != '\0'; c++)
+		if (*c == '/')
+			*c = '\\';
+
+	return result;
+#else
+
+	return path;
+#endif
+}
+
+void
+output_check_banner(bool live_check)
+{
+	if (user_opts.check && live_check)
+	{
+		pg_log(PG_REPORT, "Performing Consistency Checks on Old Live Server\n");
+		pg_log(PG_REPORT, "------------------------------------------------\n");
+	}
+	else
+	{
+		pg_log(PG_REPORT, "Performing Consistency Checks\n");
+		pg_log(PG_REPORT, "-----------------------------\n");
+	}
+}
+
+
+void
+check_and_dump_old_cluster(bool live_check)
+{
+	/* -- OLD -- */
+
+	if (!live_check)
+		start_postmaster(&old_cluster, true);
+
+	get_pg_database_relfilenode(&old_cluster);
+
+	/* Extract a list of databases and tables from the old cluster */
+	get_db_and_rel_infos(&old_cluster);
+
+	init_tablespaces();
+
+	get_loadable_libraries();
+
+
+	/*
+	 * Check for various failure cases
+	 */
+	check_is_install_user(&old_cluster);
+	check_for_prepared_transactions(&old_cluster);
+	check_for_reg_data_type_usage(&old_cluster);
+	check_for_isn_and_int8_passing_mismatch(&old_cluster);
+	if (GET_MAJOR_VERSION(old_cluster.major_version) == 904 &&
+		old_cluster.controldata.cat_ver < JSONB_FORMAT_CHANGE_CAT_VER)
+		check_for_jsonb_9_4_usage(&old_cluster);
+
+	/* Pre-PG 9.4 had a different 'line' data type internal format */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 903)
+		old_9_3_check_for_line_data_type_usage(&old_cluster);
+
+	/* Pre-PG 9.0 had no large object permissions */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 804)
+		new_9_0_populate_pg_largeobject_metadata(&old_cluster, true);
+
+	/*
+	 * While not a check option, we do this now because this is the only time
+	 * the old server is running.
+	 */
+	if (!user_opts.check)
+		generate_old_dump();
+
+	if (!live_check)
+		stop_postmaster(false);
+}
+
+
+void
+check_new_cluster(void)
+{
+	get_db_and_rel_infos(&new_cluster);
+
+	check_new_cluster_is_empty();
+	check_databases_are_compatible();
+
+	check_loadable_libraries();
+
+	if (user_opts.transfer_mode == TRANSFER_MODE_LINK)
+		check_hard_link();
+
+	check_is_install_user(&new_cluster);
+
+	check_for_prepared_transactions(&new_cluster);
+}
+
+
+void
+report_clusters_compatible(void)
+{
+	if (user_opts.check)
+	{
+		pg_log(PG_REPORT, "\n*Clusters are compatible*\n");
+		/* stops new cluster */
+		stop_postmaster(false);
+		exit(0);
+	}
+
+	pg_log(PG_REPORT, "\n"
+		   "If pg_upgrade fails after this point, you must re-initdb the\n"
+		   "new cluster before continuing.\n");
+}
+
+
+void
+issue_warnings(void)
+{
+	/* Create dummy large object permissions for old < PG 9.0? */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 804)
+	{
+		start_postmaster(&new_cluster, true);
+		new_9_0_populate_pg_largeobject_metadata(&new_cluster, false);
+		stop_postmaster(false);
+	}
+}
+
+
+void
+output_completion_banner(char *analyze_script_file_name,
+						 char *deletion_script_file_name)
+{
+	/* Did we copy the free space files? */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 804)
+		pg_log(PG_REPORT,
+			   "Optimizer statistics are not transferred by pg_upgrade so,\n"
+			   "once you start the new server, consider running:\n"
+			   "    %s\n\n", analyze_script_file_name);
+	else
+		pg_log(PG_REPORT,
+			   "Optimizer statistics and free space information are not transferred\n"
+		"by pg_upgrade so, once you start the new server, consider running:\n"
+			   "    %s\n\n", analyze_script_file_name);
+
+
+	if (deletion_script_file_name)
+		pg_log(PG_REPORT,
+			"Running this script will delete the old cluster's data files:\n"
+			   "    %s\n",
+			   deletion_script_file_name);
+	else
+		pg_log(PG_REPORT,
+			   "Could not create a script to delete the old cluster's data\n"
+		  "files because user-defined tablespaces exist in the old cluster\n"
+		"directory.  The old cluster's contents must be deleted manually.\n");
+}
+
+
+void
+check_cluster_versions(void)
+{
+	prep_status("Checking cluster versions");
+
+	/* get old and new cluster versions */
+	old_cluster.major_version = get_major_server_version(&old_cluster);
+	new_cluster.major_version = get_major_server_version(&new_cluster);
+
+	/*
+	 * We allow upgrades from/to the same major version for alpha/beta
+	 * upgrades
+	 */
+
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 804)
+		pg_fatal("This utility can only upgrade from PostgreSQL version 8.4 and later.\n");
+
+	/* Only current PG version is supported as a target */
+	if (GET_MAJOR_VERSION(new_cluster.major_version) != GET_MAJOR_VERSION(PG_VERSION_NUM))
+		pg_fatal("This utility can only upgrade to PostgreSQL version %s.\n",
+				 PG_MAJORVERSION);
+
+	/*
+	 * We can't allow downgrading because we use the target pg_dump, and
+	 * pg_dump cannot operate on newer database versions, only current and
+	 * older versions.
+	 */
+	if (old_cluster.major_version > new_cluster.major_version)
+		pg_fatal("This utility cannot be used to downgrade to older major PostgreSQL versions.\n");
+
+	/* get old and new binary versions */
+	get_bin_version(&old_cluster);
+	get_bin_version(&new_cluster);
+
+	/* Ensure binaries match the designated data directories */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) !=
+		GET_MAJOR_VERSION(old_cluster.bin_version))
+		pg_fatal("Old cluster data and binary directories are from different major versions.\n");
+	if (GET_MAJOR_VERSION(new_cluster.major_version) !=
+		GET_MAJOR_VERSION(new_cluster.bin_version))
+		pg_fatal("New cluster data and binary directories are from different major versions.\n");
+
+	check_ok();
+}
+
+
+void
+check_cluster_compatibility(bool live_check)
+{
+	/* get/check pg_control data of servers */
+	get_control_data(&old_cluster, live_check);
+	get_control_data(&new_cluster, false);
+	check_control_data(&old_cluster.controldata, &new_cluster.controldata);
+
+	/* Is it 9.0 but without tablespace directories? */
+	if (GET_MAJOR_VERSION(new_cluster.major_version) == 900 &&
+		new_cluster.controldata.cat_ver < TABLE_SPACE_SUBDIRS_CAT_VER)
+		pg_fatal("This utility can only upgrade to PostgreSQL version 9.0 after 2010-01-11\n"
+				 "because of backend API changes made during development.\n");
+
+	/* We read the real port number for PG >= 9.1 */
+	if (live_check && GET_MAJOR_VERSION(old_cluster.major_version) < 901 &&
+		old_cluster.port == DEF_PGUPORT)
+		pg_fatal("When checking a pre-PG 9.1 live old server, "
+				 "you must specify the old server's port number.\n");
+
+	if (live_check && old_cluster.port == new_cluster.port)
+		pg_fatal("When checking a live server, "
+				 "the old and new port numbers must be different.\n");
+}
+
+
+/*
+ * check_locale_and_encoding()
+ *
+ * Check that locale and encoding of a database in the old and new clusters
+ * are compatible.
+ */
+static void
+check_locale_and_encoding(DbInfo *olddb, DbInfo *newdb)
+{
+	if (olddb->db_encoding != newdb->db_encoding)
+		pg_fatal("encodings for database \"%s\" do not match:  old \"%s\", new \"%s\"\n",
+				 olddb->db_name,
+				 pg_encoding_to_char(olddb->db_encoding),
+				 pg_encoding_to_char(newdb->db_encoding));
+	if (!equivalent_locale(LC_COLLATE, olddb->db_collate, newdb->db_collate))
+		pg_fatal("lc_collate values for database \"%s\" do not match:  old \"%s\", new \"%s\"\n",
+				 olddb->db_name, olddb->db_collate, newdb->db_collate);
+	if (!equivalent_locale(LC_CTYPE, olddb->db_ctype, newdb->db_ctype))
+		pg_fatal("lc_ctype values for database \"%s\" do not match:  old \"%s\", new \"%s\"\n",
+				 olddb->db_name, olddb->db_ctype, newdb->db_ctype);
+}
+
+/*
+ * equivalent_locale()
+ *
+ * Best effort locale-name comparison.  Return false if we are not 100% sure
+ * the locales are equivalent.
+ *
+ * Note: The encoding parts of the names are ignored. This function is
+ * currently used to compare locale names stored in pg_database, and
+ * pg_database contains a separate encoding field. That's compared directly
+ * in check_locale_and_encoding().
+ */
+static bool
+equivalent_locale(int category, const char *loca, const char *locb)
+{
+	const char *chara;
+	const char *charb;
+	char	   *canona;
+	char	   *canonb;
+	int			lena;
+	int			lenb;
+
+	/*
+	 * If the names are equal, the locales are equivalent. Checking this
+	 * first avoids calling setlocale() in the common case that the names
+	 * are equal. That's a good thing, if setlocale() is buggy, for example.
+	 */
+	if (pg_strcasecmp(loca, locb) == 0)
+		return true;
+
+	/*
+	 * Not identical. Canonicalize both names, remove the encoding parts,
+	 * and try again.
+	 */
+	canona = get_canonical_locale_name(category, loca);
+	chara = strrchr(canona, '.');
+	lena = chara ? (chara - canona) : strlen(canona);
+
+	canonb = get_canonical_locale_name(category, locb);
+	charb = strrchr(canonb, '.');
+	lenb = charb ? (charb - canonb) : strlen(canonb);
+
+	if (lena == lenb && pg_strncasecmp(canona, canonb, lena) == 0)
+		return true;
+
+	return false;
+}
+
+
+static void
+check_new_cluster_is_empty(void)
+{
+	int			dbnum;
+
+	for (dbnum = 0; dbnum < new_cluster.dbarr.ndbs; dbnum++)
+	{
+		int			relnum;
+		RelInfoArr *rel_arr = &new_cluster.dbarr.dbs[dbnum].rel_arr;
+
+		for (relnum = 0; relnum < rel_arr->nrels;
+			 relnum++)
+		{
+			/* pg_largeobject and its index should be skipped */
+			if (strcmp(rel_arr->rels[relnum].nspname, "pg_catalog") != 0)
+				pg_fatal("New cluster database \"%s\" is not empty\n",
+						 new_cluster.dbarr.dbs[dbnum].db_name);
+		}
+	}
+}
+
+/*
+ * Check that every database that already exists in the new cluster is
+ * compatible with the corresponding database in the old one.
+ */
+static void
+check_databases_are_compatible(void)
+{
+	int			newdbnum;
+	int			olddbnum;
+	DbInfo	   *newdbinfo;
+	DbInfo	   *olddbinfo;
+
+	for (newdbnum = 0; newdbnum < new_cluster.dbarr.ndbs; newdbnum++)
+	{
+		newdbinfo = &new_cluster.dbarr.dbs[newdbnum];
+
+		/* Find the corresponding database in the old cluster */
+		for (olddbnum = 0; olddbnum < old_cluster.dbarr.ndbs; olddbnum++)
+		{
+			olddbinfo = &old_cluster.dbarr.dbs[olddbnum];
+			if (strcmp(newdbinfo->db_name, olddbinfo->db_name) == 0)
+			{
+				check_locale_and_encoding(olddbinfo, newdbinfo);
+				break;
+			}
+		}
+	}
+}
+
+
+/*
+ * create_script_for_cluster_analyze()
+ *
+ *	This incrementally generates better optimizer statistics
+ */
+void
+create_script_for_cluster_analyze(char **analyze_script_file_name)
+{
+	FILE	   *script = NULL;
+	char	   *user_specification = "";
+
+	prep_status("Creating script to analyze new cluster");
+
+	if (os_info.user_specified)
+		user_specification = psprintf("-U \"%s\" ", os_info.user);
+
+	*analyze_script_file_name = psprintf("%sanalyze_new_cluster.%s",
+										 SCRIPT_PREFIX, SCRIPT_EXT);
+
+	if ((script = fopen_priv(*analyze_script_file_name, "w")) == NULL)
+		pg_fatal("Could not open file \"%s\": %s\n",
+				 *analyze_script_file_name, getErrorText(errno));
+
+#ifndef WIN32
+	/* add shebang header */
+	fprintf(script, "#!/bin/sh\n\n");
+#else
+	/* suppress command echoing */
+	fprintf(script, "@echo off\n");
+#endif
+
+	fprintf(script, "echo %sThis script will generate minimal optimizer statistics rapidly%s\n",
+			ECHO_QUOTE, ECHO_QUOTE);
+	fprintf(script, "echo %sso your system is usable, and then gather statistics twice more%s\n",
+			ECHO_QUOTE, ECHO_QUOTE);
+	fprintf(script, "echo %swith increasing accuracy.  When it is done, your system will%s\n",
+			ECHO_QUOTE, ECHO_QUOTE);
+	fprintf(script, "echo %shave the default level of optimizer statistics.%s\n",
+			ECHO_QUOTE, ECHO_QUOTE);
+	fprintf(script, "echo%s\n\n", ECHO_BLANK);
+
+	fprintf(script, "echo %sIf you have used ALTER TABLE to modify the statistics target for%s\n",
+			ECHO_QUOTE, ECHO_QUOTE);
+	fprintf(script, "echo %sany tables, you might want to remove them and restore them after%s\n",
+			ECHO_QUOTE, ECHO_QUOTE);
+	fprintf(script, "echo %srunning this script because they will delay fast statistics generation.%s\n",
+			ECHO_QUOTE, ECHO_QUOTE);
+	fprintf(script, "echo%s\n\n", ECHO_BLANK);
+
+	fprintf(script, "echo %sIf you would like default statistics as quickly as possible, cancel%s\n",
+			ECHO_QUOTE, ECHO_QUOTE);
+	fprintf(script, "echo %sthis script and run:%s\n",
+			ECHO_QUOTE, ECHO_QUOTE);
+	fprintf(script, "echo %s    \"%s/vacuumdb\" %s--all %s%s\n", ECHO_QUOTE,
+			new_cluster.bindir, user_specification,
+	/* Did we copy the free space files? */
+			(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
+			"--analyze-only" : "--analyze", ECHO_QUOTE);
+	fprintf(script, "echo%s\n\n", ECHO_BLANK);
+
+	fprintf(script, "\"%s/vacuumdb\" %s--all --analyze-in-stages\n",
+			new_cluster.bindir, user_specification);
+	/* Did we copy the free space files? */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 804)
+		fprintf(script, "\"%s/vacuumdb\" %s--all\n", new_cluster.bindir,
+				user_specification);
+
+	fprintf(script, "echo%s\n\n", ECHO_BLANK);
+	fprintf(script, "echo %sDone%s\n",
+			ECHO_QUOTE, ECHO_QUOTE);
+
+	fclose(script);
+
+#ifndef WIN32
+	if (chmod(*analyze_script_file_name, S_IRWXU) != 0)
+		pg_fatal("Could not add execute permission to file \"%s\": %s\n",
+				 *analyze_script_file_name, getErrorText(errno));
+#endif
+
+	if (os_info.user_specified)
+		pg_free(user_specification);
+
+	check_ok();
+}
+
+
+/*
+ * create_script_for_old_cluster_deletion()
+ *
+ *	This is particularly useful for tablespace deletion.
+ */
+void
+create_script_for_old_cluster_deletion(char **deletion_script_file_name)
+{
+	FILE	   *script = NULL;
+	int			tblnum;
+	char		old_cluster_pgdata[MAXPGPATH];
+
+	*deletion_script_file_name = psprintf("%sdelete_old_cluster.%s",
+										  SCRIPT_PREFIX, SCRIPT_EXT);
+
+	/*
+	 * Some users (oddly) create tablespaces inside the cluster data
+	 * directory.  We can't create a proper old cluster delete script in that
+	 * case.
+	 */
+	strlcpy(old_cluster_pgdata, old_cluster.pgdata, MAXPGPATH);
+	canonicalize_path(old_cluster_pgdata);
+	for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
+	{
+		char		old_tablespace_dir[MAXPGPATH];
+
+		strlcpy(old_tablespace_dir, os_info.old_tablespaces[tblnum], MAXPGPATH);
+		canonicalize_path(old_tablespace_dir);
+		if (path_is_prefix_of_path(old_cluster_pgdata, old_tablespace_dir))
+		{
+			/* Unlink file in case it is left over from a previous run. */
+			unlink(*deletion_script_file_name);
+			pg_free(*deletion_script_file_name);
+			*deletion_script_file_name = NULL;
+			return;
+		}
+	}
+
+	prep_status("Creating script to delete old cluster");
+
+	if ((script = fopen_priv(*deletion_script_file_name, "w")) == NULL)
+		pg_fatal("Could not open file \"%s\": %s\n",
+				 *deletion_script_file_name, getErrorText(errno));
+
+#ifndef WIN32
+	/* add shebang header */
+	fprintf(script, "#!/bin/sh\n\n");
+#endif
+
+	/* delete old cluster's default tablespace */
+	fprintf(script, RMDIR_CMD " \"%s\"\n", fix_path_separator(old_cluster.pgdata));
+
+	/* delete old cluster's alternate tablespaces */
+	for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
+	{
+		/*
+		 * Do the old cluster's per-database directories share a directory
+		 * with a new version-specific tablespace?
+		 */
+		if (strlen(old_cluster.tablespace_suffix) == 0)
+		{
+			/* delete per-database directories */
+			int			dbnum;
+
+			fprintf(script, "\n");
+			/* remove PG_VERSION? */
+			if (GET_MAJOR_VERSION(old_cluster.major_version) <= 804)
+				fprintf(script, RM_CMD " %s%cPG_VERSION\n",
+						fix_path_separator(os_info.old_tablespaces[tblnum]),
+						PATH_SEPARATOR);
+
+			for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+				fprintf(script, RMDIR_CMD " \"%s%c%d\"\n",
+						fix_path_separator(os_info.old_tablespaces[tblnum]),
+						PATH_SEPARATOR, old_cluster.dbarr.dbs[dbnum].db_oid);
+		}
+		else
+		{
+			char	   *suffix_path = pg_strdup(old_cluster.tablespace_suffix);
+
+			/*
+			 * Simply delete the tablespace directory, which might be ".old"
+			 * or a version-specific subdirectory.
+			 */
+			fprintf(script, RMDIR_CMD " \"%s%s\"\n",
+					fix_path_separator(os_info.old_tablespaces[tblnum]),
+					fix_path_separator(suffix_path));
+			pfree(suffix_path);
+		}
+	}
+
+	fclose(script);
+
+#ifndef WIN32
+	if (chmod(*deletion_script_file_name, S_IRWXU) != 0)
+		pg_fatal("Could not add execute permission to file \"%s\": %s\n",
+				 *deletion_script_file_name, getErrorText(errno));
+#endif
+
+	check_ok();
+}
+
+
+/*
+ *	check_is_install_user()
+ *
+ *	Check we are the install user, and that the new cluster
+ *	has no other users.
+ */
+static void
+check_is_install_user(ClusterInfo *cluster)
+{
+	PGresult   *res;
+	PGconn	   *conn = connectToServer(cluster, "template1");
+
+	prep_status("Checking database user is the install user");
+
+	/* Can't use pg_authid because only superusers can view it. */
+	res = executeQueryOrDie(conn,
+							"SELECT rolsuper, oid "
+							"FROM pg_catalog.pg_roles "
+							"WHERE rolname = current_user");
+
+	/*
+	 * We only allow the install user in the new cluster (see comment below)
+	 * and we preserve pg_authid.oid, so this must be the install user in
+	 * the old cluster too.
+	 */
+	if (PQntuples(res) != 1 ||
+		atooid(PQgetvalue(res, 0, 1)) != BOOTSTRAP_SUPERUSERID)
+		pg_fatal("database user \"%s\" is not the install user\n",
+				 os_info.user);
+
+	PQclear(res);
+
+	res = executeQueryOrDie(conn,
+							"SELECT COUNT(*) "
+							"FROM pg_catalog.pg_roles ");
+
+	if (PQntuples(res) != 1)
+		pg_fatal("could not determine the number of users\n");
+
+	/*
+	 * We only allow the install user in the new cluster because other defined
+	 * users might match users defined in the old cluster and generate an
+	 * error during pg_dump restore.
+	 */
+	if (cluster == &new_cluster && atooid(PQgetvalue(res, 0, 0)) != 1)
+		pg_fatal("Only the install user can be defined in the new cluster.\n");
+
+	PQclear(res);
+
+	PQfinish(conn);
+
+	check_ok();
+}
+
+
+/*
+ *	check_for_prepared_transactions()
+ *
+ *	Make sure there are no prepared transactions because the storage format
+ *	might have changed.
+ */
+static void
+check_for_prepared_transactions(ClusterInfo *cluster)
+{
+	PGresult   *res;
+	PGconn	   *conn = connectToServer(cluster, "template1");
+
+	prep_status("Checking for prepared transactions");
+
+	res = executeQueryOrDie(conn,
+							"SELECT * "
+							"FROM pg_catalog.pg_prepared_xacts");
+
+	if (PQntuples(res) != 0)
+		pg_fatal("The %s cluster contains prepared transactions\n",
+				 CLUSTER_NAME(cluster));
+
+	PQclear(res);
+
+	PQfinish(conn);
+
+	check_ok();
+}
+
+
+/*
+ *	check_for_isn_and_int8_passing_mismatch()
+ *
+ *	contrib/isn relies on data type int8, and in 8.4 int8 can now be passed
+ *	by value.  The schema dumps the CREATE TYPE PASSEDBYVALUE setting so
+ *	it must match for the old and new servers.
+ */
+static void
+check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster)
+{
+	int			dbnum;
+	FILE	   *script = NULL;
+	bool		found = false;
+	char		output_path[MAXPGPATH];
+
+	prep_status("Checking for contrib/isn with bigint-passing mismatch");
+
+	if (old_cluster.controldata.float8_pass_by_value ==
+		new_cluster.controldata.float8_pass_by_value)
+	{
+		/* no mismatch */
+		check_ok();
+		return;
+	}
+
+	snprintf(output_path, sizeof(output_path),
+			 "contrib_isn_and_int8_pass_by_value.txt");
+
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		bool		db_used = false;
+		int			ntups;
+		int			rowno;
+		int			i_nspname,
+					i_proname;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		/* Find any functions coming from contrib/isn */
+		res = executeQueryOrDie(conn,
+								"SELECT n.nspname, p.proname "
+								"FROM	pg_catalog.pg_proc p, "
+								"		pg_catalog.pg_namespace n "
+								"WHERE	p.pronamespace = n.oid AND "
+								"		p.probin = '$libdir/isn'");
+
+		ntups = PQntuples(res);
+		i_nspname = PQfnumber(res, "nspname");
+		i_proname = PQfnumber(res, "proname");
+		for (rowno = 0; rowno < ntups; rowno++)
+		{
+			found = true;
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("Could not open file \"%s\": %s\n",
+						 output_path, getErrorText(errno));
+			if (!db_used)
+			{
+				fprintf(script, "Database: %s\n", active_db->db_name);
+				db_used = true;
+			}
+			fprintf(script, "  %s.%s\n",
+					PQgetvalue(res, rowno, i_nspname),
+					PQgetvalue(res, rowno, i_proname));
+		}
+
+		PQclear(res);
+
+		PQfinish(conn);
+	}
+
+	if (script)
+		fclose(script);
+
+	if (found)
+	{
+		pg_log(PG_REPORT, "fatal\n");
+		pg_fatal("Your installation contains \"contrib/isn\" functions which rely on the\n"
+		  "bigint data type.  Your old and new clusters pass bigint values\n"
+		"differently so this cluster cannot currently be upgraded.  You can\n"
+				 "manually upgrade databases that use \"contrib/isn\" facilities and remove\n"
+				 "\"contrib/isn\" from the old cluster and restart the upgrade.  A list of\n"
+				 "the problem functions is in the file:\n"
+				 "    %s\n\n", output_path);
+	}
+	else
+		check_ok();
+}
+
+
+/*
+ * check_for_reg_data_type_usage()
+ *	pg_upgrade only preserves these system values:
+ *		pg_class.oid
+ *		pg_type.oid
+ *		pg_enum.oid
+ *
+ *	Many of the reg* data types reference system catalog info that is
+ *	not preserved, and hence these data types cannot be used in user
+ *	tables upgraded by pg_upgrade.
+ */
+static void
+check_for_reg_data_type_usage(ClusterInfo *cluster)
+{
+	int			dbnum;
+	FILE	   *script = NULL;
+	bool		found = false;
+	char		output_path[MAXPGPATH];
+
+	prep_status("Checking for reg* system OID user data types");
+
+	snprintf(output_path, sizeof(output_path), "tables_using_reg.txt");
+
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		bool		db_used = false;
+		int			ntups;
+		int			rowno;
+		int			i_nspname,
+					i_relname,
+					i_attname;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		/*
+		 * While several relkinds don't store any data, e.g. views, they can
+		 * be used to define data types of other columns, so we check all
+		 * relkinds.
+		 */
+		res = executeQueryOrDie(conn,
+								"SELECT n.nspname, c.relname, a.attname "
+								"FROM	pg_catalog.pg_class c, "
+								"		pg_catalog.pg_namespace n, "
+								"		pg_catalog.pg_attribute a "
+								"WHERE	c.oid = a.attrelid AND "
+								"		NOT a.attisdropped AND "
+								"		a.atttypid IN ( "
+		  "			'pg_catalog.regproc'::pg_catalog.regtype, "
+								"			'pg_catalog.regprocedure'::pg_catalog.regtype, "
+		  "			'pg_catalog.regoper'::pg_catalog.regtype, "
+								"			'pg_catalog.regoperator'::pg_catalog.regtype, "
+		/* regclass.oid is preserved, so 'regclass' is OK */
+		/* regtype.oid is preserved, so 'regtype' is OK */
+		"			'pg_catalog.regconfig'::pg_catalog.regtype, "
+								"			'pg_catalog.regdictionary'::pg_catalog.regtype) AND "
+								"		c.relnamespace = n.oid AND "
+							  "		n.nspname NOT IN ('pg_catalog', 'information_schema')");
+
+		ntups = PQntuples(res);
+		i_nspname = PQfnumber(res, "nspname");
+		i_relname = PQfnumber(res, "relname");
+		i_attname = PQfnumber(res, "attname");
+		for (rowno = 0; rowno < ntups; rowno++)
+		{
+			found = true;
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("Could not open file \"%s\": %s\n",
+						 output_path, getErrorText(errno));
+			if (!db_used)
+			{
+				fprintf(script, "Database: %s\n", active_db->db_name);
+				db_used = true;
+			}
+			fprintf(script, "  %s.%s.%s\n",
+					PQgetvalue(res, rowno, i_nspname),
+					PQgetvalue(res, rowno, i_relname),
+					PQgetvalue(res, rowno, i_attname));
+		}
+
+		PQclear(res);
+
+		PQfinish(conn);
+	}
+
+	if (script)
+		fclose(script);
+
+	if (found)
+	{
+		pg_log(PG_REPORT, "fatal\n");
+		pg_fatal("Your installation contains one of the reg* data types in user tables.\n"
+		 "These data types reference system OIDs that are not preserved by\n"
+		"pg_upgrade, so this cluster cannot currently be upgraded.  You can\n"
+				 "remove the problem tables and restart the upgrade.  A list of the problem\n"
+				 "columns is in the file:\n"
+				 "    %s\n\n", output_path);
+	}
+	else
+		check_ok();
+}
+
+
+/*
+ * check_for_jsonb_9_4_usage()
+ *
+ *	JSONB changed its storage format during 9.4 beta, so check for it.
+ */
+static void
+check_for_jsonb_9_4_usage(ClusterInfo *cluster)
+{
+	int			dbnum;
+	FILE	   *script = NULL;
+	bool		found = false;
+	char		output_path[MAXPGPATH];
+
+	prep_status("Checking for JSONB user data types");
+
+	snprintf(output_path, sizeof(output_path), "tables_using_jsonb.txt");
+
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		bool		db_used = false;
+		int			ntups;
+		int			rowno;
+		int			i_nspname,
+					i_relname,
+					i_attname;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		/*
+		 * While several relkinds don't store any data, e.g. views, they can
+		 * be used to define data types of other columns, so we check all
+		 * relkinds.
+		 */
+		res = executeQueryOrDie(conn,
+								"SELECT n.nspname, c.relname, a.attname "
+								"FROM	pg_catalog.pg_class c, "
+								"		pg_catalog.pg_namespace n, "
+								"		pg_catalog.pg_attribute a "
+								"WHERE	c.oid = a.attrelid AND "
+								"		NOT a.attisdropped AND "
+								"		a.atttypid = 'pg_catalog.jsonb'::pg_catalog.regtype AND "
+								"		c.relnamespace = n.oid AND "
+		/* exclude possible orphaned temp tables */
+								"  		n.nspname !~ '^pg_temp_' AND "
+							  "		n.nspname NOT IN ('pg_catalog', 'information_schema')");
+
+		ntups = PQntuples(res);
+		i_nspname = PQfnumber(res, "nspname");
+		i_relname = PQfnumber(res, "relname");
+		i_attname = PQfnumber(res, "attname");
+		for (rowno = 0; rowno < ntups; rowno++)
+		{
+			found = true;
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("Could not open file \"%s\": %s\n",
+						 output_path, getErrorText(errno));
+			if (!db_used)
+			{
+				fprintf(script, "Database: %s\n", active_db->db_name);
+				db_used = true;
+			}
+			fprintf(script, "  %s.%s.%s\n",
+					PQgetvalue(res, rowno, i_nspname),
+					PQgetvalue(res, rowno, i_relname),
+					PQgetvalue(res, rowno, i_attname));
+		}
+
+		PQclear(res);
+
+		PQfinish(conn);
+	}
+
+	if (script)
+		fclose(script);
+
+	if (found)
+	{
+		pg_log(PG_REPORT, "fatal\n");
+		pg_fatal("Your installation contains one of the JSONB data types in user tables.\n"
+		 "The internal format of JSONB changed during 9.4 beta so this cluster cannot currently\n"
+				 "be upgraded.  You can remove the problem tables and restart the upgrade.  A list\n"
+				 "of the problem columns is in the file:\n"
+				 "    %s\n\n", output_path);
+	}
+	else
+		check_ok();
+}
+
+
+static void
+get_bin_version(ClusterInfo *cluster)
+{
+	char		cmd[MAXPGPATH],
+				cmd_output[MAX_STRING];
+	FILE	   *output;
+	int			pre_dot,
+				post_dot;
+
+	snprintf(cmd, sizeof(cmd), "\"%s/pg_ctl\" --version", cluster->bindir);
+
+	if ((output = popen(cmd, "r")) == NULL ||
+		fgets(cmd_output, sizeof(cmd_output), output) == NULL)
+		pg_fatal("Could not get pg_ctl version data using %s: %s\n",
+				 cmd, getErrorText(errno));
+
+	pclose(output);
+
+	/* Remove trailing newline */
+	if (strchr(cmd_output, '\n') != NULL)
+		*strchr(cmd_output, '\n') = '\0';
+
+	if (sscanf(cmd_output, "%*s %*s %d.%d", &pre_dot, &post_dot) != 2)
+		pg_fatal("could not get version from %s\n", cmd);
+
+	cluster->bin_version = (pre_dot * 100 + post_dot) * 100;
+}
+
+
+/*
+ * get_canonical_locale_name
+ *
+ * Send the locale name to the system, and hope we get back a canonical
+ * version.  This should match the backend's check_locale() function.
+ */
+static char *
+get_canonical_locale_name(int category, const char *locale)
+{
+	char	   *save;
+	char	   *res;
+
+	/* get the current setting, so we can restore it. */
+	save = setlocale(category, NULL);
+	if (!save)
+		pg_fatal("failed to get the current locale\n");
+
+	/* 'save' may be pointing at a modifiable scratch variable, so copy it. */
+	save = pg_strdup(save);
+
+	/* set the locale with setlocale, to see if it accepts it. */
+	res = setlocale(category, locale);
+
+	if (!res)
+		pg_fatal("failed to get system locale name for \"%s\"\n", locale);
+
+	res = pg_strdup(res);
+
+	/* restore old value. */
+	if (!setlocale(category, save))
+		pg_fatal("failed to restore old locale \"%s\"\n", save);
+
+	pg_free(save);
+
+	return res;
+}
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
new file mode 100644
index 0000000..bf53db0
--- /dev/null
+++ b/src/bin/pg_upgrade/controldata.c
@@ -0,0 +1,606 @@
+/*
+ *	controldata.c
+ *
+ *	controldata functions
+ *
+ *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
+ *	src/bin/pg_upgrade/controldata.c
+ */
+
+#include "postgres_fe.h"
+
+#include "pg_upgrade.h"
+
+#include <ctype.h>
+
+/*
+ * get_control_data()
+ *
+ * gets pg_control information in "ctrl". Assumes that bindir and
+ * datadir are valid absolute paths to postgresql bin and pgdata
+ * directories respectively *and* pg_resetxlog is version compatible
+ * with datadir. The main purpose of this function is to get pg_control
+ * data in a version independent manner.
+ *
+ * The approach taken here is to invoke pg_resetxlog with -n option
+ * and then pipe its output. With little string parsing we get the
+ * pg_control data.  pg_resetxlog cannot be run while the server is running
+ * so we use pg_controldata;  pg_controldata doesn't provide all the fields
+ * we need to actually perform the upgrade, but it provides enough for
+ * check mode.  We do not implement pg_resetxlog -n because it is hard to
+ * return valid xid data for a running server.
+ */
+void
+get_control_data(ClusterInfo *cluster, bool live_check)
+{
+	char		cmd[MAXPGPATH];
+	char		bufin[MAX_STRING];
+	FILE	   *output;
+	char	   *p;
+	bool		got_xid = false;
+	bool		got_oid = false;
+	bool		got_nextxlogfile = false;
+	bool		got_multi = false;
+	bool		got_mxoff = false;
+	bool		got_oldestmulti = false;
+	bool		got_log_id = false;
+	bool		got_log_seg = false;
+	bool		got_tli = false;
+	bool		got_align = false;
+	bool		got_blocksz = false;
+	bool		got_largesz = false;
+	bool		got_walsz = false;
+	bool		got_walseg = false;
+	bool		got_ident = false;
+	bool		got_index = false;
+	bool		got_toast = false;
+	bool		got_large_object = false;
+	bool		got_date_is_int = false;
+	bool		got_float8_pass_by_value = false;
+	bool		got_data_checksum_version = false;
+	char	   *lc_collate = NULL;
+	char	   *lc_ctype = NULL;
+	char	   *lc_monetary = NULL;
+	char	   *lc_numeric = NULL;
+	char	   *lc_time = NULL;
+	char	   *lang = NULL;
+	char	   *language = NULL;
+	char	   *lc_all = NULL;
+	char	   *lc_messages = NULL;
+	uint32		logid = 0;
+	uint32		segno = 0;
+	uint32		tli = 0;
+
+
+	/*
+	 * Because we test the pg_resetxlog output as strings, it has to be in
+	 * English.  Copied from pg_regress.c.
+	 */
+	if (getenv("LC_COLLATE"))
+		lc_collate = pg_strdup(getenv("LC_COLLATE"));
+	if (getenv("LC_CTYPE"))
+		lc_ctype = pg_strdup(getenv("LC_CTYPE"));
+	if (getenv("LC_MONETARY"))
+		lc_monetary = pg_strdup(getenv("LC_MONETARY"));
+	if (getenv("LC_NUMERIC"))
+		lc_numeric = pg_strdup(getenv("LC_NUMERIC"));
+	if (getenv("LC_TIME"))
+		lc_time = pg_strdup(getenv("LC_TIME"));
+	if (getenv("LANG"))
+		lang = pg_strdup(getenv("LANG"));
+	if (getenv("LANGUAGE"))
+		language = pg_strdup(getenv("LANGUAGE"));
+	if (getenv("LC_ALL"))
+		lc_all = pg_strdup(getenv("LC_ALL"));
+	if (getenv("LC_MESSAGES"))
+		lc_messages = pg_strdup(getenv("LC_MESSAGES"));
+
+	pg_putenv("LC_COLLATE", NULL);
+	pg_putenv("LC_CTYPE", NULL);
+	pg_putenv("LC_MONETARY", NULL);
+	pg_putenv("LC_NUMERIC", NULL);
+	pg_putenv("LC_TIME", NULL);
+	pg_putenv("LANG",
+#ifndef WIN32
+			  NULL);
+#else
+	/* On Windows the default locale cannot be English, so force it */
+			  "en");
+#endif
+	pg_putenv("LANGUAGE", NULL);
+	pg_putenv("LC_ALL", NULL);
+	pg_putenv("LC_MESSAGES", "C");
+
+	snprintf(cmd, sizeof(cmd), "\"%s/%s \"%s\"",
+			 cluster->bindir,
+			 live_check ? "pg_controldata\"" : "pg_resetxlog\" -n",
+			 cluster->pgdata);
+	fflush(stdout);
+	fflush(stderr);
+
+	if ((output = popen(cmd, "r")) == NULL)
+		pg_fatal("Could not get control data using %s: %s\n",
+				 cmd, getErrorText(errno));
+
+	/* Only in <= 9.2 */
+	if (GET_MAJOR_VERSION(cluster->major_version) <= 902)
+	{
+		cluster->controldata.data_checksum_version = 0;
+		got_data_checksum_version = true;
+	}
+
+	/* we have the result of cmd in "output". so parse it line by line now */
+	while (fgets(bufin, sizeof(bufin), output))
+	{
+		pg_log(PG_VERBOSE, "%s", bufin);
+
+		if ((p = strstr(bufin, "pg_control version number:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: pg_resetxlog problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			cluster->controldata.ctrl_ver = str2uint(p);
+		}
+		else if ((p = strstr(bufin, "Catalog version number:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			cluster->controldata.cat_ver = str2uint(p);
+		}
+		else if ((p = strstr(bufin, "First log segment after reset:")) != NULL)
+		{
+			/* Skip the colon and any whitespace after it */
+			p = strchr(p, ':');
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+			p = strpbrk(p, "01234567890ABCDEF");
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			/* Make sure it looks like a valid WAL file name */
+			if (strspn(p, "0123456789ABCDEF") != 24)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			strlcpy(cluster->controldata.nextxlogfile, p, 25);
+			got_nextxlogfile = true;
+		}
+		else if ((p = strstr(bufin, "First log file ID after reset:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			logid = str2uint(p);
+			got_log_id = true;
+		}
+		else if ((p = strstr(bufin, "First log file segment after reset:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			segno = str2uint(p);
+			got_log_seg = true;
+		}
+		else if ((p = strstr(bufin, "Latest checkpoint's TimeLineID:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			cluster->controldata.chkpnt_tli = str2uint(p);
+			got_tli = true;
+		}
+		else if ((p = strstr(bufin, "Latest checkpoint's NextXID:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			cluster->controldata.chkpnt_nxtepoch = str2uint(p);
+
+			p = strchr(p, '/');
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove '/' char */
+			cluster->controldata.chkpnt_nxtxid = str2uint(p);
+			got_xid = true;
+		}
+		else if ((p = strstr(bufin, "Latest checkpoint's NextOID:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			cluster->controldata.chkpnt_nxtoid = str2uint(p);
+			got_oid = true;
+		}
+		else if ((p = strstr(bufin, "Latest checkpoint's NextMultiXactId:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			cluster->controldata.chkpnt_nxtmulti = str2uint(p);
+			got_multi = true;
+		}
+		else if ((p = strstr(bufin, "Latest checkpoint's oldestMultiXid:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			cluster->controldata.chkpnt_oldstMulti = str2uint(p);
+			got_oldestmulti = true;
+		}
+		else if ((p = strstr(bufin, "Latest checkpoint's NextMultiOffset:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			cluster->controldata.chkpnt_nxtmxoff = str2uint(p);
+			got_mxoff = true;
+		}
+		else if ((p = strstr(bufin, "Maximum data alignment:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			cluster->controldata.align = str2uint(p);
+			got_align = true;
+		}
+		else if ((p = strstr(bufin, "Database block size:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			cluster->controldata.blocksz = str2uint(p);
+			got_blocksz = true;
+		}
+		else if ((p = strstr(bufin, "Blocks per segment of large relation:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			cluster->controldata.largesz = str2uint(p);
+			got_largesz = true;
+		}
+		else if ((p = strstr(bufin, "WAL block size:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			cluster->controldata.walsz = str2uint(p);
+			got_walsz = true;
+		}
+		else if ((p = strstr(bufin, "Bytes per WAL segment:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			cluster->controldata.walseg = str2uint(p);
+			got_walseg = true;
+		}
+		else if ((p = strstr(bufin, "Maximum length of identifiers:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			cluster->controldata.ident = str2uint(p);
+			got_ident = true;
+		}
+		else if ((p = strstr(bufin, "Maximum columns in an index:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			cluster->controldata.index = str2uint(p);
+			got_index = true;
+		}
+		else if ((p = strstr(bufin, "Maximum size of a TOAST chunk:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			cluster->controldata.toast = str2uint(p);
+			got_toast = true;
+		}
+		else if ((p = strstr(bufin, "Size of a large-object chunk:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			cluster->controldata.large_object = str2uint(p);
+			got_large_object = true;
+		}
+		else if ((p = strstr(bufin, "Date/time type storage:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			cluster->controldata.date_is_int = strstr(p, "64-bit integers") != NULL;
+			got_date_is_int = true;
+		}
+		else if ((p = strstr(bufin, "Float8 argument passing:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			/* used later for contrib check */
+			cluster->controldata.float8_pass_by_value = strstr(p, "by value") != NULL;
+			got_float8_pass_by_value = true;
+		}
+		else if ((p = strstr(bufin, "checksum")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			/* used later for contrib check */
+			cluster->controldata.data_checksum_version = str2uint(p);
+			got_data_checksum_version = true;
+		}
+	}
+
+	if (output)
+		pclose(output);
+
+	/*
+	 * Restore environment variables
+	 */
+	pg_putenv("LC_COLLATE", lc_collate);
+	pg_putenv("LC_CTYPE", lc_ctype);
+	pg_putenv("LC_MONETARY", lc_monetary);
+	pg_putenv("LC_NUMERIC", lc_numeric);
+	pg_putenv("LC_TIME", lc_time);
+	pg_putenv("LANG", lang);
+	pg_putenv("LANGUAGE", language);
+	pg_putenv("LC_ALL", lc_all);
+	pg_putenv("LC_MESSAGES", lc_messages);
+
+	pg_free(lc_collate);
+	pg_free(lc_ctype);
+	pg_free(lc_monetary);
+	pg_free(lc_numeric);
+	pg_free(lc_time);
+	pg_free(lang);
+	pg_free(language);
+	pg_free(lc_all);
+	pg_free(lc_messages);
+
+	/*
+	 * Before 9.3, pg_resetxlog reported the xlogid and segno of the first log
+	 * file after reset as separate lines. Starting with 9.3, it reports the
+	 * WAL file name. If the old cluster is older than 9.3, we construct the
+	 * WAL file name from the xlogid and segno.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) <= 902)
+	{
+		if (got_log_id && got_log_seg)
+		{
+			snprintf(cluster->controldata.nextxlogfile, 25, "%08X%08X%08X",
+					 tli, logid, segno);
+			got_nextxlogfile = true;
+		}
+	}
+
+	/* verify that we got all the mandatory pg_control data */
+	if (!got_xid || !got_oid ||
+		!got_multi || !got_mxoff ||
+		(!got_oldestmulti &&
+		 cluster->controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER) ||
+		(!live_check && !got_nextxlogfile) ||
+		!got_tli ||
+		!got_align || !got_blocksz || !got_largesz || !got_walsz ||
+		!got_walseg || !got_ident || !got_index || !got_toast ||
+		(!got_large_object &&
+		 cluster->controldata.ctrl_ver >= LARGE_OBJECT_SIZE_PG_CONTROL_VER) ||
+		!got_date_is_int || !got_float8_pass_by_value || !got_data_checksum_version)
+	{
+		pg_log(PG_REPORT,
+			   "The %s cluster lacks some required control information:\n",
+			   CLUSTER_NAME(cluster));
+
+		if (!got_xid)
+			pg_log(PG_REPORT, "  checkpoint next XID\n");
+
+		if (!got_oid)
+			pg_log(PG_REPORT, "  latest checkpoint next OID\n");
+
+		if (!got_multi)
+			pg_log(PG_REPORT, "  latest checkpoint next MultiXactId\n");
+
+		if (!got_mxoff)
+			pg_log(PG_REPORT, "  latest checkpoint next MultiXactOffset\n");
+
+		if (!got_oldestmulti &&
+			cluster->controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER)
+			pg_log(PG_REPORT, "  latest checkpoint oldest MultiXactId\n");
+
+		if (!live_check && !got_nextxlogfile)
+			pg_log(PG_REPORT, "  first WAL segment after reset\n");
+
+		if (!got_tli)
+			pg_log(PG_REPORT, "  latest checkpoint timeline ID\n");
+
+		if (!got_align)
+			pg_log(PG_REPORT, "  maximum alignment\n");
+
+		if (!got_blocksz)
+			pg_log(PG_REPORT, "  block size\n");
+
+		if (!got_largesz)
+			pg_log(PG_REPORT, "  large relation segment size\n");
+
+		if (!got_walsz)
+			pg_log(PG_REPORT, "  WAL block size\n");
+
+		if (!got_walseg)
+			pg_log(PG_REPORT, "  WAL segment size\n");
+
+		if (!got_ident)
+			pg_log(PG_REPORT, "  maximum identifier length\n");
+
+		if (!got_index)
+			pg_log(PG_REPORT, "  maximum number of indexed columns\n");
+
+		if (!got_toast)
+			pg_log(PG_REPORT, "  maximum TOAST chunk size\n");
+
+		if (!got_large_object &&
+			cluster->controldata.ctrl_ver >= LARGE_OBJECT_SIZE_PG_CONTROL_VER)
+			pg_log(PG_REPORT, "  large-object chunk size\n");
+
+		if (!got_date_is_int)
+			pg_log(PG_REPORT, "  dates/times are integers?\n");
+
+		if (!got_float8_pass_by_value)
+			pg_log(PG_REPORT, "  float8 argument passing method\n");
+
+		/* value added in Postgres 9.3 */
+		if (!got_data_checksum_version)
+			pg_log(PG_REPORT, "  data checksum version\n");
+
+		pg_fatal("Cannot continue without required control information, terminating\n");
+	}
+}
+
+
+/*
+ * check_control_data()
+ *
+ * check to make sure the control data settings are compatible
+ */
+void
+check_control_data(ControlData *oldctrl,
+				   ControlData *newctrl)
+{
+	if (oldctrl->align == 0 || oldctrl->align != newctrl->align)
+		pg_fatal("old and new pg_controldata alignments are invalid or do not match\n"
+			   "Likely one cluster is a 32-bit install, the other 64-bit\n");
+
+	if (oldctrl->blocksz == 0 || oldctrl->blocksz != newctrl->blocksz)
+		pg_fatal("old and new pg_controldata block sizes are invalid or do not match\n");
+
+	if (oldctrl->largesz == 0 || oldctrl->largesz != newctrl->largesz)
+		pg_fatal("old and new pg_controldata maximum relation segement sizes are invalid or do not match\n");
+
+	if (oldctrl->walsz == 0 || oldctrl->walsz != newctrl->walsz)
+		pg_fatal("old and new pg_controldata WAL block sizes are invalid or do not match\n");
+
+	if (oldctrl->walseg == 0 || oldctrl->walseg != newctrl->walseg)
+		pg_fatal("old and new pg_controldata WAL segment sizes are invalid or do not match\n");
+
+	if (oldctrl->ident == 0 || oldctrl->ident != newctrl->ident)
+		pg_fatal("old and new pg_controldata maximum identifier lengths are invalid or do not match\n");
+
+	if (oldctrl->index == 0 || oldctrl->index != newctrl->index)
+		pg_fatal("old and new pg_controldata maximum indexed columns are invalid or do not match\n");
+
+	if (oldctrl->toast == 0 || oldctrl->toast != newctrl->toast)
+		pg_fatal("old and new pg_controldata maximum TOAST chunk sizes are invalid or do not match\n");
+
+	/* large_object added in 9.5, so it might not exist in the old cluster */
+	if (oldctrl->large_object != 0 &&
+		oldctrl->large_object != newctrl->large_object)
+		pg_fatal("old and new pg_controldata large-object chunk sizes are invalid or do not match\n");
+
+	if (oldctrl->date_is_int != newctrl->date_is_int)
+		pg_fatal("old and new pg_controldata date/time storage types do not match\n");
+
+	/*
+	 * We might eventually allow upgrades from checksum to no-checksum
+	 * clusters.
+	 */
+	if (oldctrl->data_checksum_version == 0 &&
+		newctrl->data_checksum_version != 0)
+		pg_fatal("old cluster does not use data checksums but the new one does\n");
+	else if (oldctrl->data_checksum_version != 0 &&
+			 newctrl->data_checksum_version == 0)
+		pg_fatal("old cluster uses data checksums but the new one does not\n");
+	else if (oldctrl->data_checksum_version != newctrl->data_checksum_version)
+		pg_fatal("old and new cluster pg_controldata checksum versions do not match\n");
+}
+
+
+void
+disable_old_cluster(void)
+{
+	char		old_path[MAXPGPATH],
+				new_path[MAXPGPATH];
+
+	/* rename pg_control so old server cannot be accidentally started */
+	prep_status("Adding \".old\" suffix to old global/pg_control");
+
+	snprintf(old_path, sizeof(old_path), "%s/global/pg_control", old_cluster.pgdata);
+	snprintf(new_path, sizeof(new_path), "%s/global/pg_control.old", old_cluster.pgdata);
+	if (pg_mv_file(old_path, new_path) != 0)
+		pg_fatal("Unable to rename %s to %s.\n", old_path, new_path);
+	check_ok();
+
+	pg_log(PG_REPORT, "\n"
+		   "If you want to start the old cluster, you will need to remove\n"
+		   "the \".old\" suffix from %s/global/pg_control.old.\n"
+		 "Because \"link\" mode was used, the old cluster cannot be safely\n"
+	"started once the new cluster has been started.\n\n", old_cluster.pgdata);
+}
diff --git a/src/bin/pg_upgrade/dump.c b/src/bin/pg_upgrade/dump.c
new file mode 100644
index 0000000..2c20e84
--- /dev/null
+++ b/src/bin/pg_upgrade/dump.c
@@ -0,0 +1,139 @@
+/*
+ *	dump.c
+ *
+ *	dump functions
+ *
+ *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
+ *	src/bin/pg_upgrade/dump.c
+ */
+
+#include "postgres_fe.h"
+
+#include "pg_upgrade.h"
+
+#include <sys/types.h>
+#include "catalog/binary_upgrade.h"
+
+
+void
+generate_old_dump(void)
+{
+	int			dbnum;
+	mode_t		old_umask;
+
+	prep_status("Creating dump of global objects");
+
+	/* run new pg_dumpall binary for globals */
+	exec_prog(UTILITY_LOG_FILE, NULL, true,
+			  "\"%s/pg_dumpall\" %s --globals-only --quote-all-identifiers "
+			  "--binary-upgrade %s -f %s",
+			  new_cluster.bindir, cluster_conn_opts(&old_cluster),
+			  log_opts.verbose ? "--verbose" : "",
+			  GLOBALS_DUMP_FILE);
+	check_ok();
+
+	prep_status("Creating dump of database schemas\n");
+
+	/*
+	 * Set umask for this function, all functions it calls, and all
+	 * subprocesses/threads it creates.  We can't use fopen_priv() as Windows
+	 * uses threads and umask is process-global.
+	 */
+	old_umask = umask(S_IRWXG | S_IRWXO);
+
+	/* create per-db dump files */
+	for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+	{
+		char		sql_file_name[MAXPGPATH],
+					log_file_name[MAXPGPATH];
+		DbInfo	   *old_db = &old_cluster.dbarr.dbs[dbnum];
+
+		pg_log(PG_STATUS, "%s", old_db->db_name);
+		snprintf(sql_file_name, sizeof(sql_file_name), DB_DUMP_FILE_MASK, old_db->db_oid);
+		snprintf(log_file_name, sizeof(log_file_name), DB_DUMP_LOG_FILE_MASK, old_db->db_oid);
+
+		parallel_exec_prog(log_file_name, NULL,
+				   "\"%s/pg_dump\" %s --schema-only --quote-all-identifiers "
+				  "--binary-upgrade --format=custom %s --file=\"%s\" \"%s\"",
+						 new_cluster.bindir, cluster_conn_opts(&old_cluster),
+						   log_opts.verbose ? "--verbose" : "",
+						   sql_file_name, old_db->db_name);
+	}
+
+	/* reap all children */
+	while (reap_child(true) == true)
+		;
+
+	umask(old_umask);
+
+	end_progress_output();
+	check_ok();
+}
+
+
+/*
+ * It is possible for there to be a mismatch in the need for TOAST tables
+ * between the old and new servers, e.g. some pre-9.1 tables didn't need
+ * TOAST tables but will need them in 9.1+.  (There are also opposite cases,
+ * but these are handled by setting binary_upgrade_next_toast_pg_class_oid.)
+ *
+ * We can't allow the TOAST table to be created by pg_dump with a
+ * pg_dump-assigned oid because it might conflict with a later table that
+ * uses that oid, causing a "file exists" error for pg_class conflicts, and
+ * a "duplicate oid" error for pg_type conflicts.  (TOAST tables need pg_type
+ * entries.)
+ *
+ * Therefore, a backend in binary-upgrade mode will not create a TOAST
+ * table unless an OID as passed in via pg_upgrade_support functions.
+ * This function is called after the restore and uses ALTER TABLE to
+ * auto-create any needed TOAST tables which will not conflict with
+ * restored oids.
+ */
+void
+optionally_create_toast_tables(void)
+{
+	int			dbnum;
+
+	prep_status("Creating newly-required TOAST tables");
+
+	for (dbnum = 0; dbnum < new_cluster.dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		int			ntups;
+		int			rowno;
+		int			i_nspname,
+					i_relname;
+		DbInfo	   *active_db = &new_cluster.dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(&new_cluster, active_db->db_name);
+
+		res = executeQueryOrDie(conn,
+								"SELECT n.nspname, c.relname "
+								"FROM	pg_catalog.pg_class c, "
+								"		pg_catalog.pg_namespace n "
+								"WHERE	c.relnamespace = n.oid AND "
+							  "		n.nspname NOT IN ('pg_catalog', 'information_schema') AND "
+								"c.relkind IN ('r', 'm') AND "
+								"c.reltoastrelid = 0");
+
+		ntups = PQntuples(res);
+		i_nspname = PQfnumber(res, "nspname");
+		i_relname = PQfnumber(res, "relname");
+		for (rowno = 0; rowno < ntups; rowno++)
+		{
+			/* enable auto-oid-numbered TOAST creation if needed */
+			PQclear(executeQueryOrDie(conn, "SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('%d'::pg_catalog.oid);",
+					OPTIONALLY_CREATE_TOAST_OID));
+
+			/* dummy command that also triggers check for required TOAST table */
+			PQclear(executeQueryOrDie(conn, "ALTER TABLE %s.%s RESET (binary_upgrade_dummy_option);",
+					quote_identifier(PQgetvalue(res, rowno, i_nspname)),
+					quote_identifier(PQgetvalue(res, rowno, i_relname))));
+		}
+
+		PQclear(res);
+
+		PQfinish(conn);
+	}
+
+	check_ok();
+}
diff --git a/src/bin/pg_upgrade/exec.c b/src/bin/pg_upgrade/exec.c
new file mode 100644
index 0000000..7d31912
--- /dev/null
+++ b/src/bin/pg_upgrade/exec.c
@@ -0,0 +1,379 @@
+/*
+ *	exec.c
+ *
+ *	execution functions
+ *
+ *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
+ *	src/bin/pg_upgrade/exec.c
+ */
+
+#include "postgres_fe.h"
+
+#include "pg_upgrade.h"
+
+#include <fcntl.h>
+#include <sys/types.h>
+
+static void check_data_dir(const char *pg_data);
+static void check_bin_dir(ClusterInfo *cluster);
+static void validate_exec(const char *dir, const char *cmdName);
+
+#ifdef WIN32
+static int	win32_check_directory_write_permissions(void);
+#endif
+
+
+/*
+ * exec_prog()
+ *		Execute an external program with stdout/stderr redirected, and report
+ *		errors
+ *
+ * Formats a command from the given argument list, logs it to the log file,
+ * and attempts to execute that command.  If the command executes
+ * successfully, exec_prog() returns true.
+ *
+ * If the command fails, an error message is saved to the specified log_file.
+ * If throw_error is true, this raises a PG_FATAL error and pg_upgrade
+ * terminates; otherwise it is just reported as PG_REPORT and exec_prog()
+ * returns false.
+ *
+ * The code requires it be called first from the primary thread on Windows.
+ */
+bool
+exec_prog(const char *log_file, const char *opt_log_file,
+		  bool throw_error, const char *fmt,...)
+{
+	int			result = 0;
+	int			written;
+
+#define MAXCMDLEN (2 * MAXPGPATH)
+	char		cmd[MAXCMDLEN];
+	FILE	   *log;
+	va_list		ap;
+
+#ifdef WIN32
+	static DWORD mainThreadId = 0;
+
+	/* We assume we are called from the primary thread first */
+	if (mainThreadId == 0)
+		mainThreadId = GetCurrentThreadId();
+#endif
+
+	written = 0;
+	va_start(ap, fmt);
+	written += vsnprintf(cmd + written, MAXCMDLEN - written, fmt, ap);
+	va_end(ap);
+	if (written >= MAXCMDLEN)
+		pg_fatal("command too long\n");
+	written += snprintf(cmd + written, MAXCMDLEN - written,
+						" >> \"%s\" 2>&1", log_file);
+	if (written >= MAXCMDLEN)
+		pg_fatal("command too long\n");
+
+	pg_log(PG_VERBOSE, "%s\n", cmd);
+
+#ifdef WIN32
+
+	/*
+	 * For some reason, Windows issues a file-in-use error if we write data to
+	 * the log file from a non-primary thread just before we create a
+	 * subprocess that also writes to the same log file.  One fix is to sleep
+	 * for 100ms.  A cleaner fix is to write to the log file _after_ the
+	 * subprocess has completed, so we do this only when writing from a
+	 * non-primary thread.  fflush(), running system() twice, and pre-creating
+	 * the file do not see to help.
+	 */
+	if (mainThreadId != GetCurrentThreadId())
+		result = system(cmd);
+#endif
+
+	log = fopen(log_file, "a");
+
+#ifdef WIN32
+	{
+		/*
+		 * "pg_ctl -w stop" might have reported that the server has stopped
+		 * because the postmaster.pid file has been removed, but "pg_ctl -w
+		 * start" might still be in the process of closing and might still be
+		 * holding its stdout and -l log file descriptors open.  Therefore,
+		 * try to open the log file a few more times.
+		 */
+		int			iter;
+
+		for (iter = 0; iter < 4 && log == NULL; iter++)
+		{
+			pg_usleep(1000000); /* 1 sec */
+			log = fopen(log_file, "a");
+		}
+	}
+#endif
+
+	if (log == NULL)
+		pg_fatal("cannot write to log file %s\n", log_file);
+
+#ifdef WIN32
+	/* Are we printing "command:" before its output? */
+	if (mainThreadId == GetCurrentThreadId())
+		fprintf(log, "\n\n");
+#endif
+	fprintf(log, "command: %s\n", cmd);
+#ifdef WIN32
+	/* Are we printing "command:" after its output? */
+	if (mainThreadId != GetCurrentThreadId())
+		fprintf(log, "\n\n");
+#endif
+
+	/*
+	 * In Windows, we must close the log file at this point so the file is not
+	 * open while the command is running, or we get a share violation.
+	 */
+	fclose(log);
+
+#ifdef WIN32
+	/* see comment above */
+	if (mainThreadId == GetCurrentThreadId())
+#endif
+		result = system(cmd);
+
+	if (result != 0)
+	{
+		/* we might be in on a progress status line, so go to the next line */
+		report_status(PG_REPORT, "\n*failure*");
+		fflush(stdout);
+
+		pg_log(PG_VERBOSE, "There were problems executing \"%s\"\n", cmd);
+		if (opt_log_file)
+			pg_log(throw_error ? PG_FATAL : PG_REPORT,
+				   "Consult the last few lines of \"%s\" or \"%s\" for\n"
+				   "the probable cause of the failure.\n",
+				   log_file, opt_log_file);
+		else
+			pg_log(throw_error ? PG_FATAL : PG_REPORT,
+				   "Consult the last few lines of \"%s\" for\n"
+				   "the probable cause of the failure.\n",
+				   log_file);
+	}
+
+#ifndef WIN32
+
+	/*
+	 * We can't do this on Windows because it will keep the "pg_ctl start"
+	 * output filename open until the server stops, so we do the \n\n above on
+	 * that platform.  We use a unique filename for "pg_ctl start" that is
+	 * never reused while the server is running, so it works fine.  We could
+	 * log these commands to a third file, but that just adds complexity.
+	 */
+	if ((log = fopen(log_file, "a")) == NULL)
+		pg_fatal("cannot write to log file %s\n", log_file);
+	fprintf(log, "\n\n");
+	fclose(log);
+#endif
+
+	return result == 0;
+}
+
+
+/*
+ * pid_lock_file_exists()
+ *
+ * Checks whether the postmaster.pid file exists.
+ */
+bool
+pid_lock_file_exists(const char *datadir)
+{
+	char		path[MAXPGPATH];
+	int			fd;
+
+	snprintf(path, sizeof(path), "%s/postmaster.pid", datadir);
+
+	if ((fd = open(path, O_RDONLY, 0)) < 0)
+	{
+		/* ENOTDIR means we will throw a more useful error later */
+		if (errno != ENOENT && errno != ENOTDIR)
+			pg_fatal("could not open file \"%s\" for reading: %s\n",
+					 path, getErrorText(errno));
+
+		return false;
+	}
+
+	close(fd);
+	return true;
+}
+
+
+/*
+ * verify_directories()
+ *
+ * does all the hectic work of verifying directories and executables
+ * of old and new server.
+ *
+ * NOTE: May update the values of all parameters
+ */
+void
+verify_directories(void)
+{
+#ifndef WIN32
+	if (access(".", R_OK | W_OK | X_OK) != 0)
+#else
+	if (win32_check_directory_write_permissions() != 0)
+#endif
+		pg_fatal("You must have read and write access in the current directory.\n");
+
+	check_bin_dir(&old_cluster);
+	check_data_dir(old_cluster.pgdata);
+	check_bin_dir(&new_cluster);
+	check_data_dir(new_cluster.pgdata);
+}
+
+
+#ifdef WIN32
+/*
+ * win32_check_directory_write_permissions()
+ *
+ *	access() on WIN32 can't check directory permissions, so we have to
+ *	optionally create, then delete a file to check.
+ *		http://msdn.microsoft.com/en-us/library/1w06ktdy%28v=vs.80%29.aspx
+ */
+static int
+win32_check_directory_write_permissions(void)
+{
+	int			fd;
+
+	/*
+	 * We open a file we would normally create anyway.  We do this even in
+	 * 'check' mode, which isn't ideal, but this is the best we can do.
+	 */
+	if ((fd = open(GLOBALS_DUMP_FILE, O_RDWR | O_CREAT, S_IRUSR | S_IWUSR)) < 0)
+		return -1;
+	close(fd);
+
+	return unlink(GLOBALS_DUMP_FILE);
+}
+#endif
+
+
+/*
+ * check_data_dir()
+ *
+ *	This function validates the given cluster directory - we search for a
+ *	small set of subdirectories that we expect to find in a valid $PGDATA
+ *	directory.  If any of the subdirectories are missing (or secured against
+ *	us) we display an error message and exit()
+ *
+ */
+static void
+check_data_dir(const char *pg_data)
+{
+	char		subDirName[MAXPGPATH];
+	int			subdirnum;
+
+	/* start check with top-most directory */
+	const char *requiredSubdirs[] = {"", "base", "global", "pg_clog",
+		"pg_multixact", "pg_subtrans", "pg_tblspc", "pg_twophase",
+	"pg_xlog"};
+
+	for (subdirnum = 0;
+		 subdirnum < sizeof(requiredSubdirs) / sizeof(requiredSubdirs[0]);
+		 ++subdirnum)
+	{
+		struct stat statBuf;
+
+		snprintf(subDirName, sizeof(subDirName), "%s%s%s", pg_data,
+		/* Win32 can't stat() a directory with a trailing slash. */
+				 *requiredSubdirs[subdirnum] ? "/" : "",
+				 requiredSubdirs[subdirnum]);
+
+		if (stat(subDirName, &statBuf) != 0)
+			report_status(PG_FATAL, "check for \"%s\" failed: %s\n",
+						  subDirName, getErrorText(errno));
+		else if (!S_ISDIR(statBuf.st_mode))
+			report_status(PG_FATAL, "%s is not a directory\n",
+						  subDirName);
+	}
+}
+
+
+/*
+ * check_bin_dir()
+ *
+ *	This function searches for the executables that we expect to find
+ *	in the binaries directory.  If we find that a required executable
+ *	is missing (or secured against us), we display an error message and
+ *	exit().
+ */
+static void
+check_bin_dir(ClusterInfo *cluster)
+{
+	struct stat statBuf;
+
+	/* check bindir */
+	if (stat(cluster->bindir, &statBuf) != 0)
+		report_status(PG_FATAL, "check for \"%s\" failed: %s\n",
+					  cluster->bindir, getErrorText(errno));
+	else if (!S_ISDIR(statBuf.st_mode))
+		report_status(PG_FATAL, "%s is not a directory\n",
+					  cluster->bindir);
+
+	validate_exec(cluster->bindir, "postgres");
+	validate_exec(cluster->bindir, "pg_ctl");
+	validate_exec(cluster->bindir, "pg_resetxlog");
+	if (cluster == &new_cluster)
+	{
+		/* these are only needed in the new cluster */
+		validate_exec(cluster->bindir, "psql");
+		validate_exec(cluster->bindir, "pg_dump");
+		validate_exec(cluster->bindir, "pg_dumpall");
+	}
+}
+
+
+/*
+ * validate_exec()
+ *
+ * validate "path" as an executable file
+ */
+static void
+validate_exec(const char *dir, const char *cmdName)
+{
+	char		path[MAXPGPATH];
+	struct stat buf;
+
+	snprintf(path, sizeof(path), "%s/%s", dir, cmdName);
+
+#ifdef WIN32
+	/* Windows requires a .exe suffix for stat() */
+	if (strlen(path) <= strlen(EXE_EXT) ||
+		pg_strcasecmp(path + strlen(path) - strlen(EXE_EXT), EXE_EXT) != 0)
+		strlcat(path, EXE_EXT, sizeof(path));
+#endif
+
+	/*
+	 * Ensure that the file exists and is a regular file.
+	 */
+	if (stat(path, &buf) < 0)
+		pg_fatal("check for \"%s\" failed: %s\n",
+				 path, getErrorText(errno));
+	else if (!S_ISREG(buf.st_mode))
+		pg_fatal("check for \"%s\" failed: not an executable file\n",
+				 path);
+
+	/*
+	 * Ensure that the file is both executable and readable (required for
+	 * dynamic loading).
+	 */
+#ifndef WIN32
+	if (access(path, R_OK) != 0)
+#else
+	if ((buf.st_mode & S_IRUSR) == 0)
+#endif
+		pg_fatal("check for \"%s\" failed: cannot read file (permission denied)\n",
+				 path);
+
+#ifndef WIN32
+	if (access(path, X_OK) != 0)
+#else
+	if ((buf.st_mode & S_IXUSR) == 0)
+#endif
+		pg_fatal("check for \"%s\" failed: cannot execute (permission denied)\n",
+				 path);
+}
diff --git a/src/bin/pg_upgrade/file.c b/src/bin/pg_upgrade/file.c
new file mode 100644
index 0000000..79d9390
--- /dev/null
+++ b/src/bin/pg_upgrade/file.c
@@ -0,0 +1,250 @@
+/*
+ *	file.c
+ *
+ *	file system operations
+ *
+ *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
+ *	src/bin/pg_upgrade/file.c
+ */
+
+#include "postgres_fe.h"
+
+#include "pg_upgrade.h"
+
+#include <fcntl.h>
+
+
+
+#ifndef WIN32
+static int	copy_file(const char *fromfile, const char *tofile, bool force);
+#else
+static int	win32_pghardlink(const char *src, const char *dst);
+#endif
+
+
+/*
+ * copyAndUpdateFile()
+ *
+ *	Copies a relation file from src to dst.  If pageConverter is non-NULL, this function
+ *	uses that pageConverter to do a page-by-page conversion.
+ */
+const char *
+copyAndUpdateFile(pageCnvCtx *pageConverter,
+				  const char *src, const char *dst, bool force)
+{
+	if (pageConverter == NULL)
+	{
+		if (pg_copy_file(src, dst, force) == -1)
+			return getErrorText(errno);
+		else
+			return NULL;
+	}
+	else
+	{
+		/*
+		 * We have a pageConverter object - that implies that the
+		 * PageLayoutVersion differs between the two clusters so we have to
+		 * perform a page-by-page conversion.
+		 *
+		 * If the pageConverter can convert the entire file at once, invoke
+		 * that plugin function, otherwise, read each page in the relation
+		 * file and call the convertPage plugin function.
+		 */
+
+#ifdef PAGE_CONVERSION
+		if (pageConverter->convertFile)
+			return pageConverter->convertFile(pageConverter->pluginData,
+											  dst, src);
+		else
+#endif
+		{
+			int			src_fd;
+			int			dstfd;
+			char		buf[BLCKSZ];
+			ssize_t		bytesRead;
+			const char *msg = NULL;
+
+			if ((src_fd = open(src, O_RDONLY, 0)) < 0)
+				return "could not open source file";
+
+			if ((dstfd = open(dst, O_RDWR | O_CREAT | O_EXCL, S_IRUSR | S_IWUSR)) < 0)
+			{
+				close(src_fd);
+				return "could not create destination file";
+			}
+
+			while ((bytesRead = read(src_fd, buf, BLCKSZ)) == BLCKSZ)
+			{
+#ifdef PAGE_CONVERSION
+				if ((msg = pageConverter->convertPage(pageConverter->pluginData, buf, buf)) != NULL)
+					break;
+#endif
+				if (write(dstfd, buf, BLCKSZ) != BLCKSZ)
+				{
+					msg = "could not write new page to destination";
+					break;
+				}
+			}
+
+			close(src_fd);
+			close(dstfd);
+
+			if (msg)
+				return msg;
+			else if (bytesRead != 0)
+				return "found partial page in source file";
+			else
+				return NULL;
+		}
+	}
+}
+
+
+/*
+ * linkAndUpdateFile()
+ *
+ * Creates a hard link between the given relation files. We use
+ * this function to perform a true in-place update. If the on-disk
+ * format of the new cluster is bit-for-bit compatible with the on-disk
+ * format of the old cluster, we can simply link each relation
+ * instead of copying the data from the old cluster to the new cluster.
+ */
+const char *
+linkAndUpdateFile(pageCnvCtx *pageConverter,
+				  const char *src, const char *dst)
+{
+	if (pageConverter != NULL)
+		return "Cannot in-place update this cluster, page-by-page conversion is required";
+
+	if (pg_link_file(src, dst) == -1)
+		return getErrorText(errno);
+	else
+		return NULL;
+}
+
+
+#ifndef WIN32
+static int
+copy_file(const char *srcfile, const char *dstfile, bool force)
+{
+#define COPY_BUF_SIZE (50 * BLCKSZ)
+
+	int			src_fd;
+	int			dest_fd;
+	char	   *buffer;
+	int			ret = 0;
+	int			save_errno = 0;
+
+	if ((srcfile == NULL) || (dstfile == NULL))
+	{
+		errno = EINVAL;
+		return -1;
+	}
+
+	if ((src_fd = open(srcfile, O_RDONLY, 0)) < 0)
+		return -1;
+
+	if ((dest_fd = open(dstfile, O_RDWR | O_CREAT | (force ? 0 : O_EXCL), S_IRUSR | S_IWUSR)) < 0)
+	{
+		save_errno = errno;
+
+		if (src_fd != 0)
+			close(src_fd);
+
+		errno = save_errno;
+		return -1;
+	}
+
+	buffer = (char *) pg_malloc(COPY_BUF_SIZE);
+
+	/* perform data copying i.e read src source, write to destination */
+	while (true)
+	{
+		ssize_t		nbytes = read(src_fd, buffer, COPY_BUF_SIZE);
+
+		if (nbytes < 0)
+		{
+			save_errno = errno;
+			ret = -1;
+			break;
+		}
+
+		if (nbytes == 0)
+			break;
+
+		errno = 0;
+
+		if (write(dest_fd, buffer, nbytes) != nbytes)
+		{
+			/* if write didn't set errno, assume problem is no disk space */
+			if (errno == 0)
+				errno = ENOSPC;
+			save_errno = errno;
+			ret = -1;
+			break;
+		}
+	}
+
+	pg_free(buffer);
+
+	if (src_fd != 0)
+		close(src_fd);
+
+	if (dest_fd != 0)
+		close(dest_fd);
+
+	if (save_errno != 0)
+		errno = save_errno;
+
+	return ret;
+}
+#endif
+
+
+void
+check_hard_link(void)
+{
+	char		existing_file[MAXPGPATH];
+	char		new_link_file[MAXPGPATH];
+
+	snprintf(existing_file, sizeof(existing_file), "%s/PG_VERSION", old_cluster.pgdata);
+	snprintf(new_link_file, sizeof(new_link_file), "%s/PG_VERSION.linktest", new_cluster.pgdata);
+	unlink(new_link_file);		/* might fail */
+
+	if (pg_link_file(existing_file, new_link_file) == -1)
+	{
+		pg_fatal("Could not create hard link between old and new data directories: %s\n"
+				 "In link mode the old and new data directories must be on the same file system volume.\n",
+				 getErrorText(errno));
+	}
+	unlink(new_link_file);
+}
+
+#ifdef WIN32
+static int
+win32_pghardlink(const char *src, const char *dst)
+{
+	/*
+	 * CreateHardLinkA returns zero for failure
+	 * http://msdn.microsoft.com/en-us/library/aa363860(VS.85).aspx
+	 */
+	if (CreateHardLinkA(dst, src, NULL) == 0)
+		return -1;
+	else
+		return 0;
+}
+#endif
+
+
+/* fopen() file with no group/other permissions */
+FILE *
+fopen_priv(const char *path, const char *mode)
+{
+	mode_t		old_umask = umask(S_IRWXG | S_IRWXO);
+	FILE	   *fp;
+
+	fp = fopen(path, mode);
+	umask(old_umask);
+
+	return fp;
+}
diff --git a/src/bin/pg_upgrade/function.c b/src/bin/pg_upgrade/function.c
new file mode 100644
index 0000000..04492a5
--- /dev/null
+++ b/src/bin/pg_upgrade/function.c
@@ -0,0 +1,240 @@
+/*
+ *	function.c
+ *
+ *	server-side function support
+ *
+ *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
+ *	src/bin/pg_upgrade/function.c
+ */
+
+#include "postgres_fe.h"
+
+#include "pg_upgrade.h"
+
+#include "access/transam.h"
+
+
+/*
+ * get_loadable_libraries()
+ *
+ *	Fetch the names of all old libraries containing C-language functions.
+ *	We will later check that they all exist in the new installation.
+ */
+void
+get_loadable_libraries(void)
+{
+	PGresult  **ress;
+	int			totaltups;
+	int			dbnum;
+	bool		found_public_plpython_handler = false;
+
+	ress = (PGresult **) pg_malloc(old_cluster.dbarr.ndbs * sizeof(PGresult *));
+	totaltups = 0;
+
+	/* Fetch all library names, removing duplicates within each DB */
+	for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+	{
+		DbInfo	   *active_db = &old_cluster.dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(&old_cluster, active_db->db_name);
+
+		/*
+		 * Fetch all libraries referenced in this DB.  We can't exclude the
+		 * "pg_catalog" schema because, while such functions are not
+		 * explicitly dumped by pg_dump, they do reference implicit objects
+		 * that pg_dump does dump, e.g. CREATE LANGUAGE plperl.
+		 */
+		ress[dbnum] = executeQueryOrDie(conn,
+										"SELECT DISTINCT probin "
+										"FROM	pg_catalog.pg_proc "
+										"WHERE	prolang = 13 /* C */ AND "
+										"probin IS NOT NULL AND "
+										"oid >= %u;",
+										FirstNormalObjectId);
+		totaltups += PQntuples(ress[dbnum]);
+
+		/*
+		 * Systems that install plpython before 8.1 have
+		 * plpython_call_handler() defined in the "public" schema, causing
+		 * pg_dump to dump it.  However that function still references
+		 * "plpython" (no "2"), so it throws an error on restore.  This code
+		 * checks for the problem function, reports affected databases to the
+		 * user and explains how to remove them. 8.1 git commit:
+		 * e0dedd0559f005d60c69c9772163e69c204bac69
+		 * http://archives.postgresql.org/pgsql-hackers/2012-03/msg01101.php
+		 * http://archives.postgresql.org/pgsql-bugs/2012-05/msg00206.php
+		 */
+		if (GET_MAJOR_VERSION(old_cluster.major_version) < 901)
+		{
+			PGresult   *res;
+
+			res = executeQueryOrDie(conn,
+									"SELECT 1 "
+						   "FROM	pg_catalog.pg_proc JOIN pg_namespace "
+							 "		ON pronamespace = pg_namespace.oid "
+							   "WHERE proname = 'plpython_call_handler' AND "
+									"nspname = 'public' AND "
+									"prolang = 13 /* C */ AND "
+									"probin = '$libdir/plpython' AND "
+									"pg_proc.oid >= %u;",
+									FirstNormalObjectId);
+			if (PQntuples(res) > 0)
+			{
+				if (!found_public_plpython_handler)
+				{
+					pg_log(PG_WARNING,
+						   "\nThe old cluster has a \"plpython_call_handler\" function defined\n"
+						   "in the \"public\" schema which is a duplicate of the one defined\n"
+						   "in the \"pg_catalog\" schema.  You can confirm this by executing\n"
+						   "in psql:\n"
+						   "\n"
+						   "    \\df *.plpython_call_handler\n"
+						   "\n"
+						   "The \"public\" schema version of this function was created by a\n"
+						   "pre-8.1 install of plpython, and must be removed for pg_upgrade\n"
+						   "to complete because it references a now-obsolete \"plpython\"\n"
+						   "shared object file.  You can remove the \"public\" schema version\n"
+					   "of this function by running the following command:\n"
+						   "\n"
+						 "    DROP FUNCTION public.plpython_call_handler()\n"
+						   "\n"
+						   "in each affected database:\n"
+						   "\n");
+				}
+				pg_log(PG_WARNING, "    %s\n", active_db->db_name);
+				found_public_plpython_handler = true;
+			}
+			PQclear(res);
+		}
+
+		PQfinish(conn);
+	}
+
+	if (found_public_plpython_handler)
+		pg_fatal("Remove the problem functions from the old cluster to continue.\n");
+
+	/* Allocate what's certainly enough space */
+	os_info.libraries = (char **) pg_malloc(totaltups * sizeof(char *));
+
+	/*
+	 * Now remove duplicates across DBs.  This is pretty inefficient code, but
+	 * there probably aren't enough entries to matter.
+	 */
+	totaltups = 0;
+
+	for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res = ress[dbnum];
+		int			ntups;
+		int			rowno;
+
+		ntups = PQntuples(res);
+		for (rowno = 0; rowno < ntups; rowno++)
+		{
+			char	   *lib = PQgetvalue(res, rowno, 0);
+			bool		dup = false;
+			int			n;
+
+			for (n = 0; n < totaltups; n++)
+			{
+				if (strcmp(lib, os_info.libraries[n]) == 0)
+				{
+					dup = true;
+					break;
+				}
+			}
+			if (!dup)
+				os_info.libraries[totaltups++] = pg_strdup(lib);
+		}
+
+		PQclear(res);
+	}
+
+	os_info.num_libraries = totaltups;
+
+	pg_free(ress);
+}
+
+
+/*
+ * check_loadable_libraries()
+ *
+ *	Check that the new cluster contains all required libraries.
+ *	We do this by actually trying to LOAD each one, thereby testing
+ *	compatibility as well as presence.
+ */
+void
+check_loadable_libraries(void)
+{
+	PGconn	   *conn = connectToServer(&new_cluster, "template1");
+	int			libnum;
+	FILE	   *script = NULL;
+	bool		found = false;
+	char		output_path[MAXPGPATH];
+
+	prep_status("Checking for presence of required libraries");
+
+	snprintf(output_path, sizeof(output_path), "loadable_libraries.txt");
+
+	for (libnum = 0; libnum < os_info.num_libraries; libnum++)
+	{
+		char	   *lib = os_info.libraries[libnum];
+		int			llen = strlen(lib);
+		char		cmd[7 + 2 * MAXPGPATH + 1];
+		PGresult   *res;
+
+		/*
+		 * In Postgres 9.0, Python 3 support was added, and to do that, a
+		 * plpython2u language was created with library name plpython2.so as a
+		 * symbolic link to plpython.so.  In Postgres 9.1, only the
+		 * plpython2.so library was created, and both plpythonu and plpython2u
+		 * pointing to it.  For this reason, any reference to library name
+		 * "plpython" in an old PG <= 9.1 cluster must look for "plpython2" in
+		 * the new cluster.
+		 *
+		 * For this case, we could check pg_pltemplate, but that only works
+		 * for languages, and does not help with function shared objects, so
+		 * we just do a general fix.
+		 */
+		if (GET_MAJOR_VERSION(old_cluster.major_version) < 901 &&
+			strcmp(lib, "$libdir/plpython") == 0)
+		{
+			lib = "$libdir/plpython2";
+			llen = strlen(lib);
+		}
+
+		strcpy(cmd, "LOAD '");
+		PQescapeStringConn(conn, cmd + strlen(cmd), lib, llen, NULL);
+		strcat(cmd, "'");
+
+		res = PQexec(conn, cmd);
+
+		if (PQresultStatus(res) != PGRES_COMMAND_OK)
+		{
+			found = true;
+
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("Could not open file \"%s\": %s\n",
+						 output_path, getErrorText(errno));
+			fprintf(script, "Could not load library \"%s\"\n%s\n",
+					lib,
+					PQerrorMessage(conn));
+		}
+
+		PQclear(res);
+	}
+
+	PQfinish(conn);
+
+	if (found)
+	{
+		fclose(script);
+		pg_log(PG_REPORT, "fatal\n");
+		pg_fatal("Your installation references loadable libraries that are missing from the\n"
+				 "new installation.  You can add these libraries to the new installation,\n"
+				 "or remove the functions using them from the old installation.  A list of\n"
+				 "problem libraries is in the file:\n"
+				 "    %s\n\n", output_path);
+	}
+	else
+		check_ok();
+}
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
new file mode 100644
index 0000000..c0a5601
--- /dev/null
+++ b/src/bin/pg_upgrade/info.c
@@ -0,0 +1,535 @@
+/*
+ *	info.c
+ *
+ *	information support functions
+ *
+ *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
+ *	src/bin/pg_upgrade/info.c
+ */
+
+#include "postgres_fe.h"
+
+#include "pg_upgrade.h"
+
+#include "access/transam.h"
+
+
+static void create_rel_filename_map(const char *old_data, const char *new_data,
+						const DbInfo *old_db, const DbInfo *new_db,
+						const RelInfo *old_rel, const RelInfo *new_rel,
+						FileNameMap *map);
+static void free_db_and_rel_infos(DbInfoArr *db_arr);
+static void get_db_infos(ClusterInfo *cluster);
+static void get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo);
+static void free_rel_infos(RelInfoArr *rel_arr);
+static void print_db_infos(DbInfoArr *dbinfo);
+static void print_rel_infos(RelInfoArr *rel_arr);
+
+
+/*
+ * gen_db_file_maps()
+ *
+ * generates database mappings for "old_db" and "new_db". Returns a malloc'ed
+ * array of mappings. nmaps is a return parameter which refers to the number
+ * mappings.
+ */
+FileNameMap *
+gen_db_file_maps(DbInfo *old_db, DbInfo *new_db,
+				 int *nmaps, const char *old_pgdata, const char *new_pgdata)
+{
+	FileNameMap *maps;
+	int			old_relnum, new_relnum;
+	int			num_maps = 0;
+
+	maps = (FileNameMap *) pg_malloc(sizeof(FileNameMap) *
+									 old_db->rel_arr.nrels);
+
+	/*
+	 * The old database shouldn't have more relations than the new one.
+	 * We force the new cluster to have a TOAST table if the old table
+	 * had one.
+	 */
+	if (old_db->rel_arr.nrels > new_db->rel_arr.nrels)
+		pg_fatal("old and new databases \"%s\" have a mismatched number of relations\n",
+				 old_db->db_name);
+
+	/* Drive the loop using new_relnum, which might be higher. */
+	for (old_relnum = new_relnum = 0; new_relnum < new_db->rel_arr.nrels;
+		 new_relnum++)
+	{
+		RelInfo    *old_rel;
+		RelInfo    *new_rel = &new_db->rel_arr.rels[new_relnum];
+
+		/*
+		 * It is possible that the new cluster has a TOAST table for a table
+		 * that didn't need one in the old cluster, e.g. 9.0 to 9.1 changed the
+		 * NUMERIC length computation.  Therefore, if we have a TOAST table
+		 * in the new cluster that doesn't match, skip over it and continue
+		 * processing.  It is possible this TOAST table used an OID that was
+		 * reserved in the old cluster, but we have no way of testing that,
+		 * and we would have already gotten an error at the new cluster schema
+		 * creation stage.  Fortunately, since we only restore the OID counter
+		 * after schema restore, and restore in OID order via pg_dump, a
+		 * conflict would only happen if the new TOAST table had a very low
+		 * OID.  However, TOAST tables created long after initial table
+		 * creation can have any OID, particularly after OID wraparound.
+		 */
+		if (old_relnum == old_db->rel_arr.nrels)
+		{
+			if (strcmp(new_rel->nspname, "pg_toast") == 0)
+				continue;
+			else
+				pg_fatal("Extra non-TOAST relation found in database \"%s\": new OID %d\n",
+						 old_db->db_name, new_rel->reloid);
+		}
+
+		old_rel = &old_db->rel_arr.rels[old_relnum];
+
+		if (old_rel->reloid != new_rel->reloid)
+		{
+			if (strcmp(new_rel->nspname, "pg_toast") == 0)
+				continue;
+			else
+				pg_fatal("Mismatch of relation OID in database \"%s\": old OID %d, new OID %d\n",
+						 old_db->db_name, old_rel->reloid, new_rel->reloid);
+		}
+
+		/*
+		 * TOAST table names initially match the heap pg_class oid. In
+		 * pre-8.4, TOAST table names change during CLUSTER; in pre-9.0, TOAST
+		 * table names change during ALTER TABLE ALTER COLUMN SET TYPE. In >=
+		 * 9.0, TOAST relation names always use heap table oids, hence we
+		 * cannot check relation names when upgrading from pre-9.0. Clusters
+		 * upgraded to 9.0 will get matching TOAST names. If index names don't
+		 * match primary key constraint names, this will fail because pg_dump
+		 * dumps constraint names and pg_upgrade checks index names.
+		 */
+		if (strcmp(old_rel->nspname, new_rel->nspname) != 0 ||
+			((GET_MAJOR_VERSION(old_cluster.major_version) >= 900 ||
+			  strcmp(old_rel->nspname, "pg_toast") != 0) &&
+			 strcmp(old_rel->relname, new_rel->relname) != 0))
+			pg_fatal("Mismatch of relation names in database \"%s\": "
+					 "old name \"%s.%s\", new name \"%s.%s\"\n",
+					 old_db->db_name, old_rel->nspname, old_rel->relname,
+					 new_rel->nspname, new_rel->relname);
+
+		create_rel_filename_map(old_pgdata, new_pgdata, old_db, new_db,
+								old_rel, new_rel, maps + num_maps);
+		num_maps++;
+		old_relnum++;
+	}
+
+	/* Did we fail to exhaust the old array? */
+	if (old_relnum != old_db->rel_arr.nrels)
+		pg_fatal("old and new databases \"%s\" have a mismatched number of relations\n",
+				 old_db->db_name);
+
+	*nmaps = num_maps;
+	return maps;
+}
+
+
+/*
+ * create_rel_filename_map()
+ *
+ * fills a file node map structure and returns it in "map".
+ */
+static void
+create_rel_filename_map(const char *old_data, const char *new_data,
+						const DbInfo *old_db, const DbInfo *new_db,
+						const RelInfo *old_rel, const RelInfo *new_rel,
+						FileNameMap *map)
+{
+	if (strlen(old_rel->tablespace) == 0)
+	{
+		/*
+		 * relation belongs to the default tablespace, hence relfiles should
+		 * exist in the data directories.
+		 */
+		map->old_tablespace = old_data;
+		map->new_tablespace = new_data;
+		map->old_tablespace_suffix = "/base";
+		map->new_tablespace_suffix = "/base";
+	}
+	else
+	{
+		/* relation belongs to a tablespace, so use the tablespace location */
+		map->old_tablespace = old_rel->tablespace;
+		map->new_tablespace = new_rel->tablespace;
+		map->old_tablespace_suffix = old_cluster.tablespace_suffix;
+		map->new_tablespace_suffix = new_cluster.tablespace_suffix;
+	}
+
+	map->old_db_oid = old_db->db_oid;
+	map->new_db_oid = new_db->db_oid;
+
+	/*
+	 * old_relfilenode might differ from pg_class.oid (and hence
+	 * new_relfilenode) because of CLUSTER, REINDEX, or VACUUM FULL.
+	 */
+	map->old_relfilenode = old_rel->relfilenode;
+
+	/* new_relfilenode will match old and new pg_class.oid */
+	map->new_relfilenode = new_rel->relfilenode;
+
+	/* used only for logging and error reporing, old/new are identical */
+	map->nspname = old_rel->nspname;
+	map->relname = old_rel->relname;
+}
+
+
+void
+print_maps(FileNameMap *maps, int n_maps, const char *db_name)
+{
+	if (log_opts.verbose)
+	{
+		int			mapnum;
+
+		pg_log(PG_VERBOSE, "mappings for database \"%s\":\n", db_name);
+
+		for (mapnum = 0; mapnum < n_maps; mapnum++)
+			pg_log(PG_VERBOSE, "%s.%s: %u to %u\n",
+				   maps[mapnum].nspname, maps[mapnum].relname,
+				   maps[mapnum].old_relfilenode,
+				   maps[mapnum].new_relfilenode);
+
+		pg_log(PG_VERBOSE, "\n\n");
+	}
+}
+
+
+/*
+ * get_db_and_rel_infos()
+ *
+ * higher level routine to generate dbinfos for the database running
+ * on the given "port". Assumes that server is already running.
+ */
+void
+get_db_and_rel_infos(ClusterInfo *cluster)
+{
+	int			dbnum;
+
+	if (cluster->dbarr.dbs != NULL)
+		free_db_and_rel_infos(&cluster->dbarr);
+
+	get_db_infos(cluster);
+
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+		get_rel_infos(cluster, &cluster->dbarr.dbs[dbnum]);
+
+	pg_log(PG_VERBOSE, "\n%s databases:\n", CLUSTER_NAME(cluster));
+	if (log_opts.verbose)
+		print_db_infos(&cluster->dbarr);
+}
+
+
+/*
+ * get_db_infos()
+ *
+ * Scans pg_database system catalog and populates all user
+ * databases.
+ */
+static void
+get_db_infos(ClusterInfo *cluster)
+{
+	PGconn	   *conn = connectToServer(cluster, "template1");
+	PGresult   *res;
+	int			ntups;
+	int			tupnum;
+	DbInfo	   *dbinfos;
+	int			i_datname,
+				i_oid,
+				i_encoding,
+				i_datcollate,
+				i_datctype,
+				i_spclocation;
+	char		query[QUERY_ALLOC];
+
+	snprintf(query, sizeof(query),
+			 "SELECT d.oid, d.datname, d.encoding, d.datcollate, d.datctype, "
+			 "%s AS spclocation "
+			 "FROM pg_catalog.pg_database d "
+			 " LEFT OUTER JOIN pg_catalog.pg_tablespace t "
+			 " ON d.dattablespace = t.oid "
+			 "WHERE d.datallowconn = true "
+	/* we don't preserve pg_database.oid so we sort by name */
+			 "ORDER BY 2",
+	/* 9.2 removed the spclocation column */
+			 (GET_MAJOR_VERSION(cluster->major_version) <= 901) ?
+			 "t.spclocation" : "pg_catalog.pg_tablespace_location(t.oid)");
+
+	res = executeQueryOrDie(conn, "%s", query);
+
+	i_oid = PQfnumber(res, "oid");
+	i_datname = PQfnumber(res, "datname");
+	i_encoding = PQfnumber(res, "encoding");
+	i_datcollate = PQfnumber(res, "datcollate");
+	i_datctype = PQfnumber(res, "datctype");
+	i_spclocation = PQfnumber(res, "spclocation");
+
+	ntups = PQntuples(res);
+	dbinfos = (DbInfo *) pg_malloc(sizeof(DbInfo) * ntups);
+
+	for (tupnum = 0; tupnum < ntups; tupnum++)
+	{
+		dbinfos[tupnum].db_oid = atooid(PQgetvalue(res, tupnum, i_oid));
+		dbinfos[tupnum].db_name = pg_strdup(PQgetvalue(res, tupnum, i_datname));
+		dbinfos[tupnum].db_encoding = atoi(PQgetvalue(res, tupnum, i_encoding));
+		dbinfos[tupnum].db_collate = pg_strdup(PQgetvalue(res, tupnum, i_datcollate));
+		dbinfos[tupnum].db_ctype = pg_strdup(PQgetvalue(res, tupnum, i_datctype));
+		snprintf(dbinfos[tupnum].db_tablespace, sizeof(dbinfos[tupnum].db_tablespace), "%s",
+				 PQgetvalue(res, tupnum, i_spclocation));
+	}
+	PQclear(res);
+
+	PQfinish(conn);
+
+	cluster->dbarr.dbs = dbinfos;
+	cluster->dbarr.ndbs = ntups;
+}
+
+
+/*
+ * get_rel_infos()
+ *
+ * gets the relinfos for all the user tables of the database referred
+ * by "db".
+ *
+ * NOTE: we assume that relations/entities with oids greater than
+ * FirstNormalObjectId belongs to the user
+ */
+static void
+get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
+{
+	PGconn	   *conn = connectToServer(cluster,
+									   dbinfo->db_name);
+	PGresult   *res;
+	RelInfo    *relinfos;
+	int			ntups;
+	int			relnum;
+	int			num_rels = 0;
+	char	   *nspname = NULL;
+	char	   *relname = NULL;
+	char	   *tablespace = NULL;
+	int			i_spclocation,
+				i_nspname,
+				i_relname,
+				i_oid,
+				i_relfilenode,
+				i_reltablespace;
+	char		query[QUERY_ALLOC];
+	char	   *last_namespace = NULL,
+			   *last_tablespace = NULL;
+
+	/*
+	 * pg_largeobject contains user data that does not appear in pg_dump
+	 * --schema-only output, so we have to copy that system table heap and
+	 * index.  We could grab the pg_largeobject oids from template1, but it is
+	 * easy to treat it as a normal table. Order by oid so we can join old/new
+	 * structures efficiently.
+	 */
+
+	snprintf(query, sizeof(query),
+		/* get regular heap */
+			"WITH regular_heap (reloid) AS ( "
+			"	SELECT c.oid "
+			"	FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n "
+			"		   ON c.relnamespace = n.oid "
+			"	LEFT OUTER JOIN pg_catalog.pg_index i "
+			"		   ON c.oid = i.indexrelid "
+			"	WHERE relkind IN ('r', 'm', 'i', 'S') AND "
+		/*
+		 * pg_dump only dumps valid indexes;  testing indisready is necessary in
+		 * 9.2, and harmless in earlier/later versions.
+		 */
+			"		i.indisvalid IS DISTINCT FROM false AND "
+			"		i.indisready IS DISTINCT FROM false AND "
+		/* exclude possible orphaned temp tables */
+			"	  ((n.nspname !~ '^pg_temp_' AND "
+			"	    n.nspname !~ '^pg_toast_temp_' AND "
+		/* skip pg_toast because toast index have relkind == 'i', not 't' */
+			"	    n.nspname NOT IN ('pg_catalog', 'information_schema', "
+			"							'binary_upgrade', 'pg_toast') AND "
+			"		  c.oid >= %u) OR "
+			"	  (n.nspname = 'pg_catalog' AND "
+	"    relname IN ('pg_largeobject', 'pg_largeobject_loid_pn_index'%s) ))), "
+		/*
+		 * We have to gather the TOAST tables in later steps because we
+		 * can't schema-qualify TOAST tables.
+		 */
+		 /* get TOAST heap */
+			"	toast_heap (reloid) AS ( "
+			"	SELECT reltoastrelid "
+			"	FROM regular_heap JOIN pg_catalog.pg_class c "
+			"		ON regular_heap.reloid = c.oid "
+			"		AND c.reltoastrelid != %u), "
+		 /* get indexes on regular and TOAST heap */
+			"	all_index (reloid) AS ( "
+			"	SELECT indexrelid "
+			"	FROM pg_index "
+			"	WHERE indisvalid "
+			"    AND indrelid IN (SELECT reltoastrelid "
+			"        FROM (SELECT reloid FROM regular_heap "
+			"			   UNION ALL "
+			"			   SELECT reloid FROM toast_heap) all_heap "
+			"            JOIN pg_catalog.pg_class c "
+			"            ON all_heap.reloid = c.oid "
+			"            AND c.reltoastrelid != %u)) "
+		/* get all rels */
+			"SELECT c.oid, n.nspname, c.relname, "
+			"	c.relfilenode, c.reltablespace, %s "
+			"FROM (SELECT reloid FROM regular_heap "
+			"	   UNION ALL "
+			"	   SELECT reloid FROM toast_heap  "
+			"	   UNION ALL "
+			"	   SELECT reloid FROM all_index) all_rels "
+			"  JOIN pg_catalog.pg_class c "
+			"		ON all_rels.reloid = c.oid "
+			"  JOIN pg_catalog.pg_namespace n "
+			"	   ON c.relnamespace = n.oid "
+			"  LEFT OUTER JOIN pg_catalog.pg_tablespace t "
+			"	   ON c.reltablespace = t.oid "
+	/* we preserve pg_class.oid so we sort by it to match old/new */
+			"ORDER BY 1;",
+			FirstNormalObjectId,
+	/* does pg_largeobject_metadata need to be migrated? */
+			(GET_MAJOR_VERSION(old_cluster.major_version) <= 804) ?
+	"" : ", 'pg_largeobject_metadata', 'pg_largeobject_metadata_oid_index'",
+	InvalidOid, InvalidOid,
+	/* 9.2 removed the spclocation column */
+			(GET_MAJOR_VERSION(cluster->major_version) <= 901) ?
+			"t.spclocation" : "pg_catalog.pg_tablespace_location(t.oid) AS spclocation");
+
+	res = executeQueryOrDie(conn, "%s", query);
+
+	ntups = PQntuples(res);
+
+	relinfos = (RelInfo *) pg_malloc(sizeof(RelInfo) * ntups);
+
+	i_oid = PQfnumber(res, "oid");
+	i_nspname = PQfnumber(res, "nspname");
+	i_relname = PQfnumber(res, "relname");
+	i_relfilenode = PQfnumber(res, "relfilenode");
+	i_reltablespace = PQfnumber(res, "reltablespace");
+	i_spclocation = PQfnumber(res, "spclocation");
+
+	for (relnum = 0; relnum < ntups; relnum++)
+	{
+		RelInfo    *curr = &relinfos[num_rels++];
+
+		curr->reloid = atooid(PQgetvalue(res, relnum, i_oid));
+
+		nspname = PQgetvalue(res, relnum, i_nspname);
+		curr->nsp_alloc = false;
+
+		/*
+		 * Many of the namespace and tablespace strings are identical, so we
+		 * try to reuse the allocated string pointers where possible to reduce
+		 * memory consumption.
+		 */
+		/* Can we reuse the previous string allocation? */
+		if (last_namespace && strcmp(nspname, last_namespace) == 0)
+			curr->nspname = last_namespace;
+		else
+		{
+			last_namespace = curr->nspname = pg_strdup(nspname);
+			curr->nsp_alloc = true;
+		}
+
+		relname = PQgetvalue(res, relnum, i_relname);
+		curr->relname = pg_strdup(relname);
+
+		curr->relfilenode = atooid(PQgetvalue(res, relnum, i_relfilenode));
+		curr->tblsp_alloc = false;
+
+		/* Is the tablespace oid non-zero? */
+		if (atooid(PQgetvalue(res, relnum, i_reltablespace)) != 0)
+		{
+			/*
+			 * The tablespace location might be "", meaning the cluster
+			 * default location, i.e. pg_default or pg_global.
+			 */
+			tablespace = PQgetvalue(res, relnum, i_spclocation);
+
+			/* Can we reuse the previous string allocation? */
+			if (last_tablespace && strcmp(tablespace, last_tablespace) == 0)
+				curr->tablespace = last_tablespace;
+			else
+			{
+				last_tablespace = curr->tablespace = pg_strdup(tablespace);
+				curr->tblsp_alloc = true;
+			}
+		}
+		else
+			/* A zero reltablespace oid indicates the database tablespace. */
+			curr->tablespace = dbinfo->db_tablespace;
+	}
+	PQclear(res);
+
+	PQfinish(conn);
+
+	dbinfo->rel_arr.rels = relinfos;
+	dbinfo->rel_arr.nrels = num_rels;
+}
+
+
+static void
+free_db_and_rel_infos(DbInfoArr *db_arr)
+{
+	int			dbnum;
+
+	for (dbnum = 0; dbnum < db_arr->ndbs; dbnum++)
+	{
+		free_rel_infos(&db_arr->dbs[dbnum].rel_arr);
+		pg_free(db_arr->dbs[dbnum].db_name);
+	}
+	pg_free(db_arr->dbs);
+	db_arr->dbs = NULL;
+	db_arr->ndbs = 0;
+}
+
+
+static void
+free_rel_infos(RelInfoArr *rel_arr)
+{
+	int			relnum;
+
+	for (relnum = 0; relnum < rel_arr->nrels; relnum++)
+	{
+		if (rel_arr->rels[relnum].nsp_alloc)
+			pg_free(rel_arr->rels[relnum].nspname);
+		pg_free(rel_arr->rels[relnum].relname);
+		if (rel_arr->rels[relnum].tblsp_alloc)
+			pg_free(rel_arr->rels[relnum].tablespace);
+	}
+	pg_free(rel_arr->rels);
+	rel_arr->nrels = 0;
+}
+
+
+static void
+print_db_infos(DbInfoArr *db_arr)
+{
+	int			dbnum;
+
+	for (dbnum = 0; dbnum < db_arr->ndbs; dbnum++)
+	{
+		pg_log(PG_VERBOSE, "Database: %s\n", db_arr->dbs[dbnum].db_name);
+		print_rel_infos(&db_arr->dbs[dbnum].rel_arr);
+		pg_log(PG_VERBOSE, "\n\n");
+	}
+}
+
+
+static void
+print_rel_infos(RelInfoArr *rel_arr)
+{
+	int			relnum;
+
+	for (relnum = 0; relnum < rel_arr->nrels; relnum++)
+		pg_log(PG_VERBOSE, "relname: %s.%s: reloid: %u reltblspace: %s\n",
+			   rel_arr->rels[relnum].nspname,
+			   rel_arr->rels[relnum].relname,
+			   rel_arr->rels[relnum].reloid,
+			   rel_arr->rels[relnum].tablespace);
+}
diff --git a/src/bin/pg_upgrade/option.c b/src/bin/pg_upgrade/option.c
new file mode 100644
index 0000000..b851056
--- /dev/null
+++ b/src/bin/pg_upgrade/option.c
@@ -0,0 +1,518 @@
+/*
+ *	opt.c
+ *
+ *	options functions
+ *
+ *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
+ *	src/bin/pg_upgrade/option.c
+ */
+
+#include "postgres_fe.h"
+
+#include "miscadmin.h"
+#include "getopt_long.h"
+
+#include "pg_upgrade.h"
+
+#include <time.h>
+#include <sys/types.h>
+#ifdef WIN32
+#include <io.h>
+#endif
+
+
+static void usage(void);
+static void check_required_directory(char **dirpath, char **configpath,
+				   char *envVarName, char *cmdLineOption, char *description);
+#define FIX_DEFAULT_READ_ONLY "-c default_transaction_read_only=false"
+
+
+UserOpts	user_opts;
+
+
+/*
+ * parseCommandLine()
+ *
+ *	Parses the command line (argc, argv[]) and loads structures
+ */
+void
+parseCommandLine(int argc, char *argv[])
+{
+	static struct option long_options[] = {
+		{"old-datadir", required_argument, NULL, 'd'},
+		{"new-datadir", required_argument, NULL, 'D'},
+		{"old-bindir", required_argument, NULL, 'b'},
+		{"new-bindir", required_argument, NULL, 'B'},
+		{"old-options", required_argument, NULL, 'o'},
+		{"new-options", required_argument, NULL, 'O'},
+		{"old-port", required_argument, NULL, 'p'},
+		{"new-port", required_argument, NULL, 'P'},
+
+		{"username", required_argument, NULL, 'U'},
+		{"check", no_argument, NULL, 'c'},
+		{"link", no_argument, NULL, 'k'},
+		{"retain", no_argument, NULL, 'r'},
+		{"jobs", required_argument, NULL, 'j'},
+		{"verbose", no_argument, NULL, 'v'},
+		{NULL, 0, NULL, 0}
+	};
+	int			option;			/* Command line option */
+	int			optindex = 0;	/* used by getopt_long */
+	int			os_user_effective_id;
+	FILE	   *fp;
+	char	  **filename;
+	time_t		run_time = time(NULL);
+
+	user_opts.transfer_mode = TRANSFER_MODE_COPY;
+
+	os_info.progname = get_progname(argv[0]);
+
+	/* Process libpq env. variables; load values here for usage() output */
+	old_cluster.port = getenv("PGPORTOLD") ? atoi(getenv("PGPORTOLD")) : DEF_PGUPORT;
+	new_cluster.port = getenv("PGPORTNEW") ? atoi(getenv("PGPORTNEW")) : DEF_PGUPORT;
+
+	os_user_effective_id = get_user_info(&os_info.user);
+	/* we override just the database user name;  we got the OS id above */
+	if (getenv("PGUSER"))
+	{
+		pg_free(os_info.user);
+		/* must save value, getenv()'s pointer is not stable */
+		os_info.user = pg_strdup(getenv("PGUSER"));
+	}
+
+	if (argc > 1)
+	{
+		if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
+		{
+			usage();
+			exit(0);
+		}
+		if (strcmp(argv[1], "--version") == 0 || strcmp(argv[1], "-V") == 0)
+		{
+			puts("pg_upgrade (PostgreSQL) " PG_VERSION);
+			exit(0);
+		}
+	}
+
+	/* Allow help and version to be run as root, so do the test here. */
+	if (os_user_effective_id == 0)
+		pg_fatal("%s: cannot be run as root\n", os_info.progname);
+
+	if ((log_opts.internal = fopen_priv(INTERNAL_LOG_FILE, "a")) == NULL)
+		pg_fatal("cannot write to log file %s\n", INTERNAL_LOG_FILE);
+
+	while ((option = getopt_long(argc, argv, "d:D:b:B:cj:ko:O:p:P:rU:v",
+								 long_options, &optindex)) != -1)
+	{
+		switch (option)
+		{
+			case 'b':
+				old_cluster.bindir = pg_strdup(optarg);
+				break;
+
+			case 'B':
+				new_cluster.bindir = pg_strdup(optarg);
+				break;
+
+			case 'c':
+				user_opts.check = true;
+				break;
+
+			case 'd':
+				old_cluster.pgdata = pg_strdup(optarg);
+				old_cluster.pgconfig = pg_strdup(optarg);
+				break;
+
+			case 'D':
+				new_cluster.pgdata = pg_strdup(optarg);
+				new_cluster.pgconfig = pg_strdup(optarg);
+				break;
+
+			case 'j':
+				user_opts.jobs = atoi(optarg);
+				break;
+
+			case 'k':
+				user_opts.transfer_mode = TRANSFER_MODE_LINK;
+				break;
+
+			case 'o':
+				/* append option? */
+				if (!old_cluster.pgopts)
+					old_cluster.pgopts = pg_strdup(optarg);
+				else
+				{
+					char *old_pgopts = old_cluster.pgopts;
+
+					old_cluster.pgopts = psprintf("%s %s", old_pgopts, optarg);
+					free(old_pgopts);
+				}
+				break;
+
+			case 'O':
+				/* append option? */
+				if (!new_cluster.pgopts)
+					new_cluster.pgopts = pg_strdup(optarg);
+				else
+				{
+					char *new_pgopts = new_cluster.pgopts;
+
+					new_cluster.pgopts = psprintf("%s %s", new_pgopts, optarg);
+					free(new_pgopts);
+				}
+				break;
+
+				/*
+				 * Someday, the port number option could be removed and passed
+				 * using -o/-O, but that requires postmaster -C to be
+				 * supported on all old/new versions (added in PG 9.2).
+				 */
+			case 'p':
+				if ((old_cluster.port = atoi(optarg)) <= 0)
+				{
+					pg_fatal("invalid old port number\n");
+					exit(1);
+				}
+				break;
+
+			case 'P':
+				if ((new_cluster.port = atoi(optarg)) <= 0)
+				{
+					pg_fatal("invalid new port number\n");
+					exit(1);
+				}
+				break;
+
+			case 'r':
+				log_opts.retain = true;
+				break;
+
+			case 'U':
+				pg_free(os_info.user);
+				os_info.user = pg_strdup(optarg);
+				os_info.user_specified = true;
+
+				/*
+				 * Push the user name into the environment so pre-9.1
+				 * pg_ctl/libpq uses it.
+				 */
+				pg_putenv("PGUSER", os_info.user);
+				break;
+
+			case 'v':
+				pg_log(PG_REPORT, "Running in verbose mode\n");
+				log_opts.verbose = true;
+				break;
+
+			default:
+				pg_fatal("Try \"%s --help\" for more information.\n",
+						 os_info.progname);
+				break;
+		}
+	}
+
+	/* label start of upgrade in logfiles */
+	for (filename = output_files; *filename != NULL; filename++)
+	{
+		if ((fp = fopen_priv(*filename, "a")) == NULL)
+			pg_fatal("cannot write to log file %s\n", *filename);
+
+		/* Start with newline because we might be appending to a file. */
+		fprintf(fp, "\n"
+		"-----------------------------------------------------------------\n"
+				"  pg_upgrade run on %s"
+				"-----------------------------------------------------------------\n\n",
+				ctime(&run_time));
+		fclose(fp);
+	}
+
+	/* Turn off read-only mode;  add prefix to PGOPTIONS? */
+	if (getenv("PGOPTIONS"))
+	{
+		char	   *pgoptions = psprintf("%s %s", FIX_DEFAULT_READ_ONLY,
+										 getenv("PGOPTIONS"));
+
+		pg_putenv("PGOPTIONS", pgoptions);
+		pfree(pgoptions);
+	}
+	else
+		pg_putenv("PGOPTIONS", FIX_DEFAULT_READ_ONLY);
+
+	/* Get values from env if not already set */
+	check_required_directory(&old_cluster.bindir, NULL, "PGBINOLD", "-b",
+							 "old cluster binaries reside");
+	check_required_directory(&new_cluster.bindir, NULL, "PGBINNEW", "-B",
+							 "new cluster binaries reside");
+	check_required_directory(&old_cluster.pgdata, &old_cluster.pgconfig,
+							 "PGDATAOLD", "-d", "old cluster data resides");
+	check_required_directory(&new_cluster.pgdata, &new_cluster.pgconfig,
+							 "PGDATANEW", "-D", "new cluster data resides");
+
+#ifdef WIN32
+	/*
+	 * On Windows, initdb --sync-only will fail with a "Permission denied"
+	 * error on file pg_upgrade_utility.log if pg_upgrade is run inside
+	 * the new cluster directory, so we do a check here.
+	 */
+	{
+		char	cwd[MAXPGPATH], new_cluster_pgdata[MAXPGPATH];
+
+		strlcpy(new_cluster_pgdata, new_cluster.pgdata, MAXPGPATH);
+		canonicalize_path(new_cluster_pgdata);
+
+		if (!getcwd(cwd, MAXPGPATH))
+			pg_fatal("cannot find current directory\n");
+		canonicalize_path(cwd);
+		if (path_is_prefix_of_path(new_cluster_pgdata, cwd))
+			pg_fatal("cannot run pg_upgrade from inside the new cluster data directory on Windows\n");
+	}
+#endif
+}
+
+
+static void
+usage(void)
+{
+	printf(_("pg_upgrade upgrades a PostgreSQL cluster to a different major version.\n\
+\nUsage:\n\
+  pg_upgrade [OPTION]...\n\
+\n\
+Options:\n\
+  -b, --old-bindir=BINDIR       old cluster executable directory\n\
+  -B, --new-bindir=BINDIR       new cluster executable directory\n\
+  -c, --check                   check clusters only, don't change any data\n\
+  -d, --old-datadir=DATADIR     old cluster data directory\n\
+  -D, --new-datadir=DATADIR     new cluster data directory\n\
+  -j, --jobs                    number of simultaneous processes or threads to use\n\
+  -k, --link                    link instead of copying files to new cluster\n\
+  -o, --old-options=OPTIONS     old cluster options to pass to the server\n\
+  -O, --new-options=OPTIONS     new cluster options to pass to the server\n\
+  -p, --old-port=PORT           old cluster port number (default %d)\n\
+  -P, --new-port=PORT           new cluster port number (default %d)\n\
+  -r, --retain                  retain SQL and log files after success\n\
+  -U, --username=NAME           cluster superuser (default \"%s\")\n\
+  -v, --verbose                 enable verbose internal logging\n\
+  -V, --version                 display version information, then exit\n\
+  -?, --help                    show this help, then exit\n\
+\n\
+Before running pg_upgrade you must:\n\
+  create a new database cluster (using the new version of initdb)\n\
+  shutdown the postmaster servicing the old cluster\n\
+  shutdown the postmaster servicing the new cluster\n\
+\n\
+When you run pg_upgrade, you must provide the following information:\n\
+  the data directory for the old cluster  (-d DATADIR)\n\
+  the data directory for the new cluster  (-D DATADIR)\n\
+  the \"bin\" directory for the old version (-b BINDIR)\n\
+  the \"bin\" directory for the new version (-B BINDIR)\n\
+\n\
+For example:\n\
+  pg_upgrade -d oldCluster/data -D newCluster/data -b oldCluster/bin -B newCluster/bin\n\
+or\n"), old_cluster.port, new_cluster.port, os_info.user);
+#ifndef WIN32
+	printf(_("\
+  $ export PGDATAOLD=oldCluster/data\n\
+  $ export PGDATANEW=newCluster/data\n\
+  $ export PGBINOLD=oldCluster/bin\n\
+  $ export PGBINNEW=newCluster/bin\n\
+  $ pg_upgrade\n"));
+#else
+	printf(_("\
+  C:\\> set PGDATAOLD=oldCluster/data\n\
+  C:\\> set PGDATANEW=newCluster/data\n\
+  C:\\> set PGBINOLD=oldCluster/bin\n\
+  C:\\> set PGBINNEW=newCluster/bin\n\
+  C:\\> pg_upgrade\n"));
+#endif
+	printf(_("\nReport bugs to <pgsql-bugs@postgresql.org>.\n"));
+}
+
+
+/*
+ * check_required_directory()
+ *
+ * Checks a directory option.
+ *	dirpath		  - the directory name supplied on the command line
+ *	configpath	  - optional configuration directory
+ *	envVarName	  - the name of an environment variable to get if dirpath is NULL
+ *	cmdLineOption - the command line option corresponds to this directory (-o, -O, -n, -N)
+ *	description   - a description of this directory option
+ *
+ * We use the last two arguments to construct a meaningful error message if the
+ * user hasn't provided the required directory name.
+ */
+static void
+check_required_directory(char **dirpath, char **configpath,
+						 char *envVarName, char *cmdLineOption,
+						 char *description)
+{
+	if (*dirpath == NULL || strlen(*dirpath) == 0)
+	{
+		const char *envVar;
+
+		if ((envVar = getenv(envVarName)) && strlen(envVar))
+		{
+			*dirpath = pg_strdup(envVar);
+			if (configpath)
+				*configpath = pg_strdup(envVar);
+		}
+		else
+			pg_fatal("You must identify the directory where the %s.\n"
+					 "Please use the %s command-line option or the %s environment variable.\n",
+					 description, cmdLineOption, envVarName);
+	}
+
+	/*
+	 * Trim off any trailing path separators because we construct paths by
+	 * appending to this path.
+	 */
+#ifndef WIN32
+	if ((*dirpath)[strlen(*dirpath) - 1] == '/')
+#else
+	if ((*dirpath)[strlen(*dirpath) - 1] == '/' ||
+		(*dirpath)[strlen(*dirpath) - 1] == '\\')
+#endif
+		(*dirpath)[strlen(*dirpath) - 1] = 0;
+}
+
+/*
+ * adjust_data_dir
+ *
+ * If a configuration-only directory was specified, find the real data dir
+ * by quering the running server.  This has limited checking because we
+ * can't check for a running server because we can't find postmaster.pid.
+ */
+void
+adjust_data_dir(ClusterInfo *cluster)
+{
+	char		filename[MAXPGPATH];
+	char		cmd[MAXPGPATH],
+				cmd_output[MAX_STRING];
+	FILE	   *fp,
+			   *output;
+
+	/* If there is no postgresql.conf, it can't be a config-only dir */
+	snprintf(filename, sizeof(filename), "%s/postgresql.conf", cluster->pgconfig);
+	if ((fp = fopen(filename, "r")) == NULL)
+		return;
+	fclose(fp);
+
+	/* If PG_VERSION exists, it can't be a config-only dir */
+	snprintf(filename, sizeof(filename), "%s/PG_VERSION", cluster->pgconfig);
+	if ((fp = fopen(filename, "r")) != NULL)
+	{
+		fclose(fp);
+		return;
+	}
+
+	/* Must be a configuration directory, so find the real data directory. */
+
+	prep_status("Finding the real data directory for the %s cluster",
+				CLUSTER_NAME(cluster));
+
+	/*
+	 * We don't have a data directory yet, so we can't check the PG version,
+	 * so this might fail --- only works for PG 9.2+.   If this fails,
+	 * pg_upgrade will fail anyway because the data files will not be found.
+	 */
+	snprintf(cmd, sizeof(cmd), "\"%s/postgres\" -D \"%s\" -C data_directory",
+			 cluster->bindir, cluster->pgconfig);
+
+	if ((output = popen(cmd, "r")) == NULL ||
+		fgets(cmd_output, sizeof(cmd_output), output) == NULL)
+		pg_fatal("Could not get data directory using %s: %s\n",
+				 cmd, getErrorText(errno));
+
+	pclose(output);
+
+	/* Remove trailing newline */
+	if (strchr(cmd_output, '\n') != NULL)
+		*strchr(cmd_output, '\n') = '\0';
+
+	cluster->pgdata = pg_strdup(cmd_output);
+
+	check_ok();
+}
+
+
+/*
+ * get_sock_dir
+ *
+ * Identify the socket directory to use for this cluster.  If we're doing
+ * a live check (old cluster only), we need to find out where the postmaster
+ * is listening.  Otherwise, we're going to put the socket into the current
+ * directory.
+ */
+void
+get_sock_dir(ClusterInfo *cluster, bool live_check)
+{
+#ifdef HAVE_UNIX_SOCKETS
+
+	/*
+	 * sockdir and port were added to postmaster.pid in PG 9.1. Pre-9.1 cannot
+	 * process pg_ctl -w for sockets in non-default locations.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 901)
+	{
+		if (!live_check)
+		{
+			/* Use the current directory for the socket */
+			cluster->sockdir = pg_malloc(MAXPGPATH);
+			if (!getcwd(cluster->sockdir, MAXPGPATH))
+				pg_fatal("cannot find current directory\n");
+		}
+		else
+		{
+			/*
+			 * If we are doing a live check, we will use the old cluster's
+			 * Unix domain socket directory so we can connect to the live
+			 * server.
+			 */
+			unsigned short orig_port = cluster->port;
+			char		filename[MAXPGPATH],
+						line[MAXPGPATH];
+			FILE	   *fp;
+			int			lineno;
+
+			snprintf(filename, sizeof(filename), "%s/postmaster.pid",
+					 cluster->pgdata);
+			if ((fp = fopen(filename, "r")) == NULL)
+				pg_fatal("Cannot open file %s: %m\n", filename);
+
+			for (lineno = 1;
+			   lineno <= Max(LOCK_FILE_LINE_PORT, LOCK_FILE_LINE_SOCKET_DIR);
+				 lineno++)
+			{
+				if (fgets(line, sizeof(line), fp) == NULL)
+					pg_fatal("Cannot read line %d from %s: %m\n", lineno, filename);
+
+				/* potentially overwrite user-supplied value */
+				if (lineno == LOCK_FILE_LINE_PORT)
+					sscanf(line, "%hu", &old_cluster.port);
+				if (lineno == LOCK_FILE_LINE_SOCKET_DIR)
+				{
+					cluster->sockdir = pg_strdup(line);
+					/* strip off newline */
+					if (strchr(cluster->sockdir, '\n') != NULL)
+						*strchr(cluster->sockdir, '\n') = '\0';
+				}
+			}
+			fclose(fp);
+
+			/* warn of port number correction */
+			if (orig_port != DEF_PGUPORT && old_cluster.port != orig_port)
+				pg_log(PG_WARNING, "User-supplied old port number %hu corrected to %hu\n",
+					   orig_port, cluster->port);
+		}
+	}
+	else
+
+		/*
+		 * Can't get sockdir and pg_ctl -w can't use a non-default, use
+		 * default
+		 */
+		cluster->sockdir = NULL;
+#else							/* !HAVE_UNIX_SOCKETS */
+	cluster->sockdir = NULL;
+#endif
+}
diff --git a/src/bin/pg_upgrade/page.c b/src/bin/pg_upgrade/page.c
new file mode 100644
index 0000000..3f4c697
--- /dev/null
+++ b/src/bin/pg_upgrade/page.c
@@ -0,0 +1,164 @@
+/*
+ *	page.c
+ *
+ *	per-page conversion operations
+ *
+ *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
+ *	src/bin/pg_upgrade/page.c
+ */
+
+#include "postgres_fe.h"
+
+#include "pg_upgrade.h"
+
+#include "storage/bufpage.h"
+
+
+#ifdef PAGE_CONVERSION
+
+
+static void getPageVersion(
+			   uint16 *version, const char *pathName);
+static pageCnvCtx *loadConverterPlugin(
+					uint16 newPageVersion, uint16 oldPageVersion);
+
+
+/*
+ * setupPageConverter()
+ *
+ *	This function determines the PageLayoutVersion of the old cluster and
+ *	the PageLayoutVersion of the new cluster.  If the versions differ, this
+ *	function loads a converter plugin and returns a pointer to a pageCnvCtx
+ *	object (in *result) that knows how to convert pages from the old format
+ *	to the new format.  If the versions are identical, this function just
+ *	returns a NULL pageCnvCtx pointer to indicate that page-by-page conversion
+ *	is not required.
+ */
+pageCnvCtx *
+setupPageConverter(void)
+{
+	uint16		oldPageVersion;
+	uint16		newPageVersion;
+	pageCnvCtx *converter;
+	const char *msg;
+	char		dstName[MAXPGPATH];
+	char		srcName[MAXPGPATH];
+
+	snprintf(dstName, sizeof(dstName), "%s/global/%u", new_cluster.pgdata,
+			 new_cluster.pg_database_oid);
+	snprintf(srcName, sizeof(srcName), "%s/global/%u", old_cluster.pgdata,
+			 old_cluster.pg_database_oid);
+
+	getPageVersion(&oldPageVersion, srcName);
+	getPageVersion(&newPageVersion, dstName);
+
+	/*
+	 * If the old cluster and new cluster use the same page layouts, then we
+	 * don't need a page converter.
+	 */
+	if (newPageVersion != oldPageVersion)
+	{
+		/*
+		 * The clusters use differing page layouts, see if we can find a
+		 * plugin that knows how to convert from the old page layout to the
+		 * new page layout.
+		 */
+
+		if ((converter = loadConverterPlugin(newPageVersion, oldPageVersion)) == NULL)
+			pg_fatal("could not find plugin to convert from old page layout to new page layout\n");
+
+		return converter;
+	}
+	else
+		return NULL;
+}
+
+
+/*
+ * getPageVersion()
+ *
+ *	Retrieves the PageLayoutVersion for the given relation.
+ *
+ *	Returns NULL on success (and stores the PageLayoutVersion at *version),
+ *	if an error occurs, this function returns an error message (in the form
+ *	of a null-terminated string).
+ */
+static void
+getPageVersion(uint16 *version, const char *pathName)
+{
+	int			relfd;
+	PageHeaderData page;
+	ssize_t		bytesRead;
+
+	if ((relfd = open(pathName, O_RDONLY, 0)) < 0)
+		pg_fatal("could not open relation %s\n", pathName);
+
+	if ((bytesRead = read(relfd, &page, sizeof(page))) != sizeof(page))
+		pg_fatal("could not read page header of %s\n", pathName);
+
+	*version = PageGetPageLayoutVersion(&page);
+
+	close(relfd);
+
+	return;
+}
+
+
+/*
+ * loadConverterPlugin()
+ *
+ *	This function loads a page-converter plugin library and grabs a
+ *	pointer to each of the (interesting) functions provided by that
+ *	plugin.  The name of the plugin library is derived from the given
+ *	newPageVersion and oldPageVersion.  If a plugin is found, this
+ *	function returns a pointer to a pageCnvCtx object (which will contain
+ *	a collection of plugin function pointers). If the required plugin
+ *	is not found, this function returns NULL.
+ */
+static pageCnvCtx *
+loadConverterPlugin(uint16 newPageVersion, uint16 oldPageVersion)
+{
+	char		pluginName[MAXPGPATH];
+	void	   *plugin;
+
+	/*
+	 * Try to find a plugin that can convert pages of oldPageVersion into
+	 * pages of newPageVersion.  For example, if we oldPageVersion = 3 and
+	 * newPageVersion is 4, we search for a plugin named:
+	 * plugins/convertLayout_3_to_4.dll
+	 */
+
+	/*
+	 * FIXME: we are searching for plugins relative to the current directory,
+	 * we should really search relative to our own executable instead.
+	 */
+	snprintf(pluginName, sizeof(pluginName), "./plugins/convertLayout_%d_to_%d%s",
+			 oldPageVersion, newPageVersion, DLSUFFIX);
+
+	if ((plugin = pg_dlopen(pluginName)) == NULL)
+		return NULL;
+	else
+	{
+		pageCnvCtx *result = (pageCnvCtx *) pg_malloc(sizeof(*result));
+
+		result->old.PageVersion = oldPageVersion;
+		result->new.PageVersion = newPageVersion;
+
+		result->startup = (pluginStartup) pg_dlsym(plugin, "init");
+		result->convertFile = (pluginConvertFile) pg_dlsym(plugin, "convertFile");
+		result->convertPage = (pluginConvertPage) pg_dlsym(plugin, "convertPage");
+		result->shutdown = (pluginShutdown) pg_dlsym(plugin, "fini");
+		result->pluginData = NULL;
+
+		/*
+		 * If the plugin has exported an initializer, go ahead and invoke it.
+		 */
+		if (result->startup)
+			result->startup(MIGRATOR_API_VERSION, &result->pluginVersion,
+						newPageVersion, oldPageVersion, &result->pluginData);
+
+		return result;
+	}
+}
+
+#endif
diff --git a/src/bin/pg_upgrade/parallel.c b/src/bin/pg_upgrade/parallel.c
new file mode 100644
index 0000000..c6978b5
--- /dev/null
+++ b/src/bin/pg_upgrade/parallel.c
@@ -0,0 +1,357 @@
+/*
+ *	parallel.c
+ *
+ *	multi-process support
+ *
+ *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
+ *	src/bin/pg_upgrade/parallel.c
+ */
+
+#include "postgres_fe.h"
+
+#include "pg_upgrade.h"
+
+#include <stdlib.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+
+#ifdef WIN32
+#include <io.h>
+#endif
+
+static int	parallel_jobs;
+
+#ifdef WIN32
+/*
+ *	Array holding all active threads.  There can't be any gaps/zeros so
+ *	it can be passed to WaitForMultipleObjects().  We use two arrays
+ *	so the thread_handles array can be passed to WaitForMultipleObjects().
+ */
+HANDLE	   *thread_handles;
+
+typedef struct
+{
+	char	   *log_file;
+	char	   *opt_log_file;
+	char	   *cmd;
+} exec_thread_arg;
+
+typedef struct
+{
+	DbInfoArr  *old_db_arr;
+	DbInfoArr  *new_db_arr;
+	char	   *old_pgdata;
+	char	   *new_pgdata;
+	char	   *old_tablespace;
+} transfer_thread_arg;
+
+exec_thread_arg **exec_thread_args;
+transfer_thread_arg **transfer_thread_args;
+
+/* track current thread_args struct so reap_child() can be used for all cases */
+void	  **cur_thread_args;
+
+DWORD		win32_exec_prog(exec_thread_arg *args);
+DWORD		win32_transfer_all_new_dbs(transfer_thread_arg *args);
+#endif
+
+/*
+ *	parallel_exec_prog
+ *
+ *	This has the same API as exec_prog, except it does parallel execution,
+ *	and therefore must throw errors and doesn't return an error status.
+ */
+void
+parallel_exec_prog(const char *log_file, const char *opt_log_file,
+				   const char *fmt,...)
+{
+	va_list		args;
+	char		cmd[MAX_STRING];
+
+#ifndef WIN32
+	pid_t		child;
+#else
+	HANDLE		child;
+	exec_thread_arg *new_arg;
+#endif
+
+	va_start(args, fmt);
+	vsnprintf(cmd, sizeof(cmd), fmt, args);
+	va_end(args);
+
+	if (user_opts.jobs <= 1)
+		/* throw_error must be true to allow jobs */
+		exec_prog(log_file, opt_log_file, true, "%s", cmd);
+	else
+	{
+		/* parallel */
+#ifdef WIN32
+		if (thread_handles == NULL)
+			thread_handles = pg_malloc(user_opts.jobs * sizeof(HANDLE));
+
+		if (exec_thread_args == NULL)
+		{
+			int			i;
+
+			exec_thread_args = pg_malloc(user_opts.jobs * sizeof(exec_thread_arg *));
+
+			/*
+			 * For safety and performance, we keep the args allocated during
+			 * the entire life of the process, and we don't free the args in a
+			 * thread different from the one that allocated it.
+			 */
+			for (i = 0; i < user_opts.jobs; i++)
+				exec_thread_args[i] = pg_malloc0(sizeof(exec_thread_arg));
+		}
+
+		cur_thread_args = (void **) exec_thread_args;
+#endif
+		/* harvest any dead children */
+		while (reap_child(false) == true)
+			;
+
+		/* must we wait for a dead child? */
+		if (parallel_jobs >= user_opts.jobs)
+			reap_child(true);
+
+		/* set this before we start the job */
+		parallel_jobs++;
+
+		/* Ensure stdio state is quiesced before forking */
+		fflush(NULL);
+
+#ifndef WIN32
+		child = fork();
+		if (child == 0)
+			/* use _exit to skip atexit() functions */
+			_exit(!exec_prog(log_file, opt_log_file, true, "%s", cmd));
+		else if (child < 0)
+			/* fork failed */
+			pg_fatal("could not create worker process: %s\n", strerror(errno));
+#else
+		/* empty array element are always at the end */
+		new_arg = exec_thread_args[parallel_jobs - 1];
+
+		/* Can only pass one pointer into the function, so use a struct */
+		if (new_arg->log_file)
+			pg_free(new_arg->log_file);
+		new_arg->log_file = pg_strdup(log_file);
+		if (new_arg->opt_log_file)
+			pg_free(new_arg->opt_log_file);
+		new_arg->opt_log_file = opt_log_file ? pg_strdup(opt_log_file) : NULL;
+		if (new_arg->cmd)
+			pg_free(new_arg->cmd);
+		new_arg->cmd = pg_strdup(cmd);
+
+		child = (HANDLE) _beginthreadex(NULL, 0, (void *) win32_exec_prog,
+										new_arg, 0, NULL);
+		if (child == 0)
+			pg_fatal("could not create worker thread: %s\n", strerror(errno));
+
+		thread_handles[parallel_jobs - 1] = child;
+#endif
+	}
+
+	return;
+}
+
+
+#ifdef WIN32
+DWORD
+win32_exec_prog(exec_thread_arg *args)
+{
+	int			ret;
+
+	ret = !exec_prog(args->log_file, args->opt_log_file, true, "%s", args->cmd);
+
+	/* terminates thread */
+	return ret;
+}
+#endif
+
+
+/*
+ *	parallel_transfer_all_new_dbs
+ *
+ *	This has the same API as transfer_all_new_dbs, except it does parallel execution
+ *	by transfering multiple tablespaces in parallel
+ */
+void
+parallel_transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
+							  char *old_pgdata, char *new_pgdata,
+							  char *old_tablespace)
+{
+#ifndef WIN32
+	pid_t		child;
+#else
+	HANDLE		child;
+	transfer_thread_arg *new_arg;
+#endif
+
+	if (user_opts.jobs <= 1)
+		/* throw_error must be true to allow jobs */
+		transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata, new_pgdata, NULL);
+	else
+	{
+		/* parallel */
+#ifdef WIN32
+		if (thread_handles == NULL)
+			thread_handles = pg_malloc(user_opts.jobs * sizeof(HANDLE));
+
+		if (transfer_thread_args == NULL)
+		{
+			int			i;
+
+			transfer_thread_args = pg_malloc(user_opts.jobs * sizeof(transfer_thread_arg *));
+
+			/*
+			 * For safety and performance, we keep the args allocated during
+			 * the entire life of the process, and we don't free the args in a
+			 * thread different from the one that allocated it.
+			 */
+			for (i = 0; i < user_opts.jobs; i++)
+				transfer_thread_args[i] = pg_malloc0(sizeof(transfer_thread_arg));
+		}
+
+		cur_thread_args = (void **) transfer_thread_args;
+#endif
+		/* harvest any dead children */
+		while (reap_child(false) == true)
+			;
+
+		/* must we wait for a dead child? */
+		if (parallel_jobs >= user_opts.jobs)
+			reap_child(true);
+
+		/* set this before we start the job */
+		parallel_jobs++;
+
+		/* Ensure stdio state is quiesced before forking */
+		fflush(NULL);
+
+#ifndef WIN32
+		child = fork();
+		if (child == 0)
+		{
+			transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata, new_pgdata,
+								 old_tablespace);
+			/* if we take another exit path, it will be non-zero */
+			/* use _exit to skip atexit() functions */
+			_exit(0);
+		}
+		else if (child < 0)
+			/* fork failed */
+			pg_fatal("could not create worker process: %s\n", strerror(errno));
+#else
+		/* empty array element are always at the end */
+		new_arg = transfer_thread_args[parallel_jobs - 1];
+
+		/* Can only pass one pointer into the function, so use a struct */
+		new_arg->old_db_arr = old_db_arr;
+		new_arg->new_db_arr = new_db_arr;
+		if (new_arg->old_pgdata)
+			pg_free(new_arg->old_pgdata);
+		new_arg->old_pgdata = pg_strdup(old_pgdata);
+		if (new_arg->new_pgdata)
+			pg_free(new_arg->new_pgdata);
+		new_arg->new_pgdata = pg_strdup(new_pgdata);
+		if (new_arg->old_tablespace)
+			pg_free(new_arg->old_tablespace);
+		new_arg->old_tablespace = old_tablespace ? pg_strdup(old_tablespace) : NULL;
+
+		child = (HANDLE) _beginthreadex(NULL, 0, (void *) win32_transfer_all_new_dbs,
+										new_arg, 0, NULL);
+		if (child == 0)
+			pg_fatal("could not create worker thread: %s\n", strerror(errno));
+
+		thread_handles[parallel_jobs - 1] = child;
+#endif
+	}
+
+	return;
+}
+
+
+#ifdef WIN32
+DWORD
+win32_transfer_all_new_dbs(transfer_thread_arg *args)
+{
+	transfer_all_new_dbs(args->old_db_arr, args->new_db_arr, args->old_pgdata,
+						 args->new_pgdata, args->old_tablespace);
+
+	/* terminates thread */
+	return 0;
+}
+#endif
+
+
+/*
+ *	collect status from a completed worker child
+ */
+bool
+reap_child(bool wait_for_child)
+{
+#ifndef WIN32
+	int			work_status;
+	int			ret;
+#else
+	int			thread_num;
+	DWORD		res;
+#endif
+
+	if (user_opts.jobs <= 1 || parallel_jobs == 0)
+		return false;
+
+#ifndef WIN32
+	ret = waitpid(-1, &work_status, wait_for_child ? 0 : WNOHANG);
+
+	/* no children or, for WNOHANG, no dead children */
+	if (ret <= 0 || !WIFEXITED(work_status))
+		return false;
+
+	if (WEXITSTATUS(work_status) != 0)
+		pg_fatal("child worker exited abnormally: %s\n", strerror(errno));
+#else
+	/* wait for one to finish */
+	thread_num = WaitForMultipleObjects(parallel_jobs, thread_handles,
+										false, wait_for_child ? INFINITE : 0);
+
+	if (thread_num == WAIT_TIMEOUT || thread_num == WAIT_FAILED)
+		return false;
+
+	/* compute thread index in active_threads */
+	thread_num -= WAIT_OBJECT_0;
+
+	/* get the result */
+	GetExitCodeThread(thread_handles[thread_num], &res);
+	if (res != 0)
+		pg_fatal("child worker exited abnormally: %s\n", strerror(errno));
+
+	/* dispose of handle to stop leaks */
+	CloseHandle(thread_handles[thread_num]);
+
+	/* Move last slot into dead child's position */
+	if (thread_num != parallel_jobs - 1)
+	{
+		void	   *tmp_args;
+
+		thread_handles[thread_num] = thread_handles[parallel_jobs - 1];
+
+		/*
+		 * Move last active thead arg struct into the now-dead slot, and the
+		 * now-dead slot to the end for reuse by the next thread. Though the
+		 * thread struct is in use by another thread, we can safely swap the
+		 * struct pointers within the array.
+		 */
+		tmp_args = cur_thread_args[thread_num];
+		cur_thread_args[thread_num] = cur_thread_args[parallel_jobs - 1];
+		cur_thread_args[parallel_jobs - 1] = tmp_args;
+	}
+#endif
+
+	/* do this after job has been removed */
+	parallel_jobs--;
+
+	return true;
+}
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
new file mode 100644
index 0000000..fbccc2e
--- /dev/null
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -0,0 +1,616 @@
+/*
+ *	pg_upgrade.c
+ *
+ *	main source file
+ *
+ *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
+ *	src/bin/pg_upgrade/pg_upgrade.c
+ */
+
+/*
+ *	To simplify the upgrade process, we force certain system values to be
+ *	identical between old and new clusters:
+ *
+ *	We control all assignments of pg_class.oid (and relfilenode) so toast
+ *	oids are the same between old and new clusters.  This is important
+ *	because toast oids are stored as toast pointers in user tables.
+ *
+ *	While pg_class.oid and pg_class.relfilenode are initially the same
+ *	in a cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM
+ *	FULL.  In the new cluster, pg_class.oid and pg_class.relfilenode will
+ *	be the same and will match the old pg_class.oid value.  Because of
+ *	this, old/new pg_class.relfilenode values will not match if CLUSTER,
+ *	REINDEX, or VACUUM FULL have been performed in the old cluster.
+ *
+ *	We control all assignments of pg_type.oid because these oids are stored
+ *	in user composite type values.
+ *
+ *	We control all assignments of pg_enum.oid because these oids are stored
+ *	in user tables as enum values.
+ *
+ *	We control all assignments of pg_authid.oid because these oids are stored
+ *	in pg_largeobject_metadata.
+ */
+
+
+
+#include "postgres_fe.h"
+
+#include "pg_upgrade.h"
+#include "common/restricted_token.h"
+
+#ifdef HAVE_LANGINFO_H
+#include <langinfo.h>
+#endif
+
+static void prepare_new_cluster(void);
+static void prepare_new_databases(void);
+static void create_new_objects(void);
+static void copy_clog_xlog_xid(void);
+static void set_frozenxids(bool minmxid_only);
+static void setup(char *argv0, bool *live_check);
+static void cleanup(void);
+
+ClusterInfo old_cluster,
+			new_cluster;
+OSInfo		os_info;
+
+char	   *output_files[] = {
+	SERVER_LOG_FILE,
+#ifdef WIN32
+	/* unique file for pg_ctl start */
+	SERVER_START_LOG_FILE,
+#endif
+	UTILITY_LOG_FILE,
+	INTERNAL_LOG_FILE,
+	NULL
+};
+
+
+int
+main(int argc, char **argv)
+{
+	char	   *analyze_script_file_name = NULL;
+	char	   *deletion_script_file_name = NULL;
+	bool		live_check = false;
+
+	parseCommandLine(argc, argv);
+
+	get_restricted_token(os_info.progname);
+
+	adjust_data_dir(&old_cluster);
+	adjust_data_dir(&new_cluster);
+
+	setup(argv[0], &live_check);
+
+	output_check_banner(live_check);
+
+	check_cluster_versions();
+
+	get_sock_dir(&old_cluster, live_check);
+	get_sock_dir(&new_cluster, false);
+
+	check_cluster_compatibility(live_check);
+
+	check_and_dump_old_cluster(live_check);
+
+
+	/* -- NEW -- */
+	start_postmaster(&new_cluster, true);
+
+	check_new_cluster();
+	report_clusters_compatible();
+
+	pg_log(PG_REPORT, "\nPerforming Upgrade\n");
+	pg_log(PG_REPORT, "------------------\n");
+
+	prepare_new_cluster();
+
+	stop_postmaster(false);
+
+	/*
+	 * Destructive Changes to New Cluster
+	 */
+
+	copy_clog_xlog_xid();
+
+	/* New now using xids of the old system */
+
+	/* -- NEW -- */
+	start_postmaster(&new_cluster, true);
+
+	prepare_new_databases();
+
+	create_new_objects();
+
+	stop_postmaster(false);
+
+	/*
+	 * Most failures happen in create_new_objects(), which has completed at
+	 * this point.  We do this here because it is just before linking, which
+	 * will link the old and new cluster data files, preventing the old
+	 * cluster from being safely started once the new cluster is started.
+	 */
+	if (user_opts.transfer_mode == TRANSFER_MODE_LINK)
+		disable_old_cluster();
+
+	transfer_all_new_tablespaces(&old_cluster.dbarr, &new_cluster.dbarr,
+								 old_cluster.pgdata, new_cluster.pgdata);
+
+	/*
+	 * Assuming OIDs are only used in system tables, there is no need to
+	 * restore the OID counter because we have not transferred any OIDs from
+	 * the old system, but we do it anyway just in case.  We do it late here
+	 * because there is no need to have the schema load use new oids.
+	 */
+	prep_status("Setting next OID for new cluster");
+	exec_prog(UTILITY_LOG_FILE, NULL, true,
+			  "\"%s/pg_resetxlog\" -o %u \"%s\"",
+			  new_cluster.bindir, old_cluster.controldata.chkpnt_nxtoid,
+			  new_cluster.pgdata);
+	check_ok();
+
+	prep_status("Sync data directory to disk");
+	exec_prog(UTILITY_LOG_FILE, NULL, true,
+			  "\"%s/initdb\" --sync-only \"%s\"", new_cluster.bindir,
+			  new_cluster.pgdata);
+	check_ok();
+
+	create_script_for_cluster_analyze(&analyze_script_file_name);
+	create_script_for_old_cluster_deletion(&deletion_script_file_name);
+
+	issue_warnings();
+
+	pg_log(PG_REPORT, "\nUpgrade Complete\n");
+	pg_log(PG_REPORT, "----------------\n");
+
+	output_completion_banner(analyze_script_file_name,
+							 deletion_script_file_name);
+
+	pg_free(analyze_script_file_name);
+	pg_free(deletion_script_file_name);
+
+	cleanup();
+
+	return 0;
+}
+
+
+static void
+setup(char *argv0, bool *live_check)
+{
+	char		exec_path[MAXPGPATH];	/* full path to my executable */
+
+	/*
+	 * make sure the user has a clean environment, otherwise, we may confuse
+	 * libpq when we connect to one (or both) of the servers.
+	 */
+	check_pghost_envvar();
+
+	verify_directories();
+
+	/* no postmasters should be running, except for a live check */
+	if (pid_lock_file_exists(old_cluster.pgdata))
+	{
+		/*
+		 * If we have a postmaster.pid file, try to start the server.  If it
+		 * starts, the pid file was stale, so stop the server.  If it doesn't
+		 * start, assume the server is running.  If the pid file is left over
+		 * from a server crash, this also allows any committed transactions
+		 * stored in the WAL to be replayed so they are not lost, because WAL
+		 * files are not transfered from old to new servers.
+		 */
+		if (start_postmaster(&old_cluster, false))
+			stop_postmaster(false);
+		else
+		{
+			if (!user_opts.check)
+				pg_fatal("There seems to be a postmaster servicing the old cluster.\n"
+						 "Please shutdown that postmaster and try again.\n");
+			else
+				*live_check = true;
+		}
+	}
+
+	/* same goes for the new postmaster */
+	if (pid_lock_file_exists(new_cluster.pgdata))
+	{
+		if (start_postmaster(&new_cluster, false))
+			stop_postmaster(false);
+		else
+			pg_fatal("There seems to be a postmaster servicing the new cluster.\n"
+					 "Please shutdown that postmaster and try again.\n");
+	}
+
+	/* get path to pg_upgrade executable */
+	if (find_my_exec(argv0, exec_path) < 0)
+		pg_fatal("Could not get path name to pg_upgrade: %s\n", getErrorText(errno));
+
+	/* Trim off program name and keep just path */
+	*last_dir_separator(exec_path) = '\0';
+	canonicalize_path(exec_path);
+	os_info.exec_path = pg_strdup(exec_path);
+}
+
+
+static void
+prepare_new_cluster(void)
+{
+	/*
+	 * It would make more sense to freeze after loading the schema, but that
+	 * would cause us to lose the frozenids restored by the load. We use
+	 * --analyze so autovacuum doesn't update statistics later
+	 */
+	prep_status("Analyzing all rows in the new cluster");
+	exec_prog(UTILITY_LOG_FILE, NULL, true,
+			  "\"%s/vacuumdb\" %s --all --analyze %s",
+			  new_cluster.bindir, cluster_conn_opts(&new_cluster),
+			  log_opts.verbose ? "--verbose" : "");
+	check_ok();
+
+	/*
+	 * We do freeze after analyze so pg_statistic is also frozen. template0 is
+	 * not frozen here, but data rows were frozen by initdb, and we set its
+	 * datfrozenxid, relfrozenxids, and relminmxid later to match the new xid
+	 * counter later.
+	 */
+	prep_status("Freezing all rows on the new cluster");
+	exec_prog(UTILITY_LOG_FILE, NULL, true,
+			  "\"%s/vacuumdb\" %s --all --freeze %s",
+			  new_cluster.bindir, cluster_conn_opts(&new_cluster),
+			  log_opts.verbose ? "--verbose" : "");
+	check_ok();
+
+	get_pg_database_relfilenode(&new_cluster);
+}
+
+
+static void
+prepare_new_databases(void)
+{
+	/*
+	 * We set autovacuum_freeze_max_age to its maximum value so autovacuum
+	 * does not launch here and delete clog files, before the frozen xids are
+	 * set.
+	 */
+
+	set_frozenxids(false);
+
+	prep_status("Restoring global objects in the new cluster");
+
+	/*
+	 * We have to create the databases first so we can install support
+	 * functions in all the other databases.  Ideally we could create the
+	 * support functions in template1 but pg_dumpall creates database using
+	 * the template0 template.
+	 */
+	exec_prog(UTILITY_LOG_FILE, NULL, true,
+			  "\"%s/psql\" " EXEC_PSQL_ARGS " %s -f \"%s\"",
+			  new_cluster.bindir, cluster_conn_opts(&new_cluster),
+			  GLOBALS_DUMP_FILE);
+	check_ok();
+
+	/* we load this to get a current list of databases */
+	get_db_and_rel_infos(&new_cluster);
+}
+
+
+static void
+create_new_objects(void)
+{
+	int			dbnum;
+
+	prep_status("Restoring database schemas in the new cluster\n");
+
+	for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+	{
+		char		sql_file_name[MAXPGPATH],
+					log_file_name[MAXPGPATH];
+		DbInfo	   *old_db = &old_cluster.dbarr.dbs[dbnum];
+
+		pg_log(PG_STATUS, "%s", old_db->db_name);
+		snprintf(sql_file_name, sizeof(sql_file_name), DB_DUMP_FILE_MASK, old_db->db_oid);
+		snprintf(log_file_name, sizeof(log_file_name), DB_DUMP_LOG_FILE_MASK, old_db->db_oid);
+
+		/*
+		 * pg_dump only produces its output at the end, so there is little
+		 * parallelism if using the pipe.
+		 */
+		parallel_exec_prog(log_file_name,
+						   NULL,
+						   "\"%s/pg_restore\" %s --exit-on-error --verbose --dbname \"%s\" \"%s\"",
+						   new_cluster.bindir,
+						   cluster_conn_opts(&new_cluster),
+						   old_db->db_name,
+						   sql_file_name);
+	}
+
+	/* reap all children */
+	while (reap_child(true) == true)
+		;
+
+	end_progress_output();
+	check_ok();
+
+	/*
+	 * We don't have minmxids for databases or relations in pre-9.3
+	 * clusters, so set those after we have restores the schemas.
+	 */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) < 903)
+		set_frozenxids(true);
+
+	optionally_create_toast_tables();
+
+	/* regenerate now that we have objects in the databases */
+	get_db_and_rel_infos(&new_cluster);
+}
+
+/*
+ * Delete the given subdirectory contents from the new cluster
+ */
+static void
+remove_new_subdir(char *subdir, bool rmtopdir)
+{
+	char		new_path[MAXPGPATH];
+
+	prep_status("Deleting files from new %s", subdir);
+
+	snprintf(new_path, sizeof(new_path), "%s/%s", new_cluster.pgdata, subdir);
+	if (!rmtree(new_path, rmtopdir))
+		pg_fatal("could not delete directory \"%s\"\n", new_path);
+
+	check_ok();
+}
+
+/*
+ * Copy the files from the old cluster into it
+ */
+static void
+copy_subdir_files(char *subdir)
+{
+	char		old_path[MAXPGPATH];
+	char		new_path[MAXPGPATH];
+
+	remove_new_subdir(subdir, true);
+
+	snprintf(old_path, sizeof(old_path), "%s/%s", old_cluster.pgdata, subdir);
+	snprintf(new_path, sizeof(new_path), "%s/%s", new_cluster.pgdata, subdir);
+
+	prep_status("Copying old %s to new server", subdir);
+
+	exec_prog(UTILITY_LOG_FILE, NULL, true,
+#ifndef WIN32
+			  "cp -Rf \"%s\" \"%s\"",
+#else
+	/* flags: everything, no confirm, quiet, overwrite read-only */
+			  "xcopy /e /y /q /r \"%s\" \"%s\\\"",
+#endif
+			  old_path, new_path);
+
+	check_ok();
+}
+
+static void
+copy_clog_xlog_xid(void)
+{
+	/* copy old commit logs to new data dir */
+	copy_subdir_files("pg_clog");
+
+	/* set the next transaction id and epoch of the new cluster */
+	prep_status("Setting next transaction ID and epoch for new cluster");
+	exec_prog(UTILITY_LOG_FILE, NULL, true,
+			  "\"%s/pg_resetxlog\" -f -x %u \"%s\"",
+			  new_cluster.bindir, old_cluster.controldata.chkpnt_nxtxid,
+			  new_cluster.pgdata);
+	exec_prog(UTILITY_LOG_FILE, NULL, true,
+			  "\"%s/pg_resetxlog\" -f -e %u \"%s\"",
+			  new_cluster.bindir, old_cluster.controldata.chkpnt_nxtepoch,
+			  new_cluster.pgdata);
+	/* must reset commit timestamp limits also */
+	exec_prog(UTILITY_LOG_FILE, NULL, true,
+			  "\"%s/pg_resetxlog\" -f -c %u,%u \"%s\"",
+			  new_cluster.bindir,
+			  old_cluster.controldata.chkpnt_nxtxid,
+			  old_cluster.controldata.chkpnt_nxtxid,
+			  new_cluster.pgdata);
+	check_ok();
+
+	/*
+	 * If the old server is before the MULTIXACT_FORMATCHANGE_CAT_VER change
+	 * (see pg_upgrade.h) and the new server is after, then we don't copy
+	 * pg_multixact files, but we need to reset pg_control so that the new
+	 * server doesn't attempt to read multis older than the cutoff value.
+	 */
+	if (old_cluster.controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER &&
+		new_cluster.controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER)
+	{
+		copy_subdir_files("pg_multixact/offsets");
+		copy_subdir_files("pg_multixact/members");
+
+		prep_status("Setting next multixact ID and offset for new cluster");
+
+		/*
+		 * we preserve all files and contents, so we must preserve both "next"
+		 * counters here and the oldest multi present on system.
+		 */
+		exec_prog(UTILITY_LOG_FILE, NULL, true,
+				  "\"%s/pg_resetxlog\" -O %u -m %u,%u \"%s\"",
+				  new_cluster.bindir,
+				  old_cluster.controldata.chkpnt_nxtmxoff,
+				  old_cluster.controldata.chkpnt_nxtmulti,
+				  old_cluster.controldata.chkpnt_oldstMulti,
+				  new_cluster.pgdata);
+		check_ok();
+	}
+	else if (new_cluster.controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER)
+	{
+		/*
+		 * Remove offsets/0000 file created by initdb that no longer matches
+		 * the new multi-xid value.  "members" starts at zero so no need to
+		 * remove it.
+		 */
+		remove_new_subdir("pg_multixact/offsets", false);
+
+		prep_status("Setting oldest multixact ID on new cluster");
+
+		/*
+		 * We don't preserve files in this case, but it's important that the
+		 * oldest multi is set to the latest value used by the old system, so
+		 * that multixact.c returns the empty set for multis that might be
+		 * present on disk.  We set next multi to the value following that; it
+		 * might end up wrapped around (i.e. 0) if the old cluster had
+		 * next=MaxMultiXactId, but multixact.c can cope with that just fine.
+		 */
+		exec_prog(UTILITY_LOG_FILE, NULL, true,
+				  "\"%s/pg_resetxlog\" -m %u,%u \"%s\"",
+				  new_cluster.bindir,
+				  old_cluster.controldata.chkpnt_nxtmulti + 1,
+				  old_cluster.controldata.chkpnt_nxtmulti,
+				  new_cluster.pgdata);
+		check_ok();
+	}
+
+	/* now reset the wal archives in the new cluster */
+	prep_status("Resetting WAL archives");
+	exec_prog(UTILITY_LOG_FILE, NULL, true,
+			  "\"%s/pg_resetxlog\" -l %s \"%s\"", new_cluster.bindir,
+			  old_cluster.controldata.nextxlogfile,
+			  new_cluster.pgdata);
+	check_ok();
+}
+
+
+/*
+ *	set_frozenxids()
+ *
+ *	We have frozen all xids, so set datfrozenxid, relfrozenxid, and
+ *	relminmxid to be the old cluster's xid counter, which we just set
+ *	in the new cluster.  User-table frozenxid and minmxid values will
+ *	be set by pg_dump --binary-upgrade, but objects not set by the pg_dump
+ *	must have proper frozen counters.
+ */
+static
+void
+set_frozenxids(bool minmxid_only)
+{
+	int			dbnum;
+	PGconn	   *conn,
+			   *conn_template1;
+	PGresult   *dbres;
+	int			ntups;
+	int			i_datname;
+	int			i_datallowconn;
+
+	if (!minmxid_only)
+		prep_status("Setting frozenxid and minmxid counters in new cluster");
+	else
+		prep_status("Setting minmxid counter in new cluster");
+
+	conn_template1 = connectToServer(&new_cluster, "template1");
+
+	if (!minmxid_only)
+		/* set pg_database.datfrozenxid */
+		PQclear(executeQueryOrDie(conn_template1,
+								  "UPDATE pg_catalog.pg_database "
+								  "SET	datfrozenxid = '%u'",
+								  old_cluster.controldata.chkpnt_nxtxid));
+
+	/* set pg_database.datminmxid */
+	PQclear(executeQueryOrDie(conn_template1,
+							  "UPDATE pg_catalog.pg_database "
+							  "SET	datminmxid = '%u'",
+							  old_cluster.controldata.chkpnt_nxtmulti));
+
+	/* get database names */
+	dbres = executeQueryOrDie(conn_template1,
+							  "SELECT	datname, datallowconn "
+							  "FROM	pg_catalog.pg_database");
+
+	i_datname = PQfnumber(dbres, "datname");
+	i_datallowconn = PQfnumber(dbres, "datallowconn");
+
+	ntups = PQntuples(dbres);
+	for (dbnum = 0; dbnum < ntups; dbnum++)
+	{
+		char	   *datname = PQgetvalue(dbres, dbnum, i_datname);
+		char	   *datallowconn = PQgetvalue(dbres, dbnum, i_datallowconn);
+
+		/*
+		 * We must update databases where datallowconn = false, e.g.
+		 * template0, because autovacuum increments their datfrozenxids,
+		 * relfrozenxids, and relminmxid  even if autovacuum is turned off,
+		 * and even though all the data rows are already frozen  To enable
+		 * this, we temporarily change datallowconn.
+		 */
+		if (strcmp(datallowconn, "f") == 0)
+			PQclear(executeQueryOrDie(conn_template1,
+								"ALTER DATABASE %s ALLOW_CONNECTIONS = true",
+									  quote_identifier(datname)));
+
+		conn = connectToServer(&new_cluster, datname);
+
+		if (!minmxid_only)
+			/* set pg_class.relfrozenxid */
+			PQclear(executeQueryOrDie(conn,
+									  "UPDATE	pg_catalog.pg_class "
+									  "SET	relfrozenxid = '%u' "
+			/* only heap, materialized view, and TOAST are vacuumed */
+									  "WHERE	relkind IN ('r', 'm', 't')",
+									  old_cluster.controldata.chkpnt_nxtxid));
+
+		/* set pg_class.relminmxid */
+		PQclear(executeQueryOrDie(conn,
+								  "UPDATE	pg_catalog.pg_class "
+								  "SET	relminmxid = '%u' "
+		/* only heap, materialized view, and TOAST are vacuumed */
+								  "WHERE	relkind IN ('r', 'm', 't')",
+								  old_cluster.controldata.chkpnt_nxtmulti));
+		PQfinish(conn);
+
+		/* Reset datallowconn flag */
+		if (strcmp(datallowconn, "f") == 0)
+			PQclear(executeQueryOrDie(conn_template1,
+							   "ALTER DATABASE %s ALLOW_CONNECTIONS = false",
+									  quote_identifier(datname)));
+	}
+
+	PQclear(dbres);
+
+	PQfinish(conn_template1);
+
+	check_ok();
+}
+
+
+static void
+cleanup(void)
+{
+	fclose(log_opts.internal);
+
+	/* Remove dump and log files? */
+	if (!log_opts.retain)
+	{
+		int			dbnum;
+		char	  **filename;
+
+		for (filename = output_files; *filename != NULL; filename++)
+			unlink(*filename);
+
+		/* remove dump files */
+		unlink(GLOBALS_DUMP_FILE);
+
+		if (old_cluster.dbarr.dbs)
+			for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+			{
+				char		sql_file_name[MAXPGPATH],
+							log_file_name[MAXPGPATH];
+				DbInfo	   *old_db = &old_cluster.dbarr.dbs[dbnum];
+
+				snprintf(sql_file_name, sizeof(sql_file_name), DB_DUMP_FILE_MASK, old_db->db_oid);
+				unlink(sql_file_name);
+
+				snprintf(log_file_name, sizeof(log_file_name), DB_DUMP_LOG_FILE_MASK, old_db->db_oid);
+				unlink(log_file_name);
+			}
+	}
+}
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
new file mode 100644
index 0000000..4683c6f
--- /dev/null
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -0,0 +1,481 @@
+/*
+ *	pg_upgrade.h
+ *
+ *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
+ *	src/bin/pg_upgrade/pg_upgrade.h
+ */
+
+#include <unistd.h>
+#include <assert.h>
+#include <sys/stat.h>
+#include <sys/time.h>
+
+#include "libpq-fe.h"
+
+/* Use port in the private/dynamic port number range */
+#define DEF_PGUPORT			50432
+
+/* Allocate for null byte */
+#define USER_NAME_SIZE		128
+
+#define MAX_STRING			1024
+#define LINE_ALLOC			4096
+#define QUERY_ALLOC			8192
+
+#define MIGRATOR_API_VERSION	1
+
+#define MESSAGE_WIDTH		60
+
+#define GET_MAJOR_VERSION(v)	((v) / 100)
+
+/* contains both global db information and CREATE DATABASE commands */
+#define GLOBALS_DUMP_FILE	"pg_upgrade_dump_globals.sql"
+#define DB_DUMP_FILE_MASK	"pg_upgrade_dump_%u.custom"
+
+#define DB_DUMP_LOG_FILE_MASK	"pg_upgrade_dump_%u.log"
+#define SERVER_LOG_FILE		"pg_upgrade_server.log"
+#define UTILITY_LOG_FILE	"pg_upgrade_utility.log"
+#define INTERNAL_LOG_FILE	"pg_upgrade_internal.log"
+
+extern char *output_files[];
+
+/*
+ * WIN32 files do not accept writes from multiple processes
+ *
+ * On Win32, we can't send both pg_upgrade output and command output to the
+ * same file because we get the error: "The process cannot access the file
+ * because it is being used by another process." so send the pg_ctl
+ * command-line output to a new file, rather than into the server log file.
+ * Ideally we could use UTILITY_LOG_FILE for this, but some Windows platforms
+ * keep the pg_ctl output file open by the running postmaster, even after
+ * pg_ctl exits.
+ *
+ * We could use the Windows pgwin32_open() flags to allow shared file
+ * writes but is unclear how all other tools would use those flags, so
+ * we just avoid it and log a little differently on Windows;  we adjust
+ * the error message appropriately.
+ */
+#ifndef WIN32
+#define SERVER_START_LOG_FILE	SERVER_LOG_FILE
+#define SERVER_STOP_LOG_FILE	SERVER_LOG_FILE
+#else
+#define SERVER_START_LOG_FILE	"pg_upgrade_server_start.log"
+/*
+ *	"pg_ctl start" keeps SERVER_START_LOG_FILE and SERVER_LOG_FILE open
+ *	while the server is running, so we use UTILITY_LOG_FILE for "pg_ctl
+ *	stop".
+ */
+#define SERVER_STOP_LOG_FILE	UTILITY_LOG_FILE
+#endif
+
+
+#ifndef WIN32
+#define pg_copy_file		copy_file
+#define pg_mv_file			rename
+#define pg_link_file		link
+#define PATH_SEPARATOR		'/'
+#define RM_CMD				"rm -f"
+#define RMDIR_CMD			"rm -rf"
+#define SCRIPT_PREFIX		"./"
+#define SCRIPT_EXT			"sh"
+#define ECHO_QUOTE	"'"
+#define ECHO_BLANK	""
+#else
+#define pg_copy_file		CopyFile
+#define pg_mv_file			pgrename
+#define pg_link_file		win32_pghardlink
+#define PATH_SEPARATOR		'\\'
+#define RM_CMD				"DEL /q"
+#define RMDIR_CMD			"RMDIR /s/q"
+#define SCRIPT_PREFIX		""
+#define SCRIPT_EXT			"bat"
+#define EXE_EXT				".exe"
+#define ECHO_QUOTE	""
+#define ECHO_BLANK	"."
+#endif
+
+#define CLUSTER_NAME(cluster)	((cluster) == &old_cluster ? "old" : \
+								 (cluster) == &new_cluster ? "new" : "none")
+
+#define atooid(x)  ((Oid) strtoul((x), NULL, 10))
+
+/* OID system catalog preservation added during PG 9.0 development */
+#define TABLE_SPACE_SUBDIRS_CAT_VER 201001111
+/* postmaster/postgres -b (binary_upgrade) flag added during PG 9.1 development */
+#define BINARY_UPGRADE_SERVER_FLAG_CAT_VER 201104251
+/*
+ *	Visibility map changed with this 9.2 commit,
+ *	8f9fe6edce358f7904e0db119416b4d1080a83aa; pick later catalog version.
+ */
+#define VISIBILITY_MAP_CRASHSAFE_CAT_VER 201107031
+
+/*
+ * pg_multixact format changed in 9.3 commit 0ac5ad5134f2769ccbaefec73844f85,
+ * ("Improve concurrency of foreign key locking") which also updated catalog
+ * version to this value.  pg_upgrade behavior depends on whether old and new
+ * server versions are both newer than this, or only the new one is.
+ */
+#define MULTIXACT_FORMATCHANGE_CAT_VER 201301231
+
+/*
+ * large object chunk size added to pg_controldata,
+ * commit 5f93c37805e7485488480916b4585e098d3cc883
+ */
+#define LARGE_OBJECT_SIZE_PG_CONTROL_VER 942
+
+/*
+ * change in JSONB format during 9.4 beta
+ */
+#define JSONB_FORMAT_CHANGE_CAT_VER 201409291
+
+/*
+ * Each relation is represented by a relinfo structure.
+ */
+typedef struct
+{
+	/* Can't use NAMEDATALEN;  not guaranteed to fit on client */
+	char	   *nspname;		/* namespace name */
+	char	   *relname;		/* relation name */
+	Oid			reloid;			/* relation oid */
+	Oid			relfilenode;	/* relation relfile node */
+	/* relation tablespace path, or "" for the cluster default */
+	char	   *tablespace;
+	bool		nsp_alloc;
+	bool		tblsp_alloc;
+} RelInfo;
+
+typedef struct
+{
+	RelInfo    *rels;
+	int			nrels;
+} RelInfoArr;
+
+/*
+ * The following structure represents a relation mapping.
+ */
+typedef struct
+{
+	const char *old_tablespace;
+	const char *new_tablespace;
+	const char *old_tablespace_suffix;
+	const char *new_tablespace_suffix;
+	Oid			old_db_oid;
+	Oid			new_db_oid;
+
+	/*
+	 * old/new relfilenodes might differ for pg_largeobject(_metadata) indexes
+	 * due to VACUUM FULL or REINDEX.  Other relfilenodes are preserved.
+	 */
+	Oid			old_relfilenode;
+	Oid			new_relfilenode;
+	/* the rest are used only for logging and error reporting */
+	char	   *nspname;		/* namespaces */
+	char	   *relname;
+} FileNameMap;
+
+/*
+ * Structure to store database information
+ */
+typedef struct
+{
+	Oid			db_oid;			/* oid of the database */
+	char	   *db_name;		/* database name */
+	char		db_tablespace[MAXPGPATH];		/* database default tablespace
+												 * path */
+	char	   *db_collate;
+	char	   *db_ctype;
+	int			db_encoding;
+	RelInfoArr	rel_arr;		/* array of all user relinfos */
+} DbInfo;
+
+typedef struct
+{
+	DbInfo	   *dbs;			/* array of db infos */
+	int			ndbs;			/* number of db infos */
+} DbInfoArr;
+
+/*
+ * The following structure is used to hold pg_control information.
+ * Rather than using the backend's control structure we use our own
+ * structure to avoid pg_control version issues between releases.
+ */
+typedef struct
+{
+	uint32		ctrl_ver;
+	uint32		cat_ver;
+	char		nextxlogfile[25];
+	uint32		chkpnt_tli;
+	uint32		chkpnt_nxtxid;
+	uint32		chkpnt_nxtepoch;
+	uint32		chkpnt_nxtoid;
+	uint32		chkpnt_nxtmulti;
+	uint32		chkpnt_nxtmxoff;
+	uint32		chkpnt_oldstMulti;
+	uint32		align;
+	uint32		blocksz;
+	uint32		largesz;
+	uint32		walsz;
+	uint32		walseg;
+	uint32		ident;
+	uint32		index;
+	uint32		toast;
+	uint32		large_object;
+	bool		date_is_int;
+	bool		float8_pass_by_value;
+	bool		data_checksum_version;
+} ControlData;
+
+/*
+ * Enumeration to denote link modes
+ */
+typedef enum
+{
+	TRANSFER_MODE_COPY,
+	TRANSFER_MODE_LINK
+} transferMode;
+
+/*
+ * Enumeration to denote pg_log modes
+ */
+typedef enum
+{
+	PG_VERBOSE,
+	PG_STATUS,
+	PG_REPORT,
+	PG_WARNING,
+	PG_FATAL
+} eLogType;
+
+
+typedef long pgpid_t;
+
+
+/*
+ * cluster
+ *
+ *	information about each cluster
+ */
+typedef struct
+{
+	ControlData controldata;	/* pg_control information */
+	DbInfoArr	dbarr;			/* dbinfos array */
+	char	   *pgdata;			/* pathname for cluster's $PGDATA directory */
+	char	   *pgconfig;		/* pathname for cluster's config file
+								 * directory */
+	char	   *bindir;			/* pathname for cluster's executable directory */
+	char	   *pgopts;			/* options to pass to the server, like pg_ctl
+								 * -o */
+	char	   *sockdir;		/* directory for Unix Domain socket, if any */
+	unsigned short port;		/* port number where postmaster is waiting */
+	uint32		major_version;	/* PG_VERSION of cluster */
+	char		major_version_str[64];	/* string PG_VERSION of cluster */
+	uint32		bin_version;	/* version returned from pg_ctl */
+	Oid			pg_database_oid;	/* OID of pg_database relation */
+	const char *tablespace_suffix;		/* directory specification */
+} ClusterInfo;
+
+
+/*
+ *	LogOpts
+*/
+typedef struct
+{
+	FILE	   *internal;		/* internal log FILE */
+	bool		verbose;		/* TRUE -> be verbose in messages */
+	bool		retain;			/* retain log files on success */
+} LogOpts;
+
+
+/*
+ *	UserOpts
+*/
+typedef struct
+{
+	bool		check;			/* TRUE -> ask user for permission to make
+								 * changes */
+	transferMode transfer_mode; /* copy files or link them? */
+	int			jobs;
+} UserOpts;
+
+
+/*
+ * OSInfo
+ */
+typedef struct
+{
+	const char *progname;		/* complete pathname for this program */
+	char	   *exec_path;		/* full path to my executable */
+	char	   *user;			/* username for clusters */
+	bool		user_specified; /* user specified on command-line */
+	char	  **old_tablespaces;	/* tablespaces */
+	int			num_old_tablespaces;
+	char	  **libraries;		/* loadable libraries */
+	int			num_libraries;
+	ClusterInfo *running_cluster;
+} OSInfo;
+
+
+/*
+ * Global variables
+ */
+extern LogOpts log_opts;
+extern UserOpts user_opts;
+extern ClusterInfo old_cluster,
+			new_cluster;
+extern OSInfo os_info;
+
+
+/* check.c */
+
+void		output_check_banner(bool live_check);
+void check_and_dump_old_cluster(bool live_check);
+void		check_new_cluster(void);
+void		report_clusters_compatible(void);
+void		issue_warnings(void);
+void output_completion_banner(char *analyze_script_file_name,
+						 char *deletion_script_file_name);
+void		check_cluster_versions(void);
+void		check_cluster_compatibility(bool live_check);
+void		create_script_for_old_cluster_deletion(char **deletion_script_file_name);
+void		create_script_for_cluster_analyze(char **analyze_script_file_name);
+
+
+/* controldata.c */
+
+void		get_control_data(ClusterInfo *cluster, bool live_check);
+void		check_control_data(ControlData *oldctrl, ControlData *newctrl);
+void		disable_old_cluster(void);
+
+
+/* dump.c */
+
+void		generate_old_dump(void);
+void		optionally_create_toast_tables(void);
+
+
+/* exec.c */
+
+#define EXEC_PSQL_ARGS "--echo-queries --set ON_ERROR_STOP=on --no-psqlrc --dbname=template1"
+
+bool		exec_prog(const char *log_file, const char *opt_log_file,
+		  bool throw_error, const char *fmt,...) pg_attribute_printf(4, 5);
+void		verify_directories(void);
+bool		pid_lock_file_exists(const char *datadir);
+
+
+/* file.c */
+
+#ifdef PAGE_CONVERSION
+typedef const char *(*pluginStartup) (uint16 migratorVersion,
+								uint16 *pluginVersion, uint16 newPageVersion,
+								   uint16 oldPageVersion, void **pluginData);
+typedef const char *(*pluginConvertFile) (void *pluginData,
+								   const char *dstName, const char *srcName);
+typedef const char *(*pluginConvertPage) (void *pluginData,
+								   const char *dstPage, const char *srcPage);
+typedef const char *(*pluginShutdown) (void *pluginData);
+
+typedef struct
+{
+	uint16		oldPageVersion; /* Page layout version of the old cluster		*/
+	uint16		newPageVersion; /* Page layout version of the new cluster		*/
+	uint16		pluginVersion;	/* API version of converter plugin */
+	void	   *pluginData;		/* Plugin data (set by plugin) */
+	pluginStartup startup;		/* Pointer to plugin's startup function */
+	pluginConvertFile convertFile;		/* Pointer to plugin's file converter
+										 * function */
+	pluginConvertPage convertPage;		/* Pointer to plugin's page converter
+										 * function */
+	pluginShutdown shutdown;	/* Pointer to plugin's shutdown function */
+} pageCnvCtx;
+
+const pageCnvCtx *setupPageConverter(void);
+#else
+/* dummy */
+typedef void *pageCnvCtx;
+#endif
+
+const char *copyAndUpdateFile(pageCnvCtx *pageConverter, const char *src,
+				  const char *dst, bool force);
+const char *linkAndUpdateFile(pageCnvCtx *pageConverter, const char *src,
+				  const char *dst);
+
+void		check_hard_link(void);
+FILE	   *fopen_priv(const char *path, const char *mode);
+
+/* function.c */
+
+void		get_loadable_libraries(void);
+void		check_loadable_libraries(void);
+
+/* info.c */
+
+FileNameMap *gen_db_file_maps(DbInfo *old_db,
+				 DbInfo *new_db, int *nmaps, const char *old_pgdata,
+				 const char *new_pgdata);
+void		get_db_and_rel_infos(ClusterInfo *cluster);
+void print_maps(FileNameMap *maps, int n,
+		   const char *db_name);
+
+/* option.c */
+
+void		parseCommandLine(int argc, char *argv[]);
+void		adjust_data_dir(ClusterInfo *cluster);
+void		get_sock_dir(ClusterInfo *cluster, bool live_check);
+
+/* relfilenode.c */
+
+void		get_pg_database_relfilenode(ClusterInfo *cluster);
+void transfer_all_new_tablespaces(DbInfoArr *old_db_arr,
+				  DbInfoArr *new_db_arr, char *old_pgdata, char *new_pgdata);
+void transfer_all_new_dbs(DbInfoArr *old_db_arr,
+				   DbInfoArr *new_db_arr, char *old_pgdata, char *new_pgdata,
+					 char *old_tablespace);
+
+/* tablespace.c */
+
+void		init_tablespaces(void);
+
+
+/* server.c */
+
+PGconn	   *connectToServer(ClusterInfo *cluster, const char *db_name);
+PGresult   *executeQueryOrDie(PGconn *conn, const char *fmt,...) pg_attribute_printf(2, 3);
+
+char	   *cluster_conn_opts(ClusterInfo *cluster);
+
+bool		start_postmaster(ClusterInfo *cluster, bool throw_error);
+void		stop_postmaster(bool fast);
+uint32		get_major_server_version(ClusterInfo *cluster);
+void		check_pghost_envvar(void);
+
+
+/* util.c */
+
+char	   *quote_identifier(const char *s);
+int			get_user_info(char **user_name_p);
+void		check_ok(void);
+void		report_status(eLogType type, const char *fmt,...) pg_attribute_printf(2, 3);
+void		pg_log(eLogType type, const char *fmt,...) pg_attribute_printf(2, 3);
+void		pg_fatal(const char *fmt,...) pg_attribute_printf(1, 2) pg_attribute_noreturn();
+void		end_progress_output(void);
+void		prep_status(const char *fmt,...) pg_attribute_printf(1, 2);
+void		check_ok(void);
+const char *getErrorText(int errNum);
+unsigned int str2uint(const char *str);
+void		pg_putenv(const char *var, const char *val);
+
+
+/* version.c */
+
+void new_9_0_populate_pg_largeobject_metadata(ClusterInfo *cluster,
+										 bool check_mode);
+void old_9_3_check_for_line_data_type_usage(ClusterInfo *cluster);
+
+/* parallel.c */
+void parallel_exec_prog(const char *log_file, const char *opt_log_file,
+				   const char *fmt,...) pg_attribute_printf(3, 4);
+void parallel_transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
+							  char *old_pgdata, char *new_pgdata,
+							  char *old_tablespace);
+bool		reap_child(bool wait_for_child);
diff --git a/src/bin/pg_upgrade/relfilenode.c b/src/bin/pg_upgrade/relfilenode.c
new file mode 100644
index 0000000..fe05880
--- /dev/null
+++ b/src/bin/pg_upgrade/relfilenode.c
@@ -0,0 +1,294 @@
+/*
+ *	relfilenode.c
+ *
+ *	relfilenode functions
+ *
+ *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
+ *	src/bin/pg_upgrade/relfilenode.c
+ */
+
+#include "postgres_fe.h"
+
+#include "pg_upgrade.h"
+
+#include "catalog/pg_class.h"
+#include "access/transam.h"
+
+
+static void transfer_single_new_db(pageCnvCtx *pageConverter,
+					   FileNameMap *maps, int size, char *old_tablespace);
+static void transfer_relfile(pageCnvCtx *pageConverter, FileNameMap *map,
+				 const char *suffix);
+
+
+/*
+ * transfer_all_new_tablespaces()
+ *
+ * Responsible for upgrading all database. invokes routines to generate mappings and then
+ * physically link the databases.
+ */
+void
+transfer_all_new_tablespaces(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
+							 char *old_pgdata, char *new_pgdata)
+{
+	pg_log(PG_REPORT, "%s user relation files\n",
+	  user_opts.transfer_mode == TRANSFER_MODE_LINK ? "Linking" : "Copying");
+
+	/*
+	 * Transfering files by tablespace is tricky because a single database can
+	 * use multiple tablespaces.  For non-parallel mode, we just pass a NULL
+	 * tablespace path, which matches all tablespaces.  In parallel mode, we
+	 * pass the default tablespace and all user-created tablespaces and let
+	 * those operations happen in parallel.
+	 */
+	if (user_opts.jobs <= 1)
+		parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
+									  new_pgdata, NULL);
+	else
+	{
+		int			tblnum;
+
+		/* transfer default tablespace */
+		parallel_transfer_all_new_dbs(old_db_arr, new_db_arr, old_pgdata,
+									  new_pgdata, old_pgdata);
+
+		for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
+			parallel_transfer_all_new_dbs(old_db_arr,
+										  new_db_arr,
+										  old_pgdata,
+										  new_pgdata,
+										  os_info.old_tablespaces[tblnum]);
+		/* reap all children */
+		while (reap_child(true) == true)
+			;
+	}
+
+	end_progress_output();
+	check_ok();
+
+	return;
+}
+
+
+/*
+ * transfer_all_new_dbs()
+ *
+ * Responsible for upgrading all database. invokes routines to generate mappings and then
+ * physically link the databases.
+ */
+void
+transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
+					 char *old_pgdata, char *new_pgdata, char *old_tablespace)
+{
+	int			old_dbnum,
+				new_dbnum;
+
+	/* Scan the old cluster databases and transfer their files */
+	for (old_dbnum = new_dbnum = 0;
+		 old_dbnum < old_db_arr->ndbs;
+		 old_dbnum++, new_dbnum++)
+	{
+		DbInfo	   *old_db = &old_db_arr->dbs[old_dbnum],
+				   *new_db = NULL;
+		FileNameMap *mappings;
+		int			n_maps;
+		pageCnvCtx *pageConverter = NULL;
+
+		/*
+		 * Advance past any databases that exist in the new cluster but not in
+		 * the old, e.g. "postgres".  (The user might have removed the
+		 * 'postgres' database from the old cluster.)
+		 */
+		for (; new_dbnum < new_db_arr->ndbs; new_dbnum++)
+		{
+			new_db = &new_db_arr->dbs[new_dbnum];
+			if (strcmp(old_db->db_name, new_db->db_name) == 0)
+				break;
+		}
+
+		if (new_dbnum >= new_db_arr->ndbs)
+			pg_fatal("old database \"%s\" not found in the new cluster\n",
+					 old_db->db_name);
+
+		mappings = gen_db_file_maps(old_db, new_db, &n_maps, old_pgdata,
+									new_pgdata);
+		if (n_maps)
+		{
+			print_maps(mappings, n_maps, new_db->db_name);
+
+#ifdef PAGE_CONVERSION
+			pageConverter = setupPageConverter();
+#endif
+			transfer_single_new_db(pageConverter, mappings, n_maps,
+								   old_tablespace);
+		}
+		/* We allocate something even for n_maps == 0 */
+		pg_free(mappings);
+	}
+
+	return;
+}
+
+
+/*
+ * get_pg_database_relfilenode()
+ *
+ *	Retrieves the relfilenode for a few system-catalog tables.  We need these
+ *	relfilenodes later in the upgrade process.
+ */
+void
+get_pg_database_relfilenode(ClusterInfo *cluster)
+{
+	PGconn	   *conn = connectToServer(cluster, "template1");
+	PGresult   *res;
+	int			i_relfile;
+
+	res = executeQueryOrDie(conn,
+							"SELECT c.relname, c.relfilenode "
+							"FROM	pg_catalog.pg_class c, "
+							"		pg_catalog.pg_namespace n "
+							"WHERE	c.relnamespace = n.oid AND "
+							"		n.nspname = 'pg_catalog' AND "
+							"		c.relname = 'pg_database' "
+							"ORDER BY c.relname");
+
+	i_relfile = PQfnumber(res, "relfilenode");
+	cluster->pg_database_oid = atooid(PQgetvalue(res, 0, i_relfile));
+
+	PQclear(res);
+	PQfinish(conn);
+}
+
+
+/*
+ * transfer_single_new_db()
+ *
+ * create links for mappings stored in "maps" array.
+ */
+static void
+transfer_single_new_db(pageCnvCtx *pageConverter,
+					   FileNameMap *maps, int size, char *old_tablespace)
+{
+	int			mapnum;
+	bool		vm_crashsafe_match = true;
+
+	/*
+	 * Do the old and new cluster disagree on the crash-safetiness of the vm
+	 * files?  If so, do not copy them.
+	 */
+	if (old_cluster.controldata.cat_ver < VISIBILITY_MAP_CRASHSAFE_CAT_VER &&
+		new_cluster.controldata.cat_ver >= VISIBILITY_MAP_CRASHSAFE_CAT_VER)
+		vm_crashsafe_match = false;
+
+	for (mapnum = 0; mapnum < size; mapnum++)
+	{
+		if (old_tablespace == NULL ||
+			strcmp(maps[mapnum].old_tablespace, old_tablespace) == 0)
+		{
+			/* transfer primary file */
+			transfer_relfile(pageConverter, &maps[mapnum], "");
+
+			/* fsm/vm files added in PG 8.4 */
+			if (GET_MAJOR_VERSION(old_cluster.major_version) >= 804)
+			{
+				/*
+				 * Copy/link any fsm and vm files, if they exist
+				 */
+				transfer_relfile(pageConverter, &maps[mapnum], "_fsm");
+				if (vm_crashsafe_match)
+					transfer_relfile(pageConverter, &maps[mapnum], "_vm");
+			}
+		}
+	}
+}
+
+
+/*
+ * transfer_relfile()
+ *
+ * Copy or link file from old cluster to new one.
+ */
+static void
+transfer_relfile(pageCnvCtx *pageConverter, FileNameMap *map,
+				 const char *type_suffix)
+{
+	const char *msg;
+	char		old_file[MAXPGPATH];
+	char		new_file[MAXPGPATH];
+	int			fd;
+	int			segno;
+	char		extent_suffix[65];
+
+	/*
+	 * Now copy/link any related segments as well. Remember, PG breaks large
+	 * files into 1GB segments, the first segment has no extension, subsequent
+	 * segments are named relfilenode.1, relfilenode.2, relfilenode.3. copied.
+	 */
+	for (segno = 0;; segno++)
+	{
+		if (segno == 0)
+			extent_suffix[0] = '\0';
+		else
+			snprintf(extent_suffix, sizeof(extent_suffix), ".%d", segno);
+
+		snprintf(old_file, sizeof(old_file), "%s%s/%u/%u%s%s",
+				 map->old_tablespace,
+				 map->old_tablespace_suffix,
+				 map->old_db_oid,
+				 map->old_relfilenode,
+				 type_suffix,
+				 extent_suffix);
+		snprintf(new_file, sizeof(new_file), "%s%s/%u/%u%s%s",
+				 map->new_tablespace,
+				 map->new_tablespace_suffix,
+				 map->new_db_oid,
+				 map->new_relfilenode,
+				 type_suffix,
+				 extent_suffix);
+
+		/* Is it an extent, fsm, or vm file? */
+		if (type_suffix[0] != '\0' || segno != 0)
+		{
+			/* Did file open fail? */
+			if ((fd = open(old_file, O_RDONLY, 0)) == -1)
+			{
+				/* File does not exist?  That's OK, just return */
+				if (errno == ENOENT)
+					return;
+				else
+					pg_fatal("error while checking for file existence \"%s.%s\" (\"%s\" to \"%s\"): %s\n",
+							 map->nspname, map->relname, old_file, new_file,
+							 getErrorText(errno));
+			}
+			close(fd);
+		}
+
+		unlink(new_file);
+
+		/* Copying files might take some time, so give feedback. */
+		pg_log(PG_STATUS, "%s", old_file);
+
+		if ((user_opts.transfer_mode == TRANSFER_MODE_LINK) && (pageConverter != NULL))
+			pg_fatal("This upgrade requires page-by-page conversion, "
+					 "you must use copy mode instead of link mode.\n");
+
+		if (user_opts.transfer_mode == TRANSFER_MODE_COPY)
+		{
+			pg_log(PG_VERBOSE, "copying \"%s\" to \"%s\"\n", old_file, new_file);
+
+			if ((msg = copyAndUpdateFile(pageConverter, old_file, new_file, true)) != NULL)
+				pg_fatal("error while copying relation \"%s.%s\" (\"%s\" to \"%s\"): %s\n",
+						 map->nspname, map->relname, old_file, new_file, msg);
+		}
+		else
+		{
+			pg_log(PG_VERBOSE, "linking \"%s\" to \"%s\"\n", old_file, new_file);
+
+			if ((msg = linkAndUpdateFile(pageConverter, old_file, new_file)) != NULL)
+				pg_fatal("error while creating link for relation \"%s.%s\" (\"%s\" to \"%s\"): %s\n",
+						 map->nspname, map->relname, old_file, new_file, msg);
+		}
+	}
+
+	return;
+}
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
new file mode 100644
index 0000000..8d8e7d7
--- /dev/null
+++ b/src/bin/pg_upgrade/server.c
@@ -0,0 +1,350 @@
+/*
+ *	server.c
+ *
+ *	database server functions
+ *
+ *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
+ *	src/bin/pg_upgrade/server.c
+ */
+
+#include "postgres_fe.h"
+
+#include "pg_upgrade.h"
+
+
+static PGconn *get_db_conn(ClusterInfo *cluster, const char *db_name);
+
+
+/*
+ * connectToServer()
+ *
+ *	Connects to the desired database on the designated server.
+ *	If the connection attempt fails, this function logs an error
+ *	message and calls exit() to kill the program.
+ */
+PGconn *
+connectToServer(ClusterInfo *cluster, const char *db_name)
+{
+	PGconn	   *conn = get_db_conn(cluster, db_name);
+
+	if (conn == NULL || PQstatus(conn) != CONNECTION_OK)
+	{
+		pg_log(PG_REPORT, "connection to database failed: %s\n",
+			   PQerrorMessage(conn));
+
+		if (conn)
+			PQfinish(conn);
+
+		printf("Failure, exiting\n");
+		exit(1);
+	}
+
+	return conn;
+}
+
+
+/*
+ * get_db_conn()
+ *
+ * get database connection, using named database + standard params for cluster
+ */
+static PGconn *
+get_db_conn(ClusterInfo *cluster, const char *db_name)
+{
+	char		conn_opts[2 * NAMEDATALEN + MAXPGPATH + 100];
+
+	if (cluster->sockdir)
+		snprintf(conn_opts, sizeof(conn_opts),
+				 "dbname = '%s' user = '%s' host = '%s' port = %d",
+				 db_name, os_info.user, cluster->sockdir, cluster->port);
+	else
+		snprintf(conn_opts, sizeof(conn_opts),
+				 "dbname = '%s' user = '%s' port = %d",
+				 db_name, os_info.user, cluster->port);
+
+	return PQconnectdb(conn_opts);
+}
+
+
+/*
+ * cluster_conn_opts()
+ *
+ * Return standard command-line options for connecting to this cluster when
+ * using psql, pg_dump, etc.  Ideally this would match what get_db_conn()
+ * sets, but the utilities we need aren't very consistent about the treatment
+ * of database name options, so we leave that out.
+ *
+ * Note result is in static storage, so use it right away.
+ */
+char *
+cluster_conn_opts(ClusterInfo *cluster)
+{
+	static char conn_opts[MAXPGPATH + NAMEDATALEN + 100];
+
+	if (cluster->sockdir)
+		snprintf(conn_opts, sizeof(conn_opts),
+				 "--host \"%s\" --port %d --username \"%s\"",
+				 cluster->sockdir, cluster->port, os_info.user);
+	else
+		snprintf(conn_opts, sizeof(conn_opts),
+				 "--port %d --username \"%s\"",
+				 cluster->port, os_info.user);
+
+	return conn_opts;
+}
+
+
+/*
+ * executeQueryOrDie()
+ *
+ *	Formats a query string from the given arguments and executes the
+ *	resulting query.  If the query fails, this function logs an error
+ *	message and calls exit() to kill the program.
+ */
+PGresult *
+executeQueryOrDie(PGconn *conn, const char *fmt,...)
+{
+	static char query[QUERY_ALLOC];
+	va_list		args;
+	PGresult   *result;
+	ExecStatusType status;
+
+	va_start(args, fmt);
+	vsnprintf(query, sizeof(query), fmt, args);
+	va_end(args);
+
+	pg_log(PG_VERBOSE, "executing: %s\n", query);
+	result = PQexec(conn, query);
+	status = PQresultStatus(result);
+
+	if ((status != PGRES_TUPLES_OK) && (status != PGRES_COMMAND_OK))
+	{
+		pg_log(PG_REPORT, "SQL command failed\n%s\n%s\n", query,
+			   PQerrorMessage(conn));
+		PQclear(result);
+		PQfinish(conn);
+		printf("Failure, exiting\n");
+		exit(1);
+	}
+	else
+		return result;
+}
+
+
+/*
+ * get_major_server_version()
+ *
+ * gets the version (in unsigned int form) for the given datadir. Assumes
+ * that datadir is an absolute path to a valid pgdata directory. The version
+ * is retrieved by reading the PG_VERSION file.
+ */
+uint32
+get_major_server_version(ClusterInfo *cluster)
+{
+	FILE	   *version_fd;
+	char		ver_filename[MAXPGPATH];
+	int			integer_version = 0;
+	int			fractional_version = 0;
+
+	snprintf(ver_filename, sizeof(ver_filename), "%s/PG_VERSION",
+			 cluster->pgdata);
+	if ((version_fd = fopen(ver_filename, "r")) == NULL)
+		pg_fatal("could not open version file: %s\n", ver_filename);
+
+	if (fscanf(version_fd, "%63s", cluster->major_version_str) == 0 ||
+		sscanf(cluster->major_version_str, "%d.%d", &integer_version,
+			   &fractional_version) != 2)
+		pg_fatal("could not get version from %s\n", cluster->pgdata);
+
+	fclose(version_fd);
+
+	return (100 * integer_version + fractional_version) * 100;
+}
+
+
+static void
+stop_postmaster_atexit(void)
+{
+	stop_postmaster(true);
+}
+
+
+bool
+start_postmaster(ClusterInfo *cluster, bool throw_error)
+{
+	char		cmd[MAXPGPATH * 4 + 1000];
+	PGconn	   *conn;
+	bool		exit_hook_registered = false;
+	bool		pg_ctl_return = false;
+	char		socket_string[MAXPGPATH + 200];
+
+	if (!exit_hook_registered)
+	{
+		atexit(stop_postmaster_atexit);
+		exit_hook_registered = true;
+	}
+
+	socket_string[0] = '\0';
+
+#ifdef HAVE_UNIX_SOCKETS
+	/* prevent TCP/IP connections, restrict socket access */
+	strcat(socket_string,
+		   " -c listen_addresses='' -c unix_socket_permissions=0700");
+
+	/* Have a sockdir?	Tell the postmaster. */
+	if (cluster->sockdir)
+		snprintf(socket_string + strlen(socket_string),
+				 sizeof(socket_string) - strlen(socket_string),
+				 " -c %s='%s'",
+				 (GET_MAJOR_VERSION(cluster->major_version) < 903) ?
+				 "unix_socket_directory" : "unix_socket_directories",
+				 cluster->sockdir);
+#endif
+
+	/*
+	 * Since PG 9.1, we have used -b to disable autovacuum.  For earlier
+	 * releases, setting autovacuum=off disables cleanup vacuum and analyze,
+	 * but freeze vacuums can still happen, so we set autovacuum_freeze_max_age
+	 * to its maximum.  (autovacuum_multixact_freeze_max_age was introduced
+	 * after 9.1, so there is no need to set that.)  We assume all datfrozenxid
+	 * and relfrozenxid values are less than a gap of 2000000000 from the current
+	 * xid counter, so autovacuum will not touch them.
+	 *
+	 * Turn off durability requirements to improve object creation speed, and
+	 * we only modify the new cluster, so only use it there.  If there is a
+	 * crash, the new cluster has to be recreated anyway.  fsync=off is a big
+	 * win on ext4.
+	 */
+	snprintf(cmd, sizeof(cmd),
+		  "\"%s/pg_ctl\" -w -l \"%s\" -D \"%s\" -o \"-p %d%s%s %s%s\" start",
+		  cluster->bindir, SERVER_LOG_FILE, cluster->pgconfig, cluster->port,
+			 (cluster->controldata.cat_ver >=
+			  BINARY_UPGRADE_SERVER_FLAG_CAT_VER) ? " -b" :
+			 " -c autovacuum=off -c autovacuum_freeze_max_age=2000000000",
+			 (cluster == &new_cluster) ?
+	  " -c synchronous_commit=off -c fsync=off -c full_page_writes=off" : "",
+			 cluster->pgopts ? cluster->pgopts : "", socket_string);
+
+	/*
+	 * Don't throw an error right away, let connecting throw the error because
+	 * it might supply a reason for the failure.
+	 */
+	pg_ctl_return = exec_prog(SERVER_START_LOG_FILE,
+	/* pass both file names if they differ */
+							  (strcmp(SERVER_LOG_FILE,
+									  SERVER_START_LOG_FILE) != 0) ?
+							  SERVER_LOG_FILE : NULL,
+							  false,
+							  "%s", cmd);
+
+	/* Did it fail and we are just testing if the server could be started? */
+	if (!pg_ctl_return && !throw_error)
+		return false;
+
+	/*
+	 * We set this here to make sure atexit() shuts down the server, but only
+	 * if we started the server successfully.  We do it before checking for
+	 * connectivity in case the server started but there is a connectivity
+	 * failure.  If pg_ctl did not return success, we will exit below.
+	 *
+	 * Pre-9.1 servers do not have PQping(), so we could be leaving the server
+	 * running if authentication was misconfigured, so someday we might went
+	 * to be more aggressive about doing server shutdowns even if pg_ctl
+	 * fails, but now (2013-08-14) it seems prudent to be cautious.  We don't
+	 * want to shutdown a server that might have been accidentally started
+	 * during the upgrade.
+	 */
+	if (pg_ctl_return)
+		os_info.running_cluster = cluster;
+
+	/*
+	 * pg_ctl -w might have failed because the server couldn't be started, or
+	 * there might have been a connection problem in _checking_ if the server
+	 * has started.  Therefore, even if pg_ctl failed, we continue and test
+	 * for connectivity in case we get a connection reason for the failure.
+	 */
+	if ((conn = get_db_conn(cluster, "template1")) == NULL ||
+		PQstatus(conn) != CONNECTION_OK)
+	{
+		pg_log(PG_REPORT, "\nconnection to database failed: %s\n",
+			   PQerrorMessage(conn));
+		if (conn)
+			PQfinish(conn);
+		pg_fatal("could not connect to %s postmaster started with the command:\n"
+				 "%s\n",
+				 CLUSTER_NAME(cluster), cmd);
+	}
+	PQfinish(conn);
+
+	/*
+	 * If pg_ctl failed, and the connection didn't fail, and throw_error is
+	 * enabled, fail now.  This could happen if the server was already
+	 * running.
+	 */
+	if (!pg_ctl_return)
+		pg_fatal("pg_ctl failed to start the %s server, or connection failed\n",
+				 CLUSTER_NAME(cluster));
+
+	return true;
+}
+
+
+void
+stop_postmaster(bool fast)
+{
+	ClusterInfo *cluster;
+
+	if (os_info.running_cluster == &old_cluster)
+		cluster = &old_cluster;
+	else if (os_info.running_cluster == &new_cluster)
+		cluster = &new_cluster;
+	else
+		return;					/* no cluster running */
+
+	exec_prog(SERVER_STOP_LOG_FILE, NULL, !fast,
+			  "\"%s/pg_ctl\" -w -D \"%s\" -o \"%s\" %s stop",
+			  cluster->bindir, cluster->pgconfig,
+			  cluster->pgopts ? cluster->pgopts : "",
+			  fast ? "-m fast" : "");
+
+	os_info.running_cluster = NULL;
+}
+
+
+/*
+ * check_pghost_envvar()
+ *
+ * Tests that PGHOST does not point to a non-local server
+ */
+void
+check_pghost_envvar(void)
+{
+	PQconninfoOption *option;
+	PQconninfoOption *start;
+
+	/* Get valid libpq env vars from the PQconndefaults function */
+
+	start = PQconndefaults();
+
+	if (!start)
+		pg_fatal("out of memory\n");
+
+	for (option = start; option->keyword != NULL; option++)
+	{
+		if (option->envvar && (strcmp(option->envvar, "PGHOST") == 0 ||
+							   strcmp(option->envvar, "PGHOSTADDR") == 0))
+		{
+			const char *value = getenv(option->envvar);
+
+			if (value && strlen(value) > 0 &&
+			/* check for 'local' host values */
+				(strcmp(value, "localhost") != 0 && strcmp(value, "127.0.0.1") != 0 &&
+				 strcmp(value, "::1") != 0 && value[0] != '/'))
+				pg_fatal("libpq environment variable %s has a non-local server value: %s\n",
+						 option->envvar, value);
+		}
+	}
+
+	/* Free the memory that libpq allocated on our behalf */
+	PQconninfoFree(start);
+}
diff --git a/src/bin/pg_upgrade/tablespace.c b/src/bin/pg_upgrade/tablespace.c
new file mode 100644
index 0000000..ce7097e
--- /dev/null
+++ b/src/bin/pg_upgrade/tablespace.c
@@ -0,0 +1,124 @@
+/*
+ *	tablespace.c
+ *
+ *	tablespace functions
+ *
+ *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
+ *	src/bin/pg_upgrade/tablespace.c
+ */
+
+#include "postgres_fe.h"
+
+#include "pg_upgrade.h"
+
+#include <sys/types.h>
+
+static void get_tablespace_paths(void);
+static void set_tablespace_directory_suffix(ClusterInfo *cluster);
+
+
+void
+init_tablespaces(void)
+{
+	get_tablespace_paths();
+
+	set_tablespace_directory_suffix(&old_cluster);
+	set_tablespace_directory_suffix(&new_cluster);
+
+	if (os_info.num_old_tablespaces > 0 &&
+	strcmp(old_cluster.tablespace_suffix, new_cluster.tablespace_suffix) == 0)
+		pg_fatal("Cannot upgrade to/from the same system catalog version when\n"
+				 "using tablespaces.\n");
+}
+
+
+/*
+ * get_tablespace_paths()
+ *
+ * Scans pg_tablespace and returns a malloc'ed array of all tablespace
+ * paths. Its the caller's responsibility to free the array.
+ */
+static void
+get_tablespace_paths(void)
+{
+	PGconn	   *conn = connectToServer(&old_cluster, "template1");
+	PGresult   *res;
+	int			tblnum;
+	int			i_spclocation;
+	char		query[QUERY_ALLOC];
+
+	snprintf(query, sizeof(query),
+			 "SELECT	%s "
+			 "FROM	pg_catalog.pg_tablespace "
+			 "WHERE	spcname != 'pg_default' AND "
+			 "		spcname != 'pg_global'",
+	/* 9.2 removed the spclocation column */
+			 (GET_MAJOR_VERSION(old_cluster.major_version) <= 901) ?
+	"spclocation" : "pg_catalog.pg_tablespace_location(oid) AS spclocation");
+
+	res = executeQueryOrDie(conn, "%s", query);
+
+	if ((os_info.num_old_tablespaces = PQntuples(res)) != 0)
+		os_info.old_tablespaces = (char **) pg_malloc(
+							   os_info.num_old_tablespaces * sizeof(char *));
+	else
+		os_info.old_tablespaces = NULL;
+
+	i_spclocation = PQfnumber(res, "spclocation");
+
+	for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
+	{
+		struct stat statBuf;
+
+		os_info.old_tablespaces[tblnum] = pg_strdup(
+									 PQgetvalue(res, tblnum, i_spclocation));
+
+		/*
+		 * Check that the tablespace path exists and is a directory.
+		 * Effectively, this is checking only for tables/indexes in
+		 * non-existent tablespace directories.  Databases located in
+		 * non-existent tablespaces already throw a backend error.
+		 * Non-existent tablespace directories can occur when a data directory
+		 * that contains user tablespaces is moved as part of pg_upgrade
+		 * preparation and the symbolic links are not updated.
+		 */
+		if (stat(os_info.old_tablespaces[tblnum], &statBuf) != 0)
+		{
+			if (errno == ENOENT)
+				report_status(PG_FATAL,
+							  "tablespace directory \"%s\" does not exist\n",
+							  os_info.old_tablespaces[tblnum]);
+			else
+				report_status(PG_FATAL,
+						   "cannot stat() tablespace directory \"%s\": %s\n",
+					   os_info.old_tablespaces[tblnum], getErrorText(errno));
+		}
+		if (!S_ISDIR(statBuf.st_mode))
+			report_status(PG_FATAL,
+						  "tablespace path \"%s\" is not a directory\n",
+						  os_info.old_tablespaces[tblnum]);
+	}
+
+	PQclear(res);
+
+	PQfinish(conn);
+
+	return;
+}
+
+
+static void
+set_tablespace_directory_suffix(ClusterInfo *cluster)
+{
+	if (GET_MAJOR_VERSION(cluster->major_version) <= 804)
+		cluster->tablespace_suffix = pg_strdup("");
+	else
+	{
+		/* This cluster has a version-specific subdirectory */
+
+		/* The leading slash is needed to start a new directory. */
+		cluster->tablespace_suffix = psprintf("/PG_%s_%d",
+											  cluster->major_version_str,
+											  cluster->controldata.cat_ver);
+	}
+}
diff --git a/src/bin/pg_upgrade/test.sh b/src/bin/pg_upgrade/test.sh
new file mode 100644
index 0000000..0903f30
--- /dev/null
+++ b/src/bin/pg_upgrade/test.sh
@@ -0,0 +1,224 @@
+#!/bin/sh
+
+# src/bin/pg_upgrade/test.sh
+#
+# Test driver for pg_upgrade.  Initializes a new database cluster,
+# runs the regression tests (to put in some data), runs pg_dumpall,
+# runs pg_upgrade, runs pg_dumpall again, compares the dumps.
+#
+# Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+
+set -e
+
+: ${MAKE=make}
+
+# Guard against parallel make issues (see comments in pg_regress.c)
+unset MAKEFLAGS
+unset MAKELEVEL
+
+# Run a given "initdb" binary and overlay the regression testing
+# authentication configuration.
+standard_initdb() {
+	"$1" -N
+	../../test/regress/pg_regress --config-auth "$PGDATA"
+}
+
+# Establish how the server will listen for connections
+testhost=`uname -s`
+
+case $testhost in
+	MINGW*)
+		LISTEN_ADDRESSES="localhost"
+		PGHOST=localhost
+		;;
+	*)
+		LISTEN_ADDRESSES=""
+		# Select a socket directory.  The algorithm is from the "configure"
+		# script; the outcome mimics pg_regress.c:make_temp_sockdir().
+		PGHOST=$PG_REGRESS_SOCK_DIR
+		if [ "x$PGHOST" = x ]; then
+			{
+				dir=`(umask 077 &&
+					  mktemp -d /tmp/pg_upgrade_check-XXXXXX) 2>/dev/null` &&
+				[ -d "$dir" ]
+			} ||
+			{
+				dir=/tmp/pg_upgrade_check-$$-$RANDOM
+				(umask 077 && mkdir "$dir")
+			} ||
+			{
+				echo "could not create socket temporary directory in \"/tmp\""
+				exit 1
+			}
+
+			PGHOST=$dir
+			trap 'rm -rf "$PGHOST"' 0
+			trap 'exit 3' 1 2 13 15
+		fi
+		;;
+esac
+
+POSTMASTER_OPTS="-F -c listen_addresses=$LISTEN_ADDRESSES -k \"$PGHOST\""
+export PGHOST
+
+temp_root=$PWD/tmp_check
+
+if [ "$1" = '--install' ]; then
+	temp_install=$temp_root/install
+	bindir=$temp_install/$bindir
+	libdir=$temp_install/$libdir
+
+	"$MAKE" -s -C ../.. install DESTDIR="$temp_install"
+	"$MAKE" -s -C . install DESTDIR="$temp_install"
+
+	# platform-specific magic to find the shared libraries; see pg_regress.c
+	LD_LIBRARY_PATH=$libdir:$LD_LIBRARY_PATH
+	export LD_LIBRARY_PATH
+	DYLD_LIBRARY_PATH=$libdir:$DYLD_LIBRARY_PATH
+	export DYLD_LIBRARY_PATH
+	LIBPATH=$libdir:$LIBPATH
+	export LIBPATH
+	PATH=$libdir:$PATH
+
+	# We need to make it use psql from our temporary installation,
+	# because otherwise the installcheck run below would try to
+	# use psql from the proper installation directory, which might
+	# be outdated or missing. But don't override anything else that's
+	# already in EXTRA_REGRESS_OPTS.
+	EXTRA_REGRESS_OPTS="$EXTRA_REGRESS_OPTS --psqldir='$bindir'"
+	export EXTRA_REGRESS_OPTS
+fi
+
+: ${oldbindir=$bindir}
+
+: ${oldsrc=../../..}
+oldsrc=`cd "$oldsrc" && pwd`
+newsrc=`cd ../../.. && pwd`
+
+PATH=$bindir:$PATH
+export PATH
+
+BASE_PGDATA=$temp_root/data
+PGDATA="$BASE_PGDATA.old"
+export PGDATA
+rm -rf "$BASE_PGDATA" "$PGDATA"
+
+logdir=$PWD/log
+rm -rf "$logdir"
+mkdir "$logdir"
+
+# Clear out any environment vars that might cause libpq to connect to
+# the wrong postmaster (cf pg_regress.c)
+#
+# Some shells, such as NetBSD's, return non-zero from unset if the variable
+# is already unset. Since we are operating under 'set -e', this causes the
+# script to fail. To guard against this, set them all to an empty string first.
+PGDATABASE="";        unset PGDATABASE
+PGUSER="";            unset PGUSER
+PGSERVICE="";         unset PGSERVICE
+PGSSLMODE="";         unset PGSSLMODE
+PGREQUIRESSL="";      unset PGREQUIRESSL
+PGCONNECT_TIMEOUT=""; unset PGCONNECT_TIMEOUT
+PGHOSTADDR="";        unset PGHOSTADDR
+
+# Select a non-conflicting port number, similarly to pg_regress.c
+PG_VERSION_NUM=`grep '#define PG_VERSION_NUM' "$newsrc"/src/include/pg_config.h | awk '{print $3}'`
+PGPORT=`expr $PG_VERSION_NUM % 16384 + 49152`
+export PGPORT
+
+i=0
+while psql -X postgres </dev/null 2>/dev/null
+do
+	i=`expr $i + 1`
+	if [ $i -eq 16 ]
+	then
+		echo port $PGPORT apparently in use
+		exit 1
+	fi
+	PGPORT=`expr $PGPORT + 1`
+	export PGPORT
+done
+
+# buildfarm may try to override port via EXTRA_REGRESS_OPTS ...
+EXTRA_REGRESS_OPTS="$EXTRA_REGRESS_OPTS --port=$PGPORT"
+export EXTRA_REGRESS_OPTS
+
+# enable echo so the user can see what is being executed
+set -x
+
+standard_initdb "$oldbindir"/initdb
+"$oldbindir"/pg_ctl start -l "$logdir/postmaster1.log" -o "$POSTMASTER_OPTS" -w
+if "$MAKE" -C "$oldsrc" installcheck; then
+	pg_dumpall -f "$temp_root"/dump1.sql || pg_dumpall1_status=$?
+	if [ "$newsrc" != "$oldsrc" ]; then
+		oldpgversion=`psql -A -t -d regression -c "SHOW server_version_num"`
+		fix_sql=""
+		case $oldpgversion in
+			804??)
+				fix_sql="UPDATE pg_proc SET probin = replace(probin::text, '$oldsrc', '$newsrc')::bytea WHERE probin LIKE '$oldsrc%'; DROP FUNCTION public.myfunc(integer);"
+				;;
+			900??)
+				fix_sql="SET bytea_output TO escape; UPDATE pg_proc SET probin = replace(probin::text, '$oldsrc', '$newsrc')::bytea WHERE probin LIKE '$oldsrc%';"
+				;;
+			901??)
+				fix_sql="UPDATE pg_proc SET probin = replace(probin, '$oldsrc', '$newsrc') WHERE probin LIKE '$oldsrc%';"
+				;;
+		esac
+		psql -d regression -c "$fix_sql;" || psql_fix_sql_status=$?
+
+		mv "$temp_root"/dump1.sql "$temp_root"/dump1.sql.orig
+		sed "s;$oldsrc;$newsrc;g" "$temp_root"/dump1.sql.orig >"$temp_root"/dump1.sql
+	fi
+else
+	make_installcheck_status=$?
+fi
+"$oldbindir"/pg_ctl -m fast stop
+if [ -n "$make_installcheck_status" ]; then
+	exit 1
+fi
+if [ -n "$psql_fix_sql_status" ]; then
+	exit 1
+fi
+if [ -n "$pg_dumpall1_status" ]; then
+	echo "pg_dumpall of pre-upgrade database cluster failed"
+	exit 1
+fi
+
+PGDATA=$BASE_PGDATA
+
+standard_initdb 'initdb'
+
+pg_upgrade $PG_UPGRADE_OPTS -d "${PGDATA}.old" -D "${PGDATA}" -b "$oldbindir" -B "$bindir" -p "$PGPORT" -P "$PGPORT"
+
+pg_ctl start -l "$logdir/postmaster2.log" -o "$POSTMASTER_OPTS" -w
+
+case $testhost in
+	MINGW*)	cmd /c analyze_new_cluster.bat ;;
+	*)		sh ./analyze_new_cluster.sh ;;
+esac
+
+pg_dumpall -f "$temp_root"/dump2.sql || pg_dumpall2_status=$?
+pg_ctl -m fast stop
+
+# no need to echo commands anymore
+set +x
+echo
+
+if [ -n "$pg_dumpall2_status" ]; then
+	echo "pg_dumpall of post-upgrade database cluster failed"
+	exit 1
+fi
+
+case $testhost in
+	MINGW*)	cmd /c delete_old_cluster.bat ;;
+	*)	    sh ./delete_old_cluster.sh ;;
+esac
+
+if diff -q "$temp_root"/dump1.sql "$temp_root"/dump2.sql; then
+	echo PASSED
+	exit 0
+else
+	echo "dumps were not identical"
+	exit 1
+fi
diff --git a/src/bin/pg_upgrade/util.c b/src/bin/pg_upgrade/util.c
new file mode 100644
index 0000000..7f328f0
--- /dev/null
+++ b/src/bin/pg_upgrade/util.c
@@ -0,0 +1,298 @@
+/*
+ *	util.c
+ *
+ *	utility functions
+ *
+ *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
+ *	src/bin/pg_upgrade/util.c
+ */
+
+#include "postgres_fe.h"
+
+#include "common/username.h"
+#include "pg_upgrade.h"
+
+#include <signal.h>
+
+
+LogOpts		log_opts;
+
+static void pg_log_v(eLogType type, const char *fmt, va_list ap) pg_attribute_printf(2, 0);
+
+
+/*
+ * report_status()
+ *
+ *	Displays the result of an operation (ok, failed, error message,...)
+ */
+void
+report_status(eLogType type, const char *fmt,...)
+{
+	va_list		args;
+	char		message[MAX_STRING];
+
+	va_start(args, fmt);
+	vsnprintf(message, sizeof(message), fmt, args);
+	va_end(args);
+
+	pg_log(type, "%s\n", message);
+}
+
+
+/* force blank output for progress display */
+void
+end_progress_output(void)
+{
+	/*
+	 * In case nothing printed; pass a space so gcc doesn't complain about
+	 * empty format string.
+	 */
+	prep_status(" ");
+}
+
+
+/*
+ * prep_status
+ *
+ *	Displays a message that describes an operation we are about to begin.
+ *	We pad the message out to MESSAGE_WIDTH characters so that all of the "ok" and
+ *	"failed" indicators line up nicely.
+ *
+ *	A typical sequence would look like this:
+ *		prep_status("about to flarb the next %d files", fileCount );
+ *
+ *		if(( message = flarbFiles(fileCount)) == NULL)
+ *		  report_status(PG_REPORT, "ok" );
+ *		else
+ *		  pg_log(PG_FATAL, "failed - %s\n", message );
+ */
+void
+prep_status(const char *fmt,...)
+{
+	va_list		args;
+	char		message[MAX_STRING];
+
+	va_start(args, fmt);
+	vsnprintf(message, sizeof(message), fmt, args);
+	va_end(args);
+
+	if (strlen(message) > 0 && message[strlen(message) - 1] == '\n')
+		pg_log(PG_REPORT, "%s", message);
+	else
+		/* trim strings that don't end in a newline */
+		pg_log(PG_REPORT, "%-*s", MESSAGE_WIDTH, message);
+}
+
+
+static void
+pg_log_v(eLogType type, const char *fmt, va_list ap)
+{
+	char		message[QUERY_ALLOC];
+
+	vsnprintf(message, sizeof(message), fmt, ap);
+
+	/* PG_VERBOSE and PG_STATUS are only output in verbose mode */
+	/* fopen() on log_opts.internal might have failed, so check it */
+	if (((type != PG_VERBOSE && type != PG_STATUS) || log_opts.verbose) &&
+		log_opts.internal != NULL)
+	{
+		if (type == PG_STATUS)
+			/* status messages need two leading spaces and a newline */
+			fprintf(log_opts.internal, "  %s\n", message);
+		else
+			fprintf(log_opts.internal, "%s", message);
+		fflush(log_opts.internal);
+	}
+
+	switch (type)
+	{
+		case PG_VERBOSE:
+			if (log_opts.verbose)
+				printf("%s", _(message));
+			break;
+
+		case PG_STATUS:
+			/* for output to a display, do leading truncation and append \r */
+			if (isatty(fileno(stdout)))
+				/* -2 because we use a 2-space indent */
+				printf("  %s%-*.*s\r",
+				/* prefix with "..." if we do leading truncation */
+					   strlen(message) <= MESSAGE_WIDTH - 2 ? "" : "...",
+					   MESSAGE_WIDTH - 2, MESSAGE_WIDTH - 2,
+				/* optional leading truncation */
+					   strlen(message) <= MESSAGE_WIDTH - 2 ? message :
+					   message + strlen(message) - MESSAGE_WIDTH + 3 + 2);
+			else
+				printf("  %s\n", _(message));
+			break;
+
+		case PG_REPORT:
+		case PG_WARNING:
+			printf("%s", _(message));
+			break;
+
+		case PG_FATAL:
+			printf("\n%s", _(message));
+			printf("Failure, exiting\n");
+			exit(1);
+			break;
+
+		default:
+			break;
+	}
+	fflush(stdout);
+}
+
+
+void
+pg_log(eLogType type, const char *fmt,...)
+{
+	va_list		args;
+
+	va_start(args, fmt);
+	pg_log_v(type, fmt, args);
+	va_end(args);
+}
+
+
+void
+pg_fatal(const char *fmt,...)
+{
+	va_list		args;
+
+	va_start(args, fmt);
+	pg_log_v(PG_FATAL, fmt, args);
+	va_end(args);
+	printf("Failure, exiting\n");
+	exit(1);
+}
+
+
+void
+check_ok(void)
+{
+	/* all seems well */
+	report_status(PG_REPORT, "ok");
+	fflush(stdout);
+}
+
+
+/*
+ * quote_identifier()
+ *		Properly double-quote a SQL identifier.
+ *
+ * The result should be pg_free'd, but most callers don't bother because
+ * memory leakage is not a big deal in this program.
+ */
+char *
+quote_identifier(const char *s)
+{
+	char	   *result = pg_malloc(strlen(s) * 2 + 3);
+	char	   *r = result;
+
+	*r++ = '"';
+	while (*s)
+	{
+		if (*s == '"')
+			*r++ = *s;
+		*r++ = *s;
+		s++;
+	}
+	*r++ = '"';
+	*r++ = '\0';
+
+	return result;
+}
+
+
+/*
+ * get_user_info()
+ */
+int
+get_user_info(char **user_name_p)
+{
+	int			user_id;
+	const char *user_name;
+	char	   *errstr;
+
+#ifndef WIN32
+	user_id = geteuid();
+#else
+	user_id = 1;
+#endif
+
+	user_name = get_user_name(&errstr);
+	if (!user_name)
+		pg_fatal("%s\n", errstr);
+
+	/* make a copy */
+	*user_name_p = pg_strdup(user_name);
+
+	return user_id;
+}
+
+
+/*
+ * getErrorText()
+ *
+ *	Returns the text of the error message for the given error number
+ *
+ *	This feature is factored into a separate function because it is
+ *	system-dependent.
+ */
+const char *
+getErrorText(int errNum)
+{
+#ifdef WIN32
+	_dosmaperr(GetLastError());
+#endif
+	return pg_strdup(strerror(errNum));
+}
+
+
+/*
+ *	str2uint()
+ *
+ *	convert string to oid
+ */
+unsigned int
+str2uint(const char *str)
+{
+	return strtoul(str, NULL, 10);
+}
+
+
+/*
+ *	pg_putenv()
+ *
+ *	This is like putenv(), but takes two arguments.
+ *	It also does unsetenv() if val is NULL.
+ */
+void
+pg_putenv(const char *var, const char *val)
+{
+	if (val)
+	{
+#ifndef WIN32
+		char	   *envstr;
+
+		envstr = psprintf("%s=%s", var, val);
+		putenv(envstr);
+
+		/*
+		 * Do not free envstr because it becomes part of the environment on
+		 * some operating systems.  See port/unsetenv.c::unsetenv.
+		 */
+#else
+		SetEnvironmentVariableA(var, val);
+#endif
+	}
+	else
+	{
+#ifndef WIN32
+		unsetenv(var);
+#else
+		SetEnvironmentVariableA(var, "");
+#endif
+	}
+}
diff --git a/src/bin/pg_upgrade/version.c b/src/bin/pg_upgrade/version.c
new file mode 100644
index 0000000..e3e7387
--- /dev/null
+++ b/src/bin/pg_upgrade/version.c
@@ -0,0 +1,178 @@
+/*
+ *	version.c
+ *
+ *	Postgres-version-specific routines
+ *
+ *	Copyright (c) 2010-2015, PostgreSQL Global Development Group
+ *	src/bin/pg_upgrade/version.c
+ */
+
+#include "postgres_fe.h"
+
+#include "pg_upgrade.h"
+
+
+
+/*
+ * new_9_0_populate_pg_largeobject_metadata()
+ *	new >= 9.0, old <= 8.4
+ *	9.0 has a new pg_largeobject permission table
+ */
+void
+new_9_0_populate_pg_largeobject_metadata(ClusterInfo *cluster, bool check_mode)
+{
+	int			dbnum;
+	FILE	   *script = NULL;
+	bool		found = false;
+	char		output_path[MAXPGPATH];
+
+	prep_status("Checking for large objects");
+
+	snprintf(output_path, sizeof(output_path), "pg_largeobject.sql");
+
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		int			i_count;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		/* find if there are any large objects */
+		res = executeQueryOrDie(conn,
+								"SELECT count(*) "
+								"FROM	pg_catalog.pg_largeobject ");
+
+		i_count = PQfnumber(res, "count");
+		if (atoi(PQgetvalue(res, 0, i_count)) != 0)
+		{
+			found = true;
+			if (!check_mode)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s\n", output_path, getErrorText(errno));
+				fprintf(script, "\\connect %s\n",
+						quote_identifier(active_db->db_name));
+				fprintf(script,
+						"SELECT pg_catalog.lo_create(t.loid)\n"
+						"FROM (SELECT DISTINCT loid FROM pg_catalog.pg_largeobject) AS t;\n");
+			}
+		}
+
+		PQclear(res);
+		PQfinish(conn);
+	}
+
+	if (script)
+		fclose(script);
+
+	if (found)
+	{
+		report_status(PG_WARNING, "warning");
+		if (check_mode)
+			pg_log(PG_WARNING, "\n"
+				   "Your installation contains large objects.  The new database has an\n"
+				   "additional large object permission table.  After upgrading, you will be\n"
+				   "given a command to populate the pg_largeobject permission table with\n"
+				   "default permissions.\n\n");
+		else
+			pg_log(PG_WARNING, "\n"
+				   "Your installation contains large objects.  The new database has an\n"
+				   "additional large object permission table, so default permissions must be\n"
+				   "defined for all large objects.  The file\n"
+				   "    %s\n"
+				   "when executed by psql by the database superuser will set the default\n"
+				   "permissions.\n\n",
+				   output_path);
+	}
+	else
+		check_ok();
+}
+
+
+/*
+ * old_9_3_check_for_line_data_type_usage()
+ *	9.3 -> 9.4
+ *	Fully implement the 'line' data type in 9.4, which previously returned
+ *	"not enabled" by default and was only functionally enabled with a
+ *	compile-time switch;  9.4 "line" has different binary and text
+ *	representation formats;  checks tables and indexes.
+ */
+void
+old_9_3_check_for_line_data_type_usage(ClusterInfo *cluster)
+{
+	int			dbnum;
+	FILE	   *script = NULL;
+	bool		found = false;
+	char		output_path[MAXPGPATH];
+
+	prep_status("Checking for invalid \"line\" user columns");
+
+	snprintf(output_path, sizeof(output_path), "tables_using_line.txt");
+
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		bool		db_used = false;
+		int			ntups;
+		int			rowno;
+		int			i_nspname,
+					i_relname,
+					i_attname;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		res = executeQueryOrDie(conn,
+								"SELECT n.nspname, c.relname, a.attname "
+								"FROM	pg_catalog.pg_class c, "
+								"		pg_catalog.pg_namespace n, "
+								"		pg_catalog.pg_attribute a "
+								"WHERE	c.oid = a.attrelid AND "
+								"		NOT a.attisdropped AND "
+								"		a.atttypid = 'pg_catalog.line'::pg_catalog.regtype AND "
+								"		c.relnamespace = n.oid AND "
+		/* exclude possible orphaned temp tables */
+								"		n.nspname !~ '^pg_temp_' AND "
+						 "		n.nspname !~ '^pg_toast_temp_' AND "
+								"		n.nspname NOT IN ('pg_catalog', 'information_schema')");
+
+		ntups = PQntuples(res);
+		i_nspname = PQfnumber(res, "nspname");
+		i_relname = PQfnumber(res, "relname");
+		i_attname = PQfnumber(res, "attname");
+		for (rowno = 0; rowno < ntups; rowno++)
+		{
+			found = true;
+			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+				pg_fatal("could not open file \"%s\": %s\n", output_path, getErrorText(errno));
+			if (!db_used)
+			{
+				fprintf(script, "Database: %s\n", active_db->db_name);
+				db_used = true;
+			}
+			fprintf(script, "  %s.%s.%s\n",
+					PQgetvalue(res, rowno, i_nspname),
+					PQgetvalue(res, rowno, i_relname),
+					PQgetvalue(res, rowno, i_attname));
+		}
+
+		PQclear(res);
+
+		PQfinish(conn);
+	}
+
+	if (script)
+		fclose(script);
+
+	if (found)
+	{
+		pg_log(PG_REPORT, "fatal\n");
+		pg_fatal("Your installation contains the \"line\" data type in user tables.  This\n"
+		"data type changed its internal and input/output format between your old\n"
+				 "and new clusters so this cluster cannot currently be upgraded.  You can\n"
+		"remove the problem tables and restart the upgrade.  A list of the problem\n"
+				 "columns is in the file:\n"
+				 "    %s\n\n", output_path);
+	}
+	else
+		check_ok();
+}
diff --git a/src/common/Makefile b/src/common/Makefile
index c2e456d..c47445e 100644
--- a/src/common/Makefile
+++ b/src/common/Makefile
@@ -23,7 +23,7 @@ include $(top_builddir)/src/Makefile.global
 override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
 LIBS += $(PTHREAD_LIBS)
 
-OBJS_COMMON = exec.o pg_crc.o pg_lzcompress.o pgfnames.o psprintf.o relpath.o \
+OBJS_COMMON = exec.o pg_lzcompress.o pgfnames.o psprintf.o relpath.o \
 	rmtree.o string.o username.o wait_error.o
 
 OBJS_FRONTEND = $(OBJS_COMMON) fe_memutils.o restricted_token.o
diff --git a/src/common/pg_crc.c b/src/common/pg_crc.c
deleted file mode 100644
index eba32d3..0000000
--- a/src/common/pg_crc.c
+++ /dev/null
@@ -1,1252 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * pg_crc.c
- *	  PostgreSQL CRC support
- *
- * See Ross Williams' excellent introduction
- * A PAINLESS GUIDE TO CRC ERROR DETECTION ALGORITHMS, available from
- * http://www.ross.net/crc/download/crc_v3.txt or several other net sites.
- *
- * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- *
- * IDENTIFICATION
- *	  src/common/pg_crc.c
- *
- *-------------------------------------------------------------------------
- */
-
-#include "c.h"
-
-#include "common/pg_crc.h"
-
-/* Accumulate one input byte */
-#ifdef WORDS_BIGENDIAN
-#define CRC8(x) pg_crc32c_table[0][((crc >> 24) ^ (x)) & 0xFF] ^ (crc << 8)
-#else
-#define CRC8(x) pg_crc32c_table[0][(crc ^ (x)) & 0xFF] ^ (crc >> 8)
-#endif
-
-/*
- * This function computes a CRC using the slicing-by-8 algorithm, which
- * uses an 8*256 lookup table to operate on eight bytes in parallel and
- * recombine the results.
- *
- * Michael E. Kounavis, Frank L. Berry,
- * "Novel Table Lookup-Based Algorithms for High-Performance CRC
- * Generation", IEEE Transactions on Computers, vol.57, no. 11,
- * pp. 1550-1560, November 2008, doi:10.1109/TC.2008.85
- */
-pg_crc32
-pg_comp_crc32c(pg_crc32 crc, const void *data, size_t len)
-{
-	const unsigned char *p = data;
-	const uint32 *p4;
-
-	/*
-	 * Handle 0-3 initial bytes one at a time, so that the loop below starts
-	 * with a pointer aligned to four bytes.
-	 */
-	while (len > 0 && ((uintptr_t) p & 3))
-	{
-		crc = CRC8(*p++);
-		len--;
-	}
-
-	/*
-	 * Process eight bytes of data at a time.
-	 */
-	p4 = (const uint32 *) p;
-	while (len >= 8)
-	{
-		uint32		a = *p4++ ^ crc;
-		uint32		b = *p4++;
-
-#ifdef WORDS_BIGENDIAN
-		const uint8 c0 = b;
-		const uint8 c1 = b >> 8;
-		const uint8 c2 = b >> 16;
-		const uint8 c3 = b >> 24;
-		const uint8 c4 = a;
-		const uint8 c5 = a >> 8;
-		const uint8 c6 = a >> 16;
-		const uint8 c7 = a >> 24;
-#else
-		const uint8 c0 = b >> 24;
-		const uint8 c1 = b >> 16;
-		const uint8 c2 = b >> 8;
-		const uint8 c3 = b;
-		const uint8 c4 = a >> 24;
-		const uint8 c5 = a >> 16;
-		const uint8 c6 = a >> 8;
-		const uint8 c7 = a;
-#endif
-
-		crc =
-			pg_crc32c_table[0][c0] ^ pg_crc32c_table[1][c1] ^
-			pg_crc32c_table[2][c2] ^ pg_crc32c_table[3][c3] ^
-			pg_crc32c_table[4][c4] ^ pg_crc32c_table[5][c5] ^
-			pg_crc32c_table[6][c6] ^ pg_crc32c_table[7][c7];
-
-		len -= 8;
-	}
-
-	/*
-	 * Handle any remaining bytes one at a time.
-	 */
-	p = (const unsigned char *) p4;
-	while (len > 0)
-	{
-		crc = CRC8(*p++);
-		len--;
-	}
-
-	return crc;
-}
-
-/*
- * Lookup tables for the slicing-by-8 algorithm, for the so-called Castagnoli
- * polynomial (the same that is used e.g. in iSCSI), 0x1EDC6F41. Using
- * Williams' terms, this is the "normal", not "reflected" version. However, on
- * big-endian systems the values in the tables are stored in byte-reversed
- * order (IOW, the tables are stored in little-endian order even on big-endian
- * systems).
- */
-const uint32 pg_crc32c_table[8][256] = {
-#ifndef WORDS_BIGENDIAN
-	{
-		0x00000000, 0xF26B8303, 0xE13B70F7, 0x1350F3F4,
-		0xC79A971F, 0x35F1141C, 0x26A1E7E8, 0xD4CA64EB,
-		0x8AD958CF, 0x78B2DBCC, 0x6BE22838, 0x9989AB3B,
-		0x4D43CFD0, 0xBF284CD3, 0xAC78BF27, 0x5E133C24,
-		0x105EC76F, 0xE235446C, 0xF165B798, 0x030E349B,
-		0xD7C45070, 0x25AFD373, 0x36FF2087, 0xC494A384,
-		0x9A879FA0, 0x68EC1CA3, 0x7BBCEF57, 0x89D76C54,
-		0x5D1D08BF, 0xAF768BBC, 0xBC267848, 0x4E4DFB4B,
-		0x20BD8EDE, 0xD2D60DDD, 0xC186FE29, 0x33ED7D2A,
-		0xE72719C1, 0x154C9AC2, 0x061C6936, 0xF477EA35,
-		0xAA64D611, 0x580F5512, 0x4B5FA6E6, 0xB93425E5,
-		0x6DFE410E, 0x9F95C20D, 0x8CC531F9, 0x7EAEB2FA,
-		0x30E349B1, 0xC288CAB2, 0xD1D83946, 0x23B3BA45,
-		0xF779DEAE, 0x05125DAD, 0x1642AE59, 0xE4292D5A,
-		0xBA3A117E, 0x4851927D, 0x5B016189, 0xA96AE28A,
-		0x7DA08661, 0x8FCB0562, 0x9C9BF696, 0x6EF07595,
-		0x417B1DBC, 0xB3109EBF, 0xA0406D4B, 0x522BEE48,
-		0x86E18AA3, 0x748A09A0, 0x67DAFA54, 0x95B17957,
-		0xCBA24573, 0x39C9C670, 0x2A993584, 0xD8F2B687,
-		0x0C38D26C, 0xFE53516F, 0xED03A29B, 0x1F682198,
-		0x5125DAD3, 0xA34E59D0, 0xB01EAA24, 0x42752927,
-		0x96BF4DCC, 0x64D4CECF, 0x77843D3B, 0x85EFBE38,
-		0xDBFC821C, 0x2997011F, 0x3AC7F2EB, 0xC8AC71E8,
-		0x1C661503, 0xEE0D9600, 0xFD5D65F4, 0x0F36E6F7,
-		0x61C69362, 0x93AD1061, 0x80FDE395, 0x72966096,
-		0xA65C047D, 0x5437877E, 0x4767748A, 0xB50CF789,
-		0xEB1FCBAD, 0x197448AE, 0x0A24BB5A, 0xF84F3859,
-		0x2C855CB2, 0xDEEEDFB1, 0xCDBE2C45, 0x3FD5AF46,
-		0x7198540D, 0x83F3D70E, 0x90A324FA, 0x62C8A7F9,
-		0xB602C312, 0x44694011, 0x5739B3E5, 0xA55230E6,
-		0xFB410CC2, 0x092A8FC1, 0x1A7A7C35, 0xE811FF36,
-		0x3CDB9BDD, 0xCEB018DE, 0xDDE0EB2A, 0x2F8B6829,
-		0x82F63B78, 0x709DB87B, 0x63CD4B8F, 0x91A6C88C,
-		0x456CAC67, 0xB7072F64, 0xA457DC90, 0x563C5F93,
-		0x082F63B7, 0xFA44E0B4, 0xE9141340, 0x1B7F9043,
-		0xCFB5F4A8, 0x3DDE77AB, 0x2E8E845F, 0xDCE5075C,
-		0x92A8FC17, 0x60C37F14, 0x73938CE0, 0x81F80FE3,
-		0x55326B08, 0xA759E80B, 0xB4091BFF, 0x466298FC,
-		0x1871A4D8, 0xEA1A27DB, 0xF94AD42F, 0x0B21572C,
-		0xDFEB33C7, 0x2D80B0C4, 0x3ED04330, 0xCCBBC033,
-		0xA24BB5A6, 0x502036A5, 0x4370C551, 0xB11B4652,
-		0x65D122B9, 0x97BAA1BA, 0x84EA524E, 0x7681D14D,
-		0x2892ED69, 0xDAF96E6A, 0xC9A99D9E, 0x3BC21E9D,
-		0xEF087A76, 0x1D63F975, 0x0E330A81, 0xFC588982,
-		0xB21572C9, 0x407EF1CA, 0x532E023E, 0xA145813D,
-		0x758FE5D6, 0x87E466D5, 0x94B49521, 0x66DF1622,
-		0x38CC2A06, 0xCAA7A905, 0xD9F75AF1, 0x2B9CD9F2,
-		0xFF56BD19, 0x0D3D3E1A, 0x1E6DCDEE, 0xEC064EED,
-		0xC38D26C4, 0x31E6A5C7, 0x22B65633, 0xD0DDD530,
-		0x0417B1DB, 0xF67C32D8, 0xE52CC12C, 0x1747422F,
-		0x49547E0B, 0xBB3FFD08, 0xA86F0EFC, 0x5A048DFF,
-		0x8ECEE914, 0x7CA56A17, 0x6FF599E3, 0x9D9E1AE0,
-		0xD3D3E1AB, 0x21B862A8, 0x32E8915C, 0xC083125F,
-		0x144976B4, 0xE622F5B7, 0xF5720643, 0x07198540,
-		0x590AB964, 0xAB613A67, 0xB831C993, 0x4A5A4A90,
-		0x9E902E7B, 0x6CFBAD78, 0x7FAB5E8C, 0x8DC0DD8F,
-		0xE330A81A, 0x115B2B19, 0x020BD8ED, 0xF0605BEE,
-		0x24AA3F05, 0xD6C1BC06, 0xC5914FF2, 0x37FACCF1,
-		0x69E9F0D5, 0x9B8273D6, 0x88D28022, 0x7AB90321,
-		0xAE7367CA, 0x5C18E4C9, 0x4F48173D, 0xBD23943E,
-		0xF36E6F75, 0x0105EC76, 0x12551F82, 0xE03E9C81,
-		0x34F4F86A, 0xC69F7B69, 0xD5CF889D, 0x27A40B9E,
-		0x79B737BA, 0x8BDCB4B9, 0x988C474D, 0x6AE7C44E,
-		0xBE2DA0A5, 0x4C4623A6, 0x5F16D052, 0xAD7D5351
-	},
-	{
-		0x00000000, 0x13A29877, 0x274530EE, 0x34E7A899,
-		0x4E8A61DC, 0x5D28F9AB, 0x69CF5132, 0x7A6DC945,
-		0x9D14C3B8, 0x8EB65BCF, 0xBA51F356, 0xA9F36B21,
-		0xD39EA264, 0xC03C3A13, 0xF4DB928A, 0xE7790AFD,
-		0x3FC5F181, 0x2C6769F6, 0x1880C16F, 0x0B225918,
-		0x714F905D, 0x62ED082A, 0x560AA0B3, 0x45A838C4,
-		0xA2D13239, 0xB173AA4E, 0x859402D7, 0x96369AA0,
-		0xEC5B53E5, 0xFFF9CB92, 0xCB1E630B, 0xD8BCFB7C,
-		0x7F8BE302, 0x6C297B75, 0x58CED3EC, 0x4B6C4B9B,
-		0x310182DE, 0x22A31AA9, 0x1644B230, 0x05E62A47,
-		0xE29F20BA, 0xF13DB8CD, 0xC5DA1054, 0xD6788823,
-		0xAC154166, 0xBFB7D911, 0x8B507188, 0x98F2E9FF,
-		0x404E1283, 0x53EC8AF4, 0x670B226D, 0x74A9BA1A,
-		0x0EC4735F, 0x1D66EB28, 0x298143B1, 0x3A23DBC6,
-		0xDD5AD13B, 0xCEF8494C, 0xFA1FE1D5, 0xE9BD79A2,
-		0x93D0B0E7, 0x80722890, 0xB4958009, 0xA737187E,
-		0xFF17C604, 0xECB55E73, 0xD852F6EA, 0xCBF06E9D,
-		0xB19DA7D8, 0xA23F3FAF, 0x96D89736, 0x857A0F41,
-		0x620305BC, 0x71A19DCB, 0x45463552, 0x56E4AD25,
-		0x2C896460, 0x3F2BFC17, 0x0BCC548E, 0x186ECCF9,
-		0xC0D23785, 0xD370AFF2, 0xE797076B, 0xF4359F1C,
-		0x8E585659, 0x9DFACE2E, 0xA91D66B7, 0xBABFFEC0,
-		0x5DC6F43D, 0x4E646C4A, 0x7A83C4D3, 0x69215CA4,
-		0x134C95E1, 0x00EE0D96, 0x3409A50F, 0x27AB3D78,
-		0x809C2506, 0x933EBD71, 0xA7D915E8, 0xB47B8D9F,
-		0xCE1644DA, 0xDDB4DCAD, 0xE9537434, 0xFAF1EC43,
-		0x1D88E6BE, 0x0E2A7EC9, 0x3ACDD650, 0x296F4E27,
-		0x53028762, 0x40A01F15, 0x7447B78C, 0x67E52FFB,
-		0xBF59D487, 0xACFB4CF0, 0x981CE469, 0x8BBE7C1E,
-		0xF1D3B55B, 0xE2712D2C, 0xD69685B5, 0xC5341DC2,
-		0x224D173F, 0x31EF8F48, 0x050827D1, 0x16AABFA6,
-		0x6CC776E3, 0x7F65EE94, 0x4B82460D, 0x5820DE7A,
-		0xFBC3FAF9, 0xE861628E, 0xDC86CA17, 0xCF245260,
-		0xB5499B25, 0xA6EB0352, 0x920CABCB, 0x81AE33BC,
-		0x66D73941, 0x7575A136, 0x419209AF, 0x523091D8,
-		0x285D589D, 0x3BFFC0EA, 0x0F186873, 0x1CBAF004,
-		0xC4060B78, 0xD7A4930F, 0xE3433B96, 0xF0E1A3E1,
-		0x8A8C6AA4, 0x992EF2D3, 0xADC95A4A, 0xBE6BC23D,
-		0x5912C8C0, 0x4AB050B7, 0x7E57F82E, 0x6DF56059,
-		0x1798A91C, 0x043A316B, 0x30DD99F2, 0x237F0185,
-		0x844819FB, 0x97EA818C, 0xA30D2915, 0xB0AFB162,
-		0xCAC27827, 0xD960E050, 0xED8748C9, 0xFE25D0BE,
-		0x195CDA43, 0x0AFE4234, 0x3E19EAAD, 0x2DBB72DA,
-		0x57D6BB9F, 0x447423E8, 0x70938B71, 0x63311306,
-		0xBB8DE87A, 0xA82F700D, 0x9CC8D894, 0x8F6A40E3,
-		0xF50789A6, 0xE6A511D1, 0xD242B948, 0xC1E0213F,
-		0x26992BC2, 0x353BB3B5, 0x01DC1B2C, 0x127E835B,
-		0x68134A1E, 0x7BB1D269, 0x4F567AF0, 0x5CF4E287,
-		0x04D43CFD, 0x1776A48A, 0x23910C13, 0x30339464,
-		0x4A5E5D21, 0x59FCC556, 0x6D1B6DCF, 0x7EB9F5B8,
-		0x99C0FF45, 0x8A626732, 0xBE85CFAB, 0xAD2757DC,
-		0xD74A9E99, 0xC4E806EE, 0xF00FAE77, 0xE3AD3600,
-		0x3B11CD7C, 0x28B3550B, 0x1C54FD92, 0x0FF665E5,
-		0x759BACA0, 0x663934D7, 0x52DE9C4E, 0x417C0439,
-		0xA6050EC4, 0xB5A796B3, 0x81403E2A, 0x92E2A65D,
-		0xE88F6F18, 0xFB2DF76F, 0xCFCA5FF6, 0xDC68C781,
-		0x7B5FDFFF, 0x68FD4788, 0x5C1AEF11, 0x4FB87766,
-		0x35D5BE23, 0x26772654, 0x12908ECD, 0x013216BA,
-		0xE64B1C47, 0xF5E98430, 0xC10E2CA9, 0xD2ACB4DE,
-		0xA8C17D9B, 0xBB63E5EC, 0x8F844D75, 0x9C26D502,
-		0x449A2E7E, 0x5738B609, 0x63DF1E90, 0x707D86E7,
-		0x0A104FA2, 0x19B2D7D5, 0x2D557F4C, 0x3EF7E73B,
-		0xD98EEDC6, 0xCA2C75B1, 0xFECBDD28, 0xED69455F,
-		0x97048C1A, 0x84A6146D, 0xB041BCF4, 0xA3E32483
-	},
-	{
-		0x00000000, 0xA541927E, 0x4F6F520D, 0xEA2EC073,
-		0x9EDEA41A, 0x3B9F3664, 0xD1B1F617, 0x74F06469,
-		0x38513EC5, 0x9D10ACBB, 0x773E6CC8, 0xD27FFEB6,
-		0xA68F9ADF, 0x03CE08A1, 0xE9E0C8D2, 0x4CA15AAC,
-		0x70A27D8A, 0xD5E3EFF4, 0x3FCD2F87, 0x9A8CBDF9,
-		0xEE7CD990, 0x4B3D4BEE, 0xA1138B9D, 0x045219E3,
-		0x48F3434F, 0xEDB2D131, 0x079C1142, 0xA2DD833C,
-		0xD62DE755, 0x736C752B, 0x9942B558, 0x3C032726,
-		0xE144FB14, 0x4405696A, 0xAE2BA919, 0x0B6A3B67,
-		0x7F9A5F0E, 0xDADBCD70, 0x30F50D03, 0x95B49F7D,
-		0xD915C5D1, 0x7C5457AF, 0x967A97DC, 0x333B05A2,
-		0x47CB61CB, 0xE28AF3B5, 0x08A433C6, 0xADE5A1B8,
-		0x91E6869E, 0x34A714E0, 0xDE89D493, 0x7BC846ED,
-		0x0F382284, 0xAA79B0FA, 0x40577089, 0xE516E2F7,
-		0xA9B7B85B, 0x0CF62A25, 0xE6D8EA56, 0x43997828,
-		0x37691C41, 0x92288E3F, 0x78064E4C, 0xDD47DC32,
-		0xC76580D9, 0x622412A7, 0x880AD2D4, 0x2D4B40AA,
-		0x59BB24C3, 0xFCFAB6BD, 0x16D476CE, 0xB395E4B0,
-		0xFF34BE1C, 0x5A752C62, 0xB05BEC11, 0x151A7E6F,
-		0x61EA1A06, 0xC4AB8878, 0x2E85480B, 0x8BC4DA75,
-		0xB7C7FD53, 0x12866F2D, 0xF8A8AF5E, 0x5DE93D20,
-		0x29195949, 0x8C58CB37, 0x66760B44, 0xC337993A,
-		0x8F96C396, 0x2AD751E8, 0xC0F9919B, 0x65B803E5,
-		0x1148678C, 0xB409F5F2, 0x5E273581, 0xFB66A7FF,
-		0x26217BCD, 0x8360E9B3, 0x694E29C0, 0xCC0FBBBE,
-		0xB8FFDFD7, 0x1DBE4DA9, 0xF7908DDA, 0x52D11FA4,
-		0x1E704508, 0xBB31D776, 0x511F1705, 0xF45E857B,
-		0x80AEE112, 0x25EF736C, 0xCFC1B31F, 0x6A802161,
-		0x56830647, 0xF3C29439, 0x19EC544A, 0xBCADC634,
-		0xC85DA25D, 0x6D1C3023, 0x8732F050, 0x2273622E,
-		0x6ED23882, 0xCB93AAFC, 0x21BD6A8F, 0x84FCF8F1,
-		0xF00C9C98, 0x554D0EE6, 0xBF63CE95, 0x1A225CEB,
-		0x8B277743, 0x2E66E53D, 0xC448254E, 0x6109B730,
-		0x15F9D359, 0xB0B84127, 0x5A968154, 0xFFD7132A,
-		0xB3764986, 0x1637DBF8, 0xFC191B8B, 0x595889F5,
-		0x2DA8ED9C, 0x88E97FE2, 0x62C7BF91, 0xC7862DEF,
-		0xFB850AC9, 0x5EC498B7, 0xB4EA58C4, 0x11ABCABA,
-		0x655BAED3, 0xC01A3CAD, 0x2A34FCDE, 0x8F756EA0,
-		0xC3D4340C, 0x6695A672, 0x8CBB6601, 0x29FAF47F,
-		0x5D0A9016, 0xF84B0268, 0x1265C21B, 0xB7245065,
-		0x6A638C57, 0xCF221E29, 0x250CDE5A, 0x804D4C24,
-		0xF4BD284D, 0x51FCBA33, 0xBBD27A40, 0x1E93E83E,
-		0x5232B292, 0xF77320EC, 0x1D5DE09F, 0xB81C72E1,
-		0xCCEC1688, 0x69AD84F6, 0x83834485, 0x26C2D6FB,
-		0x1AC1F1DD, 0xBF8063A3, 0x55AEA3D0, 0xF0EF31AE,
-		0x841F55C7, 0x215EC7B9, 0xCB7007CA, 0x6E3195B4,
-		0x2290CF18, 0x87D15D66, 0x6DFF9D15, 0xC8BE0F6B,
-		0xBC4E6B02, 0x190FF97C, 0xF321390F, 0x5660AB71,
-		0x4C42F79A, 0xE90365E4, 0x032DA597, 0xA66C37E9,
-		0xD29C5380, 0x77DDC1FE, 0x9DF3018D, 0x38B293F3,
-		0x7413C95F, 0xD1525B21, 0x3B7C9B52, 0x9E3D092C,
-		0xEACD6D45, 0x4F8CFF3B, 0xA5A23F48, 0x00E3AD36,
-		0x3CE08A10, 0x99A1186E, 0x738FD81D, 0xD6CE4A63,
-		0xA23E2E0A, 0x077FBC74, 0xED517C07, 0x4810EE79,
-		0x04B1B4D5, 0xA1F026AB, 0x4BDEE6D8, 0xEE9F74A6,
-		0x9A6F10CF, 0x3F2E82B1, 0xD50042C2, 0x7041D0BC,
-		0xAD060C8E, 0x08479EF0, 0xE2695E83, 0x4728CCFD,
-		0x33D8A894, 0x96993AEA, 0x7CB7FA99, 0xD9F668E7,
-		0x9557324B, 0x3016A035, 0xDA386046, 0x7F79F238,
-		0x0B899651, 0xAEC8042F, 0x44E6C45C, 0xE1A75622,
-		0xDDA47104, 0x78E5E37A, 0x92CB2309, 0x378AB177,
-		0x437AD51E, 0xE63B4760, 0x0C158713, 0xA954156D,
-		0xE5F54FC1, 0x40B4DDBF, 0xAA9A1DCC, 0x0FDB8FB2,
-		0x7B2BEBDB, 0xDE6A79A5, 0x3444B9D6, 0x91052BA8
-	},
-	{
-		0x00000000, 0xDD45AAB8, 0xBF672381, 0x62228939,
-		0x7B2231F3, 0xA6679B4B, 0xC4451272, 0x1900B8CA,
-		0xF64463E6, 0x2B01C95E, 0x49234067, 0x9466EADF,
-		0x8D665215, 0x5023F8AD, 0x32017194, 0xEF44DB2C,
-		0xE964B13D, 0x34211B85, 0x560392BC, 0x8B463804,
-		0x924680CE, 0x4F032A76, 0x2D21A34F, 0xF06409F7,
-		0x1F20D2DB, 0xC2657863, 0xA047F15A, 0x7D025BE2,
-		0x6402E328, 0xB9474990, 0xDB65C0A9, 0x06206A11,
-		0xD725148B, 0x0A60BE33, 0x6842370A, 0xB5079DB2,
-		0xAC072578, 0x71428FC0, 0x136006F9, 0xCE25AC41,
-		0x2161776D, 0xFC24DDD5, 0x9E0654EC, 0x4343FE54,
-		0x5A43469E, 0x8706EC26, 0xE524651F, 0x3861CFA7,
-		0x3E41A5B6, 0xE3040F0E, 0x81268637, 0x5C632C8F,
-		0x45639445, 0x98263EFD, 0xFA04B7C4, 0x27411D7C,
-		0xC805C650, 0x15406CE8, 0x7762E5D1, 0xAA274F69,
-		0xB327F7A3, 0x6E625D1B, 0x0C40D422, 0xD1057E9A,
-		0xABA65FE7, 0x76E3F55F, 0x14C17C66, 0xC984D6DE,
-		0xD0846E14, 0x0DC1C4AC, 0x6FE34D95, 0xB2A6E72D,
-		0x5DE23C01, 0x80A796B9, 0xE2851F80, 0x3FC0B538,
-		0x26C00DF2, 0xFB85A74A, 0x99A72E73, 0x44E284CB,
-		0x42C2EEDA, 0x9F874462, 0xFDA5CD5B, 0x20E067E3,
-		0x39E0DF29, 0xE4A57591, 0x8687FCA8, 0x5BC25610,
-		0xB4868D3C, 0x69C32784, 0x0BE1AEBD, 0xD6A40405,
-		0xCFA4BCCF, 0x12E11677, 0x70C39F4E, 0xAD8635F6,
-		0x7C834B6C, 0xA1C6E1D4, 0xC3E468ED, 0x1EA1C255,
-		0x07A17A9F, 0xDAE4D027, 0xB8C6591E, 0x6583F3A6,
-		0x8AC7288A, 0x57828232, 0x35A00B0B, 0xE8E5A1B3,
-		0xF1E51979, 0x2CA0B3C1, 0x4E823AF8, 0x93C79040,
-		0x95E7FA51, 0x48A250E9, 0x2A80D9D0, 0xF7C57368,
-		0xEEC5CBA2, 0x3380611A, 0x51A2E823, 0x8CE7429B,
-		0x63A399B7, 0xBEE6330F, 0xDCC4BA36, 0x0181108E,
-		0x1881A844, 0xC5C402FC, 0xA7E68BC5, 0x7AA3217D,
-		0x52A0C93F, 0x8FE56387, 0xEDC7EABE, 0x30824006,
-		0x2982F8CC, 0xF4C75274, 0x96E5DB4D, 0x4BA071F5,
-		0xA4E4AAD9, 0x79A10061, 0x1B838958, 0xC6C623E0,
-		0xDFC69B2A, 0x02833192, 0x60A1B8AB, 0xBDE41213,
-		0xBBC47802, 0x6681D2BA, 0x04A35B83, 0xD9E6F13B,
-		0xC0E649F1, 0x1DA3E349, 0x7F816A70, 0xA2C4C0C8,
-		0x4D801BE4, 0x90C5B15C, 0xF2E73865, 0x2FA292DD,
-		0x36A22A17, 0xEBE780AF, 0x89C50996, 0x5480A32E,
-		0x8585DDB4, 0x58C0770C, 0x3AE2FE35, 0xE7A7548D,
-		0xFEA7EC47, 0x23E246FF, 0x41C0CFC6, 0x9C85657E,
-		0x73C1BE52, 0xAE8414EA, 0xCCA69DD3, 0x11E3376B,
-		0x08E38FA1, 0xD5A62519, 0xB784AC20, 0x6AC10698,
-		0x6CE16C89, 0xB1A4C631, 0xD3864F08, 0x0EC3E5B0,
-		0x17C35D7A, 0xCA86F7C2, 0xA8A47EFB, 0x75E1D443,
-		0x9AA50F6F, 0x47E0A5D7, 0x25C22CEE, 0xF8878656,
-		0xE1873E9C, 0x3CC29424, 0x5EE01D1D, 0x83A5B7A5,
-		0xF90696D8, 0x24433C60, 0x4661B559, 0x9B241FE1,
-		0x8224A72B, 0x5F610D93, 0x3D4384AA, 0xE0062E12,
-		0x0F42F53E, 0xD2075F86, 0xB025D6BF, 0x6D607C07,
-		0x7460C4CD, 0xA9256E75, 0xCB07E74C, 0x16424DF4,
-		0x106227E5, 0xCD278D5D, 0xAF050464, 0x7240AEDC,
-		0x6B401616, 0xB605BCAE, 0xD4273597, 0x09629F2F,
-		0xE6264403, 0x3B63EEBB, 0x59416782, 0x8404CD3A,
-		0x9D0475F0, 0x4041DF48, 0x22635671, 0xFF26FCC9,
-		0x2E238253, 0xF36628EB, 0x9144A1D2, 0x4C010B6A,
-		0x5501B3A0, 0x88441918, 0xEA669021, 0x37233A99,
-		0xD867E1B5, 0x05224B0D, 0x6700C234, 0xBA45688C,
-		0xA345D046, 0x7E007AFE, 0x1C22F3C7, 0xC167597F,
-		0xC747336E, 0x1A0299D6, 0x782010EF, 0xA565BA57,
-		0xBC65029D, 0x6120A825, 0x0302211C, 0xDE478BA4,
-		0x31035088, 0xEC46FA30, 0x8E647309, 0x5321D9B1,
-		0x4A21617B, 0x9764CBC3, 0xF54642FA, 0x2803E842
-	},
-	{
-		0x00000000, 0x38116FAC, 0x7022DF58, 0x4833B0F4,
-		0xE045BEB0, 0xD854D11C, 0x906761E8, 0xA8760E44,
-		0xC5670B91, 0xFD76643D, 0xB545D4C9, 0x8D54BB65,
-		0x2522B521, 0x1D33DA8D, 0x55006A79, 0x6D1105D5,
-		0x8F2261D3, 0xB7330E7F, 0xFF00BE8B, 0xC711D127,
-		0x6F67DF63, 0x5776B0CF, 0x1F45003B, 0x27546F97,
-		0x4A456A42, 0x725405EE, 0x3A67B51A, 0x0276DAB6,
-		0xAA00D4F2, 0x9211BB5E, 0xDA220BAA, 0xE2336406,
-		0x1BA8B557, 0x23B9DAFB, 0x6B8A6A0F, 0x539B05A3,
-		0xFBED0BE7, 0xC3FC644B, 0x8BCFD4BF, 0xB3DEBB13,
-		0xDECFBEC6, 0xE6DED16A, 0xAEED619E, 0x96FC0E32,
-		0x3E8A0076, 0x069B6FDA, 0x4EA8DF2E, 0x76B9B082,
-		0x948AD484, 0xAC9BBB28, 0xE4A80BDC, 0xDCB96470,
-		0x74CF6A34, 0x4CDE0598, 0x04EDB56C, 0x3CFCDAC0,
-		0x51EDDF15, 0x69FCB0B9, 0x21CF004D, 0x19DE6FE1,
-		0xB1A861A5, 0x89B90E09, 0xC18ABEFD, 0xF99BD151,
-		0x37516AAE, 0x0F400502, 0x4773B5F6, 0x7F62DA5A,
-		0xD714D41E, 0xEF05BBB2, 0xA7360B46, 0x9F2764EA,
-		0xF236613F, 0xCA270E93, 0x8214BE67, 0xBA05D1CB,
-		0x1273DF8F, 0x2A62B023, 0x625100D7, 0x5A406F7B,
-		0xB8730B7D, 0x806264D1, 0xC851D425, 0xF040BB89,
-		0x5836B5CD, 0x6027DA61, 0x28146A95, 0x10050539,
-		0x7D1400EC, 0x45056F40, 0x0D36DFB4, 0x3527B018,
-		0x9D51BE5C, 0xA540D1F0, 0xED736104, 0xD5620EA8,
-		0x2CF9DFF9, 0x14E8B055, 0x5CDB00A1, 0x64CA6F0D,
-		0xCCBC6149, 0xF4AD0EE5, 0xBC9EBE11, 0x848FD1BD,
-		0xE99ED468, 0xD18FBBC4, 0x99BC0B30, 0xA1AD649C,
-		0x09DB6AD8, 0x31CA0574, 0x79F9B580, 0x41E8DA2C,
-		0xA3DBBE2A, 0x9BCAD186, 0xD3F96172, 0xEBE80EDE,
-		0x439E009A, 0x7B8F6F36, 0x33BCDFC2, 0x0BADB06E,
-		0x66BCB5BB, 0x5EADDA17, 0x169E6AE3, 0x2E8F054F,
-		0x86F90B0B, 0xBEE864A7, 0xF6DBD453, 0xCECABBFF,
-		0x6EA2D55C, 0x56B3BAF0, 0x1E800A04, 0x269165A8,
-		0x8EE76BEC, 0xB6F60440, 0xFEC5B4B4, 0xC6D4DB18,
-		0xABC5DECD, 0x93D4B161, 0xDBE70195, 0xE3F66E39,
-		0x4B80607D, 0x73910FD1, 0x3BA2BF25, 0x03B3D089,
-		0xE180B48F, 0xD991DB23, 0x91A26BD7, 0xA9B3047B,
-		0x01C50A3F, 0x39D46593, 0x71E7D567, 0x49F6BACB,
-		0x24E7BF1E, 0x1CF6D0B2, 0x54C56046, 0x6CD40FEA,
-		0xC4A201AE, 0xFCB36E02, 0xB480DEF6, 0x8C91B15A,
-		0x750A600B, 0x4D1B0FA7, 0x0528BF53, 0x3D39D0FF,
-		0x954FDEBB, 0xAD5EB117, 0xE56D01E3, 0xDD7C6E4F,
-		0xB06D6B9A, 0x887C0436, 0xC04FB4C2, 0xF85EDB6E,
-		0x5028D52A, 0x6839BA86, 0x200A0A72, 0x181B65DE,
-		0xFA2801D8, 0xC2396E74, 0x8A0ADE80, 0xB21BB12C,
-		0x1A6DBF68, 0x227CD0C4, 0x6A4F6030, 0x525E0F9C,
-		0x3F4F0A49, 0x075E65E5, 0x4F6DD511, 0x777CBABD,
-		0xDF0AB4F9, 0xE71BDB55, 0xAF286BA1, 0x9739040D,
-		0x59F3BFF2, 0x61E2D05E, 0x29D160AA, 0x11C00F06,
-		0xB9B60142, 0x81A76EEE, 0xC994DE1A, 0xF185B1B6,
-		0x9C94B463, 0xA485DBCF, 0xECB66B3B, 0xD4A70497,
-		0x7CD10AD3, 0x44C0657F, 0x0CF3D58B, 0x34E2BA27,
-		0xD6D1DE21, 0xEEC0B18D, 0xA6F30179, 0x9EE26ED5,
-		0x36946091, 0x0E850F3D, 0x46B6BFC9, 0x7EA7D065,
-		0x13B6D5B0, 0x2BA7BA1C, 0x63940AE8, 0x5B856544,
-		0xF3F36B00, 0xCBE204AC, 0x83D1B458, 0xBBC0DBF4,
-		0x425B0AA5, 0x7A4A6509, 0x3279D5FD, 0x0A68BA51,
-		0xA21EB415, 0x9A0FDBB9, 0xD23C6B4D, 0xEA2D04E1,
-		0x873C0134, 0xBF2D6E98, 0xF71EDE6C, 0xCF0FB1C0,
-		0x6779BF84, 0x5F68D028, 0x175B60DC, 0x2F4A0F70,
-		0xCD796B76, 0xF56804DA, 0xBD5BB42E, 0x854ADB82,
-		0x2D3CD5C6, 0x152DBA6A, 0x5D1E0A9E, 0x650F6532,
-		0x081E60E7, 0x300F0F4B, 0x783CBFBF, 0x402DD013,
-		0xE85BDE57, 0xD04AB1FB, 0x9879010F, 0xA0686EA3
-	},
-	{
-		0x00000000, 0xEF306B19, 0xDB8CA0C3, 0x34BCCBDA,
-		0xB2F53777, 0x5DC55C6E, 0x697997B4, 0x8649FCAD,
-		0x6006181F, 0x8F367306, 0xBB8AB8DC, 0x54BAD3C5,
-		0xD2F32F68, 0x3DC34471, 0x097F8FAB, 0xE64FE4B2,
-		0xC00C303E, 0x2F3C5B27, 0x1B8090FD, 0xF4B0FBE4,
-		0x72F90749, 0x9DC96C50, 0xA975A78A, 0x4645CC93,
-		0xA00A2821, 0x4F3A4338, 0x7B8688E2, 0x94B6E3FB,
-		0x12FF1F56, 0xFDCF744F, 0xC973BF95, 0x2643D48C,
-		0x85F4168D, 0x6AC47D94, 0x5E78B64E, 0xB148DD57,
-		0x370121FA, 0xD8314AE3, 0xEC8D8139, 0x03BDEA20,
-		0xE5F20E92, 0x0AC2658B, 0x3E7EAE51, 0xD14EC548,
-		0x570739E5, 0xB83752FC, 0x8C8B9926, 0x63BBF23F,
-		0x45F826B3, 0xAAC84DAA, 0x9E748670, 0x7144ED69,
-		0xF70D11C4, 0x183D7ADD, 0x2C81B107, 0xC3B1DA1E,
-		0x25FE3EAC, 0xCACE55B5, 0xFE729E6F, 0x1142F576,
-		0x970B09DB, 0x783B62C2, 0x4C87A918, 0xA3B7C201,
-		0x0E045BEB, 0xE13430F2, 0xD588FB28, 0x3AB89031,
-		0xBCF16C9C, 0x53C10785, 0x677DCC5F, 0x884DA746,
-		0x6E0243F4, 0x813228ED, 0xB58EE337, 0x5ABE882E,
-		0xDCF77483, 0x33C71F9A, 0x077BD440, 0xE84BBF59,
-		0xCE086BD5, 0x213800CC, 0x1584CB16, 0xFAB4A00F,
-		0x7CFD5CA2, 0x93CD37BB, 0xA771FC61, 0x48419778,
-		0xAE0E73CA, 0x413E18D3, 0x7582D309, 0x9AB2B810,
-		0x1CFB44BD, 0xF3CB2FA4, 0xC777E47E, 0x28478F67,
-		0x8BF04D66, 0x64C0267F, 0x507CEDA5, 0xBF4C86BC,
-		0x39057A11, 0xD6351108, 0xE289DAD2, 0x0DB9B1CB,
-		0xEBF65579, 0x04C63E60, 0x307AF5BA, 0xDF4A9EA3,
-		0x5903620E, 0xB6330917, 0x828FC2CD, 0x6DBFA9D4,
-		0x4BFC7D58, 0xA4CC1641, 0x9070DD9B, 0x7F40B682,
-		0xF9094A2F, 0x16392136, 0x2285EAEC, 0xCDB581F5,
-		0x2BFA6547, 0xC4CA0E5E, 0xF076C584, 0x1F46AE9D,
-		0x990F5230, 0x763F3929, 0x4283F2F3, 0xADB399EA,
-		0x1C08B7D6, 0xF338DCCF, 0xC7841715, 0x28B47C0C,
-		0xAEFD80A1, 0x41CDEBB8, 0x75712062, 0x9A414B7B,
-		0x7C0EAFC9, 0x933EC4D0, 0xA7820F0A, 0x48B26413,
-		0xCEFB98BE, 0x21CBF3A7, 0x1577387D, 0xFA475364,
-		0xDC0487E8, 0x3334ECF1, 0x0788272B, 0xE8B84C32,
-		0x6EF1B09F, 0x81C1DB86, 0xB57D105C, 0x5A4D7B45,
-		0xBC029FF7, 0x5332F4EE, 0x678E3F34, 0x88BE542D,
-		0x0EF7A880, 0xE1C7C399, 0xD57B0843, 0x3A4B635A,
-		0x99FCA15B, 0x76CCCA42, 0x42700198, 0xAD406A81,
-		0x2B09962C, 0xC439FD35, 0xF08536EF, 0x1FB55DF6,
-		0xF9FAB944, 0x16CAD25D, 0x22761987, 0xCD46729E,
-		0x4B0F8E33, 0xA43FE52A, 0x90832EF0, 0x7FB345E9,
-		0x59F09165, 0xB6C0FA7C, 0x827C31A6, 0x6D4C5ABF,
-		0xEB05A612, 0x0435CD0B, 0x308906D1, 0xDFB96DC8,
-		0x39F6897A, 0xD6C6E263, 0xE27A29B9, 0x0D4A42A0,
-		0x8B03BE0D, 0x6433D514, 0x508F1ECE, 0xBFBF75D7,
-		0x120CEC3D, 0xFD3C8724, 0xC9804CFE, 0x26B027E7,
-		0xA0F9DB4A, 0x4FC9B053, 0x7B757B89, 0x94451090,
-		0x720AF422, 0x9D3A9F3B, 0xA98654E1, 0x46B63FF8,
-		0xC0FFC355, 0x2FCFA84C, 0x1B736396, 0xF443088F,
-		0xD200DC03, 0x3D30B71A, 0x098C7CC0, 0xE6BC17D9,
-		0x60F5EB74, 0x8FC5806D, 0xBB794BB7, 0x544920AE,
-		0xB206C41C, 0x5D36AF05, 0x698A64DF, 0x86BA0FC6,
-		0x00F3F36B, 0xEFC39872, 0xDB7F53A8, 0x344F38B1,
-		0x97F8FAB0, 0x78C891A9, 0x4C745A73, 0xA344316A,
-		0x250DCDC7, 0xCA3DA6DE, 0xFE816D04, 0x11B1061D,
-		0xF7FEE2AF, 0x18CE89B6, 0x2C72426C, 0xC3422975,
-		0x450BD5D8, 0xAA3BBEC1, 0x9E87751B, 0x71B71E02,
-		0x57F4CA8E, 0xB8C4A197, 0x8C786A4D, 0x63480154,
-		0xE501FDF9, 0x0A3196E0, 0x3E8D5D3A, 0xD1BD3623,
-		0x37F2D291, 0xD8C2B988, 0xEC7E7252, 0x034E194B,
-		0x8507E5E6, 0x6A378EFF, 0x5E8B4525, 0xB1BB2E3C
-	},
-	{
-		0x00000000, 0x68032CC8, 0xD0065990, 0xB8057558,
-		0xA5E0C5D1, 0xCDE3E919, 0x75E69C41, 0x1DE5B089,
-		0x4E2DFD53, 0x262ED19B, 0x9E2BA4C3, 0xF628880B,
-		0xEBCD3882, 0x83CE144A, 0x3BCB6112, 0x53C84DDA,
-		0x9C5BFAA6, 0xF458D66E, 0x4C5DA336, 0x245E8FFE,
-		0x39BB3F77, 0x51B813BF, 0xE9BD66E7, 0x81BE4A2F,
-		0xD27607F5, 0xBA752B3D, 0x02705E65, 0x6A7372AD,
-		0x7796C224, 0x1F95EEEC, 0xA7909BB4, 0xCF93B77C,
-		0x3D5B83BD, 0x5558AF75, 0xED5DDA2D, 0x855EF6E5,
-		0x98BB466C, 0xF0B86AA4, 0x48BD1FFC, 0x20BE3334,
-		0x73767EEE, 0x1B755226, 0xA370277E, 0xCB730BB6,
-		0xD696BB3F, 0xBE9597F7, 0x0690E2AF, 0x6E93CE67,
-		0xA100791B, 0xC90355D3, 0x7106208B, 0x19050C43,
-		0x04E0BCCA, 0x6CE39002, 0xD4E6E55A, 0xBCE5C992,
-		0xEF2D8448, 0x872EA880, 0x3F2BDDD8, 0x5728F110,
-		0x4ACD4199, 0x22CE6D51, 0x9ACB1809, 0xF2C834C1,
-		0x7AB7077A, 0x12B42BB2, 0xAAB15EEA, 0xC2B27222,
-		0xDF57C2AB, 0xB754EE63, 0x0F519B3B, 0x6752B7F3,
-		0x349AFA29, 0x5C99D6E1, 0xE49CA3B9, 0x8C9F8F71,
-		0x917A3FF8, 0xF9791330, 0x417C6668, 0x297F4AA0,
-		0xE6ECFDDC, 0x8EEFD114, 0x36EAA44C, 0x5EE98884,
-		0x430C380D, 0x2B0F14C5, 0x930A619D, 0xFB094D55,
-		0xA8C1008F, 0xC0C22C47, 0x78C7591F, 0x10C475D7,
-		0x0D21C55E, 0x6522E996, 0xDD279CCE, 0xB524B006,
-		0x47EC84C7, 0x2FEFA80F, 0x97EADD57, 0xFFE9F19F,
-		0xE20C4116, 0x8A0F6DDE, 0x320A1886, 0x5A09344E,
-		0x09C17994, 0x61C2555C, 0xD9C72004, 0xB1C40CCC,
-		0xAC21BC45, 0xC422908D, 0x7C27E5D5, 0x1424C91D,
-		0xDBB77E61, 0xB3B452A9, 0x0BB127F1, 0x63B20B39,
-		0x7E57BBB0, 0x16549778, 0xAE51E220, 0xC652CEE8,
-		0x959A8332, 0xFD99AFFA, 0x459CDAA2, 0x2D9FF66A,
-		0x307A46E3, 0x58796A2B, 0xE07C1F73, 0x887F33BB,
-		0xF56E0EF4, 0x9D6D223C, 0x25685764, 0x4D6B7BAC,
-		0x508ECB25, 0x388DE7ED, 0x808892B5, 0xE88BBE7D,
-		0xBB43F3A7, 0xD340DF6F, 0x6B45AA37, 0x034686FF,
-		0x1EA33676, 0x76A01ABE, 0xCEA56FE6, 0xA6A6432E,
-		0x6935F452, 0x0136D89A, 0xB933ADC2, 0xD130810A,
-		0xCCD53183, 0xA4D61D4B, 0x1CD36813, 0x74D044DB,
-		0x27180901, 0x4F1B25C9, 0xF71E5091, 0x9F1D7C59,
-		0x82F8CCD0, 0xEAFBE018, 0x52FE9540, 0x3AFDB988,
-		0xC8358D49, 0xA036A181, 0x1833D4D9, 0x7030F811,
-		0x6DD54898, 0x05D66450, 0xBDD31108, 0xD5D03DC0,
-		0x8618701A, 0xEE1B5CD2, 0x561E298A, 0x3E1D0542,
-		0x23F8B5CB, 0x4BFB9903, 0xF3FEEC5B, 0x9BFDC093,
-		0x546E77EF, 0x3C6D5B27, 0x84682E7F, 0xEC6B02B7,
-		0xF18EB23E, 0x998D9EF6, 0x2188EBAE, 0x498BC766,
-		0x1A438ABC, 0x7240A674, 0xCA45D32C, 0xA246FFE4,
-		0xBFA34F6D, 0xD7A063A5, 0x6FA516FD, 0x07A63A35,
-		0x8FD9098E, 0xE7DA2546, 0x5FDF501E, 0x37DC7CD6,
-		0x2A39CC5F, 0x423AE097, 0xFA3F95CF, 0x923CB907,
-		0xC1F4F4DD, 0xA9F7D815, 0x11F2AD4D, 0x79F18185,
-		0x6414310C, 0x0C171DC4, 0xB412689C, 0xDC114454,
-		0x1382F328, 0x7B81DFE0, 0xC384AAB8, 0xAB878670,
-		0xB66236F9, 0xDE611A31, 0x66646F69, 0x0E6743A1,
-		0x5DAF0E7B, 0x35AC22B3, 0x8DA957EB, 0xE5AA7B23,
-		0xF84FCBAA, 0x904CE762, 0x2849923A, 0x404ABEF2,
-		0xB2828A33, 0xDA81A6FB, 0x6284D3A3, 0x0A87FF6B,
-		0x17624FE2, 0x7F61632A, 0xC7641672, 0xAF673ABA,
-		0xFCAF7760, 0x94AC5BA8, 0x2CA92EF0, 0x44AA0238,
-		0x594FB2B1, 0x314C9E79, 0x8949EB21, 0xE14AC7E9,
-		0x2ED97095, 0x46DA5C5D, 0xFEDF2905, 0x96DC05CD,
-		0x8B39B544, 0xE33A998C, 0x5B3FECD4, 0x333CC01C,
-		0x60F48DC6, 0x08F7A10E, 0xB0F2D456, 0xD8F1F89E,
-		0xC5144817, 0xAD1764DF, 0x15121187, 0x7D113D4F
-	},
-	{
-		0x00000000, 0x493C7D27, 0x9278FA4E, 0xDB448769,
-		0x211D826D, 0x6821FF4A, 0xB3657823, 0xFA590504,
-		0x423B04DA, 0x0B0779FD, 0xD043FE94, 0x997F83B3,
-		0x632686B7, 0x2A1AFB90, 0xF15E7CF9, 0xB86201DE,
-		0x847609B4, 0xCD4A7493, 0x160EF3FA, 0x5F328EDD,
-		0xA56B8BD9, 0xEC57F6FE, 0x37137197, 0x7E2F0CB0,
-		0xC64D0D6E, 0x8F717049, 0x5435F720, 0x1D098A07,
-		0xE7508F03, 0xAE6CF224, 0x7528754D, 0x3C14086A,
-		0x0D006599, 0x443C18BE, 0x9F789FD7, 0xD644E2F0,
-		0x2C1DE7F4, 0x65219AD3, 0xBE651DBA, 0xF759609D,
-		0x4F3B6143, 0x06071C64, 0xDD439B0D, 0x947FE62A,
-		0x6E26E32E, 0x271A9E09, 0xFC5E1960, 0xB5626447,
-		0x89766C2D, 0xC04A110A, 0x1B0E9663, 0x5232EB44,
-		0xA86BEE40, 0xE1579367, 0x3A13140E, 0x732F6929,
-		0xCB4D68F7, 0x827115D0, 0x593592B9, 0x1009EF9E,
-		0xEA50EA9A, 0xA36C97BD, 0x782810D4, 0x31146DF3,
-		0x1A00CB32, 0x533CB615, 0x8878317C, 0xC1444C5B,
-		0x3B1D495F, 0x72213478, 0xA965B311, 0xE059CE36,
-		0x583BCFE8, 0x1107B2CF, 0xCA4335A6, 0x837F4881,
-		0x79264D85, 0x301A30A2, 0xEB5EB7CB, 0xA262CAEC,
-		0x9E76C286, 0xD74ABFA1, 0x0C0E38C8, 0x453245EF,
-		0xBF6B40EB, 0xF6573DCC, 0x2D13BAA5, 0x642FC782,
-		0xDC4DC65C, 0x9571BB7B, 0x4E353C12, 0x07094135,
-		0xFD504431, 0xB46C3916, 0x6F28BE7F, 0x2614C358,
-		0x1700AEAB, 0x5E3CD38C, 0x857854E5, 0xCC4429C2,
-		0x361D2CC6, 0x7F2151E1, 0xA465D688, 0xED59ABAF,
-		0x553BAA71, 0x1C07D756, 0xC743503F, 0x8E7F2D18,
-		0x7426281C, 0x3D1A553B, 0xE65ED252, 0xAF62AF75,
-		0x9376A71F, 0xDA4ADA38, 0x010E5D51, 0x48322076,
-		0xB26B2572, 0xFB575855, 0x2013DF3C, 0x692FA21B,
-		0xD14DA3C5, 0x9871DEE2, 0x4335598B, 0x0A0924AC,
-		0xF05021A8, 0xB96C5C8F, 0x6228DBE6, 0x2B14A6C1,
-		0x34019664, 0x7D3DEB43, 0xA6796C2A, 0xEF45110D,
-		0x151C1409, 0x5C20692E, 0x8764EE47, 0xCE589360,
-		0x763A92BE, 0x3F06EF99, 0xE44268F0, 0xAD7E15D7,
-		0x572710D3, 0x1E1B6DF4, 0xC55FEA9D, 0x8C6397BA,
-		0xB0779FD0, 0xF94BE2F7, 0x220F659E, 0x6B3318B9,
-		0x916A1DBD, 0xD856609A, 0x0312E7F3, 0x4A2E9AD4,
-		0xF24C9B0A, 0xBB70E62D, 0x60346144, 0x29081C63,
-		0xD3511967, 0x9A6D6440, 0x4129E329, 0x08159E0E,
-		0x3901F3FD, 0x703D8EDA, 0xAB7909B3, 0xE2457494,
-		0x181C7190, 0x51200CB7, 0x8A648BDE, 0xC358F6F9,
-		0x7B3AF727, 0x32068A00, 0xE9420D69, 0xA07E704E,
-		0x5A27754A, 0x131B086D, 0xC85F8F04, 0x8163F223,
-		0xBD77FA49, 0xF44B876E, 0x2F0F0007, 0x66337D20,
-		0x9C6A7824, 0xD5560503, 0x0E12826A, 0x472EFF4D,
-		0xFF4CFE93, 0xB67083B4, 0x6D3404DD, 0x240879FA,
-		0xDE517CFE, 0x976D01D9, 0x4C2986B0, 0x0515FB97,
-		0x2E015D56, 0x673D2071, 0xBC79A718, 0xF545DA3F,
-		0x0F1CDF3B, 0x4620A21C, 0x9D642575, 0xD4585852,
-		0x6C3A598C, 0x250624AB, 0xFE42A3C2, 0xB77EDEE5,
-		0x4D27DBE1, 0x041BA6C6, 0xDF5F21AF, 0x96635C88,
-		0xAA7754E2, 0xE34B29C5, 0x380FAEAC, 0x7133D38B,
-		0x8B6AD68F, 0xC256ABA8, 0x19122CC1, 0x502E51E6,
-		0xE84C5038, 0xA1702D1F, 0x7A34AA76, 0x3308D751,
-		0xC951D255, 0x806DAF72, 0x5B29281B, 0x1215553C,
-		0x230138CF, 0x6A3D45E8, 0xB179C281, 0xF845BFA6,
-		0x021CBAA2, 0x4B20C785, 0x906440EC, 0xD9583DCB,
-		0x613A3C15, 0x28064132, 0xF342C65B, 0xBA7EBB7C,
-		0x4027BE78, 0x091BC35F, 0xD25F4436, 0x9B633911,
-		0xA777317B, 0xEE4B4C5C, 0x350FCB35, 0x7C33B612,
-		0x866AB316, 0xCF56CE31, 0x14124958, 0x5D2E347F,
-		0xE54C35A1, 0xAC704886, 0x7734CFEF, 0x3E08B2C8,
-		0xC451B7CC, 0x8D6DCAEB, 0x56294D82, 0x1F1530A5
-	}
-#else		/* !WORDS_BIGENDIAN */
-	{
-		0x00000000, 0x03836BF2, 0xF7703BE1, 0xF4F35013,
-		0x1F979AC7, 0x1C14F135, 0xE8E7A126, 0xEB64CAD4,
-		0xCF58D98A, 0xCCDBB278, 0x3828E26B, 0x3BAB8999,
-		0xD0CF434D, 0xD34C28BF, 0x27BF78AC, 0x243C135E,
-		0x6FC75E10, 0x6C4435E2, 0x98B765F1, 0x9B340E03,
-		0x7050C4D7, 0x73D3AF25, 0x8720FF36, 0x84A394C4,
-		0xA09F879A, 0xA31CEC68, 0x57EFBC7B, 0x546CD789,
-		0xBF081D5D, 0xBC8B76AF, 0x487826BC, 0x4BFB4D4E,
-		0xDE8EBD20, 0xDD0DD6D2, 0x29FE86C1, 0x2A7DED33,
-		0xC11927E7, 0xC29A4C15, 0x36691C06, 0x35EA77F4,
-		0x11D664AA, 0x12550F58, 0xE6A65F4B, 0xE52534B9,
-		0x0E41FE6D, 0x0DC2959F, 0xF931C58C, 0xFAB2AE7E,
-		0xB149E330, 0xB2CA88C2, 0x4639D8D1, 0x45BAB323,
-		0xAEDE79F7, 0xAD5D1205, 0x59AE4216, 0x5A2D29E4,
-		0x7E113ABA, 0x7D925148, 0x8961015B, 0x8AE26AA9,
-		0x6186A07D, 0x6205CB8F, 0x96F69B9C, 0x9575F06E,
-		0xBC1D7B41, 0xBF9E10B3, 0x4B6D40A0, 0x48EE2B52,
-		0xA38AE186, 0xA0098A74, 0x54FADA67, 0x5779B195,
-		0x7345A2CB, 0x70C6C939, 0x8435992A, 0x87B6F2D8,
-		0x6CD2380C, 0x6F5153FE, 0x9BA203ED, 0x9821681F,
-		0xD3DA2551, 0xD0594EA3, 0x24AA1EB0, 0x27297542,
-		0xCC4DBF96, 0xCFCED464, 0x3B3D8477, 0x38BEEF85,
-		0x1C82FCDB, 0x1F019729, 0xEBF2C73A, 0xE871ACC8,
-		0x0315661C, 0x00960DEE, 0xF4655DFD, 0xF7E6360F,
-		0x6293C661, 0x6110AD93, 0x95E3FD80, 0x96609672,
-		0x7D045CA6, 0x7E873754, 0x8A746747, 0x89F70CB5,
-		0xADCB1FEB, 0xAE487419, 0x5ABB240A, 0x59384FF8,
-		0xB25C852C, 0xB1DFEEDE, 0x452CBECD, 0x46AFD53F,
-		0x0D549871, 0x0ED7F383, 0xFA24A390, 0xF9A7C862,
-		0x12C302B6, 0x11406944, 0xE5B33957, 0xE63052A5,
-		0xC20C41FB, 0xC18F2A09, 0x357C7A1A, 0x36FF11E8,
-		0xDD9BDB3C, 0xDE18B0CE, 0x2AEBE0DD, 0x29688B2F,
-		0x783BF682, 0x7BB89D70, 0x8F4BCD63, 0x8CC8A691,
-		0x67AC6C45, 0x642F07B7, 0x90DC57A4, 0x935F3C56,
-		0xB7632F08, 0xB4E044FA, 0x401314E9, 0x43907F1B,
-		0xA8F4B5CF, 0xAB77DE3D, 0x5F848E2E, 0x5C07E5DC,
-		0x17FCA892, 0x147FC360, 0xE08C9373, 0xE30FF881,
-		0x086B3255, 0x0BE859A7, 0xFF1B09B4, 0xFC986246,
-		0xD8A47118, 0xDB271AEA, 0x2FD44AF9, 0x2C57210B,
-		0xC733EBDF, 0xC4B0802D, 0x3043D03E, 0x33C0BBCC,
-		0xA6B54BA2, 0xA5362050, 0x51C57043, 0x52461BB1,
-		0xB922D165, 0xBAA1BA97, 0x4E52EA84, 0x4DD18176,
-		0x69ED9228, 0x6A6EF9DA, 0x9E9DA9C9, 0x9D1EC23B,
-		0x767A08EF, 0x75F9631D, 0x810A330E, 0x828958FC,
-		0xC97215B2, 0xCAF17E40, 0x3E022E53, 0x3D8145A1,
-		0xD6E58F75, 0xD566E487, 0x2195B494, 0x2216DF66,
-		0x062ACC38, 0x05A9A7CA, 0xF15AF7D9, 0xF2D99C2B,
-		0x19BD56FF, 0x1A3E3D0D, 0xEECD6D1E, 0xED4E06EC,
-		0xC4268DC3, 0xC7A5E631, 0x3356B622, 0x30D5DDD0,
-		0xDBB11704, 0xD8327CF6, 0x2CC12CE5, 0x2F424717,
-		0x0B7E5449, 0x08FD3FBB, 0xFC0E6FA8, 0xFF8D045A,
-		0x14E9CE8E, 0x176AA57C, 0xE399F56F, 0xE01A9E9D,
-		0xABE1D3D3, 0xA862B821, 0x5C91E832, 0x5F1283C0,
-		0xB4764914, 0xB7F522E6, 0x430672F5, 0x40851907,
-		0x64B90A59, 0x673A61AB, 0x93C931B8, 0x904A5A4A,
-		0x7B2E909E, 0x78ADFB6C, 0x8C5EAB7F, 0x8FDDC08D,
-		0x1AA830E3, 0x192B5B11, 0xEDD80B02, 0xEE5B60F0,
-		0x053FAA24, 0x06BCC1D6, 0xF24F91C5, 0xF1CCFA37,
-		0xD5F0E969, 0xD673829B, 0x2280D288, 0x2103B97A,
-		0xCA6773AE, 0xC9E4185C, 0x3D17484F, 0x3E9423BD,
-		0x756F6EF3, 0x76EC0501, 0x821F5512, 0x819C3EE0,
-		0x6AF8F434, 0x697B9FC6, 0x9D88CFD5, 0x9E0BA427,
-		0xBA37B779, 0xB9B4DC8B, 0x4D478C98, 0x4EC4E76A,
-		0xA5A02DBE, 0xA623464C, 0x52D0165F, 0x51537DAD,
-	},
-	{
-		0x00000000, 0x7798A213, 0xEE304527, 0x99A8E734,
-		0xDC618A4E, 0xABF9285D, 0x3251CF69, 0x45C96D7A,
-		0xB8C3149D, 0xCF5BB68E, 0x56F351BA, 0x216BF3A9,
-		0x64A29ED3, 0x133A3CC0, 0x8A92DBF4, 0xFD0A79E7,
-		0x81F1C53F, 0xF669672C, 0x6FC18018, 0x1859220B,
-		0x5D904F71, 0x2A08ED62, 0xB3A00A56, 0xC438A845,
-		0x3932D1A2, 0x4EAA73B1, 0xD7029485, 0xA09A3696,
-		0xE5535BEC, 0x92CBF9FF, 0x0B631ECB, 0x7CFBBCD8,
-		0x02E38B7F, 0x757B296C, 0xECD3CE58, 0x9B4B6C4B,
-		0xDE820131, 0xA91AA322, 0x30B24416, 0x472AE605,
-		0xBA209FE2, 0xCDB83DF1, 0x5410DAC5, 0x238878D6,
-		0x664115AC, 0x11D9B7BF, 0x8871508B, 0xFFE9F298,
-		0x83124E40, 0xF48AEC53, 0x6D220B67, 0x1ABAA974,
-		0x5F73C40E, 0x28EB661D, 0xB1438129, 0xC6DB233A,
-		0x3BD15ADD, 0x4C49F8CE, 0xD5E11FFA, 0xA279BDE9,
-		0xE7B0D093, 0x90287280, 0x098095B4, 0x7E1837A7,
-		0x04C617FF, 0x735EB5EC, 0xEAF652D8, 0x9D6EF0CB,
-		0xD8A79DB1, 0xAF3F3FA2, 0x3697D896, 0x410F7A85,
-		0xBC050362, 0xCB9DA171, 0x52354645, 0x25ADE456,
-		0x6064892C, 0x17FC2B3F, 0x8E54CC0B, 0xF9CC6E18,
-		0x8537D2C0, 0xF2AF70D3, 0x6B0797E7, 0x1C9F35F4,
-		0x5956588E, 0x2ECEFA9D, 0xB7661DA9, 0xC0FEBFBA,
-		0x3DF4C65D, 0x4A6C644E, 0xD3C4837A, 0xA45C2169,
-		0xE1954C13, 0x960DEE00, 0x0FA50934, 0x783DAB27,
-		0x06259C80, 0x71BD3E93, 0xE815D9A7, 0x9F8D7BB4,
-		0xDA4416CE, 0xADDCB4DD, 0x347453E9, 0x43ECF1FA,
-		0xBEE6881D, 0xC97E2A0E, 0x50D6CD3A, 0x274E6F29,
-		0x62870253, 0x151FA040, 0x8CB74774, 0xFB2FE567,
-		0x87D459BF, 0xF04CFBAC, 0x69E41C98, 0x1E7CBE8B,
-		0x5BB5D3F1, 0x2C2D71E2, 0xB58596D6, 0xC21D34C5,
-		0x3F174D22, 0x488FEF31, 0xD1270805, 0xA6BFAA16,
-		0xE376C76C, 0x94EE657F, 0x0D46824B, 0x7ADE2058,
-		0xF9FAC3FB, 0x8E6261E8, 0x17CA86DC, 0x605224CF,
-		0x259B49B5, 0x5203EBA6, 0xCBAB0C92, 0xBC33AE81,
-		0x4139D766, 0x36A17575, 0xAF099241, 0xD8913052,
-		0x9D585D28, 0xEAC0FF3B, 0x7368180F, 0x04F0BA1C,
-		0x780B06C4, 0x0F93A4D7, 0x963B43E3, 0xE1A3E1F0,
-		0xA46A8C8A, 0xD3F22E99, 0x4A5AC9AD, 0x3DC26BBE,
-		0xC0C81259, 0xB750B04A, 0x2EF8577E, 0x5960F56D,
-		0x1CA99817, 0x6B313A04, 0xF299DD30, 0x85017F23,
-		0xFB194884, 0x8C81EA97, 0x15290DA3, 0x62B1AFB0,
-		0x2778C2CA, 0x50E060D9, 0xC94887ED, 0xBED025FE,
-		0x43DA5C19, 0x3442FE0A, 0xADEA193E, 0xDA72BB2D,
-		0x9FBBD657, 0xE8237444, 0x718B9370, 0x06133163,
-		0x7AE88DBB, 0x0D702FA8, 0x94D8C89C, 0xE3406A8F,
-		0xA68907F5, 0xD111A5E6, 0x48B942D2, 0x3F21E0C1,
-		0xC22B9926, 0xB5B33B35, 0x2C1BDC01, 0x5B837E12,
-		0x1E4A1368, 0x69D2B17B, 0xF07A564F, 0x87E2F45C,
-		0xFD3CD404, 0x8AA47617, 0x130C9123, 0x64943330,
-		0x215D5E4A, 0x56C5FC59, 0xCF6D1B6D, 0xB8F5B97E,
-		0x45FFC099, 0x3267628A, 0xABCF85BE, 0xDC5727AD,
-		0x999E4AD7, 0xEE06E8C4, 0x77AE0FF0, 0x0036ADE3,
-		0x7CCD113B, 0x0B55B328, 0x92FD541C, 0xE565F60F,
-		0xA0AC9B75, 0xD7343966, 0x4E9CDE52, 0x39047C41,
-		0xC40E05A6, 0xB396A7B5, 0x2A3E4081, 0x5DA6E292,
-		0x186F8FE8, 0x6FF72DFB, 0xF65FCACF, 0x81C768DC,
-		0xFFDF5F7B, 0x8847FD68, 0x11EF1A5C, 0x6677B84F,
-		0x23BED535, 0x54267726, 0xCD8E9012, 0xBA163201,
-		0x471C4BE6, 0x3084E9F5, 0xA92C0EC1, 0xDEB4ACD2,
-		0x9B7DC1A8, 0xECE563BB, 0x754D848F, 0x02D5269C,
-		0x7E2E9A44, 0x09B63857, 0x901EDF63, 0xE7867D70,
-		0xA24F100A, 0xD5D7B219, 0x4C7F552D, 0x3BE7F73E,
-		0xC6ED8ED9, 0xB1752CCA, 0x28DDCBFE, 0x5F4569ED,
-		0x1A8C0497, 0x6D14A684, 0xF4BC41B0, 0x8324E3A3,
-	},
-	{
-		0x00000000, 0x7E9241A5, 0x0D526F4F, 0x73C02EEA,
-		0x1AA4DE9E, 0x64369F3B, 0x17F6B1D1, 0x6964F074,
-		0xC53E5138, 0xBBAC109D, 0xC86C3E77, 0xB6FE7FD2,
-		0xDF9A8FA6, 0xA108CE03, 0xD2C8E0E9, 0xAC5AA14C,
-		0x8A7DA270, 0xF4EFE3D5, 0x872FCD3F, 0xF9BD8C9A,
-		0x90D97CEE, 0xEE4B3D4B, 0x9D8B13A1, 0xE3195204,
-		0x4F43F348, 0x31D1B2ED, 0x42119C07, 0x3C83DDA2,
-		0x55E72DD6, 0x2B756C73, 0x58B54299, 0x2627033C,
-		0x14FB44E1, 0x6A690544, 0x19A92BAE, 0x673B6A0B,
-		0x0E5F9A7F, 0x70CDDBDA, 0x030DF530, 0x7D9FB495,
-		0xD1C515D9, 0xAF57547C, 0xDC977A96, 0xA2053B33,
-		0xCB61CB47, 0xB5F38AE2, 0xC633A408, 0xB8A1E5AD,
-		0x9E86E691, 0xE014A734, 0x93D489DE, 0xED46C87B,
-		0x8422380F, 0xFAB079AA, 0x89705740, 0xF7E216E5,
-		0x5BB8B7A9, 0x252AF60C, 0x56EAD8E6, 0x28789943,
-		0x411C6937, 0x3F8E2892, 0x4C4E0678, 0x32DC47DD,
-		0xD98065C7, 0xA7122462, 0xD4D20A88, 0xAA404B2D,
-		0xC324BB59, 0xBDB6FAFC, 0xCE76D416, 0xB0E495B3,
-		0x1CBE34FF, 0x622C755A, 0x11EC5BB0, 0x6F7E1A15,
-		0x061AEA61, 0x7888ABC4, 0x0B48852E, 0x75DAC48B,
-		0x53FDC7B7, 0x2D6F8612, 0x5EAFA8F8, 0x203DE95D,
-		0x49591929, 0x37CB588C, 0x440B7666, 0x3A9937C3,
-		0x96C3968F, 0xE851D72A, 0x9B91F9C0, 0xE503B865,
-		0x8C674811, 0xF2F509B4, 0x8135275E, 0xFFA766FB,
-		0xCD7B2126, 0xB3E96083, 0xC0294E69, 0xBEBB0FCC,
-		0xD7DFFFB8, 0xA94DBE1D, 0xDA8D90F7, 0xA41FD152,
-		0x0845701E, 0x76D731BB, 0x05171F51, 0x7B855EF4,
-		0x12E1AE80, 0x6C73EF25, 0x1FB3C1CF, 0x6121806A,
-		0x47068356, 0x3994C2F3, 0x4A54EC19, 0x34C6ADBC,
-		0x5DA25DC8, 0x23301C6D, 0x50F03287, 0x2E627322,
-		0x8238D26E, 0xFCAA93CB, 0x8F6ABD21, 0xF1F8FC84,
-		0x989C0CF0, 0xE60E4D55, 0x95CE63BF, 0xEB5C221A,
-		0x4377278B, 0x3DE5662E, 0x4E2548C4, 0x30B70961,
-		0x59D3F915, 0x2741B8B0, 0x5481965A, 0x2A13D7FF,
-		0x864976B3, 0xF8DB3716, 0x8B1B19FC, 0xF5895859,
-		0x9CEDA82D, 0xE27FE988, 0x91BFC762, 0xEF2D86C7,
-		0xC90A85FB, 0xB798C45E, 0xC458EAB4, 0xBACAAB11,
-		0xD3AE5B65, 0xAD3C1AC0, 0xDEFC342A, 0xA06E758F,
-		0x0C34D4C3, 0x72A69566, 0x0166BB8C, 0x7FF4FA29,
-		0x16900A5D, 0x68024BF8, 0x1BC26512, 0x655024B7,
-		0x578C636A, 0x291E22CF, 0x5ADE0C25, 0x244C4D80,
-		0x4D28BDF4, 0x33BAFC51, 0x407AD2BB, 0x3EE8931E,
-		0x92B23252, 0xEC2073F7, 0x9FE05D1D, 0xE1721CB8,
-		0x8816ECCC, 0xF684AD69, 0x85448383, 0xFBD6C226,
-		0xDDF1C11A, 0xA36380BF, 0xD0A3AE55, 0xAE31EFF0,
-		0xC7551F84, 0xB9C75E21, 0xCA0770CB, 0xB495316E,
-		0x18CF9022, 0x665DD187, 0x159DFF6D, 0x6B0FBEC8,
-		0x026B4EBC, 0x7CF90F19, 0x0F3921F3, 0x71AB6056,
-		0x9AF7424C, 0xE46503E9, 0x97A52D03, 0xE9376CA6,
-		0x80539CD2, 0xFEC1DD77, 0x8D01F39D, 0xF393B238,
-		0x5FC91374, 0x215B52D1, 0x529B7C3B, 0x2C093D9E,
-		0x456DCDEA, 0x3BFF8C4F, 0x483FA2A5, 0x36ADE300,
-		0x108AE03C, 0x6E18A199, 0x1DD88F73, 0x634ACED6,
-		0x0A2E3EA2, 0x74BC7F07, 0x077C51ED, 0x79EE1048,
-		0xD5B4B104, 0xAB26F0A1, 0xD8E6DE4B, 0xA6749FEE,
-		0xCF106F9A, 0xB1822E3F, 0xC24200D5, 0xBCD04170,
-		0x8E0C06AD, 0xF09E4708, 0x835E69E2, 0xFDCC2847,
-		0x94A8D833, 0xEA3A9996, 0x99FAB77C, 0xE768F6D9,
-		0x4B325795, 0x35A01630, 0x466038DA, 0x38F2797F,
-		0x5196890B, 0x2F04C8AE, 0x5CC4E644, 0x2256A7E1,
-		0x0471A4DD, 0x7AE3E578, 0x0923CB92, 0x77B18A37,
-		0x1ED57A43, 0x60473BE6, 0x1387150C, 0x6D1554A9,
-		0xC14FF5E5, 0xBFDDB440, 0xCC1D9AAA, 0xB28FDB0F,
-		0xDBEB2B7B, 0xA5796ADE, 0xD6B94434, 0xA82B0591,
-	},
-	{
-		0x00000000, 0xB8AA45DD, 0x812367BF, 0x39892262,
-		0xF331227B, 0x4B9B67A6, 0x721245C4, 0xCAB80019,
-		0xE66344F6, 0x5EC9012B, 0x67402349, 0xDFEA6694,
-		0x1552668D, 0xADF82350, 0x94710132, 0x2CDB44EF,
-		0x3DB164E9, 0x851B2134, 0xBC920356, 0x0438468B,
-		0xCE804692, 0x762A034F, 0x4FA3212D, 0xF70964F0,
-		0xDBD2201F, 0x637865C2, 0x5AF147A0, 0xE25B027D,
-		0x28E30264, 0x904947B9, 0xA9C065DB, 0x116A2006,
-		0x8B1425D7, 0x33BE600A, 0x0A374268, 0xB29D07B5,
-		0x782507AC, 0xC08F4271, 0xF9066013, 0x41AC25CE,
-		0x6D776121, 0xD5DD24FC, 0xEC54069E, 0x54FE4343,
-		0x9E46435A, 0x26EC0687, 0x1F6524E5, 0xA7CF6138,
-		0xB6A5413E, 0x0E0F04E3, 0x37862681, 0x8F2C635C,
-		0x45946345, 0xFD3E2698, 0xC4B704FA, 0x7C1D4127,
-		0x50C605C8, 0xE86C4015, 0xD1E56277, 0x694F27AA,
-		0xA3F727B3, 0x1B5D626E, 0x22D4400C, 0x9A7E05D1,
-		0xE75FA6AB, 0x5FF5E376, 0x667CC114, 0xDED684C9,
-		0x146E84D0, 0xACC4C10D, 0x954DE36F, 0x2DE7A6B2,
-		0x013CE25D, 0xB996A780, 0x801F85E2, 0x38B5C03F,
-		0xF20DC026, 0x4AA785FB, 0x732EA799, 0xCB84E244,
-		0xDAEEC242, 0x6244879F, 0x5BCDA5FD, 0xE367E020,
-		0x29DFE039, 0x9175A5E4, 0xA8FC8786, 0x1056C25B,
-		0x3C8D86B4, 0x8427C369, 0xBDAEE10B, 0x0504A4D6,
-		0xCFBCA4CF, 0x7716E112, 0x4E9FC370, 0xF63586AD,
-		0x6C4B837C, 0xD4E1C6A1, 0xED68E4C3, 0x55C2A11E,
-		0x9F7AA107, 0x27D0E4DA, 0x1E59C6B8, 0xA6F38365,
-		0x8A28C78A, 0x32828257, 0x0B0BA035, 0xB3A1E5E8,
-		0x7919E5F1, 0xC1B3A02C, 0xF83A824E, 0x4090C793,
-		0x51FAE795, 0xE950A248, 0xD0D9802A, 0x6873C5F7,
-		0xA2CBC5EE, 0x1A618033, 0x23E8A251, 0x9B42E78C,
-		0xB799A363, 0x0F33E6BE, 0x36BAC4DC, 0x8E108101,
-		0x44A88118, 0xFC02C4C5, 0xC58BE6A7, 0x7D21A37A,
-		0x3FC9A052, 0x8763E58F, 0xBEEAC7ED, 0x06408230,
-		0xCCF88229, 0x7452C7F4, 0x4DDBE596, 0xF571A04B,
-		0xD9AAE4A4, 0x6100A179, 0x5889831B, 0xE023C6C6,
-		0x2A9BC6DF, 0x92318302, 0xABB8A160, 0x1312E4BD,
-		0x0278C4BB, 0xBAD28166, 0x835BA304, 0x3BF1E6D9,
-		0xF149E6C0, 0x49E3A31D, 0x706A817F, 0xC8C0C4A2,
-		0xE41B804D, 0x5CB1C590, 0x6538E7F2, 0xDD92A22F,
-		0x172AA236, 0xAF80E7EB, 0x9609C589, 0x2EA38054,
-		0xB4DD8585, 0x0C77C058, 0x35FEE23A, 0x8D54A7E7,
-		0x47ECA7FE, 0xFF46E223, 0xC6CFC041, 0x7E65859C,
-		0x52BEC173, 0xEA1484AE, 0xD39DA6CC, 0x6B37E311,
-		0xA18FE308, 0x1925A6D5, 0x20AC84B7, 0x9806C16A,
-		0x896CE16C, 0x31C6A4B1, 0x084F86D3, 0xB0E5C30E,
-		0x7A5DC317, 0xC2F786CA, 0xFB7EA4A8, 0x43D4E175,
-		0x6F0FA59A, 0xD7A5E047, 0xEE2CC225, 0x568687F8,
-		0x9C3E87E1, 0x2494C23C, 0x1D1DE05E, 0xA5B7A583,
-		0xD89606F9, 0x603C4324, 0x59B56146, 0xE11F249B,
-		0x2BA72482, 0x930D615F, 0xAA84433D, 0x122E06E0,
-		0x3EF5420F, 0x865F07D2, 0xBFD625B0, 0x077C606D,
-		0xCDC46074, 0x756E25A9, 0x4CE707CB, 0xF44D4216,
-		0xE5276210, 0x5D8D27CD, 0x640405AF, 0xDCAE4072,
-		0x1616406B, 0xAEBC05B6, 0x973527D4, 0x2F9F6209,
-		0x034426E6, 0xBBEE633B, 0x82674159, 0x3ACD0484,
-		0xF075049D, 0x48DF4140, 0x71566322, 0xC9FC26FF,
-		0x5382232E, 0xEB2866F3, 0xD2A14491, 0x6A0B014C,
-		0xA0B30155, 0x18194488, 0x219066EA, 0x993A2337,
-		0xB5E167D8, 0x0D4B2205, 0x34C20067, 0x8C6845BA,
-		0x46D045A3, 0xFE7A007E, 0xC7F3221C, 0x7F5967C1,
-		0x6E3347C7, 0xD699021A, 0xEF102078, 0x57BA65A5,
-		0x9D0265BC, 0x25A82061, 0x1C210203, 0xA48B47DE,
-		0x88500331, 0x30FA46EC, 0x0973648E, 0xB1D92153,
-		0x7B61214A, 0xC3CB6497, 0xFA4246F5, 0x42E80328,
-	},
-	{
-		0x00000000, 0xAC6F1138, 0x58DF2270, 0xF4B03348,
-		0xB0BE45E0, 0x1CD154D8, 0xE8616790, 0x440E76A8,
-		0x910B67C5, 0x3D6476FD, 0xC9D445B5, 0x65BB548D,
-		0x21B52225, 0x8DDA331D, 0x796A0055, 0xD505116D,
-		0xD361228F, 0x7F0E33B7, 0x8BBE00FF, 0x27D111C7,
-		0x63DF676F, 0xCFB07657, 0x3B00451F, 0x976F5427,
-		0x426A454A, 0xEE055472, 0x1AB5673A, 0xB6DA7602,
-		0xF2D400AA, 0x5EBB1192, 0xAA0B22DA, 0x066433E2,
-		0x57B5A81B, 0xFBDAB923, 0x0F6A8A6B, 0xA3059B53,
-		0xE70BEDFB, 0x4B64FCC3, 0xBFD4CF8B, 0x13BBDEB3,
-		0xC6BECFDE, 0x6AD1DEE6, 0x9E61EDAE, 0x320EFC96,
-		0x76008A3E, 0xDA6F9B06, 0x2EDFA84E, 0x82B0B976,
-		0x84D48A94, 0x28BB9BAC, 0xDC0BA8E4, 0x7064B9DC,
-		0x346ACF74, 0x9805DE4C, 0x6CB5ED04, 0xC0DAFC3C,
-		0x15DFED51, 0xB9B0FC69, 0x4D00CF21, 0xE16FDE19,
-		0xA561A8B1, 0x090EB989, 0xFDBE8AC1, 0x51D19BF9,
-		0xAE6A5137, 0x0205400F, 0xF6B57347, 0x5ADA627F,
-		0x1ED414D7, 0xB2BB05EF, 0x460B36A7, 0xEA64279F,
-		0x3F6136F2, 0x930E27CA, 0x67BE1482, 0xCBD105BA,
-		0x8FDF7312, 0x23B0622A, 0xD7005162, 0x7B6F405A,
-		0x7D0B73B8, 0xD1646280, 0x25D451C8, 0x89BB40F0,
-		0xCDB53658, 0x61DA2760, 0x956A1428, 0x39050510,
-		0xEC00147D, 0x406F0545, 0xB4DF360D, 0x18B02735,
-		0x5CBE519D, 0xF0D140A5, 0x046173ED, 0xA80E62D5,
-		0xF9DFF92C, 0x55B0E814, 0xA100DB5C, 0x0D6FCA64,
-		0x4961BCCC, 0xE50EADF4, 0x11BE9EBC, 0xBDD18F84,
-		0x68D49EE9, 0xC4BB8FD1, 0x300BBC99, 0x9C64ADA1,
-		0xD86ADB09, 0x7405CA31, 0x80B5F979, 0x2CDAE841,
-		0x2ABEDBA3, 0x86D1CA9B, 0x7261F9D3, 0xDE0EE8EB,
-		0x9A009E43, 0x366F8F7B, 0xC2DFBC33, 0x6EB0AD0B,
-		0xBBB5BC66, 0x17DAAD5E, 0xE36A9E16, 0x4F058F2E,
-		0x0B0BF986, 0xA764E8BE, 0x53D4DBF6, 0xFFBBCACE,
-		0x5CD5A26E, 0xF0BAB356, 0x040A801E, 0xA8659126,
-		0xEC6BE78E, 0x4004F6B6, 0xB4B4C5FE, 0x18DBD4C6,
-		0xCDDEC5AB, 0x61B1D493, 0x9501E7DB, 0x396EF6E3,
-		0x7D60804B, 0xD10F9173, 0x25BFA23B, 0x89D0B303,
-		0x8FB480E1, 0x23DB91D9, 0xD76BA291, 0x7B04B3A9,
-		0x3F0AC501, 0x9365D439, 0x67D5E771, 0xCBBAF649,
-		0x1EBFE724, 0xB2D0F61C, 0x4660C554, 0xEA0FD46C,
-		0xAE01A2C4, 0x026EB3FC, 0xF6DE80B4, 0x5AB1918C,
-		0x0B600A75, 0xA70F1B4D, 0x53BF2805, 0xFFD0393D,
-		0xBBDE4F95, 0x17B15EAD, 0xE3016DE5, 0x4F6E7CDD,
-		0x9A6B6DB0, 0x36047C88, 0xC2B44FC0, 0x6EDB5EF8,
-		0x2AD52850, 0x86BA3968, 0x720A0A20, 0xDE651B18,
-		0xD80128FA, 0x746E39C2, 0x80DE0A8A, 0x2CB11BB2,
-		0x68BF6D1A, 0xC4D07C22, 0x30604F6A, 0x9C0F5E52,
-		0x490A4F3F, 0xE5655E07, 0x11D56D4F, 0xBDBA7C77,
-		0xF9B40ADF, 0x55DB1BE7, 0xA16B28AF, 0x0D043997,
-		0xF2BFF359, 0x5ED0E261, 0xAA60D129, 0x060FC011,
-		0x4201B6B9, 0xEE6EA781, 0x1ADE94C9, 0xB6B185F1,
-		0x63B4949C, 0xCFDB85A4, 0x3B6BB6EC, 0x9704A7D4,
-		0xD30AD17C, 0x7F65C044, 0x8BD5F30C, 0x27BAE234,
-		0x21DED1D6, 0x8DB1C0EE, 0x7901F3A6, 0xD56EE29E,
-		0x91609436, 0x3D0F850E, 0xC9BFB646, 0x65D0A77E,
-		0xB0D5B613, 0x1CBAA72B, 0xE80A9463, 0x4465855B,
-		0x006BF3F3, 0xAC04E2CB, 0x58B4D183, 0xF4DBC0BB,
-		0xA50A5B42, 0x09654A7A, 0xFDD57932, 0x51BA680A,
-		0x15B41EA2, 0xB9DB0F9A, 0x4D6B3CD2, 0xE1042DEA,
-		0x34013C87, 0x986E2DBF, 0x6CDE1EF7, 0xC0B10FCF,
-		0x84BF7967, 0x28D0685F, 0xDC605B17, 0x700F4A2F,
-		0x766B79CD, 0xDA0468F5, 0x2EB45BBD, 0x82DB4A85,
-		0xC6D53C2D, 0x6ABA2D15, 0x9E0A1E5D, 0x32650F65,
-		0xE7601E08, 0x4B0F0F30, 0xBFBF3C78, 0x13D02D40,
-		0x57DE5BE8, 0xFBB14AD0, 0x0F017998, 0xA36E68A0,
-	},
-	{
-		0x00000000, 0x196B30EF, 0xC3A08CDB, 0xDACBBC34,
-		0x7737F5B2, 0x6E5CC55D, 0xB4977969, 0xADFC4986,
-		0x1F180660, 0x0673368F, 0xDCB88ABB, 0xC5D3BA54,
-		0x682FF3D2, 0x7144C33D, 0xAB8F7F09, 0xB2E44FE6,
-		0x3E300CC0, 0x275B3C2F, 0xFD90801B, 0xE4FBB0F4,
-		0x4907F972, 0x506CC99D, 0x8AA775A9, 0x93CC4546,
-		0x21280AA0, 0x38433A4F, 0xE288867B, 0xFBE3B694,
-		0x561FFF12, 0x4F74CFFD, 0x95BF73C9, 0x8CD44326,
-		0x8D16F485, 0x947DC46A, 0x4EB6785E, 0x57DD48B1,
-		0xFA210137, 0xE34A31D8, 0x39818DEC, 0x20EABD03,
-		0x920EF2E5, 0x8B65C20A, 0x51AE7E3E, 0x48C54ED1,
-		0xE5390757, 0xFC5237B8, 0x26998B8C, 0x3FF2BB63,
-		0xB326F845, 0xAA4DC8AA, 0x7086749E, 0x69ED4471,
-		0xC4110DF7, 0xDD7A3D18, 0x07B1812C, 0x1EDAB1C3,
-		0xAC3EFE25, 0xB555CECA, 0x6F9E72FE, 0x76F54211,
-		0xDB090B97, 0xC2623B78, 0x18A9874C, 0x01C2B7A3,
-		0xEB5B040E, 0xF23034E1, 0x28FB88D5, 0x3190B83A,
-		0x9C6CF1BC, 0x8507C153, 0x5FCC7D67, 0x46A74D88,
-		0xF443026E, 0xED283281, 0x37E38EB5, 0x2E88BE5A,
-		0x8374F7DC, 0x9A1FC733, 0x40D47B07, 0x59BF4BE8,
-		0xD56B08CE, 0xCC003821, 0x16CB8415, 0x0FA0B4FA,
-		0xA25CFD7C, 0xBB37CD93, 0x61FC71A7, 0x78974148,
-		0xCA730EAE, 0xD3183E41, 0x09D38275, 0x10B8B29A,
-		0xBD44FB1C, 0xA42FCBF3, 0x7EE477C7, 0x678F4728,
-		0x664DF08B, 0x7F26C064, 0xA5ED7C50, 0xBC864CBF,
-		0x117A0539, 0x081135D6, 0xD2DA89E2, 0xCBB1B90D,
-		0x7955F6EB, 0x603EC604, 0xBAF57A30, 0xA39E4ADF,
-		0x0E620359, 0x170933B6, 0xCDC28F82, 0xD4A9BF6D,
-		0x587DFC4B, 0x4116CCA4, 0x9BDD7090, 0x82B6407F,
-		0x2F4A09F9, 0x36213916, 0xECEA8522, 0xF581B5CD,
-		0x4765FA2B, 0x5E0ECAC4, 0x84C576F0, 0x9DAE461F,
-		0x30520F99, 0x29393F76, 0xF3F28342, 0xEA99B3AD,
-		0xD6B7081C, 0xCFDC38F3, 0x151784C7, 0x0C7CB428,
-		0xA180FDAE, 0xB8EBCD41, 0x62207175, 0x7B4B419A,
-		0xC9AF0E7C, 0xD0C43E93, 0x0A0F82A7, 0x1364B248,
-		0xBE98FBCE, 0xA7F3CB21, 0x7D387715, 0x645347FA,
-		0xE88704DC, 0xF1EC3433, 0x2B278807, 0x324CB8E8,
-		0x9FB0F16E, 0x86DBC181, 0x5C107DB5, 0x457B4D5A,
-		0xF79F02BC, 0xEEF43253, 0x343F8E67, 0x2D54BE88,
-		0x80A8F70E, 0x99C3C7E1, 0x43087BD5, 0x5A634B3A,
-		0x5BA1FC99, 0x42CACC76, 0x98017042, 0x816A40AD,
-		0x2C96092B, 0x35FD39C4, 0xEF3685F0, 0xF65DB51F,
-		0x44B9FAF9, 0x5DD2CA16, 0x87197622, 0x9E7246CD,
-		0x338E0F4B, 0x2AE53FA4, 0xF02E8390, 0xE945B37F,
-		0x6591F059, 0x7CFAC0B6, 0xA6317C82, 0xBF5A4C6D,
-		0x12A605EB, 0x0BCD3504, 0xD1068930, 0xC86DB9DF,
-		0x7A89F639, 0x63E2C6D6, 0xB9297AE2, 0xA0424A0D,
-		0x0DBE038B, 0x14D53364, 0xCE1E8F50, 0xD775BFBF,
-		0x3DEC0C12, 0x24873CFD, 0xFE4C80C9, 0xE727B026,
-		0x4ADBF9A0, 0x53B0C94F, 0x897B757B, 0x90104594,
-		0x22F40A72, 0x3B9F3A9D, 0xE15486A9, 0xF83FB646,
-		0x55C3FFC0, 0x4CA8CF2F, 0x9663731B, 0x8F0843F4,
-		0x03DC00D2, 0x1AB7303D, 0xC07C8C09, 0xD917BCE6,
-		0x74EBF560, 0x6D80C58F, 0xB74B79BB, 0xAE204954,
-		0x1CC406B2, 0x05AF365D, 0xDF648A69, 0xC60FBA86,
-		0x6BF3F300, 0x7298C3EF, 0xA8537FDB, 0xB1384F34,
-		0xB0FAF897, 0xA991C878, 0x735A744C, 0x6A3144A3,
-		0xC7CD0D25, 0xDEA63DCA, 0x046D81FE, 0x1D06B111,
-		0xAFE2FEF7, 0xB689CE18, 0x6C42722C, 0x752942C3,
-		0xD8D50B45, 0xC1BE3BAA, 0x1B75879E, 0x021EB771,
-		0x8ECAF457, 0x97A1C4B8, 0x4D6A788C, 0x54014863,
-		0xF9FD01E5, 0xE096310A, 0x3A5D8D3E, 0x2336BDD1,
-		0x91D2F237, 0x88B9C2D8, 0x52727EEC, 0x4B194E03,
-		0xE6E50785, 0xFF8E376A, 0x25458B5E, 0x3C2EBBB1,
-	},
-	{
-		0x00000000, 0xC82C0368, 0x905906D0, 0x587505B8,
-		0xD1C5E0A5, 0x19E9E3CD, 0x419CE675, 0x89B0E51D,
-		0x53FD2D4E, 0x9BD12E26, 0xC3A42B9E, 0x0B8828F6,
-		0x8238CDEB, 0x4A14CE83, 0x1261CB3B, 0xDA4DC853,
-		0xA6FA5B9C, 0x6ED658F4, 0x36A35D4C, 0xFE8F5E24,
-		0x773FBB39, 0xBF13B851, 0xE766BDE9, 0x2F4ABE81,
-		0xF50776D2, 0x3D2B75BA, 0x655E7002, 0xAD72736A,
-		0x24C29677, 0xECEE951F, 0xB49B90A7, 0x7CB793CF,
-		0xBD835B3D, 0x75AF5855, 0x2DDA5DED, 0xE5F65E85,
-		0x6C46BB98, 0xA46AB8F0, 0xFC1FBD48, 0x3433BE20,
-		0xEE7E7673, 0x2652751B, 0x7E2770A3, 0xB60B73CB,
-		0x3FBB96D6, 0xF79795BE, 0xAFE29006, 0x67CE936E,
-		0x1B7900A1, 0xD35503C9, 0x8B200671, 0x430C0519,
-		0xCABCE004, 0x0290E36C, 0x5AE5E6D4, 0x92C9E5BC,
-		0x48842DEF, 0x80A82E87, 0xD8DD2B3F, 0x10F12857,
-		0x9941CD4A, 0x516DCE22, 0x0918CB9A, 0xC134C8F2,
-		0x7A07B77A, 0xB22BB412, 0xEA5EB1AA, 0x2272B2C2,
-		0xABC257DF, 0x63EE54B7, 0x3B9B510F, 0xF3B75267,
-		0x29FA9A34, 0xE1D6995C, 0xB9A39CE4, 0x718F9F8C,
-		0xF83F7A91, 0x301379F9, 0x68667C41, 0xA04A7F29,
-		0xDCFDECE6, 0x14D1EF8E, 0x4CA4EA36, 0x8488E95E,
-		0x0D380C43, 0xC5140F2B, 0x9D610A93, 0x554D09FB,
-		0x8F00C1A8, 0x472CC2C0, 0x1F59C778, 0xD775C410,
-		0x5EC5210D, 0x96E92265, 0xCE9C27DD, 0x06B024B5,
-		0xC784EC47, 0x0FA8EF2F, 0x57DDEA97, 0x9FF1E9FF,
-		0x16410CE2, 0xDE6D0F8A, 0x86180A32, 0x4E34095A,
-		0x9479C109, 0x5C55C261, 0x0420C7D9, 0xCC0CC4B1,
-		0x45BC21AC, 0x8D9022C4, 0xD5E5277C, 0x1DC92414,
-		0x617EB7DB, 0xA952B4B3, 0xF127B10B, 0x390BB263,
-		0xB0BB577E, 0x78975416, 0x20E251AE, 0xE8CE52C6,
-		0x32839A95, 0xFAAF99FD, 0xA2DA9C45, 0x6AF69F2D,
-		0xE3467A30, 0x2B6A7958, 0x731F7CE0, 0xBB337F88,
-		0xF40E6EF5, 0x3C226D9D, 0x64576825, 0xAC7B6B4D,
-		0x25CB8E50, 0xEDE78D38, 0xB5928880, 0x7DBE8BE8,
-		0xA7F343BB, 0x6FDF40D3, 0x37AA456B, 0xFF864603,
-		0x7636A31E, 0xBE1AA076, 0xE66FA5CE, 0x2E43A6A6,
-		0x52F43569, 0x9AD83601, 0xC2AD33B9, 0x0A8130D1,
-		0x8331D5CC, 0x4B1DD6A4, 0x1368D31C, 0xDB44D074,
-		0x01091827, 0xC9251B4F, 0x91501EF7, 0x597C1D9F,
-		0xD0CCF882, 0x18E0FBEA, 0x4095FE52, 0x88B9FD3A,
-		0x498D35C8, 0x81A136A0, 0xD9D43318, 0x11F83070,
-		0x9848D56D, 0x5064D605, 0x0811D3BD, 0xC03DD0D5,
-		0x1A701886, 0xD25C1BEE, 0x8A291E56, 0x42051D3E,
-		0xCBB5F823, 0x0399FB4B, 0x5BECFEF3, 0x93C0FD9B,
-		0xEF776E54, 0x275B6D3C, 0x7F2E6884, 0xB7026BEC,
-		0x3EB28EF1, 0xF69E8D99, 0xAEEB8821, 0x66C78B49,
-		0xBC8A431A, 0x74A64072, 0x2CD345CA, 0xE4FF46A2,
-		0x6D4FA3BF, 0xA563A0D7, 0xFD16A56F, 0x353AA607,
-		0x8E09D98F, 0x4625DAE7, 0x1E50DF5F, 0xD67CDC37,
-		0x5FCC392A, 0x97E03A42, 0xCF953FFA, 0x07B93C92,
-		0xDDF4F4C1, 0x15D8F7A9, 0x4DADF211, 0x8581F179,
-		0x0C311464, 0xC41D170C, 0x9C6812B4, 0x544411DC,
-		0x28F38213, 0xE0DF817B, 0xB8AA84C3, 0x708687AB,
-		0xF93662B6, 0x311A61DE, 0x696F6466, 0xA143670E,
-		0x7B0EAF5D, 0xB322AC35, 0xEB57A98D, 0x237BAAE5,
-		0xAACB4FF8, 0x62E74C90, 0x3A924928, 0xF2BE4A40,
-		0x338A82B2, 0xFBA681DA, 0xA3D38462, 0x6BFF870A,
-		0xE24F6217, 0x2A63617F, 0x721664C7, 0xBA3A67AF,
-		0x6077AFFC, 0xA85BAC94, 0xF02EA92C, 0x3802AA44,
-		0xB1B24F59, 0x799E4C31, 0x21EB4989, 0xE9C74AE1,
-		0x9570D92E, 0x5D5CDA46, 0x0529DFFE, 0xCD05DC96,
-		0x44B5398B, 0x8C993AE3, 0xD4EC3F5B, 0x1CC03C33,
-		0xC68DF460, 0x0EA1F708, 0x56D4F2B0, 0x9EF8F1D8,
-		0x174814C5, 0xDF6417AD, 0x87111215, 0x4F3D117D,
-	},
-	{
-		0x00000000, 0x277D3C49, 0x4EFA7892, 0x698744DB,
-		0x6D821D21, 0x4AFF2168, 0x237865B3, 0x040559FA,
-		0xDA043B42, 0xFD79070B, 0x94FE43D0, 0xB3837F99,
-		0xB7862663, 0x90FB1A2A, 0xF97C5EF1, 0xDE0162B8,
-		0xB4097684, 0x93744ACD, 0xFAF30E16, 0xDD8E325F,
-		0xD98B6BA5, 0xFEF657EC, 0x97711337, 0xB00C2F7E,
-		0x6E0D4DC6, 0x4970718F, 0x20F73554, 0x078A091D,
-		0x038F50E7, 0x24F26CAE, 0x4D752875, 0x6A08143C,
-		0x9965000D, 0xBE183C44, 0xD79F789F, 0xF0E244D6,
-		0xF4E71D2C, 0xD39A2165, 0xBA1D65BE, 0x9D6059F7,
-		0x43613B4F, 0x641C0706, 0x0D9B43DD, 0x2AE67F94,
-		0x2EE3266E, 0x099E1A27, 0x60195EFC, 0x476462B5,
-		0x2D6C7689, 0x0A114AC0, 0x63960E1B, 0x44EB3252,
-		0x40EE6BA8, 0x679357E1, 0x0E14133A, 0x29692F73,
-		0xF7684DCB, 0xD0157182, 0xB9923559, 0x9EEF0910,
-		0x9AEA50EA, 0xBD976CA3, 0xD4102878, 0xF36D1431,
-		0x32CB001A, 0x15B63C53, 0x7C317888, 0x5B4C44C1,
-		0x5F491D3B, 0x78342172, 0x11B365A9, 0x36CE59E0,
-		0xE8CF3B58, 0xCFB20711, 0xA63543CA, 0x81487F83,
-		0x854D2679, 0xA2301A30, 0xCBB75EEB, 0xECCA62A2,
-		0x86C2769E, 0xA1BF4AD7, 0xC8380E0C, 0xEF453245,
-		0xEB406BBF, 0xCC3D57F6, 0xA5BA132D, 0x82C72F64,
-		0x5CC64DDC, 0x7BBB7195, 0x123C354E, 0x35410907,
-		0x314450FD, 0x16396CB4, 0x7FBE286F, 0x58C31426,
-		0xABAE0017, 0x8CD33C5E, 0xE5547885, 0xC22944CC,
-		0xC62C1D36, 0xE151217F, 0x88D665A4, 0xAFAB59ED,
-		0x71AA3B55, 0x56D7071C, 0x3F5043C7, 0x182D7F8E,
-		0x1C282674, 0x3B551A3D, 0x52D25EE6, 0x75AF62AF,
-		0x1FA77693, 0x38DA4ADA, 0x515D0E01, 0x76203248,
-		0x72256BB2, 0x555857FB, 0x3CDF1320, 0x1BA22F69,
-		0xC5A34DD1, 0xE2DE7198, 0x8B593543, 0xAC24090A,
-		0xA82150F0, 0x8F5C6CB9, 0xE6DB2862, 0xC1A6142B,
-		0x64960134, 0x43EB3D7D, 0x2A6C79A6, 0x0D1145EF,
-		0x09141C15, 0x2E69205C, 0x47EE6487, 0x609358CE,
-		0xBE923A76, 0x99EF063F, 0xF06842E4, 0xD7157EAD,
-		0xD3102757, 0xF46D1B1E, 0x9DEA5FC5, 0xBA97638C,
-		0xD09F77B0, 0xF7E24BF9, 0x9E650F22, 0xB918336B,
-		0xBD1D6A91, 0x9A6056D8, 0xF3E71203, 0xD49A2E4A,
-		0x0A9B4CF2, 0x2DE670BB, 0x44613460, 0x631C0829,
-		0x671951D3, 0x40646D9A, 0x29E32941, 0x0E9E1508,
-		0xFDF30139, 0xDA8E3D70, 0xB30979AB, 0x947445E2,
-		0x90711C18, 0xB70C2051, 0xDE8B648A, 0xF9F658C3,
-		0x27F73A7B, 0x008A0632, 0x690D42E9, 0x4E707EA0,
-		0x4A75275A, 0x6D081B13, 0x048F5FC8, 0x23F26381,
-		0x49FA77BD, 0x6E874BF4, 0x07000F2F, 0x207D3366,
-		0x24786A9C, 0x030556D5, 0x6A82120E, 0x4DFF2E47,
-		0x93FE4CFF, 0xB48370B6, 0xDD04346D, 0xFA790824,
-		0xFE7C51DE, 0xD9016D97, 0xB086294C, 0x97FB1505,
-		0x565D012E, 0x71203D67, 0x18A779BC, 0x3FDA45F5,
-		0x3BDF1C0F, 0x1CA22046, 0x7525649D, 0x525858D4,
-		0x8C593A6C, 0xAB240625, 0xC2A342FE, 0xE5DE7EB7,
-		0xE1DB274D, 0xC6A61B04, 0xAF215FDF, 0x885C6396,
-		0xE25477AA, 0xC5294BE3, 0xACAE0F38, 0x8BD33371,
-		0x8FD66A8B, 0xA8AB56C2, 0xC12C1219, 0xE6512E50,
-		0x38504CE8, 0x1F2D70A1, 0x76AA347A, 0x51D70833,
-		0x55D251C9, 0x72AF6D80, 0x1B28295B, 0x3C551512,
-		0xCF380123, 0xE8453D6A, 0x81C279B1, 0xA6BF45F8,
-		0xA2BA1C02, 0x85C7204B, 0xEC406490, 0xCB3D58D9,
-		0x153C3A61, 0x32410628, 0x5BC642F3, 0x7CBB7EBA,
-		0x78BE2740, 0x5FC31B09, 0x36445FD2, 0x1139639B,
-		0x7B3177A7, 0x5C4C4BEE, 0x35CB0F35, 0x12B6337C,
-		0x16B36A86, 0x31CE56CF, 0x58491214, 0x7F342E5D,
-		0xA1354CE5, 0x864870AC, 0xEFCF3477, 0xC8B2083E,
-		0xCCB751C4, 0xEBCA6D8D, 0x824D2956, 0xA530151F
-	}
-#endif /* WORDS_BIGENDIAN */
-};
-
-
-/*
- * Lookup table for calculating CRC-32 using Sarwate's algorithm.
- *
- * This table is based on the polynomial
- *	x^32+x^26+x^23+x^22+x^16+x^12+x^11+x^10+x^8+x^7+x^5+x^4+x^2+x+1.
- * (This is the same polynomial used in Ethernet checksums, for instance.)
- * Using Williams' terms, this is the "normal", not "reflected" version.
- */
-const uint32 pg_crc32_table[256] = {
-	0x00000000, 0x77073096, 0xEE0E612C, 0x990951BA,
-	0x076DC419, 0x706AF48F, 0xE963A535, 0x9E6495A3,
-	0x0EDB8832, 0x79DCB8A4, 0xE0D5E91E, 0x97D2D988,
-	0x09B64C2B, 0x7EB17CBD, 0xE7B82D07, 0x90BF1D91,
-	0x1DB71064, 0x6AB020F2, 0xF3B97148, 0x84BE41DE,
-	0x1ADAD47D, 0x6DDDE4EB, 0xF4D4B551, 0x83D385C7,
-	0x136C9856, 0x646BA8C0, 0xFD62F97A, 0x8A65C9EC,
-	0x14015C4F, 0x63066CD9, 0xFA0F3D63, 0x8D080DF5,
-	0x3B6E20C8, 0x4C69105E, 0xD56041E4, 0xA2677172,
-	0x3C03E4D1, 0x4B04D447, 0xD20D85FD, 0xA50AB56B,
-	0x35B5A8FA, 0x42B2986C, 0xDBBBC9D6, 0xACBCF940,
-	0x32D86CE3, 0x45DF5C75, 0xDCD60DCF, 0xABD13D59,
-	0x26D930AC, 0x51DE003A, 0xC8D75180, 0xBFD06116,
-	0x21B4F4B5, 0x56B3C423, 0xCFBA9599, 0xB8BDA50F,
-	0x2802B89E, 0x5F058808, 0xC60CD9B2, 0xB10BE924,
-	0x2F6F7C87, 0x58684C11, 0xC1611DAB, 0xB6662D3D,
-	0x76DC4190, 0x01DB7106, 0x98D220BC, 0xEFD5102A,
-	0x71B18589, 0x06B6B51F, 0x9FBFE4A5, 0xE8B8D433,
-	0x7807C9A2, 0x0F00F934, 0x9609A88E, 0xE10E9818,
-	0x7F6A0DBB, 0x086D3D2D, 0x91646C97, 0xE6635C01,
-	0x6B6B51F4, 0x1C6C6162, 0x856530D8, 0xF262004E,
-	0x6C0695ED, 0x1B01A57B, 0x8208F4C1, 0xF50FC457,
-	0x65B0D9C6, 0x12B7E950, 0x8BBEB8EA, 0xFCB9887C,
-	0x62DD1DDF, 0x15DA2D49, 0x8CD37CF3, 0xFBD44C65,
-	0x4DB26158, 0x3AB551CE, 0xA3BC0074, 0xD4BB30E2,
-	0x4ADFA541, 0x3DD895D7, 0xA4D1C46D, 0xD3D6F4FB,
-	0x4369E96A, 0x346ED9FC, 0xAD678846, 0xDA60B8D0,
-	0x44042D73, 0x33031DE5, 0xAA0A4C5F, 0xDD0D7CC9,
-	0x5005713C, 0x270241AA, 0xBE0B1010, 0xC90C2086,
-	0x5768B525, 0x206F85B3, 0xB966D409, 0xCE61E49F,
-	0x5EDEF90E, 0x29D9C998, 0xB0D09822, 0xC7D7A8B4,
-	0x59B33D17, 0x2EB40D81, 0xB7BD5C3B, 0xC0BA6CAD,
-	0xEDB88320, 0x9ABFB3B6, 0x03B6E20C, 0x74B1D29A,
-	0xEAD54739, 0x9DD277AF, 0x04DB2615, 0x73DC1683,
-	0xE3630B12, 0x94643B84, 0x0D6D6A3E, 0x7A6A5AA8,
-	0xE40ECF0B, 0x9309FF9D, 0x0A00AE27, 0x7D079EB1,
-	0xF00F9344, 0x8708A3D2, 0x1E01F268, 0x6906C2FE,
-	0xF762575D, 0x806567CB, 0x196C3671, 0x6E6B06E7,
-	0xFED41B76, 0x89D32BE0, 0x10DA7A5A, 0x67DD4ACC,
-	0xF9B9DF6F, 0x8EBEEFF9, 0x17B7BE43, 0x60B08ED5,
-	0xD6D6A3E8, 0xA1D1937E, 0x38D8C2C4, 0x4FDFF252,
-	0xD1BB67F1, 0xA6BC5767, 0x3FB506DD, 0x48B2364B,
-	0xD80D2BDA, 0xAF0A1B4C, 0x36034AF6, 0x41047A60,
-	0xDF60EFC3, 0xA867DF55, 0x316E8EEF, 0x4669BE79,
-	0xCB61B38C, 0xBC66831A, 0x256FD2A0, 0x5268E236,
-	0xCC0C7795, 0xBB0B4703, 0x220216B9, 0x5505262F,
-	0xC5BA3BBE, 0xB2BD0B28, 0x2BB45A92, 0x5CB36A04,
-	0xC2D7FFA7, 0xB5D0CF31, 0x2CD99E8B, 0x5BDEAE1D,
-	0x9B64C2B0, 0xEC63F226, 0x756AA39C, 0x026D930A,
-	0x9C0906A9, 0xEB0E363F, 0x72076785, 0x05005713,
-	0x95BF4A82, 0xE2B87A14, 0x7BB12BAE, 0x0CB61B38,
-	0x92D28E9B, 0xE5D5BE0D, 0x7CDCEFB7, 0x0BDBDF21,
-	0x86D3D2D4, 0xF1D4E242, 0x68DDB3F8, 0x1FDA836E,
-	0x81BE16CD, 0xF6B9265B, 0x6FB077E1, 0x18B74777,
-	0x88085AE6, 0xFF0F6A70, 0x66063BCA, 0x11010B5C,
-	0x8F659EFF, 0xF862AE69, 0x616BFFD3, 0x166CCF45,
-	0xA00AE278, 0xD70DD2EE, 0x4E048354, 0x3903B3C2,
-	0xA7672661, 0xD06016F7, 0x4969474D, 0x3E6E77DB,
-	0xAED16A4A, 0xD9D65ADC, 0x40DF0B66, 0x37D83BF0,
-	0xA9BCAE53, 0xDEBB9EC5, 0x47B2CF7F, 0x30B5FFE9,
-	0xBDBDF21C, 0xCABAC28A, 0x53B39330, 0x24B4A3A6,
-	0xBAD03605, 0xCDD70693, 0x54DE5729, 0x23D967BF,
-	0xB3667A2E, 0xC4614AB8, 0x5D681B02, 0x2A6F2B94,
-	0xB40BBE37, 0xC30C8EA1, 0x5A05DF1B, 0x2D02EF8D
-};
diff --git a/src/include/access/xlogrecord.h b/src/include/access/xlogrecord.h
index 09bbcb1..b487ae0 100644
--- a/src/include/access/xlogrecord.h
+++ b/src/include/access/xlogrecord.h
@@ -13,7 +13,7 @@
 
 #include "access/rmgr.h"
 #include "access/xlogdefs.h"
-#include "common/pg_crc.h"
+#include "port/pg_crc32c.h"
 #include "storage/block.h"
 #include "storage/relfilenode.h"
 
@@ -46,13 +46,13 @@ typedef struct XLogRecord
 	uint8		xl_info;		/* flag bits, see below */
 	RmgrId		xl_rmid;		/* resource manager for this record */
 	/* 2 bytes of padding here, initialize to zero */
-	pg_crc32	xl_crc;			/* CRC for this record */
+	pg_crc32c	xl_crc;			/* CRC for this record */
 
 	/* XLogRecordBlockHeaders and XLogRecordDataHeader follow, no padding */
 
 } XLogRecord;
 
-#define SizeOfXLogRecord	(offsetof(XLogRecord, xl_crc) + sizeof(pg_crc32))
+#define SizeOfXLogRecord	(offsetof(XLogRecord, xl_crc) + sizeof(pg_crc32c))
 
 /*
  * The high 4 bits in xl_info may be used freely by rmgr. The
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 31232b1..2e4c381 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -16,8 +16,8 @@
 #define PG_CONTROL_H
 
 #include "access/xlogdefs.h"
-#include "common/pg_crc.h"
 #include "pgtime.h"				/* for pg_time_t */
+#include "port/pg_crc32c.h"
 
 
 /* Version identifier for this pg_control format */
@@ -224,7 +224,7 @@ typedef struct ControlFileData
 	uint32		data_checksum_version;
 
 	/* CRC of all above ... MUST BE LAST! */
-	pg_crc32	crc;
+	pg_crc32c	crc;
 } ControlFileData;
 
 /*
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 9caa096..8469c82 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -5181,6 +5181,26 @@ DESCR("rank of hypothetical row without gaps");
 DATA(insert OID = 3993 ( dense_rank_final	PGNSP PGUID 12 1 0 2276 0 f f f f f f i 2 0 20 "2281 2276" "{2281,2276}" "{i,v}" _null_ _null_	hypothetical_dense_rank_final _null_ _null_ _null_ ));
 DESCR("aggregate final function");
 
+/* pg_upgrade support */
+DATA(insert OID = 3582 ( binary_upgrade_set_next_pg_type_oid PGNSP PGUID  12 1 0 0 0 f f f f t f v 1 0 2278 "26" _null_ _null_ _null_ _null_ binary_upgrade_set_next_pg_type_oid _null_ _null_ _null_ ));
+DESCR("for use by pg_upgrade");
+DATA(insert OID = 3584 ( binary_upgrade_set_next_array_pg_type_oid PGNSP PGUID  12 1 0 0 0 f f f f t f v 1 0 2278 "26" _null_ _null_ _null_ _null_ binary_upgrade_set_next_array_pg_type_oid _null_ _null_ _null_ ));
+DESCR("for use by pg_upgrade");
+DATA(insert OID = 3585 ( binary_upgrade_set_next_toast_pg_type_oid PGNSP PGUID  12 1 0 0 0 f f f f t f v 1 0 2278 "26" _null_ _null_ _null_ _null_ binary_upgrade_set_next_toast_pg_type_oid _null_ _null_ _null_ ));
+DESCR("for use by pg_upgrade");
+DATA(insert OID = 3586 ( binary_upgrade_set_next_heap_pg_class_oid PGNSP PGUID  12 1 0 0 0 f f f f t f v 1 0 2278 "26" _null_ _null_ _null_ _null_ binary_upgrade_set_next_heap_pg_class_oid _null_ _null_ _null_ ));
+DESCR("for use by pg_upgrade");
+DATA(insert OID = 3587 ( binary_upgrade_set_next_index_pg_class_oid PGNSP PGUID  12 1 0 0 0 f f f f t f v 1 0 2278 "26" _null_ _null_ _null_ _null_ binary_upgrade_set_next_index_pg_class_oid _null_ _null_ _null_ ));
+DESCR("for use by pg_upgrade");
+DATA(insert OID = 3588 ( binary_upgrade_set_next_toast_pg_class_oid PGNSP PGUID  12 1 0 0 0 f f f f t f v 1 0 2278 "26" _null_ _null_ _null_ _null_ binary_upgrade_set_next_toast_pg_class_oid _null_ _null_ _null_ ));
+DESCR("for use by pg_upgrade");
+DATA(insert OID = 3589 ( binary_upgrade_set_next_pg_enum_oid PGNSP PGUID  12 1 0 0 0 f f f f t f v 1 0 2278 "26" _null_ _null_ _null_ _null_ binary_upgrade_set_next_pg_enum_oid _null_ _null_ _null_ ));
+DESCR("for use by pg_upgrade");
+DATA(insert OID = 3590 ( binary_upgrade_set_next_pg_authid_oid PGNSP PGUID  12 1 0 0 0 f f f f t f v 1 0 2278 "26" _null_ _null_ _null_ _null_ binary_upgrade_set_next_pg_authid_oid _null_ _null_ _null_ ));
+DESCR("for use by pg_upgrade");
+DATA(insert OID = 3591 ( binary_upgrade_create_empty_extension PGNSP PGUID  12 1 0 0 0 f f f f t f v 7 0 2278 "25 25 16 25 1028 1009 1009" _null_ _null_ _null_ _null_ binary_upgrade_create_empty_extension _null_ _null_ _null_ ));
+DESCR("for use by pg_upgrade");
+
 
 /*
  * Symbolic values for provolatile column: these indicate whether the result
diff --git a/src/include/common/pg_crc.h b/src/include/common/pg_crc.h
deleted file mode 100644
index f496659..0000000
--- a/src/include/common/pg_crc.h
+++ /dev/null
@@ -1,142 +0,0 @@
-/*
- * pg_crc.h
- *
- * PostgreSQL CRC support
- *
- * See Ross Williams' excellent introduction
- * A PAINLESS GUIDE TO CRC ERROR DETECTION ALGORITHMS, available from
- * http://www.ross.net/crc/ or several other net sites.
- *
- * We have three slightly different variants of a 32-bit CRC calculation:
- * CRC-32C (Castagnoli polynomial), CRC-32 (Ethernet polynomial), and a legacy
- * CRC-32 version that uses the lookup table in a funny way. They all consist
- * of four macros:
- *
- * INIT_<variant>(crc)
- *		Initialize a CRC accumulator
- *
- * COMP_<variant>(crc, data, len)
- *		Accumulate some (more) bytes into a CRC
- *
- * FIN_<variant>(crc)
- *		Finish a CRC calculation
- *
- * EQ_<variant>(c1, c2)
- *		Check for equality of two CRCs.
- *
- * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/common/pg_crc.h
- */
-#ifndef PG_CRC_H
-#define PG_CRC_H
-
-/* ugly hack to let this be used in frontend and backend code on Cygwin */
-#ifdef FRONTEND
-#define CRCDLLIMPORT
-#else
-#define CRCDLLIMPORT PGDLLIMPORT
-#endif
-
-typedef uint32 pg_crc32;
-
-#ifdef HAVE__BUILTIN_BSWAP32
-#define BSWAP32(x) __builtin_bswap32(x)
-#else
-#define BSWAP32(x) (((x << 24) & 0xff000000) | \
-					((x << 8) & 0x00ff0000) | \
-					((x >> 8) & 0x0000ff00) | \
-					((x >> 24) & 0x000000ff))
-#endif
-
-/*
- * CRC calculation using the CRC-32C (Castagnoli) polynomial.
- *
- * We use all-ones as the initial register contents and final bit inversion.
- * This is the same algorithm used e.g. in iSCSI. See RFC 3385 for more
- * details on the choice of polynomial.
- *
- * On big-endian systems, the intermediate value is kept in reverse byte
- * order, to avoid byte-swapping during the calculation. FIN_CRC32C reverses
- * the bytes to the final order.
- */
-#define INIT_CRC32C(crc) ((crc) = 0xFFFFFFFF)
-#ifdef WORDS_BIGENDIAN
-#define FIN_CRC32C(crc)	((crc) = BSWAP32(crc) ^ 0xFFFFFFFF)
-#else
-#define FIN_CRC32C(crc)	((crc) ^= 0xFFFFFFFF)
-#endif
-#define COMP_CRC32C(crc, data, len)	\
-	((crc) = pg_comp_crc32c((crc), (data), (len)))
-#define EQ_CRC32C(c1, c2) ((c1) == (c2))
-
-extern pg_crc32 pg_comp_crc32c(pg_crc32 crc, const void *data, size_t len);
-
-/*
- * CRC-32, the same used e.g. in Ethernet.
- *
- * This is currently only used in ltree and hstore contrib modules. It uses
- * the same lookup table as the legacy algorithm below. New code should
- * use the Castagnoli version instead.
- */
-#define INIT_TRADITIONAL_CRC32(crc) ((crc) = 0xFFFFFFFF)
-#define FIN_TRADITIONAL_CRC32(crc)	((crc) ^= 0xFFFFFFFF)
-#define COMP_TRADITIONAL_CRC32(crc, data, len)	\
-	COMP_CRC32_NORMAL_TABLE(crc, data, len, pg_crc32_table)
-#define EQ_TRADITIONAL_CRC32(c1, c2) ((c1) == (c2))
-
-/* Sarwate's algorithm, for use with a "normal" lookup table */
-#define COMP_CRC32_NORMAL_TABLE(crc, data, len, table)			  \
-do {															  \
-	const unsigned char *__data = (const unsigned char *) (data); \
-	uint32		__len = (len); \
-\
-	while (__len-- > 0) \
-	{ \
-		int		__tab_index = ((int) (crc) ^ *__data++) & 0xFF; \
-		(crc) = table[__tab_index] ^ ((crc) >> 8); \
-	} \
-} while (0)
-
-/*
- * The CRC algorithm used for WAL et al in pre-9.5 versions.
- *
- * This closely resembles the normal CRC-32 algorithm, but is subtly
- * different. Using Williams' terms, we use the "normal" table, but with
- * "reflected" code. That's bogus, but it was like that for years before
- * anyone noticed. It does not correspond to any polynomial in a normal CRC
- * algorithm, so it's not clear what the error-detection properties of this
- * algorithm actually are.
- *
- * We still need to carry this around because it is used in a few on-disk
- * structures that need to be pg_upgradeable. It should not be used in new
- * code.
- */
-#define INIT_LEGACY_CRC32(crc) ((crc) = 0xFFFFFFFF)
-#define FIN_LEGACY_CRC32(crc)	((crc) ^= 0xFFFFFFFF)
-#define COMP_LEGACY_CRC32(crc, data, len)	\
-	COMP_CRC32_REFLECTED_TABLE(crc, data, len, pg_crc32_table)
-#define EQ_LEGACY_CRC32(c1, c2) ((c1) == (c2))
-
-/*
- * Sarwate's algorithm, for use with a "reflected" lookup table (but in the
- * legacy algorithm, we actually use it on a "normal" table, see above)
- */
-#define COMP_CRC32_REFLECTED_TABLE(crc, data, len, table) \
-do {															  \
-	const unsigned char *__data = (const unsigned char *) (data); \
-	uint32		__len = (len); \
-\
-	while (__len-- > 0) \
-	{ \
-		int		__tab_index = ((int) ((crc) >> 24) ^ *__data++) & 0xFF;	\
-		(crc) = table[__tab_index] ^ ((crc) << 8); \
-	} \
-} while (0)
-
-/* Constant tables for CRC-32C and CRC-32 polynomials */
-extern CRCDLLIMPORT const uint32 pg_crc32c_table[8][256];
-extern CRCDLLIMPORT const uint32 pg_crc32_table[256];
-
-#endif   /* PG_CRC_H */
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 202c51a..5688f75 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -675,6 +675,12 @@
 /* Define to 1 if your compiler understands __builtin_unreachable. */
 #undef HAVE__BUILTIN_UNREACHABLE
 
+/* Define to 1 if you have __cpuid. */
+#undef HAVE__CPUID
+
+/* Define to 1 if you have __get_cpuid. */
+#undef HAVE__GET_CPUID
+
 /* Define to 1 if your compiler understands _Static_assert. */
 #undef HAVE__STATIC_ASSERT
 
@@ -818,6 +824,15 @@
 /* Use replacement snprintf() functions. */
 #undef USE_REPL_SNPRINTF
 
+/* Define to 1 to use Intel SSE 4.2 CRC instructions with a runtime check. */
+#undef USE_SLICING_BY_8_CRC32C
+
+/* Define to 1 use Intel SSE 4.2 CRC instructions. */
+#undef USE_SSE42_CRC32C
+
+/* Define to 1 to use Intel SSSE 4.2 CRC instructions with a runtime check. */
+#undef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
+
 /* Define to select SysV-style semaphores. */
 #undef USE_SYSV_SEMAPHORES
 
diff --git a/src/include/pg_config.h.win32 b/src/include/pg_config.h.win32
index 1baf64f..9db1f6f 100644
--- a/src/include/pg_config.h.win32
+++ b/src/include/pg_config.h.win32
@@ -6,8 +6,8 @@
  *
  * HAVE_CBRT, HAVE_FUNCNAME_FUNC, HAVE_GETOPT, HAVE_GETOPT_H, HAVE_INTTYPES_H,
  * HAVE_GETOPT_LONG, HAVE_LOCALE_T, HAVE_RINT, HAVE_STRINGS_H, HAVE_STRTOLL,
- * HAVE_STRTOULL, HAVE_STRUCT_OPTION, ENABLE_THREAD_SAFETY,
- * PG_USE_INLINE, inline
+ * HAVE_STRTOULL, HAVE_STRUCT_OPTION, ENABLE_THREAD_SAFETY, PG_USE_INLINE,
+ * inline, USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
  */
 
 /* Define to the type of arg 1 of 'accept' */
@@ -529,6 +529,12 @@
 /* Define to 1 if your compiler understands __builtin_unreachable. */
 /* #undef HAVE__BUILTIN_UNREACHABLE */
 
+/* Define to 1 if you have __cpuid. */
+#define HAVE__CPUID 1
+
+/* Define to 1 if you have __get_cpuid. */
+#undef HAVE__GET_CPUID
+
 /* Define to 1 if your compiler understands _Static_assert. */
 /* #undef HAVE__STATIC_ASSERT */
 
@@ -639,6 +645,19 @@
 /* Use replacement snprintf() functions. */
 #define USE_REPL_SNPRINTF 1
 
+/* Define to 1 to use Intel SSE 4.2 CRC instructions with a runtime check. */
+#if (_MSC_VER < 1500)
+#define USE_SLICING_BY_8_CRC32C 1
+#endif
+
+/* Define to 1 use Intel SSE 4.2 CRC instructions. */
+/* #undef USE_SSE42_CRC32C */
+
+/* Define to 1 to use Intel SSSE 4.2 CRC instructions with a runtime check. */
+#if (_MSC_VER >= 1500)
+#define USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
+#endif
+
 /* Define to select SysV-style semaphores. */
 /* #undef USE_SYSV_SEMAPHORES */
 
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
new file mode 100644
index 0000000..b14d194
--- /dev/null
+++ b/src/include/port/pg_crc32c.h
@@ -0,0 +1,93 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_crc32c.h
+ *	  Routines for computing CRC-32C checksums.
+ *
+ * The speed of CRC-32C calculation has a big impact on performance, so we
+ * jump through some hoops to get the best implementation for each
+ * platform. Some CPU architectures have special instructions for speeding
+ * up CRC calculations (e.g. Intel SSE 4.2), on other platforms we use the
+ * Slicing-by-8 algorithm which uses lookup tables.
+ *
+ * The public interface consists of four macros:
+ *
+ * INIT_CRC32C(crc)
+ *		Initialize a CRC accumulator
+ *
+ * COMP_CRC32C(crc, data, len)
+ *		Accumulate some (more) bytes into a CRC
+ *
+ * FIN_CRC32C(crc)
+ *		Finish a CRC calculation
+ *
+ * EQ_CRC32C(c1, c2)
+ *		Check for equality of two CRCs.
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/port/pg_crc32c.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_CRC32C_H
+#define PG_CRC32C_H
+
+typedef uint32 pg_crc32c;
+
+/* The INIT and EQ macros are the same for all implementations. */
+#define INIT_CRC32C(crc) ((crc) = 0xFFFFFFFF)
+#define EQ_CRC32C(c1, c2) ((c1) == (c2))
+
+#if defined(USE_SSE42_CRC32C)
+/* Use SSE4.2 instructions. */
+#define COMP_CRC32C(crc, data, len) \
+	((crc) = pg_comp_crc32c_sse42((crc), (data), (len)))
+#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+
+extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
+
+#elif defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
+/*
+ * Use SSE4.2 instructions, but perform a runtime check first to check that
+ * they are available.
+ */
+#define COMP_CRC32C(crc, data, len) \
+	((crc) = pg_comp_crc32c((crc), (data), (len)))
+#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+
+extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
+extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
+
+#else
+/*
+ * Use slicing-by-8 algorithm.
+ *
+ * On big-endian systems, the intermediate value is kept in reverse byte
+ * order, to avoid byte-swapping during the calculation. FIN_CRC32C reverses
+ * the bytes to the final order.
+ */
+#define COMP_CRC32C(crc, data, len) \
+	((crc) = pg_comp_crc32c_sb8((crc), (data), (len)))
+#ifdef WORDS_BIGENDIAN
+
+#ifdef HAVE__BUILTIN_BSWAP32
+#define BSWAP32(x) __builtin_bswap32(x)
+#else
+#define BSWAP32(x) (((x << 24) & 0xff000000) | \
+					((x << 8) & 0x00ff0000) | \
+					((x >> 8) & 0x0000ff00) | \
+					((x >> 24) & 0x000000ff))
+#endif
+
+#define FIN_CRC32C(crc) ((crc) = BSWAP32(crc) ^ 0xFFFFFFFF)
+#else
+#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+#endif
+
+extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+
+#endif
+
+#endif /* PG_CRC32C_H */
diff --git a/src/include/utils/pg_crc.h b/src/include/utils/pg_crc.h
new file mode 100644
index 0000000..b4efe15
--- /dev/null
+++ b/src/include/utils/pg_crc.h
@@ -0,0 +1,107 @@
+/*
+ * pg_crc.h
+ *
+ * PostgreSQL CRC support
+ *
+ * See Ross Williams' excellent introduction
+ * A PAINLESS GUIDE TO CRC ERROR DETECTION ALGORITHMS, available from
+ * http://www.ross.net/crc/ or several other net sites.
+ *
+ * We have three slightly different variants of a 32-bit CRC calculation:
+ * CRC-32C (Castagnoli polynomial), CRC-32 (Ethernet polynomial), and a legacy
+ * CRC-32 version that uses the lookup table in a funny way. They all consist
+ * of four macros:
+ *
+ * INIT_<variant>(crc)
+ *		Initialize a CRC accumulator
+ *
+ * COMP_<variant>(crc, data, len)
+ *		Accumulate some (more) bytes into a CRC
+ *
+ * FIN_<variant>(crc)
+ *		Finish a CRC calculation
+ *
+ * EQ_<variant>(c1, c2)
+ *		Check for equality of two CRCs.
+ *
+ * The CRC-32C variant is in port/pg_crc32c.h.
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/pg_crc.h
+ */
+#ifndef PG_CRC_H
+#define PG_CRC_H
+
+typedef uint32 pg_crc32;
+
+/*
+ * CRC-32, the same used e.g. in Ethernet.
+ *
+ * This is currently only used in ltree and hstore contrib modules. It uses
+ * the same lookup table as the legacy algorithm below. New code should
+ * use the Castagnoli version instead.
+ */
+#define INIT_TRADITIONAL_CRC32(crc) ((crc) = 0xFFFFFFFF)
+#define FIN_TRADITIONAL_CRC32(crc)	((crc) ^= 0xFFFFFFFF)
+#define COMP_TRADITIONAL_CRC32(crc, data, len)	\
+	COMP_CRC32_NORMAL_TABLE(crc, data, len, pg_crc32_table)
+#define EQ_TRADITIONAL_CRC32(c1, c2) ((c1) == (c2))
+
+/* Sarwate's algorithm, for use with a "normal" lookup table */
+#define COMP_CRC32_NORMAL_TABLE(crc, data, len, table)			  \
+do {															  \
+	const unsigned char *__data = (const unsigned char *) (data); \
+	uint32		__len = (len); \
+\
+	while (__len-- > 0) \
+	{ \
+		int		__tab_index = ((int) (crc) ^ *__data++) & 0xFF; \
+		(crc) = table[__tab_index] ^ ((crc) >> 8); \
+	} \
+} while (0)
+
+/*
+ * The CRC algorithm used for WAL et al in pre-9.5 versions.
+ *
+ * This closely resembles the normal CRC-32 algorithm, but is subtly
+ * different. Using Williams' terms, we use the "normal" table, but with
+ * "reflected" code. That's bogus, but it was like that for years before
+ * anyone noticed. It does not correspond to any polynomial in a normal CRC
+ * algorithm, so it's not clear what the error-detection properties of this
+ * algorithm actually are.
+ *
+ * We still need to carry this around because it is used in a few on-disk
+ * structures that need to be pg_upgradeable. It should not be used in new
+ * code.
+ */
+#define INIT_LEGACY_CRC32(crc) ((crc) = 0xFFFFFFFF)
+#define FIN_LEGACY_CRC32(crc)	((crc) ^= 0xFFFFFFFF)
+#define COMP_LEGACY_CRC32(crc, data, len)	\
+	COMP_CRC32_REFLECTED_TABLE(crc, data, len, pg_crc32_table)
+#define EQ_LEGACY_CRC32(c1, c2) ((c1) == (c2))
+
+/*
+ * Sarwate's algorithm, for use with a "reflected" lookup table (but in the
+ * legacy algorithm, we actually use it on a "normal" table, see above)
+ */
+#define COMP_CRC32_REFLECTED_TABLE(crc, data, len, table) \
+do {															  \
+	const unsigned char *__data = (const unsigned char *) (data); \
+	uint32		__len = (len); \
+\
+	while (__len-- > 0) \
+	{ \
+		int		__tab_index = ((int) ((crc) >> 24) ^ *__data++) & 0xFF;	\
+		(crc) = table[__tab_index] ^ ((crc) << 8); \
+	} \
+} while (0)
+
+/*
+ * Constant table for the CRC-32 polynomials. The same table is used by both
+ * the normal and traditional variants.
+ */
+extern PGDLLIMPORT const uint32 pg_crc32_table[256];
+
+#endif   /* PG_CRC_H */
diff --git a/src/port/Makefile b/src/port/Makefile
index abc42a2..bc9b63a 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -30,7 +30,7 @@ include $(top_builddir)/src/Makefile.global
 override CPPFLAGS := -I$(top_builddir)/src/port -DFRONTEND $(CPPFLAGS)
 LIBS += $(PTHREAD_LIBS)
 
-OBJS = $(LIBOBJS) chklocale.o erand48.o inet_net_ntop.o \
+OBJS = $(LIBOBJS) $(PG_CRC32C_OBJS) chklocale.o erand48.o inet_net_ntop.o \
 	noblock.o path.o pgcheckdir.o pgmkdirp.o pgsleep.o \
 	pgstrcasecmp.o pqsignal.o \
 	qsort.o qsort_arg.o quotes.o sprompt.o tar.o thread.o
@@ -57,6 +57,10 @@ libpgport.a: $(OBJS)
 # thread.o needs PTHREAD_CFLAGS (but thread_srv.o does not)
 thread.o: CFLAGS+=$(PTHREAD_CFLAGS)
 
+# pg_crc32c_sse42.o and its _srv.o version need CFLAGS_SSE42
+pg_crc32c_sse42.o: CFLAGS+=$(CFLAGS_SSE42)
+pg_crc32c_sse42_srv.o: CFLAGS+=$(CFLAGS_SSE42)
+
 #
 # Server versions of object files
 #
diff --git a/src/port/pg_crc32c_choose.c b/src/port/pg_crc32c_choose.c
new file mode 100644
index 0000000..ba0d167
--- /dev/null
+++ b/src/port/pg_crc32c_choose.c
@@ -0,0 +1,63 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_crc32c_choose.c
+ *	  Choose which CRC-32C implementation to use, at runtime.
+ *
+ * Try to the special CRC instructions introduced in Intel SSE 4.2,
+ * if available on the platform we're running on, but fall back to the
+ * slicing-by-8 implementation otherwise.
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/port/pg_crc32c_choose.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "c.h"
+
+#ifdef HAVE__GET_CPUID
+#include <cpuid.h>
+#endif
+
+#ifdef HAVE__CPUID
+#include <intrin.h>
+#endif
+
+#include "port/pg_crc32c.h"
+
+static bool
+pg_crc32c_sse42_available(void)
+{
+	unsigned int exx[4] = {0, 0, 0, 0};
+
+#if defined(HAVE__GET_CPUID)
+	__get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUID)
+	__cpuid(exx, 1);
+#else
+#error cpuid instruction not available
+#endif
+
+	return (exx[2] & (1 << 20)) != 0;    /* SSE 4.2 */
+}
+
+/*
+ * This gets called on the first call. It replaces the function pointer
+ * so that subsequent calls are routed directly to the chosen implementation.
+ */
+static pg_crc32c
+pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
+{
+	if (pg_crc32c_sse42_available())
+		pg_comp_crc32c = pg_comp_crc32c_sse42;
+	else
+		pg_comp_crc32c = pg_comp_crc32c_sb8;
+
+	return pg_comp_crc32c(crc, data, len);
+}
+
+pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len) = pg_comp_crc32c_choose;
diff --git a/src/port/pg_crc32c_sb8.c b/src/port/pg_crc32c_sb8.c
new file mode 100644
index 0000000..425c02c
--- /dev/null
+++ b/src/port/pg_crc32c_sb8.c
@@ -0,0 +1,1169 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_crc32c_sb8.c
+ *	  Compute CRC-32C checksum using slicing-by-8 algorithm.
+ *
+ * Michael E. Kounavis, Frank L. Berry,
+ * "Novel Table Lookup-Based Algorithms for High-Performance CRC
+ * Generation", IEEE Transactions on Computers, vol.57, no. 11,
+ * pp. 1550-1560, November 2008, doi:10.1109/TC.2008.85
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/port/pg_crc32c_sb8.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "c.h"
+
+#include "port/pg_crc32c.h"
+
+static const uint32 pg_crc32c_table[8][256];
+
+/* Accumulate one input byte */
+#ifdef WORDS_BIGENDIAN
+#define CRC8(x) pg_crc32c_table[0][((crc >> 24) ^ (x)) & 0xFF] ^ (crc << 8)
+#else
+#define CRC8(x) pg_crc32c_table[0][(crc ^ (x)) & 0xFF] ^ (crc >> 8)
+#endif
+
+pg_crc32c
+pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len)
+{
+	const unsigned char *p = data;
+	const uint32 *p4;
+
+	/*
+	 * Handle 0-3 initial bytes one at a time, so that the loop below starts
+	 * with a pointer aligned to four bytes.
+	 */
+	while (len > 0 && ((uintptr_t) p & 3))
+	{
+		crc = CRC8(*p++);
+		len--;
+	}
+
+	/*
+	 * Process eight bytes of data at a time.
+	 */
+	p4 = (const uint32 *) p;
+	while (len >= 8)
+	{
+		uint32		a = *p4++ ^ crc;
+		uint32		b = *p4++;
+
+#ifdef WORDS_BIGENDIAN
+		const uint8 c0 = b;
+		const uint8 c1 = b >> 8;
+		const uint8 c2 = b >> 16;
+		const uint8 c3 = b >> 24;
+		const uint8 c4 = a;
+		const uint8 c5 = a >> 8;
+		const uint8 c6 = a >> 16;
+		const uint8 c7 = a >> 24;
+#else
+		const uint8 c0 = b >> 24;
+		const uint8 c1 = b >> 16;
+		const uint8 c2 = b >> 8;
+		const uint8 c3 = b;
+		const uint8 c4 = a >> 24;
+		const uint8 c5 = a >> 16;
+		const uint8 c6 = a >> 8;
+		const uint8 c7 = a;
+#endif
+
+		crc =
+			pg_crc32c_table[0][c0] ^ pg_crc32c_table[1][c1] ^
+			pg_crc32c_table[2][c2] ^ pg_crc32c_table[3][c3] ^
+			pg_crc32c_table[4][c4] ^ pg_crc32c_table[5][c5] ^
+			pg_crc32c_table[6][c6] ^ pg_crc32c_table[7][c7];
+
+		len -= 8;
+	}
+
+	/*
+	 * Handle any remaining bytes one at a time.
+	 */
+	p = (const unsigned char *) p4;
+	while (len > 0)
+	{
+		crc = CRC8(*p++);
+		len--;
+	}
+
+	return crc;
+}
+
+/*
+ * Lookup tables for the slicing-by-8 algorithm, for the so-called Castagnoli
+ * polynomial (the same that is used e.g. in iSCSI), 0x1EDC6F41. Using
+ * Williams' terms, this is the "normal", not "reflected" version. However, on
+ * big-endian systems the values in the tables are stored in byte-reversed
+ * order (IOW, the tables are stored in little-endian order even on big-endian
+ * systems).
+ */
+static const uint32 pg_crc32c_table[8][256] = {
+#ifndef WORDS_BIGENDIAN
+	{
+		0x00000000, 0xF26B8303, 0xE13B70F7, 0x1350F3F4,
+		0xC79A971F, 0x35F1141C, 0x26A1E7E8, 0xD4CA64EB,
+		0x8AD958CF, 0x78B2DBCC, 0x6BE22838, 0x9989AB3B,
+		0x4D43CFD0, 0xBF284CD3, 0xAC78BF27, 0x5E133C24,
+		0x105EC76F, 0xE235446C, 0xF165B798, 0x030E349B,
+		0xD7C45070, 0x25AFD373, 0x36FF2087, 0xC494A384,
+		0x9A879FA0, 0x68EC1CA3, 0x7BBCEF57, 0x89D76C54,
+		0x5D1D08BF, 0xAF768BBC, 0xBC267848, 0x4E4DFB4B,
+		0x20BD8EDE, 0xD2D60DDD, 0xC186FE29, 0x33ED7D2A,
+		0xE72719C1, 0x154C9AC2, 0x061C6936, 0xF477EA35,
+		0xAA64D611, 0x580F5512, 0x4B5FA6E6, 0xB93425E5,
+		0x6DFE410E, 0x9F95C20D, 0x8CC531F9, 0x7EAEB2FA,
+		0x30E349B1, 0xC288CAB2, 0xD1D83946, 0x23B3BA45,
+		0xF779DEAE, 0x05125DAD, 0x1642AE59, 0xE4292D5A,
+		0xBA3A117E, 0x4851927D, 0x5B016189, 0xA96AE28A,
+		0x7DA08661, 0x8FCB0562, 0x9C9BF696, 0x6EF07595,
+		0x417B1DBC, 0xB3109EBF, 0xA0406D4B, 0x522BEE48,
+		0x86E18AA3, 0x748A09A0, 0x67DAFA54, 0x95B17957,
+		0xCBA24573, 0x39C9C670, 0x2A993584, 0xD8F2B687,
+		0x0C38D26C, 0xFE53516F, 0xED03A29B, 0x1F682198,
+		0x5125DAD3, 0xA34E59D0, 0xB01EAA24, 0x42752927,
+		0x96BF4DCC, 0x64D4CECF, 0x77843D3B, 0x85EFBE38,
+		0xDBFC821C, 0x2997011F, 0x3AC7F2EB, 0xC8AC71E8,
+		0x1C661503, 0xEE0D9600, 0xFD5D65F4, 0x0F36E6F7,
+		0x61C69362, 0x93AD1061, 0x80FDE395, 0x72966096,
+		0xA65C047D, 0x5437877E, 0x4767748A, 0xB50CF789,
+		0xEB1FCBAD, 0x197448AE, 0x0A24BB5A, 0xF84F3859,
+		0x2C855CB2, 0xDEEEDFB1, 0xCDBE2C45, 0x3FD5AF46,
+		0x7198540D, 0x83F3D70E, 0x90A324FA, 0x62C8A7F9,
+		0xB602C312, 0x44694011, 0x5739B3E5, 0xA55230E6,
+		0xFB410CC2, 0x092A8FC1, 0x1A7A7C35, 0xE811FF36,
+		0x3CDB9BDD, 0xCEB018DE, 0xDDE0EB2A, 0x2F8B6829,
+		0x82F63B78, 0x709DB87B, 0x63CD4B8F, 0x91A6C88C,
+		0x456CAC67, 0xB7072F64, 0xA457DC90, 0x563C5F93,
+		0x082F63B7, 0xFA44E0B4, 0xE9141340, 0x1B7F9043,
+		0xCFB5F4A8, 0x3DDE77AB, 0x2E8E845F, 0xDCE5075C,
+		0x92A8FC17, 0x60C37F14, 0x73938CE0, 0x81F80FE3,
+		0x55326B08, 0xA759E80B, 0xB4091BFF, 0x466298FC,
+		0x1871A4D8, 0xEA1A27DB, 0xF94AD42F, 0x0B21572C,
+		0xDFEB33C7, 0x2D80B0C4, 0x3ED04330, 0xCCBBC033,
+		0xA24BB5A6, 0x502036A5, 0x4370C551, 0xB11B4652,
+		0x65D122B9, 0x97BAA1BA, 0x84EA524E, 0x7681D14D,
+		0x2892ED69, 0xDAF96E6A, 0xC9A99D9E, 0x3BC21E9D,
+		0xEF087A76, 0x1D63F975, 0x0E330A81, 0xFC588982,
+		0xB21572C9, 0x407EF1CA, 0x532E023E, 0xA145813D,
+		0x758FE5D6, 0x87E466D5, 0x94B49521, 0x66DF1622,
+		0x38CC2A06, 0xCAA7A905, 0xD9F75AF1, 0x2B9CD9F2,
+		0xFF56BD19, 0x0D3D3E1A, 0x1E6DCDEE, 0xEC064EED,
+		0xC38D26C4, 0x31E6A5C7, 0x22B65633, 0xD0DDD530,
+		0x0417B1DB, 0xF67C32D8, 0xE52CC12C, 0x1747422F,
+		0x49547E0B, 0xBB3FFD08, 0xA86F0EFC, 0x5A048DFF,
+		0x8ECEE914, 0x7CA56A17, 0x6FF599E3, 0x9D9E1AE0,
+		0xD3D3E1AB, 0x21B862A8, 0x32E8915C, 0xC083125F,
+		0x144976B4, 0xE622F5B7, 0xF5720643, 0x07198540,
+		0x590AB964, 0xAB613A67, 0xB831C993, 0x4A5A4A90,
+		0x9E902E7B, 0x6CFBAD78, 0x7FAB5E8C, 0x8DC0DD8F,
+		0xE330A81A, 0x115B2B19, 0x020BD8ED, 0xF0605BEE,
+		0x24AA3F05, 0xD6C1BC06, 0xC5914FF2, 0x37FACCF1,
+		0x69E9F0D5, 0x9B8273D6, 0x88D28022, 0x7AB90321,
+		0xAE7367CA, 0x5C18E4C9, 0x4F48173D, 0xBD23943E,
+		0xF36E6F75, 0x0105EC76, 0x12551F82, 0xE03E9C81,
+		0x34F4F86A, 0xC69F7B69, 0xD5CF889D, 0x27A40B9E,
+		0x79B737BA, 0x8BDCB4B9, 0x988C474D, 0x6AE7C44E,
+		0xBE2DA0A5, 0x4C4623A6, 0x5F16D052, 0xAD7D5351
+	},
+	{
+		0x00000000, 0x13A29877, 0x274530EE, 0x34E7A899,
+		0x4E8A61DC, 0x5D28F9AB, 0x69CF5132, 0x7A6DC945,
+		0x9D14C3B8, 0x8EB65BCF, 0xBA51F356, 0xA9F36B21,
+		0xD39EA264, 0xC03C3A13, 0xF4DB928A, 0xE7790AFD,
+		0x3FC5F181, 0x2C6769F6, 0x1880C16F, 0x0B225918,
+		0x714F905D, 0x62ED082A, 0x560AA0B3, 0x45A838C4,
+		0xA2D13239, 0xB173AA4E, 0x859402D7, 0x96369AA0,
+		0xEC5B53E5, 0xFFF9CB92, 0xCB1E630B, 0xD8BCFB7C,
+		0x7F8BE302, 0x6C297B75, 0x58CED3EC, 0x4B6C4B9B,
+		0x310182DE, 0x22A31AA9, 0x1644B230, 0x05E62A47,
+		0xE29F20BA, 0xF13DB8CD, 0xC5DA1054, 0xD6788823,
+		0xAC154166, 0xBFB7D911, 0x8B507188, 0x98F2E9FF,
+		0x404E1283, 0x53EC8AF4, 0x670B226D, 0x74A9BA1A,
+		0x0EC4735F, 0x1D66EB28, 0x298143B1, 0x3A23DBC6,
+		0xDD5AD13B, 0xCEF8494C, 0xFA1FE1D5, 0xE9BD79A2,
+		0x93D0B0E7, 0x80722890, 0xB4958009, 0xA737187E,
+		0xFF17C604, 0xECB55E73, 0xD852F6EA, 0xCBF06E9D,
+		0xB19DA7D8, 0xA23F3FAF, 0x96D89736, 0x857A0F41,
+		0x620305BC, 0x71A19DCB, 0x45463552, 0x56E4AD25,
+		0x2C896460, 0x3F2BFC17, 0x0BCC548E, 0x186ECCF9,
+		0xC0D23785, 0xD370AFF2, 0xE797076B, 0xF4359F1C,
+		0x8E585659, 0x9DFACE2E, 0xA91D66B7, 0xBABFFEC0,
+		0x5DC6F43D, 0x4E646C4A, 0x7A83C4D3, 0x69215CA4,
+		0x134C95E1, 0x00EE0D96, 0x3409A50F, 0x27AB3D78,
+		0x809C2506, 0x933EBD71, 0xA7D915E8, 0xB47B8D9F,
+		0xCE1644DA, 0xDDB4DCAD, 0xE9537434, 0xFAF1EC43,
+		0x1D88E6BE, 0x0E2A7EC9, 0x3ACDD650, 0x296F4E27,
+		0x53028762, 0x40A01F15, 0x7447B78C, 0x67E52FFB,
+		0xBF59D487, 0xACFB4CF0, 0x981CE469, 0x8BBE7C1E,
+		0xF1D3B55B, 0xE2712D2C, 0xD69685B5, 0xC5341DC2,
+		0x224D173F, 0x31EF8F48, 0x050827D1, 0x16AABFA6,
+		0x6CC776E3, 0x7F65EE94, 0x4B82460D, 0x5820DE7A,
+		0xFBC3FAF9, 0xE861628E, 0xDC86CA17, 0xCF245260,
+		0xB5499B25, 0xA6EB0352, 0x920CABCB, 0x81AE33BC,
+		0x66D73941, 0x7575A136, 0x419209AF, 0x523091D8,
+		0x285D589D, 0x3BFFC0EA, 0x0F186873, 0x1CBAF004,
+		0xC4060B78, 0xD7A4930F, 0xE3433B96, 0xF0E1A3E1,
+		0x8A8C6AA4, 0x992EF2D3, 0xADC95A4A, 0xBE6BC23D,
+		0x5912C8C0, 0x4AB050B7, 0x7E57F82E, 0x6DF56059,
+		0x1798A91C, 0x043A316B, 0x30DD99F2, 0x237F0185,
+		0x844819FB, 0x97EA818C, 0xA30D2915, 0xB0AFB162,
+		0xCAC27827, 0xD960E050, 0xED8748C9, 0xFE25D0BE,
+		0x195CDA43, 0x0AFE4234, 0x3E19EAAD, 0x2DBB72DA,
+		0x57D6BB9F, 0x447423E8, 0x70938B71, 0x63311306,
+		0xBB8DE87A, 0xA82F700D, 0x9CC8D894, 0x8F6A40E3,
+		0xF50789A6, 0xE6A511D1, 0xD242B948, 0xC1E0213F,
+		0x26992BC2, 0x353BB3B5, 0x01DC1B2C, 0x127E835B,
+		0x68134A1E, 0x7BB1D269, 0x4F567AF0, 0x5CF4E287,
+		0x04D43CFD, 0x1776A48A, 0x23910C13, 0x30339464,
+		0x4A5E5D21, 0x59FCC556, 0x6D1B6DCF, 0x7EB9F5B8,
+		0x99C0FF45, 0x8A626732, 0xBE85CFAB, 0xAD2757DC,
+		0xD74A9E99, 0xC4E806EE, 0xF00FAE77, 0xE3AD3600,
+		0x3B11CD7C, 0x28B3550B, 0x1C54FD92, 0x0FF665E5,
+		0x759BACA0, 0x663934D7, 0x52DE9C4E, 0x417C0439,
+		0xA6050EC4, 0xB5A796B3, 0x81403E2A, 0x92E2A65D,
+		0xE88F6F18, 0xFB2DF76F, 0xCFCA5FF6, 0xDC68C781,
+		0x7B5FDFFF, 0x68FD4788, 0x5C1AEF11, 0x4FB87766,
+		0x35D5BE23, 0x26772654, 0x12908ECD, 0x013216BA,
+		0xE64B1C47, 0xF5E98430, 0xC10E2CA9, 0xD2ACB4DE,
+		0xA8C17D9B, 0xBB63E5EC, 0x8F844D75, 0x9C26D502,
+		0x449A2E7E, 0x5738B609, 0x63DF1E90, 0x707D86E7,
+		0x0A104FA2, 0x19B2D7D5, 0x2D557F4C, 0x3EF7E73B,
+		0xD98EEDC6, 0xCA2C75B1, 0xFECBDD28, 0xED69455F,
+		0x97048C1A, 0x84A6146D, 0xB041BCF4, 0xA3E32483
+	},
+	{
+		0x00000000, 0xA541927E, 0x4F6F520D, 0xEA2EC073,
+		0x9EDEA41A, 0x3B9F3664, 0xD1B1F617, 0x74F06469,
+		0x38513EC5, 0x9D10ACBB, 0x773E6CC8, 0xD27FFEB6,
+		0xA68F9ADF, 0x03CE08A1, 0xE9E0C8D2, 0x4CA15AAC,
+		0x70A27D8A, 0xD5E3EFF4, 0x3FCD2F87, 0x9A8CBDF9,
+		0xEE7CD990, 0x4B3D4BEE, 0xA1138B9D, 0x045219E3,
+		0x48F3434F, 0xEDB2D131, 0x079C1142, 0xA2DD833C,
+		0xD62DE755, 0x736C752B, 0x9942B558, 0x3C032726,
+		0xE144FB14, 0x4405696A, 0xAE2BA919, 0x0B6A3B67,
+		0x7F9A5F0E, 0xDADBCD70, 0x30F50D03, 0x95B49F7D,
+		0xD915C5D1, 0x7C5457AF, 0x967A97DC, 0x333B05A2,
+		0x47CB61CB, 0xE28AF3B5, 0x08A433C6, 0xADE5A1B8,
+		0x91E6869E, 0x34A714E0, 0xDE89D493, 0x7BC846ED,
+		0x0F382284, 0xAA79B0FA, 0x40577089, 0xE516E2F7,
+		0xA9B7B85B, 0x0CF62A25, 0xE6D8EA56, 0x43997828,
+		0x37691C41, 0x92288E3F, 0x78064E4C, 0xDD47DC32,
+		0xC76580D9, 0x622412A7, 0x880AD2D4, 0x2D4B40AA,
+		0x59BB24C3, 0xFCFAB6BD, 0x16D476CE, 0xB395E4B0,
+		0xFF34BE1C, 0x5A752C62, 0xB05BEC11, 0x151A7E6F,
+		0x61EA1A06, 0xC4AB8878, 0x2E85480B, 0x8BC4DA75,
+		0xB7C7FD53, 0x12866F2D, 0xF8A8AF5E, 0x5DE93D20,
+		0x29195949, 0x8C58CB37, 0x66760B44, 0xC337993A,
+		0x8F96C396, 0x2AD751E8, 0xC0F9919B, 0x65B803E5,
+		0x1148678C, 0xB409F5F2, 0x5E273581, 0xFB66A7FF,
+		0x26217BCD, 0x8360E9B3, 0x694E29C0, 0xCC0FBBBE,
+		0xB8FFDFD7, 0x1DBE4DA9, 0xF7908DDA, 0x52D11FA4,
+		0x1E704508, 0xBB31D776, 0x511F1705, 0xF45E857B,
+		0x80AEE112, 0x25EF736C, 0xCFC1B31F, 0x6A802161,
+		0x56830647, 0xF3C29439, 0x19EC544A, 0xBCADC634,
+		0xC85DA25D, 0x6D1C3023, 0x8732F050, 0x2273622E,
+		0x6ED23882, 0xCB93AAFC, 0x21BD6A8F, 0x84FCF8F1,
+		0xF00C9C98, 0x554D0EE6, 0xBF63CE95, 0x1A225CEB,
+		0x8B277743, 0x2E66E53D, 0xC448254E, 0x6109B730,
+		0x15F9D359, 0xB0B84127, 0x5A968154, 0xFFD7132A,
+		0xB3764986, 0x1637DBF8, 0xFC191B8B, 0x595889F5,
+		0x2DA8ED9C, 0x88E97FE2, 0x62C7BF91, 0xC7862DEF,
+		0xFB850AC9, 0x5EC498B7, 0xB4EA58C4, 0x11ABCABA,
+		0x655BAED3, 0xC01A3CAD, 0x2A34FCDE, 0x8F756EA0,
+		0xC3D4340C, 0x6695A672, 0x8CBB6601, 0x29FAF47F,
+		0x5D0A9016, 0xF84B0268, 0x1265C21B, 0xB7245065,
+		0x6A638C57, 0xCF221E29, 0x250CDE5A, 0x804D4C24,
+		0xF4BD284D, 0x51FCBA33, 0xBBD27A40, 0x1E93E83E,
+		0x5232B292, 0xF77320EC, 0x1D5DE09F, 0xB81C72E1,
+		0xCCEC1688, 0x69AD84F6, 0x83834485, 0x26C2D6FB,
+		0x1AC1F1DD, 0xBF8063A3, 0x55AEA3D0, 0xF0EF31AE,
+		0x841F55C7, 0x215EC7B9, 0xCB7007CA, 0x6E3195B4,
+		0x2290CF18, 0x87D15D66, 0x6DFF9D15, 0xC8BE0F6B,
+		0xBC4E6B02, 0x190FF97C, 0xF321390F, 0x5660AB71,
+		0x4C42F79A, 0xE90365E4, 0x032DA597, 0xA66C37E9,
+		0xD29C5380, 0x77DDC1FE, 0x9DF3018D, 0x38B293F3,
+		0x7413C95F, 0xD1525B21, 0x3B7C9B52, 0x9E3D092C,
+		0xEACD6D45, 0x4F8CFF3B, 0xA5A23F48, 0x00E3AD36,
+		0x3CE08A10, 0x99A1186E, 0x738FD81D, 0xD6CE4A63,
+		0xA23E2E0A, 0x077FBC74, 0xED517C07, 0x4810EE79,
+		0x04B1B4D5, 0xA1F026AB, 0x4BDEE6D8, 0xEE9F74A6,
+		0x9A6F10CF, 0x3F2E82B1, 0xD50042C2, 0x7041D0BC,
+		0xAD060C8E, 0x08479EF0, 0xE2695E83, 0x4728CCFD,
+		0x33D8A894, 0x96993AEA, 0x7CB7FA99, 0xD9F668E7,
+		0x9557324B, 0x3016A035, 0xDA386046, 0x7F79F238,
+		0x0B899651, 0xAEC8042F, 0x44E6C45C, 0xE1A75622,
+		0xDDA47104, 0x78E5E37A, 0x92CB2309, 0x378AB177,
+		0x437AD51E, 0xE63B4760, 0x0C158713, 0xA954156D,
+		0xE5F54FC1, 0x40B4DDBF, 0xAA9A1DCC, 0x0FDB8FB2,
+		0x7B2BEBDB, 0xDE6A79A5, 0x3444B9D6, 0x91052BA8
+	},
+	{
+		0x00000000, 0xDD45AAB8, 0xBF672381, 0x62228939,
+		0x7B2231F3, 0xA6679B4B, 0xC4451272, 0x1900B8CA,
+		0xF64463E6, 0x2B01C95E, 0x49234067, 0x9466EADF,
+		0x8D665215, 0x5023F8AD, 0x32017194, 0xEF44DB2C,
+		0xE964B13D, 0x34211B85, 0x560392BC, 0x8B463804,
+		0x924680CE, 0x4F032A76, 0x2D21A34F, 0xF06409F7,
+		0x1F20D2DB, 0xC2657863, 0xA047F15A, 0x7D025BE2,
+		0x6402E328, 0xB9474990, 0xDB65C0A9, 0x06206A11,
+		0xD725148B, 0x0A60BE33, 0x6842370A, 0xB5079DB2,
+		0xAC072578, 0x71428FC0, 0x136006F9, 0xCE25AC41,
+		0x2161776D, 0xFC24DDD5, 0x9E0654EC, 0x4343FE54,
+		0x5A43469E, 0x8706EC26, 0xE524651F, 0x3861CFA7,
+		0x3E41A5B6, 0xE3040F0E, 0x81268637, 0x5C632C8F,
+		0x45639445, 0x98263EFD, 0xFA04B7C4, 0x27411D7C,
+		0xC805C650, 0x15406CE8, 0x7762E5D1, 0xAA274F69,
+		0xB327F7A3, 0x6E625D1B, 0x0C40D422, 0xD1057E9A,
+		0xABA65FE7, 0x76E3F55F, 0x14C17C66, 0xC984D6DE,
+		0xD0846E14, 0x0DC1C4AC, 0x6FE34D95, 0xB2A6E72D,
+		0x5DE23C01, 0x80A796B9, 0xE2851F80, 0x3FC0B538,
+		0x26C00DF2, 0xFB85A74A, 0x99A72E73, 0x44E284CB,
+		0x42C2EEDA, 0x9F874462, 0xFDA5CD5B, 0x20E067E3,
+		0x39E0DF29, 0xE4A57591, 0x8687FCA8, 0x5BC25610,
+		0xB4868D3C, 0x69C32784, 0x0BE1AEBD, 0xD6A40405,
+		0xCFA4BCCF, 0x12E11677, 0x70C39F4E, 0xAD8635F6,
+		0x7C834B6C, 0xA1C6E1D4, 0xC3E468ED, 0x1EA1C255,
+		0x07A17A9F, 0xDAE4D027, 0xB8C6591E, 0x6583F3A6,
+		0x8AC7288A, 0x57828232, 0x35A00B0B, 0xE8E5A1B3,
+		0xF1E51979, 0x2CA0B3C1, 0x4E823AF8, 0x93C79040,
+		0x95E7FA51, 0x48A250E9, 0x2A80D9D0, 0xF7C57368,
+		0xEEC5CBA2, 0x3380611A, 0x51A2E823, 0x8CE7429B,
+		0x63A399B7, 0xBEE6330F, 0xDCC4BA36, 0x0181108E,
+		0x1881A844, 0xC5C402FC, 0xA7E68BC5, 0x7AA3217D,
+		0x52A0C93F, 0x8FE56387, 0xEDC7EABE, 0x30824006,
+		0x2982F8CC, 0xF4C75274, 0x96E5DB4D, 0x4BA071F5,
+		0xA4E4AAD9, 0x79A10061, 0x1B838958, 0xC6C623E0,
+		0xDFC69B2A, 0x02833192, 0x60A1B8AB, 0xBDE41213,
+		0xBBC47802, 0x6681D2BA, 0x04A35B83, 0xD9E6F13B,
+		0xC0E649F1, 0x1DA3E349, 0x7F816A70, 0xA2C4C0C8,
+		0x4D801BE4, 0x90C5B15C, 0xF2E73865, 0x2FA292DD,
+		0x36A22A17, 0xEBE780AF, 0x89C50996, 0x5480A32E,
+		0x8585DDB4, 0x58C0770C, 0x3AE2FE35, 0xE7A7548D,
+		0xFEA7EC47, 0x23E246FF, 0x41C0CFC6, 0x9C85657E,
+		0x73C1BE52, 0xAE8414EA, 0xCCA69DD3, 0x11E3376B,
+		0x08E38FA1, 0xD5A62519, 0xB784AC20, 0x6AC10698,
+		0x6CE16C89, 0xB1A4C631, 0xD3864F08, 0x0EC3E5B0,
+		0x17C35D7A, 0xCA86F7C2, 0xA8A47EFB, 0x75E1D443,
+		0x9AA50F6F, 0x47E0A5D7, 0x25C22CEE, 0xF8878656,
+		0xE1873E9C, 0x3CC29424, 0x5EE01D1D, 0x83A5B7A5,
+		0xF90696D8, 0x24433C60, 0x4661B559, 0x9B241FE1,
+		0x8224A72B, 0x5F610D93, 0x3D4384AA, 0xE0062E12,
+		0x0F42F53E, 0xD2075F86, 0xB025D6BF, 0x6D607C07,
+		0x7460C4CD, 0xA9256E75, 0xCB07E74C, 0x16424DF4,
+		0x106227E5, 0xCD278D5D, 0xAF050464, 0x7240AEDC,
+		0x6B401616, 0xB605BCAE, 0xD4273597, 0x09629F2F,
+		0xE6264403, 0x3B63EEBB, 0x59416782, 0x8404CD3A,
+		0x9D0475F0, 0x4041DF48, 0x22635671, 0xFF26FCC9,
+		0x2E238253, 0xF36628EB, 0x9144A1D2, 0x4C010B6A,
+		0x5501B3A0, 0x88441918, 0xEA669021, 0x37233A99,
+		0xD867E1B5, 0x05224B0D, 0x6700C234, 0xBA45688C,
+		0xA345D046, 0x7E007AFE, 0x1C22F3C7, 0xC167597F,
+		0xC747336E, 0x1A0299D6, 0x782010EF, 0xA565BA57,
+		0xBC65029D, 0x6120A825, 0x0302211C, 0xDE478BA4,
+		0x31035088, 0xEC46FA30, 0x8E647309, 0x5321D9B1,
+		0x4A21617B, 0x9764CBC3, 0xF54642FA, 0x2803E842
+	},
+	{
+		0x00000000, 0x38116FAC, 0x7022DF58, 0x4833B0F4,
+		0xE045BEB0, 0xD854D11C, 0x906761E8, 0xA8760E44,
+		0xC5670B91, 0xFD76643D, 0xB545D4C9, 0x8D54BB65,
+		0x2522B521, 0x1D33DA8D, 0x55006A79, 0x6D1105D5,
+		0x8F2261D3, 0xB7330E7F, 0xFF00BE8B, 0xC711D127,
+		0x6F67DF63, 0x5776B0CF, 0x1F45003B, 0x27546F97,
+		0x4A456A42, 0x725405EE, 0x3A67B51A, 0x0276DAB6,
+		0xAA00D4F2, 0x9211BB5E, 0xDA220BAA, 0xE2336406,
+		0x1BA8B557, 0x23B9DAFB, 0x6B8A6A0F, 0x539B05A3,
+		0xFBED0BE7, 0xC3FC644B, 0x8BCFD4BF, 0xB3DEBB13,
+		0xDECFBEC6, 0xE6DED16A, 0xAEED619E, 0x96FC0E32,
+		0x3E8A0076, 0x069B6FDA, 0x4EA8DF2E, 0x76B9B082,
+		0x948AD484, 0xAC9BBB28, 0xE4A80BDC, 0xDCB96470,
+		0x74CF6A34, 0x4CDE0598, 0x04EDB56C, 0x3CFCDAC0,
+		0x51EDDF15, 0x69FCB0B9, 0x21CF004D, 0x19DE6FE1,
+		0xB1A861A5, 0x89B90E09, 0xC18ABEFD, 0xF99BD151,
+		0x37516AAE, 0x0F400502, 0x4773B5F6, 0x7F62DA5A,
+		0xD714D41E, 0xEF05BBB2, 0xA7360B46, 0x9F2764EA,
+		0xF236613F, 0xCA270E93, 0x8214BE67, 0xBA05D1CB,
+		0x1273DF8F, 0x2A62B023, 0x625100D7, 0x5A406F7B,
+		0xB8730B7D, 0x806264D1, 0xC851D425, 0xF040BB89,
+		0x5836B5CD, 0x6027DA61, 0x28146A95, 0x10050539,
+		0x7D1400EC, 0x45056F40, 0x0D36DFB4, 0x3527B018,
+		0x9D51BE5C, 0xA540D1F0, 0xED736104, 0xD5620EA8,
+		0x2CF9DFF9, 0x14E8B055, 0x5CDB00A1, 0x64CA6F0D,
+		0xCCBC6149, 0xF4AD0EE5, 0xBC9EBE11, 0x848FD1BD,
+		0xE99ED468, 0xD18FBBC4, 0x99BC0B30, 0xA1AD649C,
+		0x09DB6AD8, 0x31CA0574, 0x79F9B580, 0x41E8DA2C,
+		0xA3DBBE2A, 0x9BCAD186, 0xD3F96172, 0xEBE80EDE,
+		0x439E009A, 0x7B8F6F36, 0x33BCDFC2, 0x0BADB06E,
+		0x66BCB5BB, 0x5EADDA17, 0x169E6AE3, 0x2E8F054F,
+		0x86F90B0B, 0xBEE864A7, 0xF6DBD453, 0xCECABBFF,
+		0x6EA2D55C, 0x56B3BAF0, 0x1E800A04, 0x269165A8,
+		0x8EE76BEC, 0xB6F60440, 0xFEC5B4B4, 0xC6D4DB18,
+		0xABC5DECD, 0x93D4B161, 0xDBE70195, 0xE3F66E39,
+		0x4B80607D, 0x73910FD1, 0x3BA2BF25, 0x03B3D089,
+		0xE180B48F, 0xD991DB23, 0x91A26BD7, 0xA9B3047B,
+		0x01C50A3F, 0x39D46593, 0x71E7D567, 0x49F6BACB,
+		0x24E7BF1E, 0x1CF6D0B2, 0x54C56046, 0x6CD40FEA,
+		0xC4A201AE, 0xFCB36E02, 0xB480DEF6, 0x8C91B15A,
+		0x750A600B, 0x4D1B0FA7, 0x0528BF53, 0x3D39D0FF,
+		0x954FDEBB, 0xAD5EB117, 0xE56D01E3, 0xDD7C6E4F,
+		0xB06D6B9A, 0x887C0436, 0xC04FB4C2, 0xF85EDB6E,
+		0x5028D52A, 0x6839BA86, 0x200A0A72, 0x181B65DE,
+		0xFA2801D8, 0xC2396E74, 0x8A0ADE80, 0xB21BB12C,
+		0x1A6DBF68, 0x227CD0C4, 0x6A4F6030, 0x525E0F9C,
+		0x3F4F0A49, 0x075E65E5, 0x4F6DD511, 0x777CBABD,
+		0xDF0AB4F9, 0xE71BDB55, 0xAF286BA1, 0x9739040D,
+		0x59F3BFF2, 0x61E2D05E, 0x29D160AA, 0x11C00F06,
+		0xB9B60142, 0x81A76EEE, 0xC994DE1A, 0xF185B1B6,
+		0x9C94B463, 0xA485DBCF, 0xECB66B3B, 0xD4A70497,
+		0x7CD10AD3, 0x44C0657F, 0x0CF3D58B, 0x34E2BA27,
+		0xD6D1DE21, 0xEEC0B18D, 0xA6F30179, 0x9EE26ED5,
+		0x36946091, 0x0E850F3D, 0x46B6BFC9, 0x7EA7D065,
+		0x13B6D5B0, 0x2BA7BA1C, 0x63940AE8, 0x5B856544,
+		0xF3F36B00, 0xCBE204AC, 0x83D1B458, 0xBBC0DBF4,
+		0x425B0AA5, 0x7A4A6509, 0x3279D5FD, 0x0A68BA51,
+		0xA21EB415, 0x9A0FDBB9, 0xD23C6B4D, 0xEA2D04E1,
+		0x873C0134, 0xBF2D6E98, 0xF71EDE6C, 0xCF0FB1C0,
+		0x6779BF84, 0x5F68D028, 0x175B60DC, 0x2F4A0F70,
+		0xCD796B76, 0xF56804DA, 0xBD5BB42E, 0x854ADB82,
+		0x2D3CD5C6, 0x152DBA6A, 0x5D1E0A9E, 0x650F6532,
+		0x081E60E7, 0x300F0F4B, 0x783CBFBF, 0x402DD013,
+		0xE85BDE57, 0xD04AB1FB, 0x9879010F, 0xA0686EA3
+	},
+	{
+		0x00000000, 0xEF306B19, 0xDB8CA0C3, 0x34BCCBDA,
+		0xB2F53777, 0x5DC55C6E, 0x697997B4, 0x8649FCAD,
+		0x6006181F, 0x8F367306, 0xBB8AB8DC, 0x54BAD3C5,
+		0xD2F32F68, 0x3DC34471, 0x097F8FAB, 0xE64FE4B2,
+		0xC00C303E, 0x2F3C5B27, 0x1B8090FD, 0xF4B0FBE4,
+		0x72F90749, 0x9DC96C50, 0xA975A78A, 0x4645CC93,
+		0xA00A2821, 0x4F3A4338, 0x7B8688E2, 0x94B6E3FB,
+		0x12FF1F56, 0xFDCF744F, 0xC973BF95, 0x2643D48C,
+		0x85F4168D, 0x6AC47D94, 0x5E78B64E, 0xB148DD57,
+		0x370121FA, 0xD8314AE3, 0xEC8D8139, 0x03BDEA20,
+		0xE5F20E92, 0x0AC2658B, 0x3E7EAE51, 0xD14EC548,
+		0x570739E5, 0xB83752FC, 0x8C8B9926, 0x63BBF23F,
+		0x45F826B3, 0xAAC84DAA, 0x9E748670, 0x7144ED69,
+		0xF70D11C4, 0x183D7ADD, 0x2C81B107, 0xC3B1DA1E,
+		0x25FE3EAC, 0xCACE55B5, 0xFE729E6F, 0x1142F576,
+		0x970B09DB, 0x783B62C2, 0x4C87A918, 0xA3B7C201,
+		0x0E045BEB, 0xE13430F2, 0xD588FB28, 0x3AB89031,
+		0xBCF16C9C, 0x53C10785, 0x677DCC5F, 0x884DA746,
+		0x6E0243F4, 0x813228ED, 0xB58EE337, 0x5ABE882E,
+		0xDCF77483, 0x33C71F9A, 0x077BD440, 0xE84BBF59,
+		0xCE086BD5, 0x213800CC, 0x1584CB16, 0xFAB4A00F,
+		0x7CFD5CA2, 0x93CD37BB, 0xA771FC61, 0x48419778,
+		0xAE0E73CA, 0x413E18D3, 0x7582D309, 0x9AB2B810,
+		0x1CFB44BD, 0xF3CB2FA4, 0xC777E47E, 0x28478F67,
+		0x8BF04D66, 0x64C0267F, 0x507CEDA5, 0xBF4C86BC,
+		0x39057A11, 0xD6351108, 0xE289DAD2, 0x0DB9B1CB,
+		0xEBF65579, 0x04C63E60, 0x307AF5BA, 0xDF4A9EA3,
+		0x5903620E, 0xB6330917, 0x828FC2CD, 0x6DBFA9D4,
+		0x4BFC7D58, 0xA4CC1641, 0x9070DD9B, 0x7F40B682,
+		0xF9094A2F, 0x16392136, 0x2285EAEC, 0xCDB581F5,
+		0x2BFA6547, 0xC4CA0E5E, 0xF076C584, 0x1F46AE9D,
+		0x990F5230, 0x763F3929, 0x4283F2F3, 0xADB399EA,
+		0x1C08B7D6, 0xF338DCCF, 0xC7841715, 0x28B47C0C,
+		0xAEFD80A1, 0x41CDEBB8, 0x75712062, 0x9A414B7B,
+		0x7C0EAFC9, 0x933EC4D0, 0xA7820F0A, 0x48B26413,
+		0xCEFB98BE, 0x21CBF3A7, 0x1577387D, 0xFA475364,
+		0xDC0487E8, 0x3334ECF1, 0x0788272B, 0xE8B84C32,
+		0x6EF1B09F, 0x81C1DB86, 0xB57D105C, 0x5A4D7B45,
+		0xBC029FF7, 0x5332F4EE, 0x678E3F34, 0x88BE542D,
+		0x0EF7A880, 0xE1C7C399, 0xD57B0843, 0x3A4B635A,
+		0x99FCA15B, 0x76CCCA42, 0x42700198, 0xAD406A81,
+		0x2B09962C, 0xC439FD35, 0xF08536EF, 0x1FB55DF6,
+		0xF9FAB944, 0x16CAD25D, 0x22761987, 0xCD46729E,
+		0x4B0F8E33, 0xA43FE52A, 0x90832EF0, 0x7FB345E9,
+		0x59F09165, 0xB6C0FA7C, 0x827C31A6, 0x6D4C5ABF,
+		0xEB05A612, 0x0435CD0B, 0x308906D1, 0xDFB96DC8,
+		0x39F6897A, 0xD6C6E263, 0xE27A29B9, 0x0D4A42A0,
+		0x8B03BE0D, 0x6433D514, 0x508F1ECE, 0xBFBF75D7,
+		0x120CEC3D, 0xFD3C8724, 0xC9804CFE, 0x26B027E7,
+		0xA0F9DB4A, 0x4FC9B053, 0x7B757B89, 0x94451090,
+		0x720AF422, 0x9D3A9F3B, 0xA98654E1, 0x46B63FF8,
+		0xC0FFC355, 0x2FCFA84C, 0x1B736396, 0xF443088F,
+		0xD200DC03, 0x3D30B71A, 0x098C7CC0, 0xE6BC17D9,
+		0x60F5EB74, 0x8FC5806D, 0xBB794BB7, 0x544920AE,
+		0xB206C41C, 0x5D36AF05, 0x698A64DF, 0x86BA0FC6,
+		0x00F3F36B, 0xEFC39872, 0xDB7F53A8, 0x344F38B1,
+		0x97F8FAB0, 0x78C891A9, 0x4C745A73, 0xA344316A,
+		0x250DCDC7, 0xCA3DA6DE, 0xFE816D04, 0x11B1061D,
+		0xF7FEE2AF, 0x18CE89B6, 0x2C72426C, 0xC3422975,
+		0x450BD5D8, 0xAA3BBEC1, 0x9E87751B, 0x71B71E02,
+		0x57F4CA8E, 0xB8C4A197, 0x8C786A4D, 0x63480154,
+		0xE501FDF9, 0x0A3196E0, 0x3E8D5D3A, 0xD1BD3623,
+		0x37F2D291, 0xD8C2B988, 0xEC7E7252, 0x034E194B,
+		0x8507E5E6, 0x6A378EFF, 0x5E8B4525, 0xB1BB2E3C
+	},
+	{
+		0x00000000, 0x68032CC8, 0xD0065990, 0xB8057558,
+		0xA5E0C5D1, 0xCDE3E919, 0x75E69C41, 0x1DE5B089,
+		0x4E2DFD53, 0x262ED19B, 0x9E2BA4C3, 0xF628880B,
+		0xEBCD3882, 0x83CE144A, 0x3BCB6112, 0x53C84DDA,
+		0x9C5BFAA6, 0xF458D66E, 0x4C5DA336, 0x245E8FFE,
+		0x39BB3F77, 0x51B813BF, 0xE9BD66E7, 0x81BE4A2F,
+		0xD27607F5, 0xBA752B3D, 0x02705E65, 0x6A7372AD,
+		0x7796C224, 0x1F95EEEC, 0xA7909BB4, 0xCF93B77C,
+		0x3D5B83BD, 0x5558AF75, 0xED5DDA2D, 0x855EF6E5,
+		0x98BB466C, 0xF0B86AA4, 0x48BD1FFC, 0x20BE3334,
+		0x73767EEE, 0x1B755226, 0xA370277E, 0xCB730BB6,
+		0xD696BB3F, 0xBE9597F7, 0x0690E2AF, 0x6E93CE67,
+		0xA100791B, 0xC90355D3, 0x7106208B, 0x19050C43,
+		0x04E0BCCA, 0x6CE39002, 0xD4E6E55A, 0xBCE5C992,
+		0xEF2D8448, 0x872EA880, 0x3F2BDDD8, 0x5728F110,
+		0x4ACD4199, 0x22CE6D51, 0x9ACB1809, 0xF2C834C1,
+		0x7AB7077A, 0x12B42BB2, 0xAAB15EEA, 0xC2B27222,
+		0xDF57C2AB, 0xB754EE63, 0x0F519B3B, 0x6752B7F3,
+		0x349AFA29, 0x5C99D6E1, 0xE49CA3B9, 0x8C9F8F71,
+		0x917A3FF8, 0xF9791330, 0x417C6668, 0x297F4AA0,
+		0xE6ECFDDC, 0x8EEFD114, 0x36EAA44C, 0x5EE98884,
+		0x430C380D, 0x2B0F14C5, 0x930A619D, 0xFB094D55,
+		0xA8C1008F, 0xC0C22C47, 0x78C7591F, 0x10C475D7,
+		0x0D21C55E, 0x6522E996, 0xDD279CCE, 0xB524B006,
+		0x47EC84C7, 0x2FEFA80F, 0x97EADD57, 0xFFE9F19F,
+		0xE20C4116, 0x8A0F6DDE, 0x320A1886, 0x5A09344E,
+		0x09C17994, 0x61C2555C, 0xD9C72004, 0xB1C40CCC,
+		0xAC21BC45, 0xC422908D, 0x7C27E5D5, 0x1424C91D,
+		0xDBB77E61, 0xB3B452A9, 0x0BB127F1, 0x63B20B39,
+		0x7E57BBB0, 0x16549778, 0xAE51E220, 0xC652CEE8,
+		0x959A8332, 0xFD99AFFA, 0x459CDAA2, 0x2D9FF66A,
+		0x307A46E3, 0x58796A2B, 0xE07C1F73, 0x887F33BB,
+		0xF56E0EF4, 0x9D6D223C, 0x25685764, 0x4D6B7BAC,
+		0x508ECB25, 0x388DE7ED, 0x808892B5, 0xE88BBE7D,
+		0xBB43F3A7, 0xD340DF6F, 0x6B45AA37, 0x034686FF,
+		0x1EA33676, 0x76A01ABE, 0xCEA56FE6, 0xA6A6432E,
+		0x6935F452, 0x0136D89A, 0xB933ADC2, 0xD130810A,
+		0xCCD53183, 0xA4D61D4B, 0x1CD36813, 0x74D044DB,
+		0x27180901, 0x4F1B25C9, 0xF71E5091, 0x9F1D7C59,
+		0x82F8CCD0, 0xEAFBE018, 0x52FE9540, 0x3AFDB988,
+		0xC8358D49, 0xA036A181, 0x1833D4D9, 0x7030F811,
+		0x6DD54898, 0x05D66450, 0xBDD31108, 0xD5D03DC0,
+		0x8618701A, 0xEE1B5CD2, 0x561E298A, 0x3E1D0542,
+		0x23F8B5CB, 0x4BFB9903, 0xF3FEEC5B, 0x9BFDC093,
+		0x546E77EF, 0x3C6D5B27, 0x84682E7F, 0xEC6B02B7,
+		0xF18EB23E, 0x998D9EF6, 0x2188EBAE, 0x498BC766,
+		0x1A438ABC, 0x7240A674, 0xCA45D32C, 0xA246FFE4,
+		0xBFA34F6D, 0xD7A063A5, 0x6FA516FD, 0x07A63A35,
+		0x8FD9098E, 0xE7DA2546, 0x5FDF501E, 0x37DC7CD6,
+		0x2A39CC5F, 0x423AE097, 0xFA3F95CF, 0x923CB907,
+		0xC1F4F4DD, 0xA9F7D815, 0x11F2AD4D, 0x79F18185,
+		0x6414310C, 0x0C171DC4, 0xB412689C, 0xDC114454,
+		0x1382F328, 0x7B81DFE0, 0xC384AAB8, 0xAB878670,
+		0xB66236F9, 0xDE611A31, 0x66646F69, 0x0E6743A1,
+		0x5DAF0E7B, 0x35AC22B3, 0x8DA957EB, 0xE5AA7B23,
+		0xF84FCBAA, 0x904CE762, 0x2849923A, 0x404ABEF2,
+		0xB2828A33, 0xDA81A6FB, 0x6284D3A3, 0x0A87FF6B,
+		0x17624FE2, 0x7F61632A, 0xC7641672, 0xAF673ABA,
+		0xFCAF7760, 0x94AC5BA8, 0x2CA92EF0, 0x44AA0238,
+		0x594FB2B1, 0x314C9E79, 0x8949EB21, 0xE14AC7E9,
+		0x2ED97095, 0x46DA5C5D, 0xFEDF2905, 0x96DC05CD,
+		0x8B39B544, 0xE33A998C, 0x5B3FECD4, 0x333CC01C,
+		0x60F48DC6, 0x08F7A10E, 0xB0F2D456, 0xD8F1F89E,
+		0xC5144817, 0xAD1764DF, 0x15121187, 0x7D113D4F
+	},
+	{
+		0x00000000, 0x493C7D27, 0x9278FA4E, 0xDB448769,
+		0x211D826D, 0x6821FF4A, 0xB3657823, 0xFA590504,
+		0x423B04DA, 0x0B0779FD, 0xD043FE94, 0x997F83B3,
+		0x632686B7, 0x2A1AFB90, 0xF15E7CF9, 0xB86201DE,
+		0x847609B4, 0xCD4A7493, 0x160EF3FA, 0x5F328EDD,
+		0xA56B8BD9, 0xEC57F6FE, 0x37137197, 0x7E2F0CB0,
+		0xC64D0D6E, 0x8F717049, 0x5435F720, 0x1D098A07,
+		0xE7508F03, 0xAE6CF224, 0x7528754D, 0x3C14086A,
+		0x0D006599, 0x443C18BE, 0x9F789FD7, 0xD644E2F0,
+		0x2C1DE7F4, 0x65219AD3, 0xBE651DBA, 0xF759609D,
+		0x4F3B6143, 0x06071C64, 0xDD439B0D, 0x947FE62A,
+		0x6E26E32E, 0x271A9E09, 0xFC5E1960, 0xB5626447,
+		0x89766C2D, 0xC04A110A, 0x1B0E9663, 0x5232EB44,
+		0xA86BEE40, 0xE1579367, 0x3A13140E, 0x732F6929,
+		0xCB4D68F7, 0x827115D0, 0x593592B9, 0x1009EF9E,
+		0xEA50EA9A, 0xA36C97BD, 0x782810D4, 0x31146DF3,
+		0x1A00CB32, 0x533CB615, 0x8878317C, 0xC1444C5B,
+		0x3B1D495F, 0x72213478, 0xA965B311, 0xE059CE36,
+		0x583BCFE8, 0x1107B2CF, 0xCA4335A6, 0x837F4881,
+		0x79264D85, 0x301A30A2, 0xEB5EB7CB, 0xA262CAEC,
+		0x9E76C286, 0xD74ABFA1, 0x0C0E38C8, 0x453245EF,
+		0xBF6B40EB, 0xF6573DCC, 0x2D13BAA5, 0x642FC782,
+		0xDC4DC65C, 0x9571BB7B, 0x4E353C12, 0x07094135,
+		0xFD504431, 0xB46C3916, 0x6F28BE7F, 0x2614C358,
+		0x1700AEAB, 0x5E3CD38C, 0x857854E5, 0xCC4429C2,
+		0x361D2CC6, 0x7F2151E1, 0xA465D688, 0xED59ABAF,
+		0x553BAA71, 0x1C07D756, 0xC743503F, 0x8E7F2D18,
+		0x7426281C, 0x3D1A553B, 0xE65ED252, 0xAF62AF75,
+		0x9376A71F, 0xDA4ADA38, 0x010E5D51, 0x48322076,
+		0xB26B2572, 0xFB575855, 0x2013DF3C, 0x692FA21B,
+		0xD14DA3C5, 0x9871DEE2, 0x4335598B, 0x0A0924AC,
+		0xF05021A8, 0xB96C5C8F, 0x6228DBE6, 0x2B14A6C1,
+		0x34019664, 0x7D3DEB43, 0xA6796C2A, 0xEF45110D,
+		0x151C1409, 0x5C20692E, 0x8764EE47, 0xCE589360,
+		0x763A92BE, 0x3F06EF99, 0xE44268F0, 0xAD7E15D7,
+		0x572710D3, 0x1E1B6DF4, 0xC55FEA9D, 0x8C6397BA,
+		0xB0779FD0, 0xF94BE2F7, 0x220F659E, 0x6B3318B9,
+		0x916A1DBD, 0xD856609A, 0x0312E7F3, 0x4A2E9AD4,
+		0xF24C9B0A, 0xBB70E62D, 0x60346144, 0x29081C63,
+		0xD3511967, 0x9A6D6440, 0x4129E329, 0x08159E0E,
+		0x3901F3FD, 0x703D8EDA, 0xAB7909B3, 0xE2457494,
+		0x181C7190, 0x51200CB7, 0x8A648BDE, 0xC358F6F9,
+		0x7B3AF727, 0x32068A00, 0xE9420D69, 0xA07E704E,
+		0x5A27754A, 0x131B086D, 0xC85F8F04, 0x8163F223,
+		0xBD77FA49, 0xF44B876E, 0x2F0F0007, 0x66337D20,
+		0x9C6A7824, 0xD5560503, 0x0E12826A, 0x472EFF4D,
+		0xFF4CFE93, 0xB67083B4, 0x6D3404DD, 0x240879FA,
+		0xDE517CFE, 0x976D01D9, 0x4C2986B0, 0x0515FB97,
+		0x2E015D56, 0x673D2071, 0xBC79A718, 0xF545DA3F,
+		0x0F1CDF3B, 0x4620A21C, 0x9D642575, 0xD4585852,
+		0x6C3A598C, 0x250624AB, 0xFE42A3C2, 0xB77EDEE5,
+		0x4D27DBE1, 0x041BA6C6, 0xDF5F21AF, 0x96635C88,
+		0xAA7754E2, 0xE34B29C5, 0x380FAEAC, 0x7133D38B,
+		0x8B6AD68F, 0xC256ABA8, 0x19122CC1, 0x502E51E6,
+		0xE84C5038, 0xA1702D1F, 0x7A34AA76, 0x3308D751,
+		0xC951D255, 0x806DAF72, 0x5B29281B, 0x1215553C,
+		0x230138CF, 0x6A3D45E8, 0xB179C281, 0xF845BFA6,
+		0x021CBAA2, 0x4B20C785, 0x906440EC, 0xD9583DCB,
+		0x613A3C15, 0x28064132, 0xF342C65B, 0xBA7EBB7C,
+		0x4027BE78, 0x091BC35F, 0xD25F4436, 0x9B633911,
+		0xA777317B, 0xEE4B4C5C, 0x350FCB35, 0x7C33B612,
+		0x866AB316, 0xCF56CE31, 0x14124958, 0x5D2E347F,
+		0xE54C35A1, 0xAC704886, 0x7734CFEF, 0x3E08B2C8,
+		0xC451B7CC, 0x8D6DCAEB, 0x56294D82, 0x1F1530A5
+	}
+#else							/* !WORDS_BIGENDIAN */
+	{
+		0x00000000, 0x03836BF2, 0xF7703BE1, 0xF4F35013,
+		0x1F979AC7, 0x1C14F135, 0xE8E7A126, 0xEB64CAD4,
+		0xCF58D98A, 0xCCDBB278, 0x3828E26B, 0x3BAB8999,
+		0xD0CF434D, 0xD34C28BF, 0x27BF78AC, 0x243C135E,
+		0x6FC75E10, 0x6C4435E2, 0x98B765F1, 0x9B340E03,
+		0x7050C4D7, 0x73D3AF25, 0x8720FF36, 0x84A394C4,
+		0xA09F879A, 0xA31CEC68, 0x57EFBC7B, 0x546CD789,
+		0xBF081D5D, 0xBC8B76AF, 0x487826BC, 0x4BFB4D4E,
+		0xDE8EBD20, 0xDD0DD6D2, 0x29FE86C1, 0x2A7DED33,
+		0xC11927E7, 0xC29A4C15, 0x36691C06, 0x35EA77F4,
+		0x11D664AA, 0x12550F58, 0xE6A65F4B, 0xE52534B9,
+		0x0E41FE6D, 0x0DC2959F, 0xF931C58C, 0xFAB2AE7E,
+		0xB149E330, 0xB2CA88C2, 0x4639D8D1, 0x45BAB323,
+		0xAEDE79F7, 0xAD5D1205, 0x59AE4216, 0x5A2D29E4,
+		0x7E113ABA, 0x7D925148, 0x8961015B, 0x8AE26AA9,
+		0x6186A07D, 0x6205CB8F, 0x96F69B9C, 0x9575F06E,
+		0xBC1D7B41, 0xBF9E10B3, 0x4B6D40A0, 0x48EE2B52,
+		0xA38AE186, 0xA0098A74, 0x54FADA67, 0x5779B195,
+		0x7345A2CB, 0x70C6C939, 0x8435992A, 0x87B6F2D8,
+		0x6CD2380C, 0x6F5153FE, 0x9BA203ED, 0x9821681F,
+		0xD3DA2551, 0xD0594EA3, 0x24AA1EB0, 0x27297542,
+		0xCC4DBF96, 0xCFCED464, 0x3B3D8477, 0x38BEEF85,
+		0x1C82FCDB, 0x1F019729, 0xEBF2C73A, 0xE871ACC8,
+		0x0315661C, 0x00960DEE, 0xF4655DFD, 0xF7E6360F,
+		0x6293C661, 0x6110AD93, 0x95E3FD80, 0x96609672,
+		0x7D045CA6, 0x7E873754, 0x8A746747, 0x89F70CB5,
+		0xADCB1FEB, 0xAE487419, 0x5ABB240A, 0x59384FF8,
+		0xB25C852C, 0xB1DFEEDE, 0x452CBECD, 0x46AFD53F,
+		0x0D549871, 0x0ED7F383, 0xFA24A390, 0xF9A7C862,
+		0x12C302B6, 0x11406944, 0xE5B33957, 0xE63052A5,
+		0xC20C41FB, 0xC18F2A09, 0x357C7A1A, 0x36FF11E8,
+		0xDD9BDB3C, 0xDE18B0CE, 0x2AEBE0DD, 0x29688B2F,
+		0x783BF682, 0x7BB89D70, 0x8F4BCD63, 0x8CC8A691,
+		0x67AC6C45, 0x642F07B7, 0x90DC57A4, 0x935F3C56,
+		0xB7632F08, 0xB4E044FA, 0x401314E9, 0x43907F1B,
+		0xA8F4B5CF, 0xAB77DE3D, 0x5F848E2E, 0x5C07E5DC,
+		0x17FCA892, 0x147FC360, 0xE08C9373, 0xE30FF881,
+		0x086B3255, 0x0BE859A7, 0xFF1B09B4, 0xFC986246,
+		0xD8A47118, 0xDB271AEA, 0x2FD44AF9, 0x2C57210B,
+		0xC733EBDF, 0xC4B0802D, 0x3043D03E, 0x33C0BBCC,
+		0xA6B54BA2, 0xA5362050, 0x51C57043, 0x52461BB1,
+		0xB922D165, 0xBAA1BA97, 0x4E52EA84, 0x4DD18176,
+		0x69ED9228, 0x6A6EF9DA, 0x9E9DA9C9, 0x9D1EC23B,
+		0x767A08EF, 0x75F9631D, 0x810A330E, 0x828958FC,
+		0xC97215B2, 0xCAF17E40, 0x3E022E53, 0x3D8145A1,
+		0xD6E58F75, 0xD566E487, 0x2195B494, 0x2216DF66,
+		0x062ACC38, 0x05A9A7CA, 0xF15AF7D9, 0xF2D99C2B,
+		0x19BD56FF, 0x1A3E3D0D, 0xEECD6D1E, 0xED4E06EC,
+		0xC4268DC3, 0xC7A5E631, 0x3356B622, 0x30D5DDD0,
+		0xDBB11704, 0xD8327CF6, 0x2CC12CE5, 0x2F424717,
+		0x0B7E5449, 0x08FD3FBB, 0xFC0E6FA8, 0xFF8D045A,
+		0x14E9CE8E, 0x176AA57C, 0xE399F56F, 0xE01A9E9D,
+		0xABE1D3D3, 0xA862B821, 0x5C91E832, 0x5F1283C0,
+		0xB4764914, 0xB7F522E6, 0x430672F5, 0x40851907,
+		0x64B90A59, 0x673A61AB, 0x93C931B8, 0x904A5A4A,
+		0x7B2E909E, 0x78ADFB6C, 0x8C5EAB7F, 0x8FDDC08D,
+		0x1AA830E3, 0x192B5B11, 0xEDD80B02, 0xEE5B60F0,
+		0x053FAA24, 0x06BCC1D6, 0xF24F91C5, 0xF1CCFA37,
+		0xD5F0E969, 0xD673829B, 0x2280D288, 0x2103B97A,
+		0xCA6773AE, 0xC9E4185C, 0x3D17484F, 0x3E9423BD,
+		0x756F6EF3, 0x76EC0501, 0x821F5512, 0x819C3EE0,
+		0x6AF8F434, 0x697B9FC6, 0x9D88CFD5, 0x9E0BA427,
+		0xBA37B779, 0xB9B4DC8B, 0x4D478C98, 0x4EC4E76A,
+		0xA5A02DBE, 0xA623464C, 0x52D0165F, 0x51537DAD,
+	},
+	{
+		0x00000000, 0x7798A213, 0xEE304527, 0x99A8E734,
+		0xDC618A4E, 0xABF9285D, 0x3251CF69, 0x45C96D7A,
+		0xB8C3149D, 0xCF5BB68E, 0x56F351BA, 0x216BF3A9,
+		0x64A29ED3, 0x133A3CC0, 0x8A92DBF4, 0xFD0A79E7,
+		0x81F1C53F, 0xF669672C, 0x6FC18018, 0x1859220B,
+		0x5D904F71, 0x2A08ED62, 0xB3A00A56, 0xC438A845,
+		0x3932D1A2, 0x4EAA73B1, 0xD7029485, 0xA09A3696,
+		0xE5535BEC, 0x92CBF9FF, 0x0B631ECB, 0x7CFBBCD8,
+		0x02E38B7F, 0x757B296C, 0xECD3CE58, 0x9B4B6C4B,
+		0xDE820131, 0xA91AA322, 0x30B24416, 0x472AE605,
+		0xBA209FE2, 0xCDB83DF1, 0x5410DAC5, 0x238878D6,
+		0x664115AC, 0x11D9B7BF, 0x8871508B, 0xFFE9F298,
+		0x83124E40, 0xF48AEC53, 0x6D220B67, 0x1ABAA974,
+		0x5F73C40E, 0x28EB661D, 0xB1438129, 0xC6DB233A,
+		0x3BD15ADD, 0x4C49F8CE, 0xD5E11FFA, 0xA279BDE9,
+		0xE7B0D093, 0x90287280, 0x098095B4, 0x7E1837A7,
+		0x04C617FF, 0x735EB5EC, 0xEAF652D8, 0x9D6EF0CB,
+		0xD8A79DB1, 0xAF3F3FA2, 0x3697D896, 0x410F7A85,
+		0xBC050362, 0xCB9DA171, 0x52354645, 0x25ADE456,
+		0x6064892C, 0x17FC2B3F, 0x8E54CC0B, 0xF9CC6E18,
+		0x8537D2C0, 0xF2AF70D3, 0x6B0797E7, 0x1C9F35F4,
+		0x5956588E, 0x2ECEFA9D, 0xB7661DA9, 0xC0FEBFBA,
+		0x3DF4C65D, 0x4A6C644E, 0xD3C4837A, 0xA45C2169,
+		0xE1954C13, 0x960DEE00, 0x0FA50934, 0x783DAB27,
+		0x06259C80, 0x71BD3E93, 0xE815D9A7, 0x9F8D7BB4,
+		0xDA4416CE, 0xADDCB4DD, 0x347453E9, 0x43ECF1FA,
+		0xBEE6881D, 0xC97E2A0E, 0x50D6CD3A, 0x274E6F29,
+		0x62870253, 0x151FA040, 0x8CB74774, 0xFB2FE567,
+		0x87D459BF, 0xF04CFBAC, 0x69E41C98, 0x1E7CBE8B,
+		0x5BB5D3F1, 0x2C2D71E2, 0xB58596D6, 0xC21D34C5,
+		0x3F174D22, 0x488FEF31, 0xD1270805, 0xA6BFAA16,
+		0xE376C76C, 0x94EE657F, 0x0D46824B, 0x7ADE2058,
+		0xF9FAC3FB, 0x8E6261E8, 0x17CA86DC, 0x605224CF,
+		0x259B49B5, 0x5203EBA6, 0xCBAB0C92, 0xBC33AE81,
+		0x4139D766, 0x36A17575, 0xAF099241, 0xD8913052,
+		0x9D585D28, 0xEAC0FF3B, 0x7368180F, 0x04F0BA1C,
+		0x780B06C4, 0x0F93A4D7, 0x963B43E3, 0xE1A3E1F0,
+		0xA46A8C8A, 0xD3F22E99, 0x4A5AC9AD, 0x3DC26BBE,
+		0xC0C81259, 0xB750B04A, 0x2EF8577E, 0x5960F56D,
+		0x1CA99817, 0x6B313A04, 0xF299DD30, 0x85017F23,
+		0xFB194884, 0x8C81EA97, 0x15290DA3, 0x62B1AFB0,
+		0x2778C2CA, 0x50E060D9, 0xC94887ED, 0xBED025FE,
+		0x43DA5C19, 0x3442FE0A, 0xADEA193E, 0xDA72BB2D,
+		0x9FBBD657, 0xE8237444, 0x718B9370, 0x06133163,
+		0x7AE88DBB, 0x0D702FA8, 0x94D8C89C, 0xE3406A8F,
+		0xA68907F5, 0xD111A5E6, 0x48B942D2, 0x3F21E0C1,
+		0xC22B9926, 0xB5B33B35, 0x2C1BDC01, 0x5B837E12,
+		0x1E4A1368, 0x69D2B17B, 0xF07A564F, 0x87E2F45C,
+		0xFD3CD404, 0x8AA47617, 0x130C9123, 0x64943330,
+		0x215D5E4A, 0x56C5FC59, 0xCF6D1B6D, 0xB8F5B97E,
+		0x45FFC099, 0x3267628A, 0xABCF85BE, 0xDC5727AD,
+		0x999E4AD7, 0xEE06E8C4, 0x77AE0FF0, 0x0036ADE3,
+		0x7CCD113B, 0x0B55B328, 0x92FD541C, 0xE565F60F,
+		0xA0AC9B75, 0xD7343966, 0x4E9CDE52, 0x39047C41,
+		0xC40E05A6, 0xB396A7B5, 0x2A3E4081, 0x5DA6E292,
+		0x186F8FE8, 0x6FF72DFB, 0xF65FCACF, 0x81C768DC,
+		0xFFDF5F7B, 0x8847FD68, 0x11EF1A5C, 0x6677B84F,
+		0x23BED535, 0x54267726, 0xCD8E9012, 0xBA163201,
+		0x471C4BE6, 0x3084E9F5, 0xA92C0EC1, 0xDEB4ACD2,
+		0x9B7DC1A8, 0xECE563BB, 0x754D848F, 0x02D5269C,
+		0x7E2E9A44, 0x09B63857, 0x901EDF63, 0xE7867D70,
+		0xA24F100A, 0xD5D7B219, 0x4C7F552D, 0x3BE7F73E,
+		0xC6ED8ED9, 0xB1752CCA, 0x28DDCBFE, 0x5F4569ED,
+		0x1A8C0497, 0x6D14A684, 0xF4BC41B0, 0x8324E3A3,
+	},
+	{
+		0x00000000, 0x7E9241A5, 0x0D526F4F, 0x73C02EEA,
+		0x1AA4DE9E, 0x64369F3B, 0x17F6B1D1, 0x6964F074,
+		0xC53E5138, 0xBBAC109D, 0xC86C3E77, 0xB6FE7FD2,
+		0xDF9A8FA6, 0xA108CE03, 0xD2C8E0E9, 0xAC5AA14C,
+		0x8A7DA270, 0xF4EFE3D5, 0x872FCD3F, 0xF9BD8C9A,
+		0x90D97CEE, 0xEE4B3D4B, 0x9D8B13A1, 0xE3195204,
+		0x4F43F348, 0x31D1B2ED, 0x42119C07, 0x3C83DDA2,
+		0x55E72DD6, 0x2B756C73, 0x58B54299, 0x2627033C,
+		0x14FB44E1, 0x6A690544, 0x19A92BAE, 0x673B6A0B,
+		0x0E5F9A7F, 0x70CDDBDA, 0x030DF530, 0x7D9FB495,
+		0xD1C515D9, 0xAF57547C, 0xDC977A96, 0xA2053B33,
+		0xCB61CB47, 0xB5F38AE2, 0xC633A408, 0xB8A1E5AD,
+		0x9E86E691, 0xE014A734, 0x93D489DE, 0xED46C87B,
+		0x8422380F, 0xFAB079AA, 0x89705740, 0xF7E216E5,
+		0x5BB8B7A9, 0x252AF60C, 0x56EAD8E6, 0x28789943,
+		0x411C6937, 0x3F8E2892, 0x4C4E0678, 0x32DC47DD,
+		0xD98065C7, 0xA7122462, 0xD4D20A88, 0xAA404B2D,
+		0xC324BB59, 0xBDB6FAFC, 0xCE76D416, 0xB0E495B3,
+		0x1CBE34FF, 0x622C755A, 0x11EC5BB0, 0x6F7E1A15,
+		0x061AEA61, 0x7888ABC4, 0x0B48852E, 0x75DAC48B,
+		0x53FDC7B7, 0x2D6F8612, 0x5EAFA8F8, 0x203DE95D,
+		0x49591929, 0x37CB588C, 0x440B7666, 0x3A9937C3,
+		0x96C3968F, 0xE851D72A, 0x9B91F9C0, 0xE503B865,
+		0x8C674811, 0xF2F509B4, 0x8135275E, 0xFFA766FB,
+		0xCD7B2126, 0xB3E96083, 0xC0294E69, 0xBEBB0FCC,
+		0xD7DFFFB8, 0xA94DBE1D, 0xDA8D90F7, 0xA41FD152,
+		0x0845701E, 0x76D731BB, 0x05171F51, 0x7B855EF4,
+		0x12E1AE80, 0x6C73EF25, 0x1FB3C1CF, 0x6121806A,
+		0x47068356, 0x3994C2F3, 0x4A54EC19, 0x34C6ADBC,
+		0x5DA25DC8, 0x23301C6D, 0x50F03287, 0x2E627322,
+		0x8238D26E, 0xFCAA93CB, 0x8F6ABD21, 0xF1F8FC84,
+		0x989C0CF0, 0xE60E4D55, 0x95CE63BF, 0xEB5C221A,
+		0x4377278B, 0x3DE5662E, 0x4E2548C4, 0x30B70961,
+		0x59D3F915, 0x2741B8B0, 0x5481965A, 0x2A13D7FF,
+		0x864976B3, 0xF8DB3716, 0x8B1B19FC, 0xF5895859,
+		0x9CEDA82D, 0xE27FE988, 0x91BFC762, 0xEF2D86C7,
+		0xC90A85FB, 0xB798C45E, 0xC458EAB4, 0xBACAAB11,
+		0xD3AE5B65, 0xAD3C1AC0, 0xDEFC342A, 0xA06E758F,
+		0x0C34D4C3, 0x72A69566, 0x0166BB8C, 0x7FF4FA29,
+		0x16900A5D, 0x68024BF8, 0x1BC26512, 0x655024B7,
+		0x578C636A, 0x291E22CF, 0x5ADE0C25, 0x244C4D80,
+		0x4D28BDF4, 0x33BAFC51, 0x407AD2BB, 0x3EE8931E,
+		0x92B23252, 0xEC2073F7, 0x9FE05D1D, 0xE1721CB8,
+		0x8816ECCC, 0xF684AD69, 0x85448383, 0xFBD6C226,
+		0xDDF1C11A, 0xA36380BF, 0xD0A3AE55, 0xAE31EFF0,
+		0xC7551F84, 0xB9C75E21, 0xCA0770CB, 0xB495316E,
+		0x18CF9022, 0x665DD187, 0x159DFF6D, 0x6B0FBEC8,
+		0x026B4EBC, 0x7CF90F19, 0x0F3921F3, 0x71AB6056,
+		0x9AF7424C, 0xE46503E9, 0x97A52D03, 0xE9376CA6,
+		0x80539CD2, 0xFEC1DD77, 0x8D01F39D, 0xF393B238,
+		0x5FC91374, 0x215B52D1, 0x529B7C3B, 0x2C093D9E,
+		0x456DCDEA, 0x3BFF8C4F, 0x483FA2A5, 0x36ADE300,
+		0x108AE03C, 0x6E18A199, 0x1DD88F73, 0x634ACED6,
+		0x0A2E3EA2, 0x74BC7F07, 0x077C51ED, 0x79EE1048,
+		0xD5B4B104, 0xAB26F0A1, 0xD8E6DE4B, 0xA6749FEE,
+		0xCF106F9A, 0xB1822E3F, 0xC24200D5, 0xBCD04170,
+		0x8E0C06AD, 0xF09E4708, 0x835E69E2, 0xFDCC2847,
+		0x94A8D833, 0xEA3A9996, 0x99FAB77C, 0xE768F6D9,
+		0x4B325795, 0x35A01630, 0x466038DA, 0x38F2797F,
+		0x5196890B, 0x2F04C8AE, 0x5CC4E644, 0x2256A7E1,
+		0x0471A4DD, 0x7AE3E578, 0x0923CB92, 0x77B18A37,
+		0x1ED57A43, 0x60473BE6, 0x1387150C, 0x6D1554A9,
+		0xC14FF5E5, 0xBFDDB440, 0xCC1D9AAA, 0xB28FDB0F,
+		0xDBEB2B7B, 0xA5796ADE, 0xD6B94434, 0xA82B0591,
+	},
+	{
+		0x00000000, 0xB8AA45DD, 0x812367BF, 0x39892262,
+		0xF331227B, 0x4B9B67A6, 0x721245C4, 0xCAB80019,
+		0xE66344F6, 0x5EC9012B, 0x67402349, 0xDFEA6694,
+		0x1552668D, 0xADF82350, 0x94710132, 0x2CDB44EF,
+		0x3DB164E9, 0x851B2134, 0xBC920356, 0x0438468B,
+		0xCE804692, 0x762A034F, 0x4FA3212D, 0xF70964F0,
+		0xDBD2201F, 0x637865C2, 0x5AF147A0, 0xE25B027D,
+		0x28E30264, 0x904947B9, 0xA9C065DB, 0x116A2006,
+		0x8B1425D7, 0x33BE600A, 0x0A374268, 0xB29D07B5,
+		0x782507AC, 0xC08F4271, 0xF9066013, 0x41AC25CE,
+		0x6D776121, 0xD5DD24FC, 0xEC54069E, 0x54FE4343,
+		0x9E46435A, 0x26EC0687, 0x1F6524E5, 0xA7CF6138,
+		0xB6A5413E, 0x0E0F04E3, 0x37862681, 0x8F2C635C,
+		0x45946345, 0xFD3E2698, 0xC4B704FA, 0x7C1D4127,
+		0x50C605C8, 0xE86C4015, 0xD1E56277, 0x694F27AA,
+		0xA3F727B3, 0x1B5D626E, 0x22D4400C, 0x9A7E05D1,
+		0xE75FA6AB, 0x5FF5E376, 0x667CC114, 0xDED684C9,
+		0x146E84D0, 0xACC4C10D, 0x954DE36F, 0x2DE7A6B2,
+		0x013CE25D, 0xB996A780, 0x801F85E2, 0x38B5C03F,
+		0xF20DC026, 0x4AA785FB, 0x732EA799, 0xCB84E244,
+		0xDAEEC242, 0x6244879F, 0x5BCDA5FD, 0xE367E020,
+		0x29DFE039, 0x9175A5E4, 0xA8FC8786, 0x1056C25B,
+		0x3C8D86B4, 0x8427C369, 0xBDAEE10B, 0x0504A4D6,
+		0xCFBCA4CF, 0x7716E112, 0x4E9FC370, 0xF63586AD,
+		0x6C4B837C, 0xD4E1C6A1, 0xED68E4C3, 0x55C2A11E,
+		0x9F7AA107, 0x27D0E4DA, 0x1E59C6B8, 0xA6F38365,
+		0x8A28C78A, 0x32828257, 0x0B0BA035, 0xB3A1E5E8,
+		0x7919E5F1, 0xC1B3A02C, 0xF83A824E, 0x4090C793,
+		0x51FAE795, 0xE950A248, 0xD0D9802A, 0x6873C5F7,
+		0xA2CBC5EE, 0x1A618033, 0x23E8A251, 0x9B42E78C,
+		0xB799A363, 0x0F33E6BE, 0x36BAC4DC, 0x8E108101,
+		0x44A88118, 0xFC02C4C5, 0xC58BE6A7, 0x7D21A37A,
+		0x3FC9A052, 0x8763E58F, 0xBEEAC7ED, 0x06408230,
+		0xCCF88229, 0x7452C7F4, 0x4DDBE596, 0xF571A04B,
+		0xD9AAE4A4, 0x6100A179, 0x5889831B, 0xE023C6C6,
+		0x2A9BC6DF, 0x92318302, 0xABB8A160, 0x1312E4BD,
+		0x0278C4BB, 0xBAD28166, 0x835BA304, 0x3BF1E6D9,
+		0xF149E6C0, 0x49E3A31D, 0x706A817F, 0xC8C0C4A2,
+		0xE41B804D, 0x5CB1C590, 0x6538E7F2, 0xDD92A22F,
+		0x172AA236, 0xAF80E7EB, 0x9609C589, 0x2EA38054,
+		0xB4DD8585, 0x0C77C058, 0x35FEE23A, 0x8D54A7E7,
+		0x47ECA7FE, 0xFF46E223, 0xC6CFC041, 0x7E65859C,
+		0x52BEC173, 0xEA1484AE, 0xD39DA6CC, 0x6B37E311,
+		0xA18FE308, 0x1925A6D5, 0x20AC84B7, 0x9806C16A,
+		0x896CE16C, 0x31C6A4B1, 0x084F86D3, 0xB0E5C30E,
+		0x7A5DC317, 0xC2F786CA, 0xFB7EA4A8, 0x43D4E175,
+		0x6F0FA59A, 0xD7A5E047, 0xEE2CC225, 0x568687F8,
+		0x9C3E87E1, 0x2494C23C, 0x1D1DE05E, 0xA5B7A583,
+		0xD89606F9, 0x603C4324, 0x59B56146, 0xE11F249B,
+		0x2BA72482, 0x930D615F, 0xAA84433D, 0x122E06E0,
+		0x3EF5420F, 0x865F07D2, 0xBFD625B0, 0x077C606D,
+		0xCDC46074, 0x756E25A9, 0x4CE707CB, 0xF44D4216,
+		0xE5276210, 0x5D8D27CD, 0x640405AF, 0xDCAE4072,
+		0x1616406B, 0xAEBC05B6, 0x973527D4, 0x2F9F6209,
+		0x034426E6, 0xBBEE633B, 0x82674159, 0x3ACD0484,
+		0xF075049D, 0x48DF4140, 0x71566322, 0xC9FC26FF,
+		0x5382232E, 0xEB2866F3, 0xD2A14491, 0x6A0B014C,
+		0xA0B30155, 0x18194488, 0x219066EA, 0x993A2337,
+		0xB5E167D8, 0x0D4B2205, 0x34C20067, 0x8C6845BA,
+		0x46D045A3, 0xFE7A007E, 0xC7F3221C, 0x7F5967C1,
+		0x6E3347C7, 0xD699021A, 0xEF102078, 0x57BA65A5,
+		0x9D0265BC, 0x25A82061, 0x1C210203, 0xA48B47DE,
+		0x88500331, 0x30FA46EC, 0x0973648E, 0xB1D92153,
+		0x7B61214A, 0xC3CB6497, 0xFA4246F5, 0x42E80328,
+	},
+	{
+		0x00000000, 0xAC6F1138, 0x58DF2270, 0xF4B03348,
+		0xB0BE45E0, 0x1CD154D8, 0xE8616790, 0x440E76A8,
+		0x910B67C5, 0x3D6476FD, 0xC9D445B5, 0x65BB548D,
+		0x21B52225, 0x8DDA331D, 0x796A0055, 0xD505116D,
+		0xD361228F, 0x7F0E33B7, 0x8BBE00FF, 0x27D111C7,
+		0x63DF676F, 0xCFB07657, 0x3B00451F, 0x976F5427,
+		0x426A454A, 0xEE055472, 0x1AB5673A, 0xB6DA7602,
+		0xF2D400AA, 0x5EBB1192, 0xAA0B22DA, 0x066433E2,
+		0x57B5A81B, 0xFBDAB923, 0x0F6A8A6B, 0xA3059B53,
+		0xE70BEDFB, 0x4B64FCC3, 0xBFD4CF8B, 0x13BBDEB3,
+		0xC6BECFDE, 0x6AD1DEE6, 0x9E61EDAE, 0x320EFC96,
+		0x76008A3E, 0xDA6F9B06, 0x2EDFA84E, 0x82B0B976,
+		0x84D48A94, 0x28BB9BAC, 0xDC0BA8E4, 0x7064B9DC,
+		0x346ACF74, 0x9805DE4C, 0x6CB5ED04, 0xC0DAFC3C,
+		0x15DFED51, 0xB9B0FC69, 0x4D00CF21, 0xE16FDE19,
+		0xA561A8B1, 0x090EB989, 0xFDBE8AC1, 0x51D19BF9,
+		0xAE6A5137, 0x0205400F, 0xF6B57347, 0x5ADA627F,
+		0x1ED414D7, 0xB2BB05EF, 0x460B36A7, 0xEA64279F,
+		0x3F6136F2, 0x930E27CA, 0x67BE1482, 0xCBD105BA,
+		0x8FDF7312, 0x23B0622A, 0xD7005162, 0x7B6F405A,
+		0x7D0B73B8, 0xD1646280, 0x25D451C8, 0x89BB40F0,
+		0xCDB53658, 0x61DA2760, 0x956A1428, 0x39050510,
+		0xEC00147D, 0x406F0545, 0xB4DF360D, 0x18B02735,
+		0x5CBE519D, 0xF0D140A5, 0x046173ED, 0xA80E62D5,
+		0xF9DFF92C, 0x55B0E814, 0xA100DB5C, 0x0D6FCA64,
+		0x4961BCCC, 0xE50EADF4, 0x11BE9EBC, 0xBDD18F84,
+		0x68D49EE9, 0xC4BB8FD1, 0x300BBC99, 0x9C64ADA1,
+		0xD86ADB09, 0x7405CA31, 0x80B5F979, 0x2CDAE841,
+		0x2ABEDBA3, 0x86D1CA9B, 0x7261F9D3, 0xDE0EE8EB,
+		0x9A009E43, 0x366F8F7B, 0xC2DFBC33, 0x6EB0AD0B,
+		0xBBB5BC66, 0x17DAAD5E, 0xE36A9E16, 0x4F058F2E,
+		0x0B0BF986, 0xA764E8BE, 0x53D4DBF6, 0xFFBBCACE,
+		0x5CD5A26E, 0xF0BAB356, 0x040A801E, 0xA8659126,
+		0xEC6BE78E, 0x4004F6B6, 0xB4B4C5FE, 0x18DBD4C6,
+		0xCDDEC5AB, 0x61B1D493, 0x9501E7DB, 0x396EF6E3,
+		0x7D60804B, 0xD10F9173, 0x25BFA23B, 0x89D0B303,
+		0x8FB480E1, 0x23DB91D9, 0xD76BA291, 0x7B04B3A9,
+		0x3F0AC501, 0x9365D439, 0x67D5E771, 0xCBBAF649,
+		0x1EBFE724, 0xB2D0F61C, 0x4660C554, 0xEA0FD46C,
+		0xAE01A2C4, 0x026EB3FC, 0xF6DE80B4, 0x5AB1918C,
+		0x0B600A75, 0xA70F1B4D, 0x53BF2805, 0xFFD0393D,
+		0xBBDE4F95, 0x17B15EAD, 0xE3016DE5, 0x4F6E7CDD,
+		0x9A6B6DB0, 0x36047C88, 0xC2B44FC0, 0x6EDB5EF8,
+		0x2AD52850, 0x86BA3968, 0x720A0A20, 0xDE651B18,
+		0xD80128FA, 0x746E39C2, 0x80DE0A8A, 0x2CB11BB2,
+		0x68BF6D1A, 0xC4D07C22, 0x30604F6A, 0x9C0F5E52,
+		0x490A4F3F, 0xE5655E07, 0x11D56D4F, 0xBDBA7C77,
+		0xF9B40ADF, 0x55DB1BE7, 0xA16B28AF, 0x0D043997,
+		0xF2BFF359, 0x5ED0E261, 0xAA60D129, 0x060FC011,
+		0x4201B6B9, 0xEE6EA781, 0x1ADE94C9, 0xB6B185F1,
+		0x63B4949C, 0xCFDB85A4, 0x3B6BB6EC, 0x9704A7D4,
+		0xD30AD17C, 0x7F65C044, 0x8BD5F30C, 0x27BAE234,
+		0x21DED1D6, 0x8DB1C0EE, 0x7901F3A6, 0xD56EE29E,
+		0x91609436, 0x3D0F850E, 0xC9BFB646, 0x65D0A77E,
+		0xB0D5B613, 0x1CBAA72B, 0xE80A9463, 0x4465855B,
+		0x006BF3F3, 0xAC04E2CB, 0x58B4D183, 0xF4DBC0BB,
+		0xA50A5B42, 0x09654A7A, 0xFDD57932, 0x51BA680A,
+		0x15B41EA2, 0xB9DB0F9A, 0x4D6B3CD2, 0xE1042DEA,
+		0x34013C87, 0x986E2DBF, 0x6CDE1EF7, 0xC0B10FCF,
+		0x84BF7967, 0x28D0685F, 0xDC605B17, 0x700F4A2F,
+		0x766B79CD, 0xDA0468F5, 0x2EB45BBD, 0x82DB4A85,
+		0xC6D53C2D, 0x6ABA2D15, 0x9E0A1E5D, 0x32650F65,
+		0xE7601E08, 0x4B0F0F30, 0xBFBF3C78, 0x13D02D40,
+		0x57DE5BE8, 0xFBB14AD0, 0x0F017998, 0xA36E68A0,
+	},
+	{
+		0x00000000, 0x196B30EF, 0xC3A08CDB, 0xDACBBC34,
+		0x7737F5B2, 0x6E5CC55D, 0xB4977969, 0xADFC4986,
+		0x1F180660, 0x0673368F, 0xDCB88ABB, 0xC5D3BA54,
+		0x682FF3D2, 0x7144C33D, 0xAB8F7F09, 0xB2E44FE6,
+		0x3E300CC0, 0x275B3C2F, 0xFD90801B, 0xE4FBB0F4,
+		0x4907F972, 0x506CC99D, 0x8AA775A9, 0x93CC4546,
+		0x21280AA0, 0x38433A4F, 0xE288867B, 0xFBE3B694,
+		0x561FFF12, 0x4F74CFFD, 0x95BF73C9, 0x8CD44326,
+		0x8D16F485, 0x947DC46A, 0x4EB6785E, 0x57DD48B1,
+		0xFA210137, 0xE34A31D8, 0x39818DEC, 0x20EABD03,
+		0x920EF2E5, 0x8B65C20A, 0x51AE7E3E, 0x48C54ED1,
+		0xE5390757, 0xFC5237B8, 0x26998B8C, 0x3FF2BB63,
+		0xB326F845, 0xAA4DC8AA, 0x7086749E, 0x69ED4471,
+		0xC4110DF7, 0xDD7A3D18, 0x07B1812C, 0x1EDAB1C3,
+		0xAC3EFE25, 0xB555CECA, 0x6F9E72FE, 0x76F54211,
+		0xDB090B97, 0xC2623B78, 0x18A9874C, 0x01C2B7A3,
+		0xEB5B040E, 0xF23034E1, 0x28FB88D5, 0x3190B83A,
+		0x9C6CF1BC, 0x8507C153, 0x5FCC7D67, 0x46A74D88,
+		0xF443026E, 0xED283281, 0x37E38EB5, 0x2E88BE5A,
+		0x8374F7DC, 0x9A1FC733, 0x40D47B07, 0x59BF4BE8,
+		0xD56B08CE, 0xCC003821, 0x16CB8415, 0x0FA0B4FA,
+		0xA25CFD7C, 0xBB37CD93, 0x61FC71A7, 0x78974148,
+		0xCA730EAE, 0xD3183E41, 0x09D38275, 0x10B8B29A,
+		0xBD44FB1C, 0xA42FCBF3, 0x7EE477C7, 0x678F4728,
+		0x664DF08B, 0x7F26C064, 0xA5ED7C50, 0xBC864CBF,
+		0x117A0539, 0x081135D6, 0xD2DA89E2, 0xCBB1B90D,
+		0x7955F6EB, 0x603EC604, 0xBAF57A30, 0xA39E4ADF,
+		0x0E620359, 0x170933B6, 0xCDC28F82, 0xD4A9BF6D,
+		0x587DFC4B, 0x4116CCA4, 0x9BDD7090, 0x82B6407F,
+		0x2F4A09F9, 0x36213916, 0xECEA8522, 0xF581B5CD,
+		0x4765FA2B, 0x5E0ECAC4, 0x84C576F0, 0x9DAE461F,
+		0x30520F99, 0x29393F76, 0xF3F28342, 0xEA99B3AD,
+		0xD6B7081C, 0xCFDC38F3, 0x151784C7, 0x0C7CB428,
+		0xA180FDAE, 0xB8EBCD41, 0x62207175, 0x7B4B419A,
+		0xC9AF0E7C, 0xD0C43E93, 0x0A0F82A7, 0x1364B248,
+		0xBE98FBCE, 0xA7F3CB21, 0x7D387715, 0x645347FA,
+		0xE88704DC, 0xF1EC3433, 0x2B278807, 0x324CB8E8,
+		0x9FB0F16E, 0x86DBC181, 0x5C107DB5, 0x457B4D5A,
+		0xF79F02BC, 0xEEF43253, 0x343F8E67, 0x2D54BE88,
+		0x80A8F70E, 0x99C3C7E1, 0x43087BD5, 0x5A634B3A,
+		0x5BA1FC99, 0x42CACC76, 0x98017042, 0x816A40AD,
+		0x2C96092B, 0x35FD39C4, 0xEF3685F0, 0xF65DB51F,
+		0x44B9FAF9, 0x5DD2CA16, 0x87197622, 0x9E7246CD,
+		0x338E0F4B, 0x2AE53FA4, 0xF02E8390, 0xE945B37F,
+		0x6591F059, 0x7CFAC0B6, 0xA6317C82, 0xBF5A4C6D,
+		0x12A605EB, 0x0BCD3504, 0xD1068930, 0xC86DB9DF,
+		0x7A89F639, 0x63E2C6D6, 0xB9297AE2, 0xA0424A0D,
+		0x0DBE038B, 0x14D53364, 0xCE1E8F50, 0xD775BFBF,
+		0x3DEC0C12, 0x24873CFD, 0xFE4C80C9, 0xE727B026,
+		0x4ADBF9A0, 0x53B0C94F, 0x897B757B, 0x90104594,
+		0x22F40A72, 0x3B9F3A9D, 0xE15486A9, 0xF83FB646,
+		0x55C3FFC0, 0x4CA8CF2F, 0x9663731B, 0x8F0843F4,
+		0x03DC00D2, 0x1AB7303D, 0xC07C8C09, 0xD917BCE6,
+		0x74EBF560, 0x6D80C58F, 0xB74B79BB, 0xAE204954,
+		0x1CC406B2, 0x05AF365D, 0xDF648A69, 0xC60FBA86,
+		0x6BF3F300, 0x7298C3EF, 0xA8537FDB, 0xB1384F34,
+		0xB0FAF897, 0xA991C878, 0x735A744C, 0x6A3144A3,
+		0xC7CD0D25, 0xDEA63DCA, 0x046D81FE, 0x1D06B111,
+		0xAFE2FEF7, 0xB689CE18, 0x6C42722C, 0x752942C3,
+		0xD8D50B45, 0xC1BE3BAA, 0x1B75879E, 0x021EB771,
+		0x8ECAF457, 0x97A1C4B8, 0x4D6A788C, 0x54014863,
+		0xF9FD01E5, 0xE096310A, 0x3A5D8D3E, 0x2336BDD1,
+		0x91D2F237, 0x88B9C2D8, 0x52727EEC, 0x4B194E03,
+		0xE6E50785, 0xFF8E376A, 0x25458B5E, 0x3C2EBBB1,
+	},
+	{
+		0x00000000, 0xC82C0368, 0x905906D0, 0x587505B8,
+		0xD1C5E0A5, 0x19E9E3CD, 0x419CE675, 0x89B0E51D,
+		0x53FD2D4E, 0x9BD12E26, 0xC3A42B9E, 0x0B8828F6,
+		0x8238CDEB, 0x4A14CE83, 0x1261CB3B, 0xDA4DC853,
+		0xA6FA5B9C, 0x6ED658F4, 0x36A35D4C, 0xFE8F5E24,
+		0x773FBB39, 0xBF13B851, 0xE766BDE9, 0x2F4ABE81,
+		0xF50776D2, 0x3D2B75BA, 0x655E7002, 0xAD72736A,
+		0x24C29677, 0xECEE951F, 0xB49B90A7, 0x7CB793CF,
+		0xBD835B3D, 0x75AF5855, 0x2DDA5DED, 0xE5F65E85,
+		0x6C46BB98, 0xA46AB8F0, 0xFC1FBD48, 0x3433BE20,
+		0xEE7E7673, 0x2652751B, 0x7E2770A3, 0xB60B73CB,
+		0x3FBB96D6, 0xF79795BE, 0xAFE29006, 0x67CE936E,
+		0x1B7900A1, 0xD35503C9, 0x8B200671, 0x430C0519,
+		0xCABCE004, 0x0290E36C, 0x5AE5E6D4, 0x92C9E5BC,
+		0x48842DEF, 0x80A82E87, 0xD8DD2B3F, 0x10F12857,
+		0x9941CD4A, 0x516DCE22, 0x0918CB9A, 0xC134C8F2,
+		0x7A07B77A, 0xB22BB412, 0xEA5EB1AA, 0x2272B2C2,
+		0xABC257DF, 0x63EE54B7, 0x3B9B510F, 0xF3B75267,
+		0x29FA9A34, 0xE1D6995C, 0xB9A39CE4, 0x718F9F8C,
+		0xF83F7A91, 0x301379F9, 0x68667C41, 0xA04A7F29,
+		0xDCFDECE6, 0x14D1EF8E, 0x4CA4EA36, 0x8488E95E,
+		0x0D380C43, 0xC5140F2B, 0x9D610A93, 0x554D09FB,
+		0x8F00C1A8, 0x472CC2C0, 0x1F59C778, 0xD775C410,
+		0x5EC5210D, 0x96E92265, 0xCE9C27DD, 0x06B024B5,
+		0xC784EC47, 0x0FA8EF2F, 0x57DDEA97, 0x9FF1E9FF,
+		0x16410CE2, 0xDE6D0F8A, 0x86180A32, 0x4E34095A,
+		0x9479C109, 0x5C55C261, 0x0420C7D9, 0xCC0CC4B1,
+		0x45BC21AC, 0x8D9022C4, 0xD5E5277C, 0x1DC92414,
+		0x617EB7DB, 0xA952B4B3, 0xF127B10B, 0x390BB263,
+		0xB0BB577E, 0x78975416, 0x20E251AE, 0xE8CE52C6,
+		0x32839A95, 0xFAAF99FD, 0xA2DA9C45, 0x6AF69F2D,
+		0xE3467A30, 0x2B6A7958, 0x731F7CE0, 0xBB337F88,
+		0xF40E6EF5, 0x3C226D9D, 0x64576825, 0xAC7B6B4D,
+		0x25CB8E50, 0xEDE78D38, 0xB5928880, 0x7DBE8BE8,
+		0xA7F343BB, 0x6FDF40D3, 0x37AA456B, 0xFF864603,
+		0x7636A31E, 0xBE1AA076, 0xE66FA5CE, 0x2E43A6A6,
+		0x52F43569, 0x9AD83601, 0xC2AD33B9, 0x0A8130D1,
+		0x8331D5CC, 0x4B1DD6A4, 0x1368D31C, 0xDB44D074,
+		0x01091827, 0xC9251B4F, 0x91501EF7, 0x597C1D9F,
+		0xD0CCF882, 0x18E0FBEA, 0x4095FE52, 0x88B9FD3A,
+		0x498D35C8, 0x81A136A0, 0xD9D43318, 0x11F83070,
+		0x9848D56D, 0x5064D605, 0x0811D3BD, 0xC03DD0D5,
+		0x1A701886, 0xD25C1BEE, 0x8A291E56, 0x42051D3E,
+		0xCBB5F823, 0x0399FB4B, 0x5BECFEF3, 0x93C0FD9B,
+		0xEF776E54, 0x275B6D3C, 0x7F2E6884, 0xB7026BEC,
+		0x3EB28EF1, 0xF69E8D99, 0xAEEB8821, 0x66C78B49,
+		0xBC8A431A, 0x74A64072, 0x2CD345CA, 0xE4FF46A2,
+		0x6D4FA3BF, 0xA563A0D7, 0xFD16A56F, 0x353AA607,
+		0x8E09D98F, 0x4625DAE7, 0x1E50DF5F, 0xD67CDC37,
+		0x5FCC392A, 0x97E03A42, 0xCF953FFA, 0x07B93C92,
+		0xDDF4F4C1, 0x15D8F7A9, 0x4DADF211, 0x8581F179,
+		0x0C311464, 0xC41D170C, 0x9C6812B4, 0x544411DC,
+		0x28F38213, 0xE0DF817B, 0xB8AA84C3, 0x708687AB,
+		0xF93662B6, 0x311A61DE, 0x696F6466, 0xA143670E,
+		0x7B0EAF5D, 0xB322AC35, 0xEB57A98D, 0x237BAAE5,
+		0xAACB4FF8, 0x62E74C90, 0x3A924928, 0xF2BE4A40,
+		0x338A82B2, 0xFBA681DA, 0xA3D38462, 0x6BFF870A,
+		0xE24F6217, 0x2A63617F, 0x721664C7, 0xBA3A67AF,
+		0x6077AFFC, 0xA85BAC94, 0xF02EA92C, 0x3802AA44,
+		0xB1B24F59, 0x799E4C31, 0x21EB4989, 0xE9C74AE1,
+		0x9570D92E, 0x5D5CDA46, 0x0529DFFE, 0xCD05DC96,
+		0x44B5398B, 0x8C993AE3, 0xD4EC3F5B, 0x1CC03C33,
+		0xC68DF460, 0x0EA1F708, 0x56D4F2B0, 0x9EF8F1D8,
+		0x174814C5, 0xDF6417AD, 0x87111215, 0x4F3D117D,
+	},
+	{
+		0x00000000, 0x277D3C49, 0x4EFA7892, 0x698744DB,
+		0x6D821D21, 0x4AFF2168, 0x237865B3, 0x040559FA,
+		0xDA043B42, 0xFD79070B, 0x94FE43D0, 0xB3837F99,
+		0xB7862663, 0x90FB1A2A, 0xF97C5EF1, 0xDE0162B8,
+		0xB4097684, 0x93744ACD, 0xFAF30E16, 0xDD8E325F,
+		0xD98B6BA5, 0xFEF657EC, 0x97711337, 0xB00C2F7E,
+		0x6E0D4DC6, 0x4970718F, 0x20F73554, 0x078A091D,
+		0x038F50E7, 0x24F26CAE, 0x4D752875, 0x6A08143C,
+		0x9965000D, 0xBE183C44, 0xD79F789F, 0xF0E244D6,
+		0xF4E71D2C, 0xD39A2165, 0xBA1D65BE, 0x9D6059F7,
+		0x43613B4F, 0x641C0706, 0x0D9B43DD, 0x2AE67F94,
+		0x2EE3266E, 0x099E1A27, 0x60195EFC, 0x476462B5,
+		0x2D6C7689, 0x0A114AC0, 0x63960E1B, 0x44EB3252,
+		0x40EE6BA8, 0x679357E1, 0x0E14133A, 0x29692F73,
+		0xF7684DCB, 0xD0157182, 0xB9923559, 0x9EEF0910,
+		0x9AEA50EA, 0xBD976CA3, 0xD4102878, 0xF36D1431,
+		0x32CB001A, 0x15B63C53, 0x7C317888, 0x5B4C44C1,
+		0x5F491D3B, 0x78342172, 0x11B365A9, 0x36CE59E0,
+		0xE8CF3B58, 0xCFB20711, 0xA63543CA, 0x81487F83,
+		0x854D2679, 0xA2301A30, 0xCBB75EEB, 0xECCA62A2,
+		0x86C2769E, 0xA1BF4AD7, 0xC8380E0C, 0xEF453245,
+		0xEB406BBF, 0xCC3D57F6, 0xA5BA132D, 0x82C72F64,
+		0x5CC64DDC, 0x7BBB7195, 0x123C354E, 0x35410907,
+		0x314450FD, 0x16396CB4, 0x7FBE286F, 0x58C31426,
+		0xABAE0017, 0x8CD33C5E, 0xE5547885, 0xC22944CC,
+		0xC62C1D36, 0xE151217F, 0x88D665A4, 0xAFAB59ED,
+		0x71AA3B55, 0x56D7071C, 0x3F5043C7, 0x182D7F8E,
+		0x1C282674, 0x3B551A3D, 0x52D25EE6, 0x75AF62AF,
+		0x1FA77693, 0x38DA4ADA, 0x515D0E01, 0x76203248,
+		0x72256BB2, 0x555857FB, 0x3CDF1320, 0x1BA22F69,
+		0xC5A34DD1, 0xE2DE7198, 0x8B593543, 0xAC24090A,
+		0xA82150F0, 0x8F5C6CB9, 0xE6DB2862, 0xC1A6142B,
+		0x64960134, 0x43EB3D7D, 0x2A6C79A6, 0x0D1145EF,
+		0x09141C15, 0x2E69205C, 0x47EE6487, 0x609358CE,
+		0xBE923A76, 0x99EF063F, 0xF06842E4, 0xD7157EAD,
+		0xD3102757, 0xF46D1B1E, 0x9DEA5FC5, 0xBA97638C,
+		0xD09F77B0, 0xF7E24BF9, 0x9E650F22, 0xB918336B,
+		0xBD1D6A91, 0x9A6056D8, 0xF3E71203, 0xD49A2E4A,
+		0x0A9B4CF2, 0x2DE670BB, 0x44613460, 0x631C0829,
+		0x671951D3, 0x40646D9A, 0x29E32941, 0x0E9E1508,
+		0xFDF30139, 0xDA8E3D70, 0xB30979AB, 0x947445E2,
+		0x90711C18, 0xB70C2051, 0xDE8B648A, 0xF9F658C3,
+		0x27F73A7B, 0x008A0632, 0x690D42E9, 0x4E707EA0,
+		0x4A75275A, 0x6D081B13, 0x048F5FC8, 0x23F26381,
+		0x49FA77BD, 0x6E874BF4, 0x07000F2F, 0x207D3366,
+		0x24786A9C, 0x030556D5, 0x6A82120E, 0x4DFF2E47,
+		0x93FE4CFF, 0xB48370B6, 0xDD04346D, 0xFA790824,
+		0xFE7C51DE, 0xD9016D97, 0xB086294C, 0x97FB1505,
+		0x565D012E, 0x71203D67, 0x18A779BC, 0x3FDA45F5,
+		0x3BDF1C0F, 0x1CA22046, 0x7525649D, 0x525858D4,
+		0x8C593A6C, 0xAB240625, 0xC2A342FE, 0xE5DE7EB7,
+		0xE1DB274D, 0xC6A61B04, 0xAF215FDF, 0x885C6396,
+		0xE25477AA, 0xC5294BE3, 0xACAE0F38, 0x8BD33371,
+		0x8FD66A8B, 0xA8AB56C2, 0xC12C1219, 0xE6512E50,
+		0x38504CE8, 0x1F2D70A1, 0x76AA347A, 0x51D70833,
+		0x55D251C9, 0x72AF6D80, 0x1B28295B, 0x3C551512,
+		0xCF380123, 0xE8453D6A, 0x81C279B1, 0xA6BF45F8,
+		0xA2BA1C02, 0x85C7204B, 0xEC406490, 0xCB3D58D9,
+		0x153C3A61, 0x32410628, 0x5BC642F3, 0x7CBB7EBA,
+		0x78BE2740, 0x5FC31B09, 0x36445FD2, 0x1139639B,
+		0x7B3177A7, 0x5C4C4BEE, 0x35CB0F35, 0x12B6337C,
+		0x16B36A86, 0x31CE56CF, 0x58491214, 0x7F342E5D,
+		0xA1354CE5, 0x864870AC, 0xEFCF3477, 0xC8B2083E,
+		0xCCB751C4, 0xEBCA6D8D, 0x824D2956, 0xA530151F
+	}
+#endif   /* WORDS_BIGENDIAN */
+};
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
new file mode 100644
index 0000000..a22a9dd
--- /dev/null
+++ b/src/port/pg_crc32c_sse42.c
@@ -0,0 +1,67 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_crc32c_sse42.c
+ *	  Compute CRC-32C checksum using Intel SSE 4.2 instructions.
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/port/pg_crc32c_sse42.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "c.h"
+
+#include "port/pg_crc32c.h"
+
+#include <nmmintrin.h>
+
+pg_crc32c
+pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
+{
+	const unsigned char *p = data;
+	const unsigned char *pend = p + len;
+
+	/*
+	 * Process eight bytes of data at a time.
+	 *
+	 * NB: We do unaligned accesses here. The Intel architecture allows that,
+	 * and performance testing didn't show any performance gain from aligning
+	 * the begin address.
+	 */
+#ifdef __x86_64__
+	while (p + 8 <= pend)
+	{
+		crc = (uint32) _mm_crc32_u64(crc, *((const uint64 *) p));
+		p += 8;
+	}
+
+	/* Process remaining full four bytes if any */
+	if (p + 4 <= pend)
+	{
+		crc = _mm_crc32_u32(crc, *((const unsigned int *) p));
+		p += 4;
+	}
+#else
+	/*
+	 * Process four bytes at a time. (The eight byte instruction is not
+	 * available on the 32-bit x86 architecture).
+	 */
+	while (p + 4 <= pend)
+	{
+		crc = _mm_crc32_u32(crc, *((const unsigned int *) p));
+		p += 4;
+	}
+#endif /* __x86_64__ */
+
+	/* Process any remaining bytes one at a time. */
+	while (p < pend)
+	{
+		crc = _mm_crc32_u8(crc, *p);
+		p++;
+	}
+
+	return crc;
+}
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 2e5126b..e4dbebf 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -31,18 +31,18 @@ my $libpq;
 # Set of variables for contrib modules
 my $contrib_defines = { 'refint' => 'REFINT_VERBOSE' };
 my @contrib_uselibpq =
-  ('dblink', 'oid2name', 'pg_upgrade', 'postgres_fdw', 'vacuumlo');
+  ('dblink', 'oid2name', 'postgres_fdw', 'vacuumlo');
 my @contrib_uselibpgport = (
 	'oid2name',
 	'pg_standby',
 	'pg_test_fsync', 'pg_test_timing',
-	'pg_upgrade',    'pg_xlogdump',
+	'pg_xlogdump',
 	'vacuumlo');
 my @contrib_uselibpgcommon = (
 	'oid2name',
 	'pg_standby',
 	'pg_test_fsync', 'pg_test_timing',
-	'pg_upgrade',    'pg_xlogdump',
+	'pg_xlogdump',
 	'vacuumlo');
 my $contrib_extralibs = undef;
 my $contrib_extraincludes =
@@ -54,9 +54,9 @@ my @contrib_excludes = ('pgcrypto', 'intagg', 'sepgsql');
 
 # Set of variables for frontend modules
 my $frontend_defines = { 'initdb' => 'FRONTEND' };
-my @frontend_uselibpq = ('pg_ctl', 'pgbench', 'psql');
-my @frontend_uselibpgport = ( 'pg_archivecleanup', 'pgbench' );
-my @frontend_uselibpgcommon = ( 'pg_archivecleanup', 'pgbench' );
+my @frontend_uselibpq = ('pg_ctl', 'pg_upgrade', 'pgbench', 'psql');
+my @frontend_uselibpgport = ( 'pg_archivecleanup', 'pg_upgrade', 'pgbench' );
+my @frontend_uselibpgcommon = ( 'pg_archivecleanup', 'pg_upgrade', 'pgbench' );
 my $frontend_extralibs = {
 	'initdb'     => ['ws2_32.lib'],
 	'pg_restore' => ['ws2_32.lib'],
@@ -96,8 +96,19 @@ sub mkvcbuild
 
 	push(@pgportfiles, 'rint.c') if ($vsVersion < '12.00');
 
+	if ($vsVersion >= '9.00')
+	{
+		push(@pgportfiles, 'pg_crc32c_choose.c');
+		push(@pgportfiles, 'pg_crc32c_sse42.c');
+		push(@pgportfiles, 'pg_crc32c_sb8.c');
+	}
+	else
+	{
+		push(@pgportfiles, 'pg_crc32c_sb8.c')
+	}
+
 	our @pgcommonallfiles = qw(
-	  exec.c pg_crc.c pg_lzcompress.c pgfnames.c psprintf.c relpath.c rmtree.c
+	  exec.c pg_lzcompress.c pgfnames.c psprintf.c relpath.c rmtree.c
 	  string.c username.c wait_error.c);
 
 	our @pgcommonfrontendfiles = (@pgcommonallfiles, qw(fe_memutils.c
diff --git a/src/tools/msvc/vcregress.pl b/src/tools/msvc/vcregress.pl
index bd3dd2c..4812a03 100644
--- a/src/tools/msvc/vcregress.pl
+++ b/src/tools/msvc/vcregress.pl
@@ -269,7 +269,7 @@ sub upgradecheck
 
 	$ENV{PGHOST} = 'localhost';
 	$ENV{PGPORT} ||= 50432;
-	my $tmp_root = "$topdir/contrib/pg_upgrade/tmp_check";
+	my $tmp_root = "$topdir/src/bin/pg_upgrade/tmp_check";
 	(mkdir $tmp_root || die $!) unless -d $tmp_root;
 	my $tmp_install = "$tmp_root/install";
 	print "Setting up temp install\n\n";
@@ -282,7 +282,7 @@ sub upgradecheck
 	$ENV{PATH} = "$bindir;$ENV{PATH}";
 	my $data = "$tmp_root/data";
 	$ENV{PGDATA} = "$data.old";
-	my $logdir = "$topdir/contrib/pg_upgrade/log";
+	my $logdir = "$topdir/src/bin/pg_upgrade/log";
 	(mkdir $logdir || die $!) unless -d $logdir;
 	print "\nRunning initdb on old cluster\n\n";
 	standard_initdb() or exit 1;
@@ -292,7 +292,7 @@ sub upgradecheck
 	installcheck();
 
 	# now we can chdir into the source dir
-	chdir "$topdir/contrib/pg_upgrade";
+	chdir "$topdir/src/bin/pg_upgrade";
 	print "\nDumping old cluster\n\n";
 	system("pg_dumpall -f $tmp_root/dump1.sql") == 0 or exit 1;
 	print "\nStopping old cluster\n\n";

#28

Kouhei Kaigai

kaigai@ak.jp.nec.com

over 10 years ago

In reply to: Shigeru HANADA (#27)

Hanada-san,

I merged explain patch into foreign_join patch.

Now v12 is the latest patch.

It contains many garbage lines... Please ensure the
patch is correctly based on the latest master +
custom_join patch.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

-----Original Message-----
From: Shigeru HANADA [mailto:shigeru.hanada@gmail.com]
Sent: Thursday, April 16, 2015 5:06 PM
To: Kaigai Kouhei(海外浩平)
Cc: Ashutosh Bapat; Robert Haas; Tom Lane; Thom Brown;
pgsql-hackers@postgreSQL.org
Subject: ##freemail## Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom
Plan API)

Kaigai-san,

2015/04/15 22:33、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：

Oops, that’s right. Attached is the revised version. I chose fully qualified
name, schema.relname [alias] for the output. It would waste some cycles during
planning if that is not for EXPLAIN, but it seems difficult to get a list of

name

of relations in ExplainForeignScan() phase, because planning information has

gone
away at that time.

I understand. Private data structure of the postgres_fdw is not designed
to keep tree structure data (like relations join tree), so it seems to me
a straightforward way to implement the feature.

I have a small suggestion. This patch makes deparseSelectSql initialize
the StringInfoData if supplied, however, it usually shall be a task of
function caller, not callee.
In this case, I like to initStringInfo(&relations) next to the line of
initStingInfo(&sql) on the postgresGetForeignPlan. In my sense, it is
a bit strange to pass uninitialized StringInfoData, to get a text form.
@@ -803,7 +806,7 @@ postgresGetForeignPlan(PlannerInfo *root,
*/
initStringInfo(&sql);
deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used, remote_conds,
-                    &params_list, &fdw_ps_tlist, &retrieved_attrs);
+                    &params_list, &fdw_ps_tlist, &retrieved_attrs,
&relations);

/*
* Build the fdw_private list that will be available in the executor.

Agreed. If caller passes a buffer, it should be initialized by caller. In
addition to your idea, I added a check that the RelOptInfo is a JOINREL, coz BASEREL
doesn’t need relations for its EXPLAIN output.

Also, could you merge the EXPLAIN output feature on the main patch?
I think here is no reason why to split this feature.

I merged explain patch into foreign_join patch.

Now v12 is the latest patch.

--
Shigeru HANADA
shigeru.hanada@gmail.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Resolved by subject fallback

#29

Shigeru HANADA

shigeru.hanada@gmail.com

over 10 years ago

In reply to: Kouhei Kaigai (#28)

1 attachment(s)

Kaigai-san,

2015/04/17 10:13、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：

Hanada-san,

I merged explain patch into foreign_join patch.

Now v12 is the latest patch.

It contains many garbage lines... Please ensure the
patch is correctly based on tOhe latest master +
custom_join patch.

Oops, sorry. I’ve re-created the patch as v13, based on Custom/Foreign join v11 patch and latest master.

It contains EXPLAIN enhancement that new subitem “Relations” shows relations and joins, including order and type, processed by the foreign scan.

--
Shigeru HANADA
shigeru.hanada@gmail.com

Attachments:

foreign_join_v13.patchapplication/octet-stream; name=foreign_join_v13.patchDownload

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 94fab18..5ec3d89 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -44,8 +44,11 @@
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
+#include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "nodes/plannodes.h"
 #include "optimizer/clauses.h"
+#include "optimizer/prep.h"
 #include "optimizer/var.h"
 #include "parser/parsetree.h"
 #include "utils/builtins.h"
@@ -89,6 +92,8 @@ typedef struct deparse_expr_cxt
 	RelOptInfo *foreignrel;		/* the foreign relation we are planning for */
 	StringInfo	buf;			/* output buffer to append to */
 	List	  **params_list;	/* exprs that will become remote Params */
+	List	   *outertlist;		/* outer child's target list */
+	List	   *innertlist;		/* inner child's target list */
 } deparse_expr_cxt;
 
 /*
@@ -136,6 +141,13 @@ static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod,
 				 deparse_expr_cxt *context);
 static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
 					   deparse_expr_cxt *context);
+static const char *get_jointype_name(JoinType jointype);
+
+/*
+ * convert absolute attnum to relative one.  This would be handy for handling
+ * attnum for attrs_used and column aliases.
+ */
+#define TO_RELATIVE(x)	((x) - FirstLowInvalidHeapAttributeNumber)
 
 
 /*
@@ -143,6 +155,7 @@ static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
  * which are returned as two lists:
  *	- remote_conds contains expressions that can be evaluated remotely
  *	- local_conds contains expressions that can't be evaluated remotely
+ * Note that each element is Expr, which was stripped from RestrictInfo, 
  */
 void
 classifyConditions(PlannerInfo *root,
@@ -161,9 +174,9 @@ classifyConditions(PlannerInfo *root,
 		RestrictInfo *ri = (RestrictInfo *) lfirst(lc);
 
 		if (is_foreign_expr(root, baserel, ri->clause))
-			*remote_conds = lappend(*remote_conds, ri);
+			*remote_conds = lappend(*remote_conds, ri->clause);
 		else
-			*local_conds = lappend(*local_conds, ri);
+			*local_conds = lappend(*local_conds, ri->clause);
 	}
 }
 
@@ -250,7 +263,7 @@ foreign_expr_walker(Node *node,
 				 * Param's collation, ie it's not safe for it to have a
 				 * non-default collation.
 				 */
-				if (var->varno == glob_cxt->foreignrel->relid &&
+				if (bms_is_member(var->varno, glob_cxt->foreignrel->relids) &&
 					var->varlevelsup == 0)
 				{
 					/* Var belongs to foreign table */
@@ -675,18 +688,83 @@ is_builtin(Oid oid)
  *
  * We also create an integer List of the columns being retrieved, which is
  * returned to *retrieved_attrs.
+ *
+ * The relations is a string buffer for "Relations" portion of EXPLAIN output,
+ * or NULL if caller doesn't need it.  Note that it should have been
+ * initialized by caller.
  */
 void
 deparseSelectSql(StringInfo buf,
 				 PlannerInfo *root,
 				 RelOptInfo *baserel,
 				 Bitmapset *attrs_used,
-				 List **retrieved_attrs)
+				 List *remote_conds,
+				 List **params_list,
+				 List **fdw_ps_tlist,
+				 List **retrieved_attrs,
+				 StringInfo relations)
 {
+	PgFdwRelationInfo  *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
 	RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
 	Relation	rel;
 
 	/*
+	 * If given relation was a join relation, recursively construct statement
+	 * by putting each outer and inner relations in FROM clause as a subquery
+	 * with aliasing.
+	 */
+	if (baserel->reloptkind == RELOPT_JOINREL)
+	{
+		RelOptInfo		   *rel_o = fpinfo->outerrel;
+		RelOptInfo		   *rel_i = fpinfo->innerrel;
+		PgFdwRelationInfo  *fpinfo_o = (PgFdwRelationInfo *) rel_o->fdw_private;
+		PgFdwRelationInfo  *fpinfo_i = (PgFdwRelationInfo *) rel_i->fdw_private;
+		StringInfoData		sql_o;
+		StringInfoData		sql_i;
+		List			   *ret_attrs_tmp;	/* not used */
+		StringInfoData		relations_o;
+		StringInfoData		relations_i;
+		const char		   *jointype_str;
+
+		/*
+		 * Deparse query for outer and inner relation, and combine them into
+		 * a query.
+		 *
+		 * Here we don't pass fdw_ps_tlist because targets of underlying
+		 * relations are already put in joinrel->reltargetlist, and
+		 * deparseJoinRel() takes all care about it.
+		 */
+		initStringInfo(&sql_o);
+		initStringInfo(&relations_o);
+		deparseSelectSql(&sql_o, root, rel_o, fpinfo_o->attrs_used,
+						 fpinfo_o->remote_conds, params_list,
+						 NULL, &ret_attrs_tmp, &relations_o);
+		initStringInfo(&sql_i);
+		initStringInfo(&relations_i);
+		deparseSelectSql(&sql_i, root, rel_i, fpinfo_i->attrs_used,
+						 fpinfo_i->remote_conds, params_list,
+						 NULL, &ret_attrs_tmp, &relations_i);
+
+		/* For EXPLAIN output */
+		jointype_str = get_jointype_name(fpinfo->jointype);
+		if (relations)
+			appendStringInfo(relations, "(%s) %s JOIN (%s)",
+							 relations_o.data, jointype_str, relations_i.data);
+
+		deparseJoinSql(buf, root, baserel,
+					   fpinfo->outerrel,
+					   fpinfo->innerrel,
+					   sql_o.data,
+					   sql_i.data,
+					   fpinfo->jointype,
+					   fpinfo->joinclauses,
+					   fpinfo->otherclauses,
+					   fdw_ps_tlist,
+					   retrieved_attrs);
+		return;
+	}
+
+	/*
 	 * Core code already has some lock on each rel being planned, so we can
 	 * use NoLock here.
 	 */
@@ -705,6 +783,87 @@ deparseSelectSql(StringInfo buf,
 	appendStringInfoString(buf, " FROM ");
 	deparseRelation(buf, rel);
 
+	/*
+	 * Return local relation name for EXPLAIN output.
+	 * We can't know VERBOSE option is specified or not, so always add shcema
+	 * name.
+	 */
+	if (relations)
+	{
+		const char	   *namespace;
+		const char	   *relname;
+		const char	   *refname;
+
+		namespace = get_namespace_name(get_rel_namespace(rte->relid));
+		relname = get_rel_name(rte->relid);
+		refname = rte->eref->aliasname;
+		appendStringInfo(relations, "%s.%s",
+						 quote_identifier(namespace),
+						 quote_identifier(relname));
+		if (*refname && strcmp(refname, relname) != 0)
+			appendStringInfo(relations, " %s",
+							 quote_identifier(rte->eref->aliasname));
+	}
+
+	/*
+	 * Construct WHERE clause
+	 */
+	if (remote_conds)
+		appendConditions(buf, root, baserel, NULL, NULL, remote_conds,
+						 " WHERE ", params_list);
+
+	/*
+	 * Add FOR UPDATE/SHARE if appropriate.  We apply locking during the
+	 * initial row fetch, rather than later on as is done for local tables.
+	 * The extra roundtrips involved in trying to duplicate the local
+	 * semantics exactly don't seem worthwhile (see also comments for
+	 * RowMarkType).
+	 *
+	 * Note: because we actually run the query as a cursor, this assumes
+	 * that DECLARE CURSOR ... FOR UPDATE is supported, which it isn't
+	 * before 8.3.
+	 */
+	if (baserel->relid == root->parse->resultRelation &&
+		(root->parse->commandType == CMD_UPDATE ||
+		 root->parse->commandType == CMD_DELETE))
+	{
+		/* Relation is UPDATE/DELETE target, so use FOR UPDATE */
+		appendStringInfoString(buf, " FOR UPDATE");
+	}
+	else
+	{
+		PlanRowMark *rc = get_plan_rowmark(root->rowMarks, baserel->relid);
+
+		if (rc)
+		{
+			/*
+			 * Relation is specified as a FOR UPDATE/SHARE target, so handle
+			 * that.  (But we could also see LCS_NONE, meaning this isn't a
+			 * target relation after all.)
+			 *
+			 * For now, just ignore any [NO] KEY specification, since (a)
+			 * it's not clear what that means for a remote table that we
+			 * don't have complete information about, and (b) it wouldn't
+			 * work anyway on older remote servers.  Likewise, we don't
+			 * worry about NOWAIT.
+			 */
+			switch (rc->strength)
+			{
+				case LCS_NONE:
+					/* No locking needed */
+					break;
+				case LCS_FORKEYSHARE:
+				case LCS_FORSHARE:
+					appendStringInfoString(buf, " FOR SHARE");
+					break;
+				case LCS_FORNOKEYUPDATE:
+				case LCS_FORUPDATE:
+					appendStringInfoString(buf, " FOR UPDATE");
+					break;
+			}
+		}
+	}
+
 	heap_close(rel, NoLock);
 }
 
@@ -731,8 +890,7 @@ deparseTargetList(StringInfo buf,
 	*retrieved_attrs = NIL;
 
 	/* If there's a whole-row reference, we'll need all the columns. */
-	have_wholerow = bms_is_member(0 - FirstLowInvalidHeapAttributeNumber,
-								  attrs_used);
+	have_wholerow = bms_is_member(TO_RELATIVE(0), attrs_used);
 
 	first = true;
 	for (i = 1; i <= tupdesc->natts; i++)
@@ -743,15 +901,14 @@ deparseTargetList(StringInfo buf,
 		if (attr->attisdropped)
 			continue;
 
-		if (have_wholerow ||
-			bms_is_member(i - FirstLowInvalidHeapAttributeNumber,
-						  attrs_used))
+		if (have_wholerow || bms_is_member(TO_RELATIVE(i), attrs_used))
 		{
 			if (!first)
 				appendStringInfoString(buf, ", ");
 			first = false;
 
 			deparseColumnRef(buf, rtindex, i, root);
+			appendStringInfo(buf, " a%d", TO_RELATIVE(i));
 
 			*retrieved_attrs = lappend_int(*retrieved_attrs, i);
 		}
@@ -761,17 +918,17 @@ deparseTargetList(StringInfo buf,
 	 * Add ctid if needed.  We currently don't support retrieving any other
 	 * system columns.
 	 */
-	if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-					  attrs_used))
+	if (bms_is_member(TO_RELATIVE(SelfItemPointerAttributeNumber), attrs_used))
 	{
 		if (!first)
 			appendStringInfoString(buf, ", ");
 		first = false;
 
-		appendStringInfoString(buf, "ctid");
+		appendStringInfo(buf, "ctid a%d",
+						 TO_RELATIVE(SelfItemPointerAttributeNumber));
 
 		*retrieved_attrs = lappend_int(*retrieved_attrs,
-									   SelfItemPointerAttributeNumber);
+										   SelfItemPointerAttributeNumber);
 	}
 
 	/* Don't generate bad syntax if no undropped columns */
@@ -780,7 +937,8 @@ deparseTargetList(StringInfo buf,
 }
 
 /*
- * Deparse WHERE clauses in given list of RestrictInfos and append them to buf.
+ * Deparse conditions, such as WHERE clause and ON clause of JOIN, in given
+ * list of Expr and append them to buf.
  *
  * baserel is the foreign table we're planning for.
  *
@@ -794,12 +952,14 @@ deparseTargetList(StringInfo buf,
  * so Params and other-relation Vars should be replaced by dummy values.
  */
 void
-appendWhereClause(StringInfo buf,
-				  PlannerInfo *root,
-				  RelOptInfo *baserel,
-				  List *exprs,
-				  bool is_first,
-				  List **params)
+appendConditions(StringInfo buf,
+				 PlannerInfo *root,
+				 RelOptInfo *baserel,
+				 List *outertlist,
+				 List *innertlist,
+				 List *exprs,
+				 const char *prefix,
+				 List **params)
 {
 	deparse_expr_cxt context;
 	int			nestlevel;
@@ -813,31 +973,315 @@ appendWhereClause(StringInfo buf,
 	context.foreignrel = baserel;
 	context.buf = buf;
 	context.params_list = params;
+	context.outertlist = outertlist;
+	context.innertlist = innertlist;
 
 	/* Make sure any constants in the exprs are printed portably */
 	nestlevel = set_transmission_modes();
 
 	foreach(lc, exprs)
 	{
-		RestrictInfo *ri = (RestrictInfo *) lfirst(lc);
+		Expr	   *expr = (Expr *) lfirst(lc);
 
 		/* Connect expressions with "AND" and parenthesize each condition. */
-		if (is_first)
-			appendStringInfoString(buf, " WHERE ");
-		else
-			appendStringInfoString(buf, " AND ");
+		if (prefix)
+			appendStringInfo(buf, "%s", prefix);
 
 		appendStringInfoChar(buf, '(');
-		deparseExpr(ri->clause, &context);
+		deparseExpr(expr, &context);
 		appendStringInfoChar(buf, ')');
 
-		is_first = false;
+		prefix= " AND ";
 	}
 
 	reset_transmission_modes(nestlevel);
 }
 
 /*
+ * Returns position index (start with 1) of given var in given target list, or
+ * 0 when not found.
+ */
+static int
+find_var_pos(Var *node, List *tlist)
+{
+	int		pos = 1;
+	ListCell *lc;
+
+	foreach(lc, tlist)
+	{
+		Var *var = (Var *) lfirst(lc);
+
+		if (equal(var, node))
+		{
+			return pos;
+		}
+		pos++;
+	}
+
+	return 0;
+}
+
+/*
+ * Deparse given Var into buf.
+ */
+static void
+deparseJoinVar(Var *node, deparse_expr_cxt *context)
+{
+	char		side;
+	int			pos;
+
+	pos = find_var_pos(node, context->outertlist);
+	if (pos > 0)
+		side = 'l';
+	else
+	{
+		side = 'r';
+		pos = find_var_pos(node, context->innertlist);
+	}
+	Assert(pos > 0);
+	Assert(side == 'l' || side == 'r');
+
+	/*
+	 * We treat whole-row reference same as ordinary attribute references,
+	 * because such transformation should be done in lower level.
+	 */
+	appendStringInfo(context->buf, "%c.a%d", side, pos);
+}
+
+/*
+ * Deparse column alias list for a subquery in FROM clause.
+ */
+static void
+deparseColumnAliases(StringInfo buf, List *tlist)
+{
+	int			pos;
+	ListCell   *lc;
+
+	pos = 1;
+	foreach(lc, tlist)
+	{
+		/* Deparse column alias for the subquery */
+		if (pos > 1)
+			appendStringInfoString(buf, ", ");
+		appendStringInfo(buf, "a%d", pos);
+		pos++;
+	}
+}
+
+/*
+ * Deparse "wrapper" SQL for a query which projects target lists in proper
+ * order and contents.  Note that this treatment is necessary only for queries
+ * used in FROM clause of a join query.
+ *
+ * Even if the SQL is enough simple (no ctid, no whole-row reference), the order
+ * of output column might different from underlying scan, so we always need to
+ * wrap the queries for join sources.
+ *
+ */
+static const char *
+deparseProjectionSql(PlannerInfo *root,
+					 RelOptInfo *baserel,
+					 const char *sql,
+					 char side)
+{
+	StringInfoData wholerow;
+	StringInfoData buf;
+	ListCell   *lc;
+	bool		first;
+	bool		have_wholerow = false;
+
+	/*
+	 * We have nothing to do if the targetlist contains no special reference,
+	 * such as whole-row and ctid.
+	 */
+	foreach(lc, baserel->reltargetlist)
+	{
+		Var		   *var = (Var *) lfirst(lc);
+		if (var->varattno == 0)
+		{
+			have_wholerow = true;
+			break;
+		}
+	}
+
+	/*
+	 * Construct whole-row reference with ROW() syntax
+	 */
+	if (have_wholerow)
+	{
+		RangeTblEntry *rte;
+		Relation		rel;
+		TupleDesc		tupdesc;
+		int				i;
+
+		/* Obtain TupleDesc for deparsing all valid columns */
+		rte = planner_rt_fetch(baserel->relid, root);
+		rel = heap_open(rte->relid, NoLock);
+		tupdesc = rel->rd_att;
+
+		/* Print all valid columns in ROW() to generate whole-row value */
+		initStringInfo(&wholerow);
+		appendStringInfoString(&wholerow, "ROW(");
+		first = true;
+		for (i = 1; i <= tupdesc->natts; i++)
+		{
+			Form_pg_attribute attr = tupdesc->attrs[i - 1];
+
+			/* Ignore dropped columns. */
+			if (attr->attisdropped)
+				continue;
+
+			if (!first)
+				appendStringInfoString(&wholerow, ", ");
+			first = false;
+
+			appendStringInfo(&wholerow, "%c.a%d", side, TO_RELATIVE(i));
+		}
+		appendStringInfoString(&wholerow, ")");
+
+		heap_close(rel, NoLock);
+	}
+
+	/*
+	 * Construct a SELECT statement which has the original query in its FROM
+	 * clause, and have target list entries in its SELECT clause.  The number
+	 * used in column aliases are attnum - FirstLowInvalidHeapAttributeNumber in
+	 * order to make all numbers positive even for system columns which have
+	 * minus value as attnum.
+	 */
+	initStringInfo(&buf);
+	appendStringInfoString(&buf, "SELECT ");
+	first = true;
+	foreach(lc, baserel->reltargetlist)
+	{
+		Var *var = (Var *) lfirst(lc);
+
+		if (!first)
+			appendStringInfoString(&buf, ", ");
+	
+		if (var->varattno == 0)
+			appendStringInfo(&buf, "%s", wholerow.data);
+		else
+			appendStringInfo(&buf, "%c.a%d", side, TO_RELATIVE(var->varattno));
+
+		first = false;
+	}
+	appendStringInfo(&buf, " FROM (%s) %c", sql, side);
+
+	return buf.data;
+}
+
+static const char *
+get_jointype_name(JoinType jointype)
+{
+	return jointype == JOIN_INNER ? "INNER" :
+		   jointype == JOIN_LEFT ? "LEFT" :
+		   jointype == JOIN_RIGHT ? "RIGHT" :
+		   jointype == JOIN_FULL ? "FULL" : "";
+}
+
+/*
+ * Construct a SELECT statement which contains join clause.
+ *
+ * We also create an TargetEntry List of the columns being retrieved, which is
+ * returned to *fdw_ps_tlist.
+ *
+ * path_o, tl_o, sql_o are respectively path, targetlist, and remote query
+ * statement of the outer child relation.  postfix _i means those for the inner
+ * child relation.  jointype and joinclauses are information of join method.
+ * fdw_ps_tlist is output parameter to pass target list of the pseudo scan to
+ * caller.
+ */
+void
+deparseJoinSql(StringInfo buf,
+			   PlannerInfo *root,
+			   RelOptInfo *baserel,
+			   RelOptInfo *outerrel,
+			   RelOptInfo *innerrel,
+			   const char *sql_o,
+			   const char *sql_i,
+			   JoinType jointype,
+			   List *joinclauses,
+			   List *otherclauses,
+			   List **fdw_ps_tlist,
+			   List **retrieved_attrs)
+{
+	StringInfoData selbuf;		/* buffer for SELECT clause */
+	StringInfoData abuf_o;		/* buffer for column alias list of outer */
+	StringInfoData abuf_i;		/* buffer for column alias list of inner */
+	int			i;
+	ListCell   *lc;
+	const char *jointype_str;
+	deparse_expr_cxt context;
+
+	context.root = root;
+	context.foreignrel = baserel;
+	context.buf = &selbuf;
+	context.params_list = NULL;
+	context.outertlist = outerrel->reltargetlist;
+	context.innertlist = innerrel->reltargetlist;
+
+	jointype_str = get_jointype_name(jointype);
+	*retrieved_attrs = NIL;
+
+	/* print SELECT clause of the join scan */
+	initStringInfo(&selbuf);
+	i = 0;
+	foreach(lc, baserel->reltargetlist)
+	{
+		Var		   *var = (Var *) lfirst(lc);
+		TargetEntry *tle;
+
+		if (i > 0)
+			appendStringInfoString(&selbuf, ", ");
+		deparseJoinVar(var, &context);
+
+		tle = makeTargetEntry((Expr *) var, i + 1, NULL, false);
+		if (fdw_ps_tlist)
+			*fdw_ps_tlist = lappend(*fdw_ps_tlist, tle);
+
+		*retrieved_attrs = lappend_int(*retrieved_attrs, i + 1);
+
+		i++;
+	}
+	if (i == 0)
+		appendStringInfoString(&selbuf, "NULL");
+
+	/*
+	 * Do pseudo-projection for an underlying scan on a foreign table, if a) the
+	 * relation is a base relation, and b) its targetlist contains whole-row
+	 * reference.
+	 */
+	if (outerrel->reloptkind == RELOPT_BASEREL)
+		sql_o = deparseProjectionSql(root, outerrel, sql_o, 'l');
+	if (innerrel->reloptkind == RELOPT_BASEREL)
+		sql_i = deparseProjectionSql(root, innerrel, sql_i, 'r');
+
+	/* Deparse column alias portion of subquery in FROM clause. */
+	initStringInfo(&abuf_o);
+	deparseColumnAliases(&abuf_o, outerrel->reltargetlist);
+	initStringInfo(&abuf_i);
+	deparseColumnAliases(&abuf_i, innerrel->reltargetlist);
+
+	/* Construct SELECT statement */
+	appendStringInfo(buf, "SELECT %s FROM", selbuf.data);
+	appendStringInfo(buf, " (%s) l (%s) %s JOIN (%s) r (%s)",
+					 sql_o, abuf_o.data, jointype_str, sql_i, abuf_i.data);
+	/* Append ON clause */
+	if (joinclauses)
+		appendConditions(buf, root, baserel,
+						 outerrel->reltargetlist, innerrel->reltargetlist,
+						 joinclauses,
+						 " ON ", NULL);
+	/* Append WHERE clause */
+	if (otherclauses)
+		appendConditions(buf, root, baserel,
+						 outerrel->reltargetlist, innerrel->reltargetlist,
+						 otherclauses,
+						 " WHERE ", NULL);
+}
+
+/*
  * deparse remote INSERT statement
  *
  * The statement text is appended to buf, and we also create an integer List
@@ -976,8 +1420,7 @@ deparseReturningList(StringInfo buf, PlannerInfo *root,
 	if (trig_after_row)
 	{
 		/* whole-row reference acquires all non-system columns */
-		attrs_used =
-			bms_make_singleton(0 - FirstLowInvalidHeapAttributeNumber);
+		attrs_used = bms_make_singleton(TO_RELATIVE(0));
 	}
 
 	if (returningList != NIL)
@@ -1261,6 +1704,8 @@ deparseExpr(Expr *node, deparse_expr_cxt *context)
 /*
  * Deparse given Var node into context->buf.
  *
+ * If context has valid innerrel, this is invoked for a join conditions.
+ *
  * If the Var belongs to the foreign relation, just print its remote name.
  * Otherwise, it's effectively a Param (and will in fact be a Param at
  * run time).  Handle it the same way we handle plain Params --- see
@@ -1271,39 +1716,46 @@ deparseVar(Var *node, deparse_expr_cxt *context)
 {
 	StringInfo	buf = context->buf;
 
-	if (node->varno == context->foreignrel->relid &&
-		node->varlevelsup == 0)
+	if (context->foreignrel->reloptkind == RELOPT_JOINREL)
 	{
-		/* Var belongs to foreign table */
-		deparseColumnRef(buf, node->varno, node->varattno, context->root);
+		deparseJoinVar(node, context);
 	}
 	else
 	{
-		/* Treat like a Param */
-		if (context->params_list)
+		if (node->varno == context->foreignrel->relid &&
+			node->varlevelsup == 0)
 		{
-			int			pindex = 0;
-			ListCell   *lc;
-
-			/* find its index in params_list */
-			foreach(lc, *context->params_list)
+			/* Var belongs to foreign table */
+			deparseColumnRef(buf, node->varno, node->varattno, context->root);
+		}
+		else
+		{
+			/* Treat like a Param */
+			if (context->params_list)
 			{
-				pindex++;
-				if (equal(node, (Node *) lfirst(lc)))
-					break;
+				int			pindex = 0;
+				ListCell   *lc;
+
+				/* find its index in params_list */
+				foreach(lc, *context->params_list)
+				{
+					pindex++;
+					if (equal(node, (Node *) lfirst(lc)))
+						break;
+				}
+				if (lc == NULL)
+				{
+					/* not in list, so add it */
+					pindex++;
+					*context->params_list = lappend(*context->params_list, node);
+				}
+
+				printRemoteParam(pindex, node->vartype, node->vartypmod, context);
 			}
-			if (lc == NULL)
+			else
 			{
-				/* not in list, so add it */
-				pindex++;
-				*context->params_list = lappend(*context->params_list, node);
+				printRemotePlaceholder(node->vartype, node->vartypmod, context);
 			}
-
-			printRemoteParam(pindex, node->vartype, node->vartypmod, context);
-		}
-		else
-		{
-			printRemotePlaceholder(node->vartype, node->vartypmod, context);
 		}
 	}
 }
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 783cb41..58f24c0 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9,11 +9,16 @@ DO $d$
             OPTIONS (dbname '$$||current_database()||$$',
                      port '$$||current_setting('port')||$$'
             )$$;
+        EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER postgres_fdw
+            OPTIONS (dbname '$$||current_database()||$$',
+                     port '$$||current_setting('port')||$$'
+            )$$;
     END;
 $d$;
 CREATE USER MAPPING FOR public SERVER testserver1
 	OPTIONS (user 'value', password 'value');
 CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
+CREATE USER MAPPING FOR CURRENT_USER SERVER loopback2;
 -- ===================================================================
 -- create objects used through FDW loopback server
 -- ===================================================================
@@ -35,6 +40,18 @@ CREATE TABLE "S 1"."T 2" (
 	c2 text,
 	CONSTRAINT t2_pkey PRIMARY KEY (c1)
 );
+CREATE TABLE "S 1"."T 3" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text,
+	CONSTRAINT t3_pkey PRIMARY KEY (c1)
+);
+CREATE TABLE "S 1"."T 4" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c4 text,
+	CONSTRAINT t4_pkey PRIMARY KEY (c1)
+);
 INSERT INTO "S 1"."T 1"
 	SELECT id,
 	       id % 10,
@@ -49,8 +66,22 @@ INSERT INTO "S 1"."T 2"
 	SELECT id,
 	       'AAA' || to_char(id, 'FM000')
 	FROM generate_series(1, 100) id;
+INSERT INTO "S 1"."T 3"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 3" WHERE c1 % 2 != 0;	-- delete for outer join tests
+INSERT INTO "S 1"."T 4"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 4" WHERE c1 % 3 != 0;	-- delete for outer join tests
 ANALYZE "S 1"."T 1";
 ANALYZE "S 1"."T 2";
+ANALYZE "S 1"."T 3";
+ANALYZE "S 1"."T 4";
 -- ===================================================================
 -- create foreign tables
 -- ===================================================================
@@ -78,6 +109,26 @@ CREATE FOREIGN TABLE ft2 (
 	c8 user_enum
 ) SERVER loopback;
 ALTER FOREIGN TABLE ft2 DROP COLUMN cx;
+CREATE FOREIGN TABLE ft4 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3');
+CREATE FOREIGN TABLE ft5 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE FOREIGN TABLE ft6 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback2 OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE USER view_owner;
+GRANT ALL ON ft5 TO view_owner;
+CREATE VIEW v_ft5 AS SELECT * FROM ft5;
+ALTER VIEW v_ft5 OWNER TO view_owner;
+CREATE USER MAPPING FOR view_owner SERVER loopback;
 -- ===================================================================
 -- tests for validator
 -- ===================================================================
@@ -119,12 +170,15 @@ ALTER FOREIGN TABLE ft2 OPTIONS (schema_name 'S 1', table_name 'T 1');
 ALTER FOREIGN TABLE ft1 ALTER COLUMN c1 OPTIONS (column_name 'C 1');
 ALTER FOREIGN TABLE ft2 ALTER COLUMN c1 OPTIONS (column_name 'C 1');
 \det+
-                             List of foreign tables
- Schema | Table |  Server  |              FDW Options              | Description 
---------+-------+----------+---------------------------------------+-------------
- public | ft1   | loopback | (schema_name 'S 1', table_name 'T 1') | 
- public | ft2   | loopback | (schema_name 'S 1', table_name 'T 1') | 
-(2 rows)
+                              List of foreign tables
+ Schema | Table |  Server   |              FDW Options              | Description 
+--------+-------+-----------+---------------------------------------+-------------
+ public | ft1   | loopback  | (schema_name 'S 1', table_name 'T 1') | 
+ public | ft2   | loopback  | (schema_name 'S 1', table_name 'T 1') | 
+ public | ft4   | loopback  | (schema_name 'S 1', table_name 'T 3') | 
+ public | ft5   | loopback  | (schema_name 'S 1', table_name 'T 4') | 
+ public | ft6   | loopback2 | (schema_name 'S 1', table_name 'T 4') | 
+(5 rows)
 
 -- Now we should be able to run ANALYZE.
 -- To exercise multiple code paths, we use local stats on ft1
@@ -160,8 +214,8 @@ SELECT * FROM ft1 ORDER BY c3, c1 OFFSET 100 LIMIT 10;
 (10 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                     QUERY PLAN                                                      
+---------------------------------------------------------------------------------------------------------------------
  Limit
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    ->  Sort
@@ -169,7 +223,7 @@ EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan on public.ft1 t1
                Output: c1, c2, c3, c4, c5, c6, c7, c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (8 rows)
 
 SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
@@ -189,8 +243,8 @@ SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
 
 -- whole-row reference
 EXPLAIN (VERBOSE, COSTS false) SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                     QUERY PLAN                                                      
+---------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.*, c3, c1
    ->  Sort
@@ -198,7 +252,7 @@ EXPLAIN (VERBOSE, COSTS false) SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSE
          Sort Key: t1.c3, t1.c1
          ->  Foreign Scan on public.ft1 t1
                Output: t1.*, c3, c1
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (8 rows)
 
 SELECT t1 FROM ft1 t1 ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
@@ -224,11 +278,11 @@ SELECT * FROM ft1 WHERE false;
 
 -- with WHERE clause
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
-                                                                   QUERY PLAN                                                                   
-------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                   QUERY PLAN                                                                                   
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c7 >= '1'::bpchar)) AND (("C 1" = 101)) AND ((c6 = '1'::text))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c7 >= '1'::bpchar)) AND (("C 1" = 101)) AND ((c6 = '1'::text))
 (3 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
@@ -239,13 +293,13 @@ SELECT * FROM ft1 t1 WHERE t1.c1 = 101 AND t1.c6 = '1' AND t1.c7 >= '1';
 
 -- with FOR UPDATE/SHARE
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
-                                                   QUERY PLAN                                                   
-----------------------------------------------------------------------------------------------------------------
+                                                                   QUERY PLAN                                                                   
+------------------------------------------------------------------------------------------------------------------------------------------------
  LockRows
    Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 101)) FOR UPDATE
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 101)) FOR UPDATE
 (5 rows)
 
 SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
@@ -255,13 +309,13 @@ SELECT * FROM ft1 t1 WHERE c1 = 101 FOR UPDATE;
 (1 row)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
-                                                  QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
+                                                                  QUERY PLAN                                                                   
+-----------------------------------------------------------------------------------------------------------------------------------------------
  LockRows
    Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8, t1.*
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 102)) FOR SHARE
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 102)) FOR SHARE
 (5 rows)
 
 SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
@@ -277,22 +331,6 @@ SELECT COUNT(*) FROM ft1 t1;
   1000
 (1 row)
 
--- join two tables
-SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
- c1  
------
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
-(10 rows)
-
 -- subquery
 SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <= 10) ORDER BY c1;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -353,153 +391,149 @@ CREATE OPERATOR === (
     NEGATOR = !==
 );
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = postgres_fdw_abs(t1.c2);
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c1 = postgres_fdw_abs(t1.c2))
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 === t1.c2;
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c1 === t1.c2)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = abs(t1.c2);
-                                            QUERY PLAN                                             
----------------------------------------------------------------------------------------------------
+                                                            QUERY PLAN                                                             
+-----------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = abs(c2)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = abs(c2)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = t1.c2;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
+                                                          QUERY PLAN                                                          
+------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = c2))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = c2))
 (3 rows)
 
 -- ===================================================================
 -- WHERE with remotely-executable conditions
 -- ===================================================================
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 1;         -- Var, OpExpr(b), Const
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
-                                                  QUERY PLAN                                                  
---------------------------------------------------------------------------------------------------------------
+                                                                  QUERY PLAN                                                                  
+----------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 100)) AND ((c2 = 0))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 100)) AND ((c2 = 0))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NULL;        -- NullTest
-                                           QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
+                                                           QUERY PLAN                                                            
+---------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;    -- NullTest
-                                             QUERY PLAN                                              
------------------------------------------------------------------------------------------------------
+                                                             QUERY PLAN                                                              
+-------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
-                                                     QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
+                                                                     QUERY PLAN                                                                      
+-----------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((round(abs("C 1"), 0) = 1::numeric))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((round(abs("C 1"), 0) = 1::numeric))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = -c1;          -- OpExpr(l)
-                                             QUERY PLAN                                              
------------------------------------------------------------------------------------------------------
+                                                             QUERY PLAN                                                              
+-------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = (- "C 1")))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = (- "C 1")))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE 1 = c1!;           -- OpExpr(r)
-                                                QUERY PLAN                                                
-----------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((1::numeric = ("C 1" !)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((1::numeric = ("C 1" !)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE (c1 IS NOT NULL) IS DISTINCT FROM (c1 IS NOT NULL); -- DistinctExpr
-                                                                 QUERY PLAN                                                                 
---------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                 QUERY PLAN                                                                                 
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" IS NOT NULL) IS DISTINCT FROM ("C 1" IS NOT NULL)))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" IS NOT NULL) IS DISTINCT FROM ("C 1" IS NOT NULL)))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = ANY(ARRAY[c2, 1, c1 + 0]); -- ScalarArrayOpExpr
-                                                        QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
+                                                                        QUERY PLAN                                                                         
+-----------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = ANY (ARRAY[c2, 1, ("C 1" + 0)])))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = ANY (ARRAY[c2, 1, ("C 1" + 0)])))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = (ARRAY[c1,c2,3])[1]; -- ArrayRef
-                                                      QUERY PLAN                                                      
-----------------------------------------------------------------------------------------------------------------------
+                                                                      QUERY PLAN                                                                      
+------------------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = ((ARRAY["C 1", c2, 3])[1])))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = ((ARRAY["C 1", c2, 3])[1])))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c6 = E'foo''s\\bar';  -- check special chars
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                  
+---------------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c6 = E'foo''s\\bar'::text))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c6 = E'foo''s\\bar'::text))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c8 = 'foo';  -- can't be sent to remote
-                               QUERY PLAN                                
--------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (4 rows)
 
 -- parameterized remote path
 EXPLAIN (VERBOSE, COSTS false)
   SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
- Nested Loop
+                                                                                                                                                                                                                                                                                     QUERY PLAN                                                                                                                                                                                                                                                                                      
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan
    Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8, b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
-   ->  Foreign Scan on public.ft2 a
-         Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 47))
-   ->  Foreign Scan on public.ft2 b
-         Output: b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
-(8 rows)
+   Relations: (public.ft2 a) INNER JOIN (public.ft2 b)
+   Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, r.a1, r.a2, r.a3, r.a4, r.a5, r.a6, r.a7, r.a8 FROM (SELECT l.a9, l.a10, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 47))) l) l (a1, a2, a3, a4, a5, a6, a7, a8) INNER JOIN (SELECT r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2, a3, a4, a5, a6, a7, a8) ON ((l.a2 = r.a1))
+(4 rows)
 
 SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  | c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -511,18 +545,18 @@ SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
 EXPLAIN (VERBOSE, COSTS false)
   SELECT * FROM ft2 a, ft2 b
   WHERE a.c2 = 6 AND b.c1 = a.c1 AND a.c8 = 'foo' AND b.c7 = upper(a.c7);
-                                                 QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                 
+--------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop
    Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8, b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
    ->  Foreign Scan on public.ft2 a
          Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8
          Filter: (a.c8 = 'foo'::user_enum)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c2 = 6))
+         Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((c2 = 6))
    ->  Foreign Scan on public.ft2 b
          Output: b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
          Filter: (upper((a.c7)::text) = (b.c7)::text)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
+         Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
 (10 rows)
 
 SELECT * FROM ft2 a, ft2 b
@@ -651,21 +685,597 @@ SELECT * FROM ft2 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft1 WHERE c1 < 5));
 (4 rows)
 
 -- ===================================================================
+-- JOIN queries
+-- ===================================================================
+-- join two tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                                                                                               QUERY PLAN                                                                                                                
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1, t1.c3
+               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+               Remote SQL: SELECT l.a1, l.a2, r.a1 FROM (SELECT l.a10, l.a12 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l) l (a1, a2) INNER JOIN (SELECT r.a9 FROM (SELECT "C 1" a9 FROM "S 1"."T 1") r) r (a1) ON ((l.a1 = r.a1))
+(9 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1  | c1  
+-----+-----
+ 101 | 101
+ 102 | 102
+ 103 | 103
+ 104 | 104
+ 105 | 105
+ 106 | 106
+ 107 | 107
+ 108 | 108
+ 109 | 109
+ 110 | 110
+(10 rows)
+
+-- join three tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+                                                                                                                                                                                                              QUERY PLAN                                                                                                                                                                                                               
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c2, t3.c3, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c2, t3.c3, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c2, t3.c3, t1.c3
+               Relations: ((public.ft1 t1) INNER JOIN (public.ft2 t2)) INNER JOIN (public.ft4 t3)
+               Remote SQL: SELECT l.a1, l.a2, l.a3, r.a1 FROM (SELECT l.a1, l.a2, r.a1, r.a2 FROM (SELECT l.a10, l.a12 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l) l (a1, a2) INNER JOIN (SELECT r.a10, r.a9 FROM (SELECT "C 1" a9, c2 a10 FROM "S 1"."T 1") r) r (a1, a2) ON ((l.a1 = r.a2))) l (a1, a2, a3, a4) INNER JOIN (SELECT r.a11, r.a9 FROM (SELECT c1 a9, c3 a11 FROM "S 1"."T 3") r) r (a1, a2) ON ((l.a1 = r.a2))
+(9 rows)
+
+SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+ c1 | c2 |   c3   
+----+----+--------
+ 22 |  2 | AAA022
+ 24 |  4 | AAA024
+ 26 |  6 | AAA026
+ 28 |  8 | AAA028
+ 30 |  0 | AAA030
+ 32 |  2 | AAA032
+ 34 |  4 | AAA034
+ 36 |  6 | AAA036
+ 38 |  8 | AAA038
+ 40 |  0 | AAA040
+(10 rows)
+
+-- left outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                                              QUERY PLAN                                                                                               
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Relations: (public.ft4 t1) LEFT JOIN (public.ft5 t2)
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) LEFT JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1))
+(9 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 22 |   
+ 24 | 24
+ 26 |   
+ 28 |   
+ 30 | 30
+ 32 |   
+ 34 |   
+ 36 | 36
+ 38 |   
+ 40 |   
+(10 rows)
+
+-- right outer join
+SET enable_mergejoin = off; -- planner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                                              QUERY PLAN                                                                                               
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Relations: (public.ft5 t2) LEFT JOIN (public.ft4 t1)
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") l) l (a1) LEFT JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") r) r (a1) ON ((r.a1 = l.a1))
+(9 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+    | 33
+ 36 | 36
+    | 39
+ 42 | 42
+    | 45
+ 48 | 48
+    | 51
+ 54 | 54
+    | 57
+ 60 | 60
+(10 rows)
+
+SET enable_mergejoin = on;
+-- full outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+                                                                                              QUERY PLAN                                                                                               
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Relations: (public.ft4 t1) FULL JOIN (public.ft5 t2)
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) FULL JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1))
+(9 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+ c1  | c1 
+-----+----
+  92 |   
+  94 |   
+  96 | 96
+  98 |   
+ 100 |   
+     |  3
+     |  9
+     | 15
+     | 21
+     | 27
+(10 rows)
+
+-- full outer join + WHERE clause, only matched rows
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                                                                   QUERY PLAN                                                                                                                    
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Relations: (public.ft4 t1) FULL JOIN (public.ft5 t2)
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) FULL JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1)) WHERE (((l.a1 = r.a1) OR (l.a1 IS NULL)))
+(9 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 66 | 66
+ 72 | 72
+ 78 | 78
+ 84 | 84
+ 90 | 90
+ 96 | 96
+    |  3
+    |  9
+    | 15
+    | 21
+(10 rows)
+
+-- join at WHERE clause 
+SET enable_mergejoin = off; -- planner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+                                                                                               QUERY PLAN                                                                                               
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1
+         ->  Foreign Scan
+               Output: t1.c1, t2.c1
+               Relations: (public.ft4 t1) INNER JOIN (public.ft5 t2)
+               Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") l) l (a1) INNER JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 4") r) r (a1) ON ((l.a1 = r.a1))
+(9 rows)
+
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+ c1 | c1 
+----+----
+ 66 | 66
+ 72 | 72
+ 78 | 78
+ 84 | 84
+ 90 | 90
+ 96 | 96
+(6 rows)
+
+SET enable_mergejoin = on;
+-- join in CTE
+EXPLAIN (COSTS false, VERBOSE)
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+                                                                                                             QUERY PLAN                                                                                                              
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t.c1_1, t.c2_1, t.c1_3
+   CTE t
+     ->  Foreign Scan
+           Output: t1.c1, t1.c3, t2.c1
+           Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+           Remote SQL: SELECT l.a1, l.a2, r.a1 FROM (SELECT l.a10, l.a12 FROM (SELECT "C 1" a10, c3 a12 FROM "S 1"."T 1") l) l (a1, a2) INNER JOIN (SELECT r.a9 FROM (SELECT "C 1" a9 FROM "S 1"."T 1") r) r (a1) ON ((l.a1 = r.a1))
+   ->  Sort
+         Output: t.c1_1, t.c2_1, t.c1_3
+         Sort Key: t.c1_3, t.c1_1
+         ->  CTE Scan on t
+               Output: t.c1_1, t.c2_1, t.c1_3
+(12 rows)
+
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+ c1_1 | c2_1 
+------+------
+  101 |  101
+  102 |  102
+  103 |  103
+  104 |  104
+  105 |  105
+  106 |  106
+  107 |  107
+  108 |  108
+  109 |  109
+  110 |  110
+(10 rows)
+
+-- ctid with whole-row reference
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                                                                                                                                                                                                                                   QUERY PLAN                                                                                                                                                                                                                                                    
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+   ->  Sort
+         Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Foreign Scan
+               Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
+               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+               Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, r.a1 FROM (SELECT l.a7, ROW(l.a10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17), l.a10, l.a12 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17, ctid a7 FROM "S 1"."T 1") l) l (a1, a2, a3, a4) INNER JOIN (SELECT ROW(r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a9 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2) ON ((l.a3 = r.a2))
+(9 rows)
+
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+  ctid  |                                             t1                                             |                                             t2                                             | c1  
+--------+--------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------+-----
+ (1,4)  | (101,1,00101,"Fri Jan 02 00:00:00 1970 PST","Fri Jan 02 00:00:00 1970",1,"1         ",foo) | (101,1,00101,"Fri Jan 02 00:00:00 1970 PST","Fri Jan 02 00:00:00 1970",1,"1         ",foo) | 101
+ (1,5)  | (102,2,00102,"Sat Jan 03 00:00:00 1970 PST","Sat Jan 03 00:00:00 1970",2,"2         ",foo) | (102,2,00102,"Sat Jan 03 00:00:00 1970 PST","Sat Jan 03 00:00:00 1970",2,"2         ",foo) | 102
+ (1,6)  | (103,3,00103,"Sun Jan 04 00:00:00 1970 PST","Sun Jan 04 00:00:00 1970",3,"3         ",foo) | (103,3,00103,"Sun Jan 04 00:00:00 1970 PST","Sun Jan 04 00:00:00 1970",3,"3         ",foo) | 103
+ (1,7)  | (104,4,00104,"Mon Jan 05 00:00:00 1970 PST","Mon Jan 05 00:00:00 1970",4,"4         ",foo) | (104,4,00104,"Mon Jan 05 00:00:00 1970 PST","Mon Jan 05 00:00:00 1970",4,"4         ",foo) | 104
+ (1,8)  | (105,5,00105,"Tue Jan 06 00:00:00 1970 PST","Tue Jan 06 00:00:00 1970",5,"5         ",foo) | (105,5,00105,"Tue Jan 06 00:00:00 1970 PST","Tue Jan 06 00:00:00 1970",5,"5         ",foo) | 105
+ (1,9)  | (106,6,00106,"Wed Jan 07 00:00:00 1970 PST","Wed Jan 07 00:00:00 1970",6,"6         ",foo) | (106,6,00106,"Wed Jan 07 00:00:00 1970 PST","Wed Jan 07 00:00:00 1970",6,"6         ",foo) | 106
+ (1,10) | (107,7,00107,"Thu Jan 08 00:00:00 1970 PST","Thu Jan 08 00:00:00 1970",7,"7         ",foo) | (107,7,00107,"Thu Jan 08 00:00:00 1970 PST","Thu Jan 08 00:00:00 1970",7,"7         ",foo) | 107
+ (1,11) | (108,8,00108,"Fri Jan 09 00:00:00 1970 PST","Fri Jan 09 00:00:00 1970",8,"8         ",foo) | (108,8,00108,"Fri Jan 09 00:00:00 1970 PST","Fri Jan 09 00:00:00 1970",8,"8         ",foo) | 108
+ (1,12) | (109,9,00109,"Sat Jan 10 00:00:00 1970 PST","Sat Jan 10 00:00:00 1970",9,"9         ",foo) | (109,9,00109,"Sat Jan 10 00:00:00 1970 PST","Sat Jan 10 00:00:00 1970",9,"9         ",foo) | 109
+ (1,13) | (110,0,00110,"Sun Jan 11 00:00:00 1970 PST","Sun Jan 11 00:00:00 1970",0,"0         ",foo) | (110,0,00110,"Sun Jan 11 00:00:00 1970 PST","Sun Jan 11 00:00:00 1970",0,"0         ",foo) | 110
+(10 rows)
+
+-- partially unsafe to push down, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+                                                                                                               QUERY PLAN                                                                                                                
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Nested Loop
+               Output: t1.c1
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Materialize
+                     ->  Foreign Scan
+                           Relations: (public.ft2 t2) INNER JOIN (public.ft4 t3)
+                           Remote SQL: SELECT NULL FROM (SELECT l.a9 FROM (SELECT "C 1" a9 FROM "S 1"."T 1" WHERE (("C 1" = "C 1"))) l) l (a1) INNER JOIN (SELECT r.a9 FROM (SELECT c1 a9 FROM "S 1"."T 3") r) r (a1) ON ((l.a1 = r.a1))
+(14 rows)
+
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+ c1 
+----
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+  1
+(10 rows)
+
+-- SEMI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Hash Join
+               Output: t1.c1
+               Hash Cond: (t1.c1 = t2.c1)
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t2.c1
+                     ->  HashAggregate
+                           Output: t2.c1
+                           Group Key: t2.c1
+                           ->  Foreign Scan on public.ft2 t2
+                                 Output: t2.c1
+                                 Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+(19 rows)
+
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+ c1  
+-----
+ 101
+ 102
+ 103
+ 104
+ 105
+ 106
+ 107
+ 108
+ 109
+ 110
+(10 rows)
+
+-- ANTI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+                              QUERY PLAN                              
+----------------------------------------------------------------------
+ Limit
+   Output: t1.c1
+   ->  Sort
+         Output: t1.c1
+         Sort Key: t1.c1
+         ->  Hash Anti Join
+               Output: t1.c1
+               Hash Cond: (t1.c1 = t2.c2)
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t2.c2
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c2
+                           Remote SQL: SELECT c2 a10 FROM "S 1"."T 1"
+(16 rows)
+
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+ c1  
+-----
+ 110
+ 111
+ 112
+ 113
+ 114
+ 115
+ 116
+ 117
+ 118
+ 119
+(10 rows)
+
+-- CROSS JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Sort
+         Output: t1.c1, t2.c1
+         Sort Key: t1.c1, t2.c1
+         ->  Nested Loop
+               Output: t1.c1, t2.c1
+               ->  Foreign Scan on public.ft1 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT "C 1" a10 FROM "S 1"."T 1"
+               ->  Materialize
+                     Output: t2.c1
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c1
+                           Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+(15 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1  
+----+-----
+  1 | 101
+  1 | 102
+  1 | 103
+  1 | 104
+  1 | 105
+  1 | 106
+  1 | 107
+  1 | 108
+  1 | 109
+  1 | 110
+(10 rows)
+
+-- different server
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1
+   ->  Merge Join
+         Output: t1.c1, t2.c1
+         Merge Cond: (t1.c1 = t2.c1)
+         ->  Sort
+               Output: t1.c1
+               Sort Key: t1.c1
+               ->  Foreign Scan on public.ft5 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+         ->  Sort
+               Output: t2.c1
+               Sort Key: t2.c1
+               ->  Foreign Scan on public.ft6 t2
+                     Output: t2.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1 
+----+----
+(0 rows)
+
+-- different effective user for permission check
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Limit
+   Output: t1.c1, ft5.c1
+   ->  Merge Join
+         Output: t1.c1, ft5.c1
+         Merge Cond: (t1.c1 = ft5.c1)
+         ->  Sort
+               Output: t1.c1
+               Sort Key: t1.c1
+               ->  Foreign Scan on public.ft5 t1
+                     Output: t1.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+         ->  Sort
+               Output: ft5.c1
+               Sort Key: ft5.c1
+               ->  Foreign Scan on public.ft5
+                     Output: ft5.c1
+                     Remote SQL: SELECT c1 a9 FROM "S 1"."T 4"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+ c1 | c1 
+----+----
+(0 rows)
+
+-- unsafe join conditions
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Merge Join
+               Output: t1.c1, t2.c1, t1.c3
+               Merge Cond: (t1.c8 = t2.c8)
+               ->  Sort
+                     Output: t1.c1, t1.c3, t1.c8
+                     Sort Key: t1.c8
+                     ->  Foreign Scan on public.ft1 t1
+                           Output: t1.c1, t1.c3, t1.c8
+                           Remote SQL: SELECT "C 1" a10, c3 a12, c8 a17 FROM "S 1"."T 1"
+               ->  Sort
+                     Output: t2.c1, t2.c8
+                     Sort Key: t2.c8
+                     ->  Foreign Scan on public.ft2 t2
+                           Output: t2.c1, t2.c8
+                           Remote SQL: SELECT "C 1" a9, c8 a17 FROM "S 1"."T 1"
+(20 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1 | c1  
+----+-----
+  1 | 102
+  1 | 103
+  1 | 104
+  1 | 105
+  1 | 106
+  1 | 107
+  1 | 108
+  1 | 109
+  1 | 110
+  1 |   1
+(10 rows)
+
+-- local filter (unsafe conditions on one side)
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
+ Limit
+   Output: t1.c1, t2.c1, t1.c3
+   ->  Sort
+         Output: t1.c1, t2.c1, t1.c3
+         Sort Key: t1.c3, t1.c1
+         ->  Hash Join
+               Output: t1.c1, t2.c1, t1.c3
+               Hash Cond: (t2.c1 = t1.c1)
+               ->  Foreign Scan on public.ft2 t2
+                     Output: t2.c1
+                     Remote SQL: SELECT "C 1" a9 FROM "S 1"."T 1"
+               ->  Hash
+                     Output: t1.c1, t1.c3
+                     ->  Foreign Scan on public.ft1 t1
+                           Output: t1.c1, t1.c3
+                           Filter: (t1.c8 = 'foo'::user_enum)
+                           Remote SQL: SELECT "C 1" a10, c3 a12, c8 a17 FROM "S 1"."T 1"
+(17 rows)
+
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ c1  | c1  
+-----+-----
+ 101 | 101
+ 102 | 102
+ 103 | 103
+ 104 | 104
+ 105 | 105
+ 106 | 106
+ 107 | 107
+ 108 | 108
+ 109 | 109
+ 110 | 110
+(10 rows)
+
+-- ===================================================================
 -- parameterized queries
 -- ===================================================================
 -- simple join
 PREPARE st1(int, int) AS SELECT t1.c3, t2.c3 FROM ft1 t1, ft2 t2 WHERE t1.c1 = $1 AND t2.c1 = $2;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st1(1, 2);
-                             QUERY PLAN                             
---------------------------------------------------------------------
+                               QUERY PLAN                               
+------------------------------------------------------------------------
  Nested Loop
    Output: t1.c3, t2.c3
    ->  Foreign Scan on public.ft1 t1
          Output: t1.c3
-         Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+         Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" = 1))
    ->  Foreign Scan on public.ft2 t2
          Output: t2.c3
-         Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" = 2))
+         Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" = 2))
 (8 rows)
 
 EXECUTE st1(1, 1);
@@ -683,8 +1293,8 @@ EXECUTE st1(101, 101);
 -- subquery using stable function (can't be sent to remote)
 PREPARE st2(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c4) = '1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st2(10, 20);
-                                                QUERY PLAN                                                
-----------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Sort
    Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
    Sort Key: t1.c1
@@ -693,13 +1303,13 @@ EXPLAIN (VERBOSE, COSTS false) EXECUTE st2(10, 20);
          Join Filter: (t1.c3 = t2.c3)
          ->  Foreign Scan on public.ft1 t1
                Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20))
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" < 20))
          ->  Materialize
                Output: t2.c3
                ->  Foreign Scan on public.ft2 t2
                      Output: t2.c3
                      Filter: (date(t2.c4) = '01-17-1970'::date)
-                     Remote SQL: SELECT c3, c4 FROM "S 1"."T 1" WHERE (("C 1" > 10))
+                     Remote SQL: SELECT c3 a12, c4 a13 FROM "S 1"."T 1" WHERE (("C 1" > 10))
 (15 rows)
 
 EXECUTE st2(10, 20);
@@ -717,8 +1327,8 @@ EXECUTE st2(101, 121);
 -- subquery using immutable function (can be sent to remote)
 PREPARE st3(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c5) = '1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st3(10, 20);
-                                                      QUERY PLAN                                                       
------------------------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                
+------------------------------------------------------------------------------------------------------------------------------------------
  Sort
    Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
    Sort Key: t1.c1
@@ -727,12 +1337,12 @@ EXPLAIN (VERBOSE, COSTS false) EXECUTE st3(10, 20);
          Join Filter: (t1.c3 = t2.c3)
          ->  Foreign Scan on public.ft1 t1
                Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20))
+               Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" < 20))
          ->  Materialize
                Output: t2.c3
                ->  Foreign Scan on public.ft2 t2
                      Output: t2.c3
-                     Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date))
+                     Remote SQL: SELECT c3 a12 FROM "S 1"."T 1" WHERE (("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date))
 (14 rows)
 
 EXECUTE st3(10, 20);
@@ -749,108 +1359,108 @@ EXECUTE st3(20, 30);
 -- custom plan should be chosen initially
 PREPARE st4(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 = $1;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (3 rows)
 
 -- once we try it enough times, should switch to generic plan
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st4(1);
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
 (3 rows)
 
 -- value of $1 should not be sent to remote
 PREPARE st5(user_enum,int) AS SELECT * FROM ft1 t1 WHERE c8 = $1 and c1 = $2;
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = 'foo'::user_enum)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 1))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = 1))
 (4 rows)
 
 EXPLAIN (VERBOSE, COSTS false) EXECUTE st5('foo', 1);
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    Filter: (t1.c8 = $1)
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (("C 1" = $1::integer))
 (4 rows)
 
 EXECUTE st5('foo', 1);
@@ -868,14 +1478,14 @@ DEALLOCATE st5;
 -- System columns, except ctid, should not be sent to remote
 EXPLAIN (VERBOSE, COSTS false)
 SELECT * FROM ft1 t1 WHERE t1.tableoid = 'pg_class'::regclass LIMIT 1;
-                                  QUERY PLAN                                   
--------------------------------------------------------------------------------
+                                                  QUERY PLAN                                                   
+---------------------------------------------------------------------------------------------------------------
  Limit
    Output: c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: c1, c2, c3, c4, c5, c6, c7, c8
          Filter: (t1.tableoid = '1259'::oid)
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (6 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.tableoid = 'ft1'::regclass LIMIT 1;
@@ -886,13 +1496,13 @@ SELECT * FROM ft1 t1 WHERE t1.tableoid = 'ft1'::regclass LIMIT 1;
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
-                                  QUERY PLAN                                   
--------------------------------------------------------------------------------
+                                                  QUERY PLAN                                                   
+---------------------------------------------------------------------------------------------------------------
  Limit
    Output: ((tableoid)::regclass), c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: (tableoid)::regclass, c1, c2, c3, c4, c5, c6, c7, c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1"
 (5 rows)
 
 SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
@@ -903,11 +1513,11 @@ SELECT tableoid::regclass, * FROM ft1 t1 LIMIT 1;
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
-                                              QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN                                                               
+---------------------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((ctid = '(0,2)'::tid))
+   Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((ctid = '(0,2)'::tid))
 (3 rows)
 
 SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
@@ -918,13 +1528,13 @@ SELECT * FROM ft1 t1 WHERE t1.ctid = '(0,2)';
 
 EXPLAIN (VERBOSE, COSTS false)
 SELECT ctid, * FROM ft1 t1 LIMIT 1;
-                                     QUERY PLAN                                      
--------------------------------------------------------------------------------------
+                                                       QUERY PLAN                                                       
+------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: ctid, c1, c2, c3, c4, c5, c6, c7, c8
    ->  Foreign Scan on public.ft1 t1
          Output: ctid, c1, c2, c3, c4, c5, c6, c7, c8
-         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1"
+         Remote SQL: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17, ctid a7 FROM "S 1"."T 1"
 (5 rows)
 
 SELECT ctid, * FROM ft1 t1 LIMIT 1;
@@ -987,7 +1597,7 @@ FETCH c;
 SAVEPOINT s;
 SELECT * FROM ft1 WHERE 1 / (c1 - 1) > 0;  -- ERROR
 ERROR:  division by zero
-CONTEXT:  Remote SQL command: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (((1 / ("C 1" - 1)) > 0))
+CONTEXT:  Remote SQL command: SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE (((1 / ("C 1" - 1)) > 0))
 ROLLBACK TO s;
 FETCH c;
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
@@ -1010,64 +1620,64 @@ create foreign table ft3 (f1 text collate "C", f2 text)
   server loopback options (table_name 'loct3');
 -- can be sent to remote
 explain (verbose, costs off) select * from ft3 where f1 = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f1 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f1 = 'foo'::text))
 (3 rows)
 
 explain (verbose, costs off) select * from ft3 where f1 COLLATE "C" = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f1 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f1 = 'foo'::text))
 (3 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 = 'foo';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
-   Remote SQL: SELECT f1, f2 FROM public.loct3 WHERE ((f2 = 'foo'::text))
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3 WHERE ((f2 = 'foo'::text))
 (3 rows)
 
 -- can't be sent to remote
 explain (verbose, costs off) select * from ft3 where f1 COLLATE "POSIX" = 'foo';
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: ((ft3.f1)::text = 'foo'::text)
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f1 = 'foo' COLLATE "C";
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: (ft3.f1 = 'foo'::text COLLATE "C")
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 COLLATE "C" = 'foo';
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: ((ft3.f2)::text = 'foo'::text)
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 explain (verbose, costs off) select * from ft3 where f2 = 'foo' COLLATE "C";
-                  QUERY PLAN                   
------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Foreign Scan on public.ft3
    Output: f1, f2
    Filter: (ft3.f2 = 'foo'::text COLLATE "C")
-   Remote SQL: SELECT f1, f2 FROM public.loct3
+   Remote SQL: SELECT f1 a9, f2 a10 FROM public.loct3
 (4 rows)
 
 -- ===================================================================
@@ -1085,7 +1695,7 @@ INSERT INTO ft2 (c1,c2,c3) SELECT c1+1000,c2+100, c3 || c3 FROM ft2 LIMIT 20;
                Output: ((ft2_1.c1 + 1000)), ((ft2_1.c2 + 100)), ((ft2_1.c3 || ft2_1.c3))
                ->  Foreign Scan on public.ft2 ft2_1
                      Output: (ft2_1.c1 + 1000), (ft2_1.c2 + 100), (ft2_1.c3 || ft2_1.c3)
-                     Remote SQL: SELECT "C 1", c2, c3 FROM "S 1"."T 1"
+                     Remote SQL: SELECT "C 1" a9, c2 a10, c3 a12 FROM "S 1"."T 1"
 (9 rows)
 
 INSERT INTO ft2 (c1,c2,c3) SELECT c1+1000,c2+100, c3 || c3 FROM ft2 LIMIT 20;
@@ -1210,35 +1820,28 @@ UPDATE ft2 SET c2 = c2 + 400, c3 = c3 || '_update7' WHERE c1 % 10 = 7 RETURNING
 EXPLAIN (verbose, costs off)
 UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
-                                                                            QUERY PLAN                                                                             
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                                                                                                                       QUERY PLAN                                                                                                                                                                                                                                                                       
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Update on public.ft2
    Remote SQL: UPDATE "S 1"."T 1" SET c2 = $2, c3 = $3, c7 = $4 WHERE ctid = $1
-   ->  Hash Join
+   ->  Foreign Scan
          Output: ft2.c1, (ft2.c2 + 500), NULL::integer, (ft2.c3 || '_update9'::text), ft2.c4, ft2.c5, ft2.c6, 'ft2       '::character(10), ft2.c8, ft2.ctid, ft1.*
-         Hash Cond: (ft2.c2 = ft1.c1)
-         ->  Foreign Scan on public.ft2
-               Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c8, ft2.ctid
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c8, ctid FROM "S 1"."T 1" FOR UPDATE
-         ->  Hash
-               Output: ft1.*, ft1.c1
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.*, ft1.c1
-                     Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 9))
-(13 rows)
+         Relations: (public.ft2) INNER JOIN (public.ft1)
+         Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, r.a1 FROM (SELECT l.a9, l.a10, l.a12, l.a13, l.a14, l.a15, l.a17, l.a7 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c8 a17, ctid a7 FROM "S 1"."T 1" FOR UPDATE) l) l (a1, a2, a3, a4, a5, a6, a7, a8) INNER JOIN (SELECT ROW(r.a10, r.a11, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a10 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 9))) r) r (a1, a2) ON ((l.a2 = r.a2))
+(6 rows)
 
 UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
 EXPLAIN (verbose, costs off)
   DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
-                                       QUERY PLAN                                       
-----------------------------------------------------------------------------------------
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
  Delete on public.ft2
    Output: c1, c4
-   Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1 RETURNING "C 1", c4
+   Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1 RETURNING "C 1" a9, c4 a13
    ->  Foreign Scan on public.ft2
          Output: ctid
-         Remote SQL: SELECT ctid FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 5)) FOR UPDATE
+         Remote SQL: SELECT ctid a7 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 5)) FOR UPDATE
 (6 rows)
 
 DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
@@ -1351,22 +1954,15 @@ DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
 
 EXPLAIN (verbose, costs off)
 DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
-                                                      QUERY PLAN                                                      
-----------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                                        QUERY PLAN                                                                                                                                                                                         
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Delete on public.ft2
    Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1
-   ->  Hash Join
+   ->  Foreign Scan
          Output: ft2.ctid, ft1.*
-         Hash Cond: (ft2.c2 = ft1.c1)
-         ->  Foreign Scan on public.ft2
-               Output: ft2.ctid, ft2.c2
-               Remote SQL: SELECT c2, ctid FROM "S 1"."T 1" FOR UPDATE
-         ->  Hash
-               Output: ft1.*, ft1.c1
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.*, ft1.c1
-                     Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 2))
-(13 rows)
+         Relations: (public.ft2) INNER JOIN (public.ft1)
+         Remote SQL: SELECT l.a1, r.a1 FROM (SELECT l.a7, l.a10 FROM (SELECT c2 a10, ctid a7 FROM "S 1"."T 1" FOR UPDATE) l) l (a1, a2) INNER JOIN (SELECT ROW(r.a10, r.a11, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a10 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 2))) r) r (a1, a2) ON ((l.a2 = r.a2))
+(6 rows)
 
 DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
 SELECT c1,c2,c3,c4 FROM ft2 ORDER BY c1;
@@ -3027,386 +3623,6 @@ NOTICE:  NEW: (13,"test triggered !")
 (1 row)
 
 -- ===================================================================
--- test inheritance features
--- ===================================================================
-CREATE TABLE a (aa TEXT);
-CREATE TABLE loct (aa TEXT, bb TEXT);
-CREATE FOREIGN TABLE b (bb TEXT) INHERITS (a)
-  SERVER loopback OPTIONS (table_name 'loct');
-INSERT INTO a(aa) VALUES('aaa');
-INSERT INTO a(aa) VALUES('aaaa');
-INSERT INTO a(aa) VALUES('aaaaa');
-INSERT INTO b(aa) VALUES('bbb');
-INSERT INTO b(aa) VALUES('bbbb');
-INSERT INTO b(aa) VALUES('bbbbb');
-SELECT tableoid::regclass, * FROM a;
- tableoid |  aa   
-----------+-------
- a        | aaa
- a        | aaaa
- a        | aaaaa
- b        | bbb
- b        | bbbb
- b        | bbbbb
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |  aa   | bb 
-----------+-------+----
- b        | bbb   | 
- b        | bbbb  | 
- b        | bbbbb | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |  aa   
-----------+-------
- a        | aaa
- a        | aaaa
- a        | aaaaa
-(3 rows)
-
-UPDATE a SET aa = 'zzzzzz' WHERE aa LIKE 'aaaa%';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
- b        | bbb
- b        | bbbb
- b        | bbbbb
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |  aa   | bb 
-----------+-------+----
- b        | bbb   | 
- b        | bbbb  | 
- b        | bbbbb | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
-(3 rows)
-
-UPDATE b SET aa = 'new';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
- b        | new
- b        | new
- b        | new
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid | aa  | bb 
-----------+-----+----
- b        | new | 
- b        | new | 
- b        | new | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | aaa
- a        | zzzzzz
- a        | zzzzzz
-(3 rows)
-
-UPDATE a SET aa = 'newtoo';
-SELECT tableoid::regclass, * FROM a;
- tableoid |   aa   
-----------+--------
- a        | newtoo
- a        | newtoo
- a        | newtoo
- b        | newtoo
- b        | newtoo
- b        | newtoo
-(6 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid |   aa   | bb 
-----------+--------+----
- b        | newtoo | 
- b        | newtoo | 
- b        | newtoo | 
-(3 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid |   aa   
-----------+--------
- a        | newtoo
- a        | newtoo
- a        | newtoo
-(3 rows)
-
-DELETE FROM a;
-SELECT tableoid::regclass, * FROM a;
- tableoid | aa 
-----------+----
-(0 rows)
-
-SELECT tableoid::regclass, * FROM b;
- tableoid | aa | bb 
-----------+----+----
-(0 rows)
-
-SELECT tableoid::regclass, * FROM ONLY a;
- tableoid | aa 
-----------+----
-(0 rows)
-
-DROP TABLE a CASCADE;
-NOTICE:  drop cascades to foreign table b
-DROP TABLE loct;
--- Check SELECT FOR UPDATE/SHARE with an inherited source table
-create table loct1 (f1 int, f2 int, f3 int);
-create table loct2 (f1 int, f2 int, f3 int);
-create table foo (f1 int, f2 int);
-create foreign table foo2 (f3 int) inherits (foo)
-  server loopback options (table_name 'loct1');
-create table bar (f1 int, f2 int);
-create foreign table bar2 (f3 int) inherits (bar)
-  server loopback options (table_name 'loct2');
-insert into foo values(1,1);
-insert into foo values(3,3);
-insert into foo2 values(2,2,2);
-insert into foo2 values(4,4,4);
-insert into bar values(1,11);
-insert into bar values(2,22);
-insert into bar values(6,66);
-insert into bar2 values(3,33,33);
-insert into bar2 values(4,44,44);
-insert into bar2 values(7,77,77);
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for update;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
- LockRows
-   Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-   ->  Hash Join
-         Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Append
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.ctid, bar2.tableoid, bar2.*
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(22 rows)
-
-select * from bar where f1 in (select f1 from foo) for update;
- f1 | f2 
-----+----
-  1 | 11
-  2 | 22
-  3 | 33
-  4 | 44
-(4 rows)
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for share;
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
- LockRows
-   Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-   ->  Hash Join
-         Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Append
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid, bar.tableoid, bar.*
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.ctid, bar2.tableoid, bar2.*
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR SHARE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(22 rows)
-
-select * from bar where f1 in (select f1 from foo) for share;
- f1 | f2 
-----+----
-  1 | 11
-  2 | 22
-  3 | 33
-  4 | 44
-(4 rows)
-
--- Check UPDATE with inherited target and an inherited source table
-explain (verbose, costs off)
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
- Update on public.bar
-   Update on public.bar
-   Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
-   ->  Hash Join
-         Output: bar.f1, (bar.f2 + 100), bar.ctid, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar.f1 = foo.f1)
-         ->  Seq Scan on public.bar
-               Output: bar.f1, bar.f2, bar.ctid
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-   ->  Hash Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, foo.ctid, foo.tableoid, foo.*
-         Hash Cond: (bar2.f1 = foo.f1)
-         ->  Foreign Scan on public.bar2
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-               Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Hash
-               Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-               ->  HashAggregate
-                     Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                     Group Key: foo.f1
-                     ->  Append
-                           ->  Seq Scan on public.foo
-                                 Output: foo.ctid, foo.tableoid, foo.*, foo.f1
-                           ->  Foreign Scan on public.foo2
-                                 Output: foo2.ctid, foo2.tableoid, foo2.*, foo2.f1
-                                 Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
-(37 rows)
-
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-select tableoid::regclass, * from bar order by 1,2;
- tableoid | f1 | f2  
-----------+----+-----
- bar      |  1 | 111
- bar      |  2 | 122
- bar      |  6 |  66
- bar2     |  3 | 133
- bar2     |  4 | 144
- bar2     |  7 |  77
-(6 rows)
-
--- Check UPDATE with inherited target and an appendrel subquery
-explain (verbose, costs off)
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-                                      QUERY PLAN                                      
---------------------------------------------------------------------------------------
- Update on public.bar
-   Update on public.bar
-   Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
-   ->  Hash Join
-         Output: bar.f1, (bar.f2 + 100), bar.ctid, (ROW(foo.f1))
-         Hash Cond: (foo.f1 = bar.f1)
-         ->  Append
-               ->  Seq Scan on public.foo
-                     Output: ROW(foo.f1), foo.f1
-               ->  Foreign Scan on public.foo2
-                     Output: ROW(foo2.f1), foo2.f1
-                     Remote SQL: SELECT f1 FROM public.loct1
-               ->  Seq Scan on public.foo foo_1
-                     Output: ROW((foo_1.f1 + 3)), (foo_1.f1 + 3)
-               ->  Foreign Scan on public.foo2 foo2_1
-                     Output: ROW((foo2_1.f1 + 3)), (foo2_1.f1 + 3)
-                     Remote SQL: SELECT f1 FROM public.loct1
-         ->  Hash
-               Output: bar.f1, bar.f2, bar.ctid
-               ->  Seq Scan on public.bar
-                     Output: bar.f1, bar.f2, bar.ctid
-   ->  Merge Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, (ROW(foo.f1))
-         Merge Cond: (bar2.f1 = foo.f1)
-         ->  Sort
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-               Sort Key: bar2.f1
-               ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
-         ->  Sort
-               Output: (ROW(foo.f1)), foo.f1
-               Sort Key: foo.f1
-               ->  Append
-                     ->  Seq Scan on public.foo
-                           Output: ROW(foo.f1), foo.f1
-                     ->  Foreign Scan on public.foo2
-                           Output: ROW(foo2.f1), foo2.f1
-                           Remote SQL: SELECT f1 FROM public.loct1
-                     ->  Seq Scan on public.foo foo_1
-                           Output: ROW((foo_1.f1 + 3)), (foo_1.f1 + 3)
-                     ->  Foreign Scan on public.foo2 foo2_1
-                           Output: ROW((foo2_1.f1 + 3)), (foo2_1.f1 + 3)
-                           Remote SQL: SELECT f1 FROM public.loct1
-(45 rows)
-
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-select tableoid::regclass, * from bar order by 1,2;
- tableoid | f1 | f2  
-----------+----+-----
- bar      |  1 | 211
- bar      |  2 | 222
- bar      |  6 | 166
- bar2     |  3 | 233
- bar2     |  4 | 244
- bar2     |  7 | 177
-(6 rows)
-
--- Test that WHERE CURRENT OF is not supported
-begin;
-declare c cursor for select * from bar where f1 = 7;
-fetch from c;
- f1 | f2  
-----+-----
-  7 | 177
-(1 row)
-
-update bar set f2 = null where current of c;
-ERROR:  WHERE CURRENT OF is not supported for this table type
-rollback;
-drop table foo cascade;
-NOTICE:  drop cascades to foreign table foo2
-drop table bar cascade;
-NOTICE:  drop cascades to foreign table bar2
-drop table loct1;
-drop table loct2;
--- ===================================================================
 -- test IMPORT FOREIGN SCHEMA
 -- ===================================================================
 CREATE SCHEMA import_source;
@@ -3636,3 +3852,6 @@ QUERY:  CREATE FOREIGN TABLE t5 (
 OPTIONS (schema_name 'import_source', table_name 't5');
 CONTEXT:  importing foreign table "t5"
 ROLLBACK;
+-- Cleanup
+DROP OWNED BY view_owner;
+DROP USER view_owner;
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 478e124..de64627 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -28,7 +28,6 @@
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/planmain.h"
-#include "optimizer/prep.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/var.h"
 #include "parser/parsetree.h"
@@ -47,41 +46,8 @@ PG_MODULE_MAGIC;
 #define DEFAULT_FDW_TUPLE_COST		0.01
 
 /*
- * FDW-specific planner information kept in RelOptInfo.fdw_private for a
- * foreign table.  This information is collected by postgresGetForeignRelSize.
- */
-typedef struct PgFdwRelationInfo
-{
-	/* baserestrictinfo clauses, broken down into safe and unsafe subsets. */
-	List	   *remote_conds;
-	List	   *local_conds;
-
-	/* Bitmap of attr numbers we need to fetch from the remote server. */
-	Bitmapset  *attrs_used;
-
-	/* Cost and selectivity of local_conds. */
-	QualCost	local_conds_cost;
-	Selectivity local_conds_sel;
-
-	/* Estimated size and cost for a scan with baserestrictinfo quals. */
-	double		rows;
-	int			width;
-	Cost		startup_cost;
-	Cost		total_cost;
-
-	/* Options extracted from catalogs. */
-	bool		use_remote_estimate;
-	Cost		fdw_startup_cost;
-	Cost		fdw_tuple_cost;
-
-	/* Cached catalog information. */
-	ForeignTable *table;
-	ForeignServer *server;
-	UserMapping *user;			/* only set in use_remote_estimate mode */
-} PgFdwRelationInfo;
-
-/*
- * Indexes of FDW-private information stored in fdw_private lists.
+ * Indexes of FDW-private information stored in fdw_private of ForeignScan of
+ * a simple foreign table scan for a SELECT statement.
  *
  * We store various information in ForeignScan.fdw_private to pass it from
  * planner to executor.  Currently we store:
@@ -98,7 +64,13 @@ enum FdwScanPrivateIndex
 	/* SQL statement to execute remotely (as a String node) */
 	FdwScanPrivateSelectSql,
 	/* Integer list of attribute numbers retrieved by the SELECT */
-	FdwScanPrivateRetrievedAttrs
+	FdwScanPrivateRetrievedAttrs,
+	/* Integer value of server for the scan */
+	FdwScanPrivateServerOid,
+	/* Integer value of effective userid for the scan */
+	FdwScanPrivateUserOid,
+	/* Names of relation scanned, added when the scan is join */
+	FdwScanPrivateRelations,
 };
 
 /*
@@ -128,7 +100,8 @@ enum FdwModifyPrivateIndex
  */
 typedef struct PgFdwScanState
 {
-	Relation	rel;			/* relcache entry for the foreign table */
+	const char *relname;		/* name of relation being scanned */
+	TupleDesc	tupdesc;		/* tuple descriptor of the scan */
 	AttInMetadata *attinmeta;	/* attribute datatype conversion metadata */
 
 	/* extracted fdw_private data */
@@ -194,6 +167,8 @@ typedef struct PgFdwAnalyzeState
 	AttInMetadata *attinmeta;	/* attribute datatype conversion metadata */
 	List	   *retrieved_attrs;	/* attr numbers retrieved by query */
 
+	char	   *query;			/* text of SELECT command */
+
 	/* collected sample rows */
 	HeapTuple  *rows;			/* array of size targrows */
 	int			targrows;		/* target # of sample rows */
@@ -214,7 +189,10 @@ typedef struct PgFdwAnalyzeState
  */
 typedef struct ConversionLocation
 {
-	Relation	rel;			/* foreign table's relcache entry */
+	const char *relname;		/* name of relation being processed, or NULL for
+								   a foreign join */
+	const char *query;			/* query being processed */
+	TupleDesc	tupdesc;		/* tuple descriptor for attribute names */
 	AttrNumber	cur_attno;		/* attribute number being processed, or 0 */
 } ConversionLocation;
 
@@ -288,6 +266,12 @@ static bool postgresAnalyzeForeignTable(Relation relation,
 							BlockNumber *totalpages);
 static List *postgresImportForeignSchema(ImportForeignSchemaStmt *stmt,
 							Oid serverOid);
+static void postgresGetForeignJoinPaths(PlannerInfo *root,
+						   RelOptInfo *joinrel,
+						   RelOptInfo *outerrel,
+						   RelOptInfo *innerrel,
+						   SpecialJoinInfo *sjinfo,
+						   List *restrictlisti);
 
 /*
  * Helper functions
@@ -323,12 +307,40 @@ static void analyze_row_processor(PGresult *res, int row,
 					  PgFdwAnalyzeState *astate);
 static HeapTuple make_tuple_from_result_row(PGresult *res,
 						   int row,
-						   Relation rel,
+						   const char *relname,
+						   const char *query,
+						   TupleDesc tupdesc,
 						   AttInMetadata *attinmeta,
 						   List *retrieved_attrs,
 						   MemoryContext temp_context);
 static void conversion_error_callback(void *arg);
 
+/*
+ * Describe Bitmapset as comma-separated integer list.
+ * For debug purpose.
+ * XXX Can this become a member of bitmapset.c?
+ */
+static char *
+bms_to_str(Bitmapset *bmp)
+{
+	StringInfoData buf;
+	bool		first = true;
+	int			x;
+
+	initStringInfo(&buf);
+
+	x = -1;
+	while ((x = bms_next_member(bmp, x)) >= 0)
+	{
+		if (!first)
+			appendStringInfoString(&buf, ", ");
+		appendStringInfo(&buf, "%d", x);
+
+		first = false;
+	}
+
+	return buf.data;
+}
 
 /*
  * Foreign-data wrapper handler function: return a struct with pointers
@@ -368,6 +380,9 @@ postgres_fdw_handler(PG_FUNCTION_ARGS)
 	/* Support functions for IMPORT FOREIGN SCHEMA */
 	routine->ImportForeignSchema = postgresImportForeignSchema;
 
+	/* Support functions for join push-down */
+	routine->GetForeignJoinPaths = postgresGetForeignJoinPaths;
+
 	PG_RETURN_POINTER(routine);
 }
 
@@ -383,7 +398,9 @@ postgresGetForeignRelSize(PlannerInfo *root,
 						  RelOptInfo *baserel,
 						  Oid foreigntableid)
 {
+	RangeTblEntry *rte;
 	PgFdwRelationInfo *fpinfo;
+	ForeignTable *table;
 	ListCell   *lc;
 
 	/*
@@ -394,8 +411,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	baserel->fdw_private = (void *) fpinfo;
 
 	/* Look up foreign-table catalog info. */
-	fpinfo->table = GetForeignTable(foreigntableid);
-	fpinfo->server = GetForeignServer(fpinfo->table->serverid);
+	table = GetForeignTable(foreigntableid);
+	fpinfo->server = GetForeignServer(table->serverid);
 
 	/*
 	 * Extract user-settable option values.  Note that per-table setting of
@@ -416,7 +433,7 @@ postgresGetForeignRelSize(PlannerInfo *root,
 		else if (strcmp(def->defname, "fdw_tuple_cost") == 0)
 			fpinfo->fdw_tuple_cost = strtod(defGetString(def), NULL);
 	}
-	foreach(lc, fpinfo->table->options)
+	foreach(lc, table->options)
 	{
 		DefElem    *def = (DefElem *) lfirst(lc);
 
@@ -428,20 +445,12 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	}
 
 	/*
-	 * If the table or the server is configured to use remote estimates,
-	 * identify which user to do remote access as during planning.  This
+	 * Identify which user to do remote access as during planning.  This
 	 * should match what ExecCheckRTEPerms() does.  If we fail due to lack of
 	 * permissions, the query would have failed at runtime anyway.
 	 */
-	if (fpinfo->use_remote_estimate)
-	{
-		RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
-		Oid			userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-		fpinfo->user = GetUserMapping(userid, fpinfo->server->serverid);
-	}
-	else
-		fpinfo->user = NULL;
+	rte = planner_rt_fetch(baserel->relid, root);
+	fpinfo->userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
 
 	/*
 	 * Identify which baserestrictinfo clauses can be sent to the remote
@@ -463,10 +472,9 @@ postgresGetForeignRelSize(PlannerInfo *root,
 				   &fpinfo->attrs_used);
 	foreach(lc, fpinfo->local_conds)
 	{
-		RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
+		Expr *expr = (Expr *) lfirst(lc);
 
-		pull_varattnos((Node *) rinfo->clause, baserel->relid,
-					   &fpinfo->attrs_used);
+		pull_varattnos((Node *) expr, baserel->relid, &fpinfo->attrs_used);
 	}
 
 	/*
@@ -752,6 +760,9 @@ postgresGetForeignPlan(PlannerInfo *root,
 	List	   *retrieved_attrs;
 	StringInfoData sql;
 	ListCell   *lc;
+	List	   *fdw_ps_tlist = NIL;
+	ForeignScan *scan;
+	StringInfoData relations;
 
 	/*
 	 * Separate the scan_clauses into those that can be executed remotely and
@@ -768,9 +779,6 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 *
 	 * This code must match "extract_actual_clauses(scan_clauses, false)"
 	 * except for the additional decision about remote versus local execution.
-	 * Note however that we only strip the RestrictInfo nodes from the
-	 * local_exprs list, since appendWhereClause expects a list of
-	 * RestrictInfos.
 	 */
 	foreach(lc, scan_clauses)
 	{
@@ -783,82 +791,37 @@ postgresGetForeignPlan(PlannerInfo *root,
 			continue;
 
 		if (list_member_ptr(fpinfo->remote_conds, rinfo))
-			remote_conds = lappend(remote_conds, rinfo);
+			remote_conds = lappend(remote_conds, rinfo->clause);
 		else if (list_member_ptr(fpinfo->local_conds, rinfo))
 			local_exprs = lappend(local_exprs, rinfo->clause);
 		else if (is_foreign_expr(root, baserel, rinfo->clause))
-			remote_conds = lappend(remote_conds, rinfo);
+			remote_conds = lappend(remote_conds, rinfo->clause);
 		else
 			local_exprs = lappend(local_exprs, rinfo->clause);
 	}
 
 	/*
 	 * Build the query string to be sent for execution, and identify
-	 * expressions to be sent as parameters.
+	 * expressions to be sent as parameters.  If the relation to scan is a join
+	 * relation, receive constructed relations string from deparseSelectSql.
 	 */
 	initStringInfo(&sql);
-	deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used,
-					 &retrieved_attrs);
-	if (remote_conds)
-		appendWhereClause(&sql, root, baserel, remote_conds,
-						  true, &params_list);
-
-	/*
-	 * Add FOR UPDATE/SHARE if appropriate.  We apply locking during the
-	 * initial row fetch, rather than later on as is done for local tables.
-	 * The extra roundtrips involved in trying to duplicate the local
-	 * semantics exactly don't seem worthwhile (see also comments for
-	 * RowMarkType).
-	 *
-	 * Note: because we actually run the query as a cursor, this assumes that
-	 * DECLARE CURSOR ... FOR UPDATE is supported, which it isn't before 8.3.
-	 */
-	if (baserel->relid == root->parse->resultRelation &&
-		(root->parse->commandType == CMD_UPDATE ||
-		 root->parse->commandType == CMD_DELETE))
-	{
-		/* Relation is UPDATE/DELETE target, so use FOR UPDATE */
-		appendStringInfoString(&sql, " FOR UPDATE");
-	}
-	else
-	{
-		PlanRowMark *rc = get_plan_rowmark(root->rowMarks, baserel->relid);
-
-		if (rc)
-		{
-			/*
-			 * Relation is specified as a FOR UPDATE/SHARE target, so handle
-			 * that.  (But we could also see LCS_NONE, meaning this isn't a
-			 * target relation after all.)
-			 *
-			 * For now, just ignore any [NO] KEY specification, since (a) it's
-			 * not clear what that means for a remote table that we don't have
-			 * complete information about, and (b) it wouldn't work anyway on
-			 * older remote servers.  Likewise, we don't worry about NOWAIT.
-			 */
-			switch (rc->strength)
-			{
-				case LCS_NONE:
-					/* No locking needed */
-					break;
-				case LCS_FORKEYSHARE:
-				case LCS_FORSHARE:
-					appendStringInfoString(&sql, " FOR SHARE");
-					break;
-				case LCS_FORNOKEYUPDATE:
-				case LCS_FORUPDATE:
-					appendStringInfoString(&sql, " FOR UPDATE");
-					break;
-			}
-		}
-	}
+	if (baserel->reloptkind == RELOPT_JOINREL)
+		initStringInfo(&relations);
+	deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used, remote_conds,
+					 &params_list, &fdw_ps_tlist, &retrieved_attrs,
+					 baserel->reloptkind == RELOPT_JOINREL ? &relations : NULL);
 
 	/*
-	 * Build the fdw_private list that will be available to the executor.
+	 * Build the fdw_private list that will be available in the executor.
 	 * Items in the list must match enum FdwScanPrivateIndex, above.
 	 */
-	fdw_private = list_make2(makeString(sql.data),
-							 retrieved_attrs);
+	fdw_private = list_make4(makeString(sql.data),
+							 retrieved_attrs,
+							 makeInteger(fpinfo->server->serverid),
+							 makeInteger(fpinfo->userid));
+	if (baserel->reloptkind == RELOPT_JOINREL)
+		fdw_private = lappend(fdw_private, makeString(relations.data));
 
 	/*
 	 * Create the ForeignScan node from target list, local filtering
@@ -868,11 +831,18 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 * field of the finished plan node; we can't keep them in private state
 	 * because then they wouldn't be subject to later planner processing.
 	 */
-	return make_foreignscan(tlist,
+	scan = make_foreignscan(tlist,
 							local_exprs,
 							scan_relid,
 							params_list,
 							fdw_private);
+
+	/*
+	 * set fdw_ps_tlist to handle tuples generated by this scan.
+	 */
+	scan->fdw_ps_tlist = fdw_ps_tlist;
+
+	return scan;
 }
 
 /*
@@ -885,9 +855,8 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	ForeignScan *fsplan = (ForeignScan *) node->ss.ps.plan;
 	EState	   *estate = node->ss.ps.state;
 	PgFdwScanState *fsstate;
-	RangeTblEntry *rte;
+	Oid			serverid;
 	Oid			userid;
-	ForeignTable *table;
 	ForeignServer *server;
 	UserMapping *user;
 	int			numParams;
@@ -907,22 +876,13 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	node->fdw_state = (void *) fsstate;
 
 	/*
-	 * Identify which user to do the remote access as.  This should match what
-	 * ExecCheckRTEPerms() does.
-	 */
-	rte = rt_fetch(fsplan->scan.scanrelid, estate->es_range_table);
-	userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-	/* Get info about foreign table. */
-	fsstate->rel = node->ss.ss_currentRelation;
-	table = GetForeignTable(RelationGetRelid(fsstate->rel));
-	server = GetForeignServer(table->serverid);
-	user = GetUserMapping(userid, server->serverid);
-
-	/*
 	 * Get connection to the foreign server.  Connection manager will
 	 * establish new connection if necessary.
 	 */
+	serverid = intVal(list_nth(fsplan->fdw_private, FdwScanPrivateServerOid));
+	userid = intVal(list_nth(fsplan->fdw_private, FdwScanPrivateUserOid));
+	server = GetForeignServer(serverid);
+	user = GetUserMapping(userid, server->serverid);
 	fsstate->conn = GetConnection(server, user, false);
 
 	/* Assign a unique ID for my cursor */
@@ -932,8 +892,8 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	/* Get private info created by planner functions. */
 	fsstate->query = strVal(list_nth(fsplan->fdw_private,
 									 FdwScanPrivateSelectSql));
-	fsstate->retrieved_attrs = (List *) list_nth(fsplan->fdw_private,
-											   FdwScanPrivateRetrievedAttrs);
+	fsstate->retrieved_attrs = list_nth(fsplan->fdw_private,
+										FdwScanPrivateRetrievedAttrs);
 
 	/* Create contexts for batches of tuples and per-tuple temp workspace. */
 	fsstate->batch_cxt = AllocSetContextCreate(estate->es_query_cxt,
@@ -947,8 +907,18 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 											  ALLOCSET_SMALL_INITSIZE,
 											  ALLOCSET_SMALL_MAXSIZE);
 
-	/* Get info we'll need for input data conversion. */
-	fsstate->attinmeta = TupleDescGetAttInMetadata(RelationGetDescr(fsstate->rel));
+	/* Get info we'll need for input data conversion and error report. */
+	if (fsplan->scan.scanrelid > 0)
+	{
+		fsstate->relname = RelationGetRelationName(node->ss.ss_currentRelation);
+		fsstate->tupdesc = RelationGetDescr(node->ss.ss_currentRelation);
+	}
+	else
+	{
+		fsstate->relname = NULL;
+		fsstate->tupdesc = node->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
+	}
+	fsstate->attinmeta = TupleDescGetAttInMetadata(fsstate->tupdesc);
 
 	/* Prepare for output conversion of parameters used in remote query. */
 	numParams = list_length(fsplan->fdw_exprs);
@@ -1664,10 +1634,25 @@ postgresExplainForeignScan(ForeignScanState *node, ExplainState *es)
 {
 	List	   *fdw_private;
 	char	   *sql;
+	char	   *relations;
 
+	fdw_private = ((ForeignScan *) node->ss.ps.plan)->fdw_private;
+
+	/*
+	 * Add names of relation handled by the foreign scan when the scan is a
+	 * join
+	 */
+	if (list_length(fdw_private) > FdwScanPrivateRelations)
+	{
+		relations = strVal(list_nth(fdw_private, FdwScanPrivateRelations));
+		ExplainPropertyText("Relations", relations, es);
+	}
+
+	/*
+	 * Add remote query, when VERBOSE option is specified.
+	 */
 	if (es->verbose)
 	{
-		fdw_private = ((ForeignScan *) node->ss.ps.plan)->fdw_private;
 		sql = strVal(list_nth(fdw_private, FdwScanPrivateSelectSql));
 		ExplainPropertyText("Remote SQL", sql, es);
 	}
@@ -1726,10 +1711,12 @@ estimate_path_cost_size(PlannerInfo *root,
 	 */
 	if (fpinfo->use_remote_estimate)
 	{
+		List	   *remote_conds;
 		List	   *remote_join_conds;
 		List	   *local_join_conds;
-		StringInfoData sql;
 		List	   *retrieved_attrs;
+		StringInfoData sql;
+		UserMapping *user;
 		PGconn	   *conn;
 		Selectivity local_sel;
 		QualCost	local_cost;
@@ -1741,24 +1728,24 @@ estimate_path_cost_size(PlannerInfo *root,
 		classifyConditions(root, baserel, join_conds,
 						   &remote_join_conds, &local_join_conds);
 
+		remote_conds = copyObject(fpinfo->remote_conds);
+		remote_conds = list_concat(remote_conds, remote_join_conds);
+
 		/*
 		 * Construct EXPLAIN query including the desired SELECT, FROM, and
 		 * WHERE clauses.  Params and other-relation Vars are replaced by
 		 * dummy values.
+		 * Here we waste params_list and fdw_ps_tlist because they are
+		 * unnecessary for EXPLAIN.
 		 */
 		initStringInfo(&sql);
 		appendStringInfoString(&sql, "EXPLAIN ");
-		deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used,
-						 &retrieved_attrs);
-		if (fpinfo->remote_conds)
-			appendWhereClause(&sql, root, baserel, fpinfo->remote_conds,
-							  true, NULL);
-		if (remote_join_conds)
-			appendWhereClause(&sql, root, baserel, remote_join_conds,
-							  (fpinfo->remote_conds == NIL), NULL);
+		deparseSelectSql(&sql, root, baserel, fpinfo->attrs_used, remote_conds,
+						 NULL, NULL, &retrieved_attrs, NULL);
 
 		/* Get the remote estimate */
-		conn = GetConnection(fpinfo->server, fpinfo->user, false);
+		user = GetUserMapping(fpinfo->userid, fpinfo->server->serverid);
+		conn = GetConnection(fpinfo->server, user, false);
 		get_remote_estimate(sql.data, conn, &rows, &width,
 							&startup_cost, &total_cost);
 		ReleaseConnection(conn);
@@ -2055,7 +2042,9 @@ fetch_more_data(ForeignScanState *node)
 		{
 			fsstate->tuples[i] =
 				make_tuple_from_result_row(res, i,
-										   fsstate->rel,
+										   fsstate->relname,
+										   fsstate->query,
+										   fsstate->tupdesc,
 										   fsstate->attinmeta,
 										   fsstate->retrieved_attrs,
 										   fsstate->temp_cxt);
@@ -2273,7 +2262,9 @@ store_returning_result(PgFdwModifyState *fmstate,
 		HeapTuple	newtup;
 
 		newtup = make_tuple_from_result_row(res, 0,
-											fmstate->rel,
+										RelationGetRelationName(fmstate->rel),
+											fmstate->query,
+											RelationGetDescr(fmstate->rel),
 											fmstate->attinmeta,
 											fmstate->retrieved_attrs,
 											fmstate->temp_cxt);
@@ -2423,6 +2414,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
 	initStringInfo(&sql);
 	appendStringInfo(&sql, "DECLARE c%u CURSOR FOR ", cursor_number);
 	deparseAnalyzeSql(&sql, relation, &astate.retrieved_attrs);
+	astate.query = sql.data;
 
 	/* In what follows, do not risk leaking any PGresults. */
 	PG_TRY();
@@ -2565,7 +2557,9 @@ analyze_row_processor(PGresult *res, int row, PgFdwAnalyzeState *astate)
 		oldcontext = MemoryContextSwitchTo(astate->anl_cxt);
 
 		astate->rows[pos] = make_tuple_from_result_row(res, row,
-													   astate->rel,
+										   RelationGetRelationName(astate->rel),
+													   astate->query,
+											   RelationGetDescr(astate->rel),
 													   astate->attinmeta,
 													 astate->retrieved_attrs,
 													   astate->temp_cxt);
@@ -2839,6 +2833,269 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
 }
 
 /*
+ * Construct PgFdwRelationInfo from two join sources
+ */
+static PgFdwRelationInfo *
+merge_fpinfo(RelOptInfo *outerrel,
+			 RelOptInfo *innerrel,
+			 JoinType jointype,
+			 double rows,
+			 int width)
+{
+	PgFdwRelationInfo *fpinfo_o;
+	PgFdwRelationInfo *fpinfo_i;
+	PgFdwRelationInfo *fpinfo;
+
+	fpinfo_o = (PgFdwRelationInfo *) outerrel->fdw_private;
+	fpinfo_i = (PgFdwRelationInfo *) innerrel->fdw_private;
+
+	fpinfo = (PgFdwRelationInfo *) palloc0(sizeof(PgFdwRelationInfo));
+
+	/* Join relation must have conditions come from sources */
+	fpinfo->remote_conds = list_concat(copyObject(fpinfo_o->remote_conds),
+									   copyObject(fpinfo_i->remote_conds));
+	fpinfo->local_conds = list_concat(copyObject(fpinfo_o->local_conds),
+									  copyObject(fpinfo_i->local_conds));
+
+	/* Only for simple foreign table scan */
+	fpinfo->attrs_used = NULL;
+
+	/* rows and width will be set later */
+	fpinfo->rows = rows;
+	fpinfo->width = width;
+
+	/* A join have local conditions for outer and inner, so sum up them. */
+	fpinfo->local_conds_cost.startup = fpinfo_o->local_conds_cost.startup +
+									   fpinfo_i->local_conds_cost.startup;
+	fpinfo->local_conds_cost.per_tuple = fpinfo_o->local_conds_cost.per_tuple +
+										 fpinfo_i->local_conds_cost.per_tuple;
+
+	/* Don't consider correlation between local filters. */
+	fpinfo->local_conds_sel = fpinfo_o->local_conds_sel *
+							  fpinfo_i->local_conds_sel;
+
+	fpinfo->use_remote_estimate = false;
+
+	/*
+	 * These two comes default or per-server setting, so outer and inner must
+	 * have same value.
+	 */
+	fpinfo->fdw_startup_cost = fpinfo_o->fdw_startup_cost;
+	fpinfo->fdw_tuple_cost = fpinfo_o->fdw_tuple_cost;
+
+	/*
+	 * TODO estimate more accurately
+	 */
+	fpinfo->startup_cost = fpinfo->fdw_startup_cost +
+						   fpinfo->local_conds_cost.startup;
+	fpinfo->total_cost = fpinfo->startup_cost +
+						 (fpinfo->fdw_tuple_cost +
+						  fpinfo->local_conds_cost.per_tuple +
+						  cpu_tuple_cost) * fpinfo->rows;
+
+	/* serverid and userid are respectively identical */
+	fpinfo->server = fpinfo_o->server;
+	fpinfo->userid = fpinfo_o->userid;
+
+	fpinfo->outerrel = outerrel;
+	fpinfo->innerrel = innerrel;
+	fpinfo->jointype = jointype;
+
+	/* joinclauses and otherclauses will be set later */
+
+	return fpinfo;
+}
+
+/*
+ * postgresGetForeignJoinPaths
+ *		Add possible ForeignPath to joinrel.
+ *
+ * Joins satisfy conditions below can be pushed down to the remote PostgreSQL
+ * server.
+ *
+ * 1) Join type is INNER or OUTER (one of LEFT/RIGHT/FULL)
+ * 2) Both outer and inner portions are safe to push-down
+ * 3) All foreign tables in the join belong to the same foreign server
+ * 4) All foreign tables are accessed with identical user
+ * 5) All join conditions are safe to push down
+ * 6) No relation has local filter (this can be relaxed for INNER JOIN with
+ * no volatile function/operator, but as of now we want safer way)
+ */
+static void
+postgresGetForeignJoinPaths(PlannerInfo *root,
+							RelOptInfo *joinrel,
+							RelOptInfo *outerrel,
+							RelOptInfo *innerrel,
+							SpecialJoinInfo *sjinfo,
+							List *restrictlist)
+{
+	PgFdwRelationInfo *fpinfo;
+	PgFdwRelationInfo *fpinfo_o;
+	PgFdwRelationInfo *fpinfo_i;
+	JoinType		jointype = !sjinfo ? JOIN_INNER : sjinfo->jointype;
+	ForeignPath	   *joinpath;
+	double			rows;
+	Cost			startup_cost;
+	Cost			total_cost;
+
+	ListCell	   *lc;
+	List		   *joinclauses;
+	List		   *otherclauses;
+
+	/*
+	 * We support all outer joins in addition to inner join.  CROSS JOIN is
+	 * an INNER JOIN with no conditions internally, so will be checked later.
+	 */
+	if (jointype != JOIN_INNER && jointype != JOIN_LEFT &&
+		jointype != JOIN_RIGHT && jointype != JOIN_FULL)
+	{
+		ereport(DEBUG3, (errmsg("unsupported join type (SEMI, ANTI)")));
+		return;
+	}
+
+	/*
+	 * Having valid PgFdwRelationInfo in RelOptInfo#fdw_private indicates that
+	 * scanning against the relation can be pushed down.  If either of them
+	 * doesn't have PgFdwRelationInfo, give up to push down this join relation.
+	 */
+	if (!outerrel->fdw_private)
+	{
+		ereport(DEBUG3, (errmsg("outer is not safe to push-down")));
+		return;
+	}
+	if (!innerrel->fdw_private)
+	{
+		ereport(DEBUG3, (errmsg("inner is not safe to push-down")));
+		return;
+	}
+	fpinfo_o = (PgFdwRelationInfo *) outerrel->fdw_private;
+	fpinfo_i = (PgFdwRelationInfo *) innerrel->fdw_private;
+
+	/*
+	 * All relations in the join must belong to same server.  Having a valid
+	 * fdw_private means that all relations in the relations belong to the
+	 * server the fdw_private has, so what we should do is just compare
+	 * serverid of outer/inner relations.
+	 */
+	if (fpinfo_o->server->serverid != fpinfo_i->server->serverid)
+	{
+		ereport(DEBUG3, (errmsg("server unmatch")));
+		return;
+	}
+
+	/*
+	 * effective userid of all source relations should be identical.
+	 * Having a valid fdw_private means that all relations in the relations is
+	 * accessed with identical user, so what we should do is just compare
+	 * userid of outer/inner relations.
+	 */
+	if (fpinfo_o->userid != fpinfo_i->userid)
+	{
+		ereport(DEBUG3, (errmsg("unmatch userid")));
+		return;
+	}
+
+	/*
+	 * No source relation can have local conditions.  This can be relaxed
+	 * if the join is an inner join and local conditions don't contain
+	 * volatile function/operator, but as of now we leave it as future
+	 * enhancement.
+	 */
+	if (fpinfo_o->local_conds != NULL || fpinfo_i->local_conds != NULL)
+	{
+		ereport(DEBUG3, (errmsg("join with local filter")));
+		return;
+	}
+
+	/*
+	 * Separate restrictlist into two lists, join conditions and remote filters.
+	 */
+	joinclauses = restrictlist;
+	if (IS_OUTER_JOIN(jointype))
+	{
+		extract_actual_join_clauses(joinclauses, &joinclauses, &otherclauses);
+	}
+	else
+	{
+		joinclauses = extract_actual_clauses(joinclauses, false);
+		otherclauses = NIL;
+	}
+
+	/*
+	 * Note that CROSS JOIN (cartesian product) is transformed to JOIN_INNER
+	 * with empty joinclauses.  Pushing down CROSS JOIN usually produces more
+	 * result than retrieving each tables separately, so we don't push down
+	 * such joins.
+	 */
+	if (jointype == JOIN_INNER && joinclauses == NIL)
+	{
+		ereport(DEBUG3, (errmsg("unsupported join type (CROSS)")));
+		return;
+	}
+
+	/*
+	 * Join condition must be safe to push down.
+	 */
+	foreach(lc, joinclauses)
+	{
+		Expr *expr = (Expr *) lfirst(lc);
+
+		if (!is_foreign_expr(root, joinrel, expr))
+		{
+			ereport(DEBUG3, (errmsg("join quals contains unsafe conditions")));
+			return;
+		}
+	}
+
+	/*
+	 * Other condition for the join must be safe to push down.
+	 */
+	foreach(lc, otherclauses)
+	{
+		Expr *expr = (Expr *) lfirst(lc);
+
+		if (!is_foreign_expr(root, joinrel, expr))
+		{
+			ereport(DEBUG3, (errmsg("filter contains unsafe conditions")));
+			return;
+		}
+	}
+
+	/* Here we know that this join can be pushed-down to remote side. */
+
+	/* Construct fpinfo for the join relation */
+	fpinfo = merge_fpinfo(outerrel, innerrel, jointype, joinrel->rows, joinrel->width); 
+	fpinfo->joinclauses = joinclauses;
+	fpinfo->otherclauses = otherclauses;
+	joinrel->fdw_private = fpinfo;
+
+	/* TODO determine more accurate cost and rows of the join. */
+	rows = joinrel->rows;
+	startup_cost = fpinfo->startup_cost;
+	total_cost = fpinfo->total_cost;
+
+	/*
+	 * Create a new join path and add it to the joinrel which represents a join
+	 * between foreign tables.
+	 */
+	joinpath = create_foreignscan_path(root,
+									   joinrel,
+									   rows,
+									   startup_cost,
+									   total_cost,
+									   NIL,		/* no pathkeys */
+									   NULL,	/* no required_outer */
+									   NIL);	/* no fdw_private */
+
+	/* Add generated path into joinrel by add_path(). */
+	add_path(joinrel, (Path *) joinpath);
+	elog(DEBUG3, "join path added for (%s) join (%s)",
+		 bms_to_str(outerrel->relids), bms_to_str(innerrel->relids));
+
+	/* TODO consider parameterized paths */
+}
+
+/*
  * Create a tuple from the specified row of the PGresult.
  *
  * rel is the local representation of the foreign table, attinmeta is
@@ -2849,13 +3106,14 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
 static HeapTuple
 make_tuple_from_result_row(PGresult *res,
 						   int row,
-						   Relation rel,
+						   const char *relname,
+						   const char *query,
+						   TupleDesc tupdesc,
 						   AttInMetadata *attinmeta,
 						   List *retrieved_attrs,
 						   MemoryContext temp_context)
 {
 	HeapTuple	tuple;
-	TupleDesc	tupdesc = RelationGetDescr(rel);
 	Datum	   *values;
 	bool	   *nulls;
 	ItemPointer ctid = NULL;
@@ -2882,7 +3140,9 @@ make_tuple_from_result_row(PGresult *res,
 	/*
 	 * Set up and install callback to report where conversion error occurs.
 	 */
-	errpos.rel = rel;
+	errpos.relname = relname;
+	errpos.query = query;
+	errpos.tupdesc = tupdesc;
 	errpos.cur_attno = 0;
 	errcallback.callback = conversion_error_callback;
 	errcallback.arg = (void *) &errpos;
@@ -2966,11 +3226,39 @@ make_tuple_from_result_row(PGresult *res,
 static void
 conversion_error_callback(void *arg)
 {
+	const char *attname;
+	const char *relname;
 	ConversionLocation *errpos = (ConversionLocation *) arg;
-	TupleDesc	tupdesc = RelationGetDescr(errpos->rel);
+	TupleDesc	tupdesc = errpos->tupdesc;
+	StringInfoData buf;
+
+	if (errpos->relname)
+	{
+		/* error occurred in a scan against a foreign table */ 
+		initStringInfo(&buf);
+		if (errpos->cur_attno > 0)
+			appendStringInfo(&buf, "column \"%s\"",
+					 NameStr(tupdesc->attrs[errpos->cur_attno - 1]->attname));
+		else if (errpos->cur_attno == SelfItemPointerAttributeNumber)
+			appendStringInfoString(&buf, "column \"ctid\"");
+		attname = buf.data;
+
+		initStringInfo(&buf);
+		appendStringInfo(&buf, "foreign table \"%s\"", errpos->relname);
+		relname = buf.data;
+	}
+	else
+	{
+		/* error occurred in a scan against a foreign join */ 
+		initStringInfo(&buf);
+		appendStringInfo(&buf, "column %d", errpos->cur_attno - 1);
+		attname = buf.data;
+
+		initStringInfo(&buf);
+		appendStringInfo(&buf, "foreign join \"%s\"", errpos->query);
+		relname = buf.data;
+	}
 
 	if (errpos->cur_attno > 0 && errpos->cur_attno <= tupdesc->natts)
-		errcontext("column \"%s\" of foreign table \"%s\"",
-				   NameStr(tupdesc->attrs[errpos->cur_attno - 1]->attname),
-				   RelationGetRelationName(errpos->rel));
+		errcontext("%s of %s", attname, relname);
 }
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 950c6f7..d6b16d8 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -16,10 +16,52 @@
 #include "foreign/foreign.h"
 #include "lib/stringinfo.h"
 #include "nodes/relation.h"
+#include "nodes/plannodes.h"
 #include "utils/relcache.h"
 
 #include "libpq-fe.h"
 
+/*
+ * FDW-specific planner information kept in RelOptInfo.fdw_private for a
+ * foreign table or a foreign join.  This information is collected by
+ * postgresGetForeignRelSize, or calculated from join source relations.
+ */
+typedef struct PgFdwRelationInfo
+{
+	/* baserestrictinfo clauses, broken down into safe and unsafe subsets. */
+	List	   *remote_conds;
+	List	   *local_conds;
+
+	/* Bitmap of attr numbers we need to fetch from the remote server. */
+	Bitmapset  *attrs_used;
+
+	/* Cost and selectivity of local_conds. */
+	QualCost	local_conds_cost;
+	Selectivity local_conds_sel;
+
+	/* Estimated size and cost for a scan with baserestrictinfo quals. */
+	double		rows;
+	int			width;
+	Cost		startup_cost;
+	Cost		total_cost;
+
+	/* Options extracted from catalogs. */
+	bool		use_remote_estimate;
+	Cost		fdw_startup_cost;
+	Cost		fdw_tuple_cost;
+
+	/* Cached catalog information. */
+	ForeignServer *server;
+	Oid			userid;
+
+	/* Join information */
+	RelOptInfo *outerrel;
+	RelOptInfo *innerrel;
+	JoinType	jointype;
+	List	   *joinclauses;
+	List	   *otherclauses;
+} PgFdwRelationInfo;
+
 /* in postgres_fdw.c */
 extern int	set_transmission_modes(void);
 extern void reset_transmission_modes(int nestlevel);
@@ -51,13 +93,31 @@ extern void deparseSelectSql(StringInfo buf,
 				 PlannerInfo *root,
 				 RelOptInfo *baserel,
 				 Bitmapset *attrs_used,
-				 List **retrieved_attrs);
-extern void appendWhereClause(StringInfo buf,
+				 List *remote_conds,
+				 List **params_list,
+				 List **fdw_ps_tlist,
+				 List **retrieved_attrs,
+				 StringInfo relations);
+extern void appendConditions(StringInfo buf,
 				  PlannerInfo *root,
 				  RelOptInfo *baserel,
+				  List *outertlist,
+				  List *innertlist,
 				  List *exprs,
-				  bool is_first,
+				  const char *prefix,
 				  List **params);
+extern void deparseJoinSql(StringInfo sql,
+			   PlannerInfo *root,
+			   RelOptInfo *baserel,
+			   RelOptInfo *outerrel,
+			   RelOptInfo *innerrel,
+			   const char *sql_o,
+			   const char *sql_i,
+			   JoinType jointype,
+			   List *joinclauses,
+			   List *otherclauses,
+			   List **fdw_ps_tlist,
+			   List **retrieved_attrs);
 extern void deparseInsertSql(StringInfo buf, PlannerInfo *root,
 				 Index rtindex, Relation rel,
 				 List *targetAttrs, List *returningList,
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 4a23457..b0c9a8d 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -11,12 +11,17 @@ DO $d$
             OPTIONS (dbname '$$||current_database()||$$',
                      port '$$||current_setting('port')||$$'
             )$$;
+        EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER postgres_fdw
+            OPTIONS (dbname '$$||current_database()||$$',
+                     port '$$||current_setting('port')||$$'
+            )$$;
     END;
 $d$;
 
 CREATE USER MAPPING FOR public SERVER testserver1
 	OPTIONS (user 'value', password 'value');
 CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
+CREATE USER MAPPING FOR CURRENT_USER SERVER loopback2;
 
 -- ===================================================================
 -- create objects used through FDW loopback server
@@ -39,6 +44,18 @@ CREATE TABLE "S 1"."T 2" (
 	c2 text,
 	CONSTRAINT t2_pkey PRIMARY KEY (c1)
 );
+CREATE TABLE "S 1"."T 3" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text,
+	CONSTRAINT t3_pkey PRIMARY KEY (c1)
+);
+CREATE TABLE "S 1"."T 4" (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c4 text,
+	CONSTRAINT t4_pkey PRIMARY KEY (c1)
+);
 
 INSERT INTO "S 1"."T 1"
 	SELECT id,
@@ -54,9 +71,23 @@ INSERT INTO "S 1"."T 2"
 	SELECT id,
 	       'AAA' || to_char(id, 'FM000')
 	FROM generate_series(1, 100) id;
+INSERT INTO "S 1"."T 3"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 3" WHERE c1 % 2 != 0;	-- delete for outer join tests
+INSERT INTO "S 1"."T 4"
+	SELECT id,
+	       id + 1,
+	       'AAA' || to_char(id, 'FM000')
+	FROM generate_series(1, 100) id;
+DELETE FROM "S 1"."T 4" WHERE c1 % 3 != 0;	-- delete for outer join tests
 
 ANALYZE "S 1"."T 1";
 ANALYZE "S 1"."T 2";
+ANALYZE "S 1"."T 3";
+ANALYZE "S 1"."T 4";
 
 -- ===================================================================
 -- create foreign tables
@@ -87,6 +118,29 @@ CREATE FOREIGN TABLE ft2 (
 ) SERVER loopback;
 ALTER FOREIGN TABLE ft2 DROP COLUMN cx;
 
+CREATE FOREIGN TABLE ft4 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3');
+
+CREATE FOREIGN TABLE ft5 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4');
+
+CREATE FOREIGN TABLE ft6 (
+	c1 int NOT NULL,
+	c2 int NOT NULL,
+	c3 text
+) SERVER loopback2 OPTIONS (schema_name 'S 1', table_name 'T 4');
+CREATE USER view_owner;
+GRANT ALL ON ft5 TO view_owner;
+CREATE VIEW v_ft5 AS SELECT * FROM ft5;
+ALTER VIEW v_ft5 OWNER TO view_owner;
+CREATE USER MAPPING FOR view_owner SERVER loopback;
+
 -- ===================================================================
 -- tests for validator
 -- ===================================================================
@@ -158,8 +212,6 @@ EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
 SELECT * FROM ft1 t1 WHERE c1 = 102 FOR SHARE;
 -- aggregate
 SELECT COUNT(*) FROM ft1 t1;
--- join two tables
-SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
 -- subquery
 SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <= 10) ORDER BY c1;
 -- subquery+MAX
@@ -216,6 +268,82 @@ SELECT * FROM ft1 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft2 WHERE c1 < 5));
 SELECT * FROM ft2 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft1 WHERE c1 < 5));
 
 -- ===================================================================
+-- JOIN queries
+-- ===================================================================
+-- join two tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- join three tables
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
+-- left outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+-- right outer join
+SET enable_mergejoin = off; -- planner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 RIGHT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t2.c1 OFFSET 10 LIMIT 10;
+SET enable_mergejoin = on;
+-- full outer join
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
+-- full outer join + WHERE clause, only matched rows
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1 = t2.c1 OR t1.c1 IS NULL) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+-- join at WHERE clause 
+SET enable_mergejoin = off; -- planner choose MergeJoin even it has higher costs, so disable it for testing.
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON true WHERE (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
+SET enable_mergejoin = on;
+-- join in CTE
+EXPLAIN (COSTS false, VERBOSE)
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+-- ctid with whole-row reference
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- partially unsafe to push down, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON t2.c1 = t2.c1 JOIN ft4 t3 ON t2.c1 = t3.c1 ORDER BY t1.c1 OFFSET 10 LIMIT 10;
+-- SEMI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+-- ANTI JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1 FROM ft1 t1 WHERE NOT EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c2) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+-- CROSS JOIN, not pushed down
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- different server
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN ft6 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- different effective user for permission check
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
+-- unsafe join conditions
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c8 = t2.c8) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+-- local filter (unsafe conditions on one side)
+EXPLAIN (COSTS false, VERBOSE)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+
+-- ===================================================================
 -- parameterized queries
 -- ===================================================================
 -- simple join
@@ -666,116 +794,6 @@ UPDATE rem1 SET f2 = 'testo';
 INSERT INTO rem1(f2) VALUES ('test') RETURNING ctid;
 
 -- ===================================================================
--- test inheritance features
--- ===================================================================
-
-CREATE TABLE a (aa TEXT);
-CREATE TABLE loct (aa TEXT, bb TEXT);
-CREATE FOREIGN TABLE b (bb TEXT) INHERITS (a)
-  SERVER loopback OPTIONS (table_name 'loct');
-
-INSERT INTO a(aa) VALUES('aaa');
-INSERT INTO a(aa) VALUES('aaaa');
-INSERT INTO a(aa) VALUES('aaaaa');
-
-INSERT INTO b(aa) VALUES('bbb');
-INSERT INTO b(aa) VALUES('bbbb');
-INSERT INTO b(aa) VALUES('bbbbb');
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE a SET aa = 'zzzzzz' WHERE aa LIKE 'aaaa%';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE b SET aa = 'new';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-UPDATE a SET aa = 'newtoo';
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-DELETE FROM a;
-
-SELECT tableoid::regclass, * FROM a;
-SELECT tableoid::regclass, * FROM b;
-SELECT tableoid::regclass, * FROM ONLY a;
-
-DROP TABLE a CASCADE;
-DROP TABLE loct;
-
--- Check SELECT FOR UPDATE/SHARE with an inherited source table
-create table loct1 (f1 int, f2 int, f3 int);
-create table loct2 (f1 int, f2 int, f3 int);
-
-create table foo (f1 int, f2 int);
-create foreign table foo2 (f3 int) inherits (foo)
-  server loopback options (table_name 'loct1');
-create table bar (f1 int, f2 int);
-create foreign table bar2 (f3 int) inherits (bar)
-  server loopback options (table_name 'loct2');
-
-insert into foo values(1,1);
-insert into foo values(3,3);
-insert into foo2 values(2,2,2);
-insert into foo2 values(4,4,4);
-insert into bar values(1,11);
-insert into bar values(2,22);
-insert into bar values(6,66);
-insert into bar2 values(3,33,33);
-insert into bar2 values(4,44,44);
-insert into bar2 values(7,77,77);
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for update;
-select * from bar where f1 in (select f1 from foo) for update;
-
-explain (verbose, costs off)
-select * from bar where f1 in (select f1 from foo) for share;
-select * from bar where f1 in (select f1 from foo) for share;
-
--- Check UPDATE with inherited target and an inherited source table
-explain (verbose, costs off)
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-
-select tableoid::regclass, * from bar order by 1,2;
-
--- Check UPDATE with inherited target and an appendrel subquery
-explain (verbose, costs off)
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-update bar set f2 = f2 + 100
-from
-  ( select f1 from foo union all select f1+3 from foo ) ss
-where bar.f1 = ss.f1;
-
-select tableoid::regclass, * from bar order by 1,2;
-
--- Test that WHERE CURRENT OF is not supported
-begin;
-declare c cursor for select * from bar where f1 = 7;
-fetch from c;
-update bar set f2 = null where current of c;
-rollback;
-
-drop table foo cascade;
-drop table bar cascade;
-drop table loct1;
-drop table loct2;
-
--- ===================================================================
 -- test IMPORT FOREIGN SCHEMA
 -- ===================================================================
 
@@ -831,3 +849,7 @@ DROP TYPE "Colors" CASCADE;
 IMPORT FOREIGN SCHEMA import_source LIMIT TO (t5)
   FROM SERVER loopback INTO import_dest5;  -- ERROR
 ROLLBACK;
+
+-- Cleanup
+DROP OWNED BY view_owner;
+DROP USER view_owner;
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 43adb61..fb39c38 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -406,11 +406,27 @@
   <title>Remote Query Optimization</title>
 
   <para>
-   <filename>postgres_fdw</> attempts to optimize remote queries to reduce
-   the amount of data transferred from foreign servers.  This is done by
-   sending query <literal>WHERE</> clauses to the remote server for
-   execution, and by not retrieving table columns that are not needed for
-   the current query.  To reduce the risk of misexecution of queries,
+   <filename>postgres_fdw</filename> attempts to optimize remote queries to
+   reduce the amount of data transferred from foreign servers.
+   This is done by various ways.
+  </para>
+
+  <para>
+   For <literal>SELECT</> clause, <filename>postgres_fdw</filename> sends only
+   actually necessary columns in it.
+  </para>
+
+  <para>
+   If <literal>FROM</> clause contains multiple foreign tables managed
+   by the same server and accessed with identical user,
+   <filename>postgres_fdw</> tries to join foreign tables on the remote side as
+   much as it can.
+   To reduce risk of misexecution of queries, <filename>postgres_fdw</>
+   gives up sending joins to remote when join conditions might have different
+   semantics on the remote side.
+  </para>
+
+  <para>
    <literal>WHERE</> clauses are not sent to the remote server unless they use
    only built-in data types, operators, and functions.  Operators and
    functions in the clauses must be <literal>IMMUTABLE</> as well.

#30

Kouhei Kaigai

kaigai@ak.jp.nec.com

over 10 years ago

In reply to: Shigeru HANADA (#29)

Hanada-san,

Thanks for your works. I have nothing to comment on any more (at this moment).
I hope committer review / comment on the couple of features.

Best regards,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

-----Original Message-----
From: Shigeru HANADA [mailto:shigeru.hanada@gmail.com]
Sent: Friday, April 17, 2015 1:44 PM
To: Kaigai Kouhei(海外浩平)
Cc: Ashutosh Bapat; Robert Haas; Tom Lane; Thom Brown;
pgsql-hackers@postgreSQL.org
Subject: ##freemail## Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom
Plan API)

Kaigai-san,

2015/04/17 10:13、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：

Hanada-san,

I merged explain patch into foreign_join patch.

Now v12 is the latest patch.

It contains many garbage lines... Please ensure the
patch is correctly based on tOhe latest master +
custom_join patch.

Oops, sorry. I’ve re-created the patch as v13, based on Custom/Foreign join v11
patch and latest master.

It contains EXPLAIN enhancement that new subitem “Relations” shows relations
and joins, including order and type, processed by the foreign scan.

--
Shigeru HANADA
shigeru.hanada@gmail.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Resolved by subject fallback

#31

Shigeru HANADA

shigeru.hanada@gmail.com

over 10 years ago

In reply to: Kouhei Kaigai (#14)

Kaigai-san,

I reviewed the Custom/Foreign join API patch again after writing a patch of join push-down support for postgres_fdw.

2015/03/26 10:51、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：

Or bottom of make_join_rel(). IMO build_join_rel() is responsible for just

building (or searching from a list) a RelOptInfo for given relids. After that
make_join_rel() calls add_paths_to_joinrel() with appropriate arguments per join
type to generate actual Paths implements the join. make_join_rel() is called
only once for particular relid combination, and there SpecialJoinInfo and
restrictlist (conditions specified in JOIN-ON and WHERE), so it seems promising
for FDW cases.

I like that idea, but I think we will have complex hook signature, it won't

remain as simple as hook (root, joinrel).

Signature of the hook (or the FDW API handler) would be like this:

typedef void (*GetForeignJoinPaths_function ) (PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
SpecialJoinInfo *sjinfo,
List *restrictlist);

This is very similar to add_paths_to_joinrel(), but lacks semifactors and
extra_lateral_rels. semifactors can be obtained with
compute_semi_anti_join_factors(), and extra_lateral_rels can be constructed
from root->placeholder_list as add_paths_to_joinrel() does.

From the viewpoint of postgres_fdw, jointype and restrictlist is necessary to
generate SELECT statement, so it would require most work done in make_join_rel
again if the signature was hook(root, joinrel). sjinfo will be necessary for
supporting SEMI/ANTI joins, but currently it is not in the scope of postgres_fdw.

I guess that other FDWs require at least jointype and restrictlist.

The attached patch adds GetForeignJoinPaths call on make_join_rel() only when
'joinrel' is actually built and both of child relations are managed by same
FDW driver, prior to any other built-in join paths.
I adjusted the hook definition a little bit, because jointype can be reproduced
using SpecialJoinInfo. Right?

Yes, it can be derived from the expression below:
jointype = sjinfo ? sjinfo->jointype : JOIN_INNER;

Probably, it will solve the original concern towards multiple calls of FDW
handler in case when it tries to replace an entire join subtree with a foreign-
scan on the result of remote join query.

How about your opinion?

AFAIS it’s well-balanced about calling count and available information.

New FDW API GetForeignJoinPaths is called only once for a particular combination of join, such as (A join B join C). Before considering all joins in a join level (number of relations contained in the join tree), possible join combinations of lower join level are considered recursively. As Tom pointed out before, say imagine a case like ((huge JOIN large) LEFT JOIN small), expensive path in lower join level might be

Here, please let me summarize the changes in the patch as the result of my review.

* Add set_join_pathlist_hook_type in add_paths_to_joinrel
This hook is intended to provide a chance to add one or more CustomPaths for an actual join combination. If the join is reversible, the hook is called for both A * B and B * A. This is different from FDW API but it seems fine because FDWs should have chances to process the join in more abstract level than CSPs.

Parameters are same as hash_inner_and_outer, so they would be enough for hash-like or nestloop-like methods. I’m not sure whether mergeclause_list is necessary as a parameter or not. It’s information for merge join which is generated when enable_mergejoin is on and the join is not FULL OUTER. Does some CSP need it for processing a join in its own way? Then it must be in parameter list because select_mergejoin_clauses is static so it’s not accessible from external modules.

The timing of the hooking, after considering all built-in path types, seems fine because some of CSPs might want to use built-in paths as a template or a source.

One concern is in the document of the hook function. "Implementing Custom Paths” says:

A custom scan provider will be also able to add paths by setting the following hook, to replace built-in join paths by custom-scan that performs as if a scan on preliminary joined relations, which us called after the core code has generated what it believes to be the complete and correct set of access paths for the join.

I think “replace” would mis-lead readers that CSP can remove or edit existing built-in paths listed in RelOptInfo#pathlist or linked from cheapest_foo. IIUC CSP can just add paths for the join relation, and planner choose it if it’s the cheapest.

* Add new FDW API GetForeignJoinPaths in make_join_rel
This FDW API is intended to provide a chance to add ForeignPaths for a join relation. This is called only once for a join relation, so FDW should consider reversed combination if it’s meaningful in their own mechanisms.

Note that this is called only when the join relation was *NOT* found in the PlannerInfo, to avoid redundant calls.

Parameters seems enough for postgres_fdw to process N-way join on remote side with pushing down join conditions and remote filters.

* Propagate FDW information through bottom-up planning
FDW can handle a join which uses foreign tables managed by the FDW, of course. We obtain FDW routine entry to plan a scan against a foreign table, so propagating the information up to join phase would help core planner to check the all sources are managed by one FDW or not. It also avoids repeated catalog accesses.

* Make create_plan_recurse non-static
This is for CSPs and FDWs which want underlying plan nodes of a join. For example, a CSP might want outer/inner plan nodes as input sources of a join.

* Treat scanrelid == 0 as pseudo scan
A foreign/custom join is represented by a scan against a pseudo relation, i.e. result of a join. Usually Scan has valid scanrelid, oid of a relation being scanned, and many functions assume that it’s always valid. The patch adds another code paths for scanrelid == 0 as custom/foreign join scans.

* Pseudo scan target list support
CustomScan and ForeignScan have csp_ps_tlist and fdw_ps_tlist respectively, for column reference tracking. A scan generated for custom/foreign join would have column from multiple relations in its target list, i.e. output columns. Ordinary scans have all valid columns of the relation as output, so references to them can be resolved easily, but we need an additional mechanism to determine where a reference in a target list of custom/foreign scan come from. This is very similar to what IndexOnlyScan does, so we reuse INDEX_VAR as mark of an indirect reference to another relation’s var.

For this mechanism, set_plan_refs is changed to fix Vars in ps_tlist of CustomScan and ForeignScan. For this change, new BitmapSet function bms_shift_members is added.

set_deparse_planstate is also changed to pass ps_tlist as namespace for deparsing.

These chanes seems reasonable, so I mark this patch as “ready for committers” to hear committers' thoughts.

Regards,
--
Shigeru HANADA
shigeru.hanada@gmail.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#32

Kouhei Kaigai

kaigai@ak.jp.nec.com

over 10 years ago

In reply to: Shigeru HANADA (#31)

1 attachment(s)

Hanada-san,

I reviewed the Custom/Foreign join API patch again after writing a patch of join
push-down support for postgres_fdw.

Thanks for your dedicated jobs, my comments are inline below.

Here, please let me summarize the changes in the patch as the result of my review.

* Add set_join_pathlist_hook_type in add_paths_to_joinrel
This hook is intended to provide a chance to add one or more CustomPaths for an
actual join combination. If the join is reversible, the hook is called for both
A * B and B * A. This is different from FDW API but it seems fine because FDWs
should have chances to process the join in more abstract level than CSPs.

Parameters are same as hash_inner_and_outer, so they would be enough for hash-like
or nestloop-like methods. I’m not sure whether mergeclause_list is necessary
as a parameter or not. It’s information for merge join which is generated when
enable_mergejoin is on and the join is not FULL OUTER. Does some CSP need it
for processing a join in its own way? Then it must be in parameter list because
select_mergejoin_clauses is static so it’s not accessible from external modules.

I think, a preferable way is to reproduce the mergeclause_list by extension itself,
rather than pass it as a hook argument, because it is uncertain whether CSP should
follow "enable_mergejoin" parameter even if it implements a logic like merge-join.
Of course, it needs to expose select_mergejoin_clauses. It seems to me a straight-
forward way.

The timing of the hooking, after considering all built-in path types, seems fine
because some of CSPs might want to use built-in paths as a template or a source.

One concern is in the document of the hook function. "Implementing Custom Paths”
says:

A custom scan provider will be also able to add paths by setting the following

hook, to replace built-in join paths by custom-scan that performs as if a scan
on preliminary joined relations, which us called after the core code has generated
what it believes to be the complete and correct set of access paths for the join.

I think “replace” would mis-lead readers that CSP can remove or edit existing
built-in paths listed in RelOptInfo#pathlist or linked from cheapest_foo. IIUC
CSP can just add paths for the join relation, and planner choose it if it’s the
cheapest.

I adjusted the documentation stuff as follows:

A custom scan provider will be also able to add paths by setting the
following hook, to add <literal>CustomPath</> nodes that perform as
if built-in join logic doing. It is typically expected to take two
input relations then generate a joined output stream, or just scans
preliminaty joined relations like materialized-view. This hook is
called next to the consideration of core join logics, then planner
will choose the best path to run the relations join in the built-in
and custom ones.

Probably, it can introduce what this hook works correctly.
v12 patch updated only this portion.

* Add new FDW API GetForeignJoinPaths in make_join_rel
This FDW API is intended to provide a chance to add ForeignPaths for a join relation.
This is called only once for a join relation, so FDW should consider reversed
combination if it’s meaningful in their own mechanisms.

Note that this is called only when the join relation was *NOT* found in the
PlannerInfo, to avoid redundant calls.

Yep, it is designed according to the discussion upthreads.
It can produce N-way remote join paths even if intermediate join relation is
more expensive than local join + two foreign scan.

Parameters seems enough for postgres_fdw to process N-way join on remote side
with pushing down join conditions and remote filters.

You ensured it clearly.

* Treat scanrelid == 0 as pseudo scan
A foreign/custom join is represented by a scan against a pseudo relation, i.e.
result of a join. Usually Scan has valid scanrelid, oid of a relation being
scanned, and many functions assume that it’s always valid. The patch adds another
code paths for scanrelid == 0 as custom/foreign join scans.

Right,

* Pseudo scan target list support
CustomScan and ForeignScan have csp_ps_tlist and fdw_ps_tlist respectively, for
column reference tracking. A scan generated for custom/foreign join would have
column from multiple relations in its target list, i.e. output columns. Ordinary
scans have all valid columns of the relation as output, so references to them
can be resolved easily, but we need an additional mechanism to determine where
a reference in a target list of custom/foreign scan come from. This is very
similar to what IndexOnlyScan does, so we reuse INDEX_VAR as mark of an indirect
reference to another relation’s var.

Right, FDW/CSP driver is responsible to set *_ps_tlist to inform the core planner
which columns of relations are referenced, and which attribute represents what
columns/relations. It is an interface contract when foreign/custom-scan is chosen
instead of the built-in join logic.

For this mechanism, set_plan_refs is changed to fix Vars in ps_tlist of CustomScan
and ForeignScan. For this change, new BitmapSet function bms_shift_members is
added.

set_deparse_planstate is also changed to pass ps_tlist as namespace for
deparsing.

Yep, it is same as IndexOnlyScan.

These chanes seems reasonable, so I mark this patch as “ready for committers”
to hear committers' thoughts.

Thanks!
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Attachments:

pgsql-v9.5-custom-join.v12.patchapplication/octet-stream; name=pgsql-v9.5-custom-join.v12.patchDownload

 doc/src/sgml/custom-scan.sgml           | 46 +++++++++++++++++++
 doc/src/sgml/fdwhandler.sgml            | 51 +++++++++++++++++++++
 src/backend/commands/explain.c          | 15 +++++--
 src/backend/executor/execScan.c         |  4 ++
 src/backend/executor/nodeCustom.c       | 38 ++++++++++++----
 src/backend/executor/nodeForeignscan.c  | 34 +++++++++-----
 src/backend/foreign/foreign.c           | 31 ++++++++++---
 src/backend/nodes/bitmapset.c           | 57 +++++++++++++++++++++++
 src/backend/nodes/copyfuncs.c           |  5 +++
 src/backend/nodes/outfuncs.c            |  5 +++
 src/backend/optimizer/path/allpaths.c   |  1 -
 src/backend/optimizer/path/joinpath.c   | 13 ++++++
 src/backend/optimizer/path/joinrels.c   | 21 ++++++++-
 src/backend/optimizer/plan/createplan.c | 80 ++++++++++++++++++++++++++-------
 src/backend/optimizer/plan/setrefs.c    | 64 ++++++++++++++++++++++++++
 src/backend/optimizer/util/plancat.c    |  7 ++-
 src/backend/optimizer/util/relnode.c    | 22 ++++++++-
 src/backend/utils/adt/ruleutils.c       |  4 ++
 src/include/foreign/fdwapi.h            | 12 +++++
 src/include/nodes/bitmapset.h           |  1 +
 src/include/nodes/plannodes.h           | 24 +++++++---
 src/include/nodes/relation.h            |  2 +
 src/include/optimizer/pathnode.h        |  3 +-
 src/include/optimizer/paths.h           | 13 ++++++
 src/include/optimizer/planmain.h        |  1 +
 25 files changed, 502 insertions(+), 52 deletions(-)

diff --git a/doc/src/sgml/custom-scan.sgml b/doc/src/sgml/custom-scan.sgml
index 8a4a3df..bfa61c3 100644
--- a/doc/src/sgml/custom-scan.sgml
+++ b/doc/src/sgml/custom-scan.sgml
@@ -48,6 +48,30 @@ extern PGDLLIMPORT set_rel_pathlist_hook_type set_rel_pathlist_hook;
   </para>
 
   <para>
+   A custom scan provider will be also able to add paths by setting the
+   following hook, to add <literal>CustomPath</> nodes that perform as
+   if built-in join logic doing. It is typically expected to take two
+   input relations then generate a joined output stream, or just scans
+   preliminaty joined relations like materialized-view. This hook is
+   called next to the consideration of core join logics, then planner
+   will choose the best path to run the relations join in the built-in
+   and custom ones.
+<programlisting>
+typedef void (*set_join_pathlist_hook_type) (PlannerInfo *root,
+                                             RelOptInfo *joinrel,
+                                             RelOptInfo *outerrel,
+                                             RelOptInfo *innerrel,
+                                             List *restrictlist,
+                                             JoinType jointype,
+                                             SpecialJoinInfo *sjinfo,
+                                             SemiAntiJoinFactors *semifactors,
+                                             Relids param_source_rels,
+                                             Relids extra_lateral_rels);
+extern PGDLLIMPORT set_join_pathlist_hook_type set_join_pathlist_hook;
+</programlisting>
+  </para>
+
+  <para>
     Although this hook function can be used to examine, modify, or remove
     paths generated by the core system, a custom scan provider will typically
     confine itself to generating <structname>CustomPath</> objects and adding
@@ -124,7 +148,9 @@ typedef struct CustomScan
     Scan      scan;
     uint32    flags;
     List     *custom_exprs;
+    List     *custom_ps_tlist;
     List     *custom_private;
+    List     *custom_relids;
     const CustomScanMethods *methods;
 } CustomScan;
 </programlisting>
@@ -141,10 +167,30 @@ typedef struct CustomScan
     is only used by the custom scan provider itself.  Plan trees must be able
     to be duplicated using <function>copyObject</>, so all the data stored
     within these two fields must consist of nodes that function can handle.
+    <literal>custom_relids</> is set by the backend, thus custom-scan provider
+    does not need to touch, to track underlying relations represented by this
+    custom-scan node.
     <structfield>methods</> must point to a (usually statically allocated)
     object implementing the required custom scan methods, which are further
     detailed below.
   </para>
+  <para>
+   In case when <structname>CustomScan</> replaced built-in join paths,
+   custom-scan provider must have two characteristic setup.
+   The first one is zero on the <structfield>scan.scanrelid</>, which
+   should be usually an index of range-tables. It informs the backend
+   this <structname>CustomScan</> node is not associated with a particular
+   table. The second one is valid list of <structname>TargetEntry</> on
+   the <structfield>custom_ps_tlist</>. A <structname>CustomScan</> node
+   looks to the backend like a scan as literal, but on a relation which is
+   the result of relations join. It means we cannot construct a tuple
+   descriptor based on table definition, thus custom-scan provider must
+   introduce the expected record-type of the tuples.
+   Tuple-descriptor of scan-slot shall be constructed based on the
+   <structfield>custom_ps_tlist</>, and assigned on executor initialization.
+   Also, referenced by <command>EXPLAIN</> to solve name of the underlying
+   columns and relations.
+  </para>
 
   <sect2 id="custom-scan-plan-callbacks">
    <title>Custom Scan Callbacks</title>
diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
index c1daa4b..54ba45f 100644
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -598,6 +598,57 @@ IsForeignRelUpdatable (Relation rel);
 
    </sect2>
 
+   <sect2>
+    <title>FDW Routines for remote join</title>
+    <para>
+<programlisting>
+void
+GetForeignJoinPaths(PlannerInfo *root,
+                    RelOptInfo *joinrel,
+                    RelOptInfo *outerrel,
+                    RelOptInfo *innerrel,
+                    SpecialJoinInfo *sjinfo,
+                    List *restrictlist);
+</programlisting>
+     Create possible access paths for a join of two foreign tables or
+     joined relations, but both of them needs to be managed with same
+     FDW driver.
+     This optional function is called during query planning.
+    </para>
+    <para>
+     This function allows FDW driver to add <literal>ForeignScan</> path
+     towards the supplied <literal>joinrel</>. From the standpoint of
+     query planner, it looks like scan-node is added for join-relation.
+     It means, <literal>ForeignScan</> path added instead of the built-in
+     local join logic has to generate tuples as if it scans on a joined
+     and materialized relations.
+    </para>
+    <para>
+     Usually, we expect FDW drivers issues a remote query that involves
+     tables join on remote side, then FDW driver fetches the joined result
+     on local side.
+     Unlike simple table scan, definition of slot descriptor of the joined
+     relations is determined on the fly, thus we cannot know its definition
+     from the system catalog.
+     So, FDW driver is responsible to introduce the query planner expected
+     form of the joined relations. In case when <literal>ForeignScan</>
+     replaced a relations join, <literal>scanrelid</> of the generated plan
+     node shall be zero, to mark this <literal>ForeignScan</> node is not
+     associated with a particular foreign tables.
+     Also, it need to construct pseudo scan tlist (<literal>fdw_ps_tlist</>)
+     to indicate expected tuple definition.
+    </para>
+    <para>
+     Once <literal>scanrelid</> equals zero, executor initializes the slot
+     for scan according to <literal>fdw_ps_tlist</>, but excludes junk
+     entries. This list is also used to solve the name of the original
+     relation and columns, so FDW can chains expression nodes which are
+     not run on local side actually, like a join clause to be executed on
+     the remote side, however, target-entries of them will have
+     <literal>resjunk=true</>.
+    </para>
+   </sect2>
+
    <sect2 id="fdw-callbacks-explain">
     <title>FDW Routines for <command>EXPLAIN</></title>
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 315a528..f4cc901 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -730,11 +730,17 @@ ExplainPreScanNode(PlanState *planstate, Bitmapset **rels_used)
 		case T_ValuesScan:
 		case T_CteScan:
 		case T_WorkTableScan:
-		case T_ForeignScan:
-		case T_CustomScan:
 			*rels_used = bms_add_member(*rels_used,
 										((Scan *) plan)->scanrelid);
 			break;
+		case T_ForeignScan:
+			*rels_used = bms_add_members(*rels_used,
+										 ((ForeignScan *) plan)->fdw_relids);
+			break;
+		case T_CustomScan:
+			*rels_used = bms_add_members(*rels_used,
+										 ((CustomScan *) plan)->custom_relids);
+			break;
 		case T_ModifyTable:
 			*rels_used = bms_add_member(*rels_used,
 									((ModifyTable *) plan)->nominalRelation);
@@ -1072,9 +1078,12 @@ ExplainNode(PlanState *planstate, List *ancestors,
 		case T_ValuesScan:
 		case T_CteScan:
 		case T_WorkTableScan:
+			ExplainScanTarget((Scan *) plan, es);
+			break;
 		case T_ForeignScan:
 		case T_CustomScan:
-			ExplainScanTarget((Scan *) plan, es);
+			if (((Scan *) plan)->scanrelid > 0)
+				ExplainScanTarget((Scan *) plan, es);
 			break;
 		case T_IndexScan:
 			{
diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c
index 3f0d809..2f18a8a 100644
--- a/src/backend/executor/execScan.c
+++ b/src/backend/executor/execScan.c
@@ -251,6 +251,10 @@ ExecAssignScanProjectionInfo(ScanState *node)
 	/* Vars in an index-only scan's tlist should be INDEX_VAR */
 	if (IsA(scan, IndexOnlyScan))
 		varno = INDEX_VAR;
+	/* Also foreign-/custom-scan on pseudo relation should be INDEX_VAR */
+	else if (scan->scanrelid == 0 &&
+			 (IsA(scan, ForeignScan) || IsA(scan, CustomScan)))
+		varno = INDEX_VAR;
 	else
 		varno = scan->scanrelid;
 
diff --git a/src/backend/executor/nodeCustom.c b/src/backend/executor/nodeCustom.c
index b07932b..2344129 100644
--- a/src/backend/executor/nodeCustom.c
+++ b/src/backend/executor/nodeCustom.c
@@ -23,6 +23,7 @@ CustomScanState *
 ExecInitCustomScan(CustomScan *cscan, EState *estate, int eflags)
 {
 	CustomScanState    *css;
+	Index				scan_relid = cscan->scan.scanrelid;
 	Relation			scan_rel;
 
 	/* populate a CustomScanState according to the CustomScan */
@@ -48,12 +49,31 @@ ExecInitCustomScan(CustomScan *cscan, EState *estate, int eflags)
 	ExecInitScanTupleSlot(estate, &css->ss);
 	ExecInitResultTupleSlot(estate, &css->ss.ps);
 
-	/* initialize scan relation */
-	scan_rel = ExecOpenScanRelation(estate, cscan->scan.scanrelid, eflags);
-	css->ss.ss_currentRelation = scan_rel;
-	css->ss.ss_currentScanDesc = NULL;	/* set by provider */
-	ExecAssignScanType(&css->ss, RelationGetDescr(scan_rel));
-
+	/*
+	 * open the base relation and acquire appropriate lock on it, then
+	 * get the scan type from the relation descriptor, if this custom
+	 * scan is on actual relations.
+	 *
+	 * on the other hands, custom-scan may scan on a pseudo relation;
+	 * that is usually a result-set of relations join by external
+	 * computing resource, or others. It has to get the scan type from
+	 * the pseudo-scan target-list that should be assigned by custom-scan
+	 * provider.
+	 */
+	if (scan_relid > 0)
+	{
+		scan_rel = ExecOpenScanRelation(estate, scan_relid, eflags);
+		css->ss.ss_currentRelation = scan_rel;
+		css->ss.ss_currentScanDesc = NULL;	/* set by provider */
+		ExecAssignScanType(&css->ss, RelationGetDescr(scan_rel));
+	}
+	else
+	{
+		TupleDesc	ps_tupdesc;
+
+		ps_tupdesc = ExecCleanTypeFromTL(cscan->custom_ps_tlist, false);
+		ExecAssignScanType(&css->ss, ps_tupdesc);
+	}
 	css->ss.ps.ps_TupFromTlist = false;
 
 	/*
@@ -89,11 +109,11 @@ ExecEndCustomScan(CustomScanState *node)
 
 	/* Clean out the tuple table */
 	ExecClearTuple(node->ss.ps.ps_ResultTupleSlot);
-	if (node->ss.ss_ScanTupleSlot)
-		ExecClearTuple(node->ss.ss_ScanTupleSlot);
+	ExecClearTuple(node->ss.ss_ScanTupleSlot);
 
 	/* Close the heap relation */
-	ExecCloseScanRelation(node->ss.ss_currentRelation);
+	if (node->ss.ss_currentRelation)
+		ExecCloseScanRelation(node->ss.ss_currentRelation);
 }
 
 void
diff --git a/src/backend/executor/nodeForeignscan.c b/src/backend/executor/nodeForeignscan.c
index 7399053..542d176 100644
--- a/src/backend/executor/nodeForeignscan.c
+++ b/src/backend/executor/nodeForeignscan.c
@@ -102,6 +102,7 @@ ForeignScanState *
 ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
 {
 	ForeignScanState *scanstate;
+	Index		scanrelid = node->scan.scanrelid;
 	Relation	currentRelation;
 	FdwRoutine *fdwroutine;
 
@@ -141,16 +142,28 @@ ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
 	ExecInitScanTupleSlot(estate, &scanstate->ss);
 
 	/*
-	 * open the base relation and acquire appropriate lock on it.
+	 * open the base relation and acquire appropriate lock on it, then
+	 * get the scan type from the relation descriptor, if this foreign
+	 * scan is on actual foreign-table.
+	 *
+	 * on the other hands, foreign-scan may scan on a pseudo relation;
+	 * that is usually a result-set of remote relations join. It has
+	 * to get the scan type from the pseudo-scan target-list that should
+	 * be assigned by FDW driver.
 	 */
-	currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
-	scanstate->ss.ss_currentRelation = currentRelation;
+	if (scanrelid > 0)
+	{
+		currentRelation = ExecOpenScanRelation(estate, scanrelid, eflags);
+		scanstate->ss.ss_currentRelation = currentRelation;
+		ExecAssignScanType(&scanstate->ss, RelationGetDescr(currentRelation));
+	}
+	else
+	{
+		TupleDesc	ps_tupdesc;
 
-	/*
-	 * get the scan type from the relation descriptor.  (XXX at some point we
-	 * might want to let the FDW editorialize on the scan tupdesc.)
-	 */
-	ExecAssignScanType(&scanstate->ss, RelationGetDescr(currentRelation));
+		ps_tupdesc = ExecCleanTypeFromTL(node->fdw_ps_tlist, false);
+		ExecAssignScanType(&scanstate->ss, ps_tupdesc);
+	}
 
 	/*
 	 * Initialize result tuple type and projection info.
@@ -161,7 +174,7 @@ ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
 	/*
 	 * Acquire function pointers from the FDW's handler, and init fdw_state.
 	 */
-	fdwroutine = GetFdwRoutineForRelation(currentRelation, true);
+	fdwroutine = GetFdwRoutine(node->fdw_handler);
 	scanstate->fdwroutine = fdwroutine;
 	scanstate->fdw_state = NULL;
 
@@ -193,7 +206,8 @@ ExecEndForeignScan(ForeignScanState *node)
 	ExecClearTuple(node->ss.ss_ScanTupleSlot);
 
 	/* close the relation. */
-	ExecCloseScanRelation(node->ss.ss_currentRelation);
+	if (node->ss.ss_currentRelation)
+		ExecCloseScanRelation(node->ss.ss_currentRelation);
 }
 
 /* ----------------------------------------------------------------
diff --git a/src/backend/foreign/foreign.c b/src/backend/foreign/foreign.c
index cbe8b78..1901749 100644
--- a/src/backend/foreign/foreign.c
+++ b/src/backend/foreign/foreign.c
@@ -304,11 +304,11 @@ GetFdwRoutine(Oid fdwhandler)
 
 
 /*
- * GetFdwRoutineByRelId - look up the handler of the foreign-data wrapper
- * for the given foreign table, and retrieve its FdwRoutine struct.
+ * GetFdwHandlerByRelId - look up the handler of the foreign-data wrapper
+ * for the given foreign table
  */
-FdwRoutine *
-GetFdwRoutineByRelId(Oid relid)
+static Oid
+GetFdwHandlerByRelId(Oid relid)
 {
 	HeapTuple	tp;
 	Form_pg_foreign_data_wrapper fdwform;
@@ -350,7 +350,18 @@ GetFdwRoutineByRelId(Oid relid)
 
 	ReleaseSysCache(tp);
 
-	/* And finally, call the handler function. */
+	return fdwhandler;
+}
+
+/*
+ * GetFdwRoutineByRelId - look up the handler of the foreign-data wrapper
+ * for the given foreign table, and retrieve its FdwRoutine struct.
+ */
+FdwRoutine *
+GetFdwRoutineByRelId(Oid relid)
+{
+	Oid			fdwhandler = GetFdwHandlerByRelId(relid);
+
 	return GetFdwRoutine(fdwhandler);
 }
 
@@ -398,6 +409,16 @@ GetFdwRoutineForRelation(Relation relation, bool makecopy)
 	return relation->rd_fdwroutine;
 }
 
+/*
+ * GetFdwHandlerForRelation
+ *
+ * returns OID of FDW handler which is associated with the given relation.
+ */
+Oid
+GetFdwHandlerForRelation(Relation relation)
+{
+	return GetFdwHandlerByRelId(RelationGetRelid(relation));
+}
 
 /*
  * IsImportableForeignTable - filter table names for IMPORT FOREIGN SCHEMA
diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index a9c3b4b..4dc3286 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -301,6 +301,63 @@ bms_difference(const Bitmapset *a, const Bitmapset *b)
 }
 
 /*
+ * bms_shift_members - move all the bits by shift
+ */
+Bitmapset *
+bms_shift_members(const Bitmapset *a, int shift)
+{
+	Bitmapset  *b;
+	bitmapword	h_word;
+	bitmapword	l_word;
+	int			nwords;
+	int			w_shift;
+	int			b_shift;
+	int			i, j;
+
+	/* fast path if result shall be NULL obviously */
+	if (a == NULL || a->nwords * BITS_PER_BITMAPWORD + shift <= 0)
+		return NULL;
+	/* actually, not shift members */
+	if (shift == 0)
+		return bms_copy(a);
+
+	nwords = (a->nwords * BITS_PER_BITMAPWORD + shift +
+			  BITS_PER_BITMAPWORD - 1) / BITS_PER_BITMAPWORD;
+	b = palloc(BITMAPSET_SIZE(nwords));
+	b->nwords = nwords;
+
+	if (shift > 0)
+	{
+		/* Left shift */
+		w_shift = WORDNUM(shift);
+		b_shift = BITNUM(shift);
+
+		for (i=0, j=-w_shift; i < b->nwords; i++, j++)
+		{
+			h_word = (j >= 0   && j   < a->nwords ? a->words[j] : 0);
+			l_word = (j-1 >= 0 && j-1 < a->nwords ? a->words[j-1] : 0);
+			b->words[i] = ((h_word << b_shift) |
+						   (l_word >> (BITS_PER_BITMAPWORD - b_shift)));
+		}
+	}
+	else
+	{
+		/* Right shift */
+		w_shift = WORDNUM(-shift);
+		b_shift = BITNUM(-shift);
+
+		for (i=0, j=-w_shift; i < b->nwords; i++, j++)
+		{
+			h_word = (j+1 >= 0 && j+1 < a->nwords ? a->words[j+1] : 0);
+			l_word = (j >= 0 && j < a->nwords ? a->words[j] : 0);
+			b->words[i] = ((h_word >> (BITS_PER_BITMAPWORD - b_shift)) |
+						   (l_word << b_shift));
+		}
+	}
+	return b;
+}
+
+/*
  * bms_is_subset - is A a subset of B?
  */
 bool
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 029761e..61379a7 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -592,8 +592,11 @@ _copyForeignScan(const ForeignScan *from)
 	/*
 	 * copy remainder of node
 	 */
+	COPY_SCALAR_FIELD(fdw_handler);
 	COPY_NODE_FIELD(fdw_exprs);
+	COPY_NODE_FIELD(fdw_ps_tlist);
 	COPY_NODE_FIELD(fdw_private);
+	COPY_BITMAPSET_FIELD(fdw_relids);
 	COPY_SCALAR_FIELD(fsSystemCol);
 
 	return newnode;
@@ -617,7 +620,9 @@ _copyCustomScan(const CustomScan *from)
 	 */
 	COPY_SCALAR_FIELD(flags);
 	COPY_NODE_FIELD(custom_exprs);
+	COPY_NODE_FIELD(custom_ps_tlist);
 	COPY_NODE_FIELD(custom_private);
+	COPY_BITMAPSET_FIELD(custom_relids);
 
 	/*
 	 * NOTE: The method field of CustomScan is required to be a pointer to a
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 385b289..a178132 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -558,8 +558,11 @@ _outForeignScan(StringInfo str, const ForeignScan *node)
 
 	_outScanInfo(str, (const Scan *) node);
 
+	WRITE_OID_FIELD(fdw_handler);
 	WRITE_NODE_FIELD(fdw_exprs);
+	WRITE_NODE_FIELD(fdw_ps_tlist);
 	WRITE_NODE_FIELD(fdw_private);
+	WRITE_BITMAPSET_FIELD(fdw_relids);
 	WRITE_BOOL_FIELD(fsSystemCol);
 }
 
@@ -572,7 +575,9 @@ _outCustomScan(StringInfo str, const CustomScan *node)
 
 	WRITE_UINT_FIELD(flags);
 	WRITE_NODE_FIELD(custom_exprs);
+	WRITE_NODE_FIELD(custom_ps_tlist);
 	WRITE_NODE_FIELD(custom_private);
+	WRITE_BITMAPSET_FIELD(custom_relids);
 	appendStringInfoString(str, " :methods ");
 	_outToken(str, node->methods->CustomName);
 	if (node->methods->TextOutCustomScan)
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 58d78e6..14872ae 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -60,7 +60,6 @@ set_rel_pathlist_hook_type set_rel_pathlist_hook = NULL;
 /* Hook for plugins to replace standard_join_search() */
 join_search_hook_type join_search_hook = NULL;
 
-
 static void set_base_rel_sizes(PlannerInfo *root);
 static void set_base_rel_pathlists(PlannerInfo *root);
 static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 1da953f..61f1a78 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -17,10 +17,13 @@
 #include <math.h>
 
 #include "executor/executor.h"
+#include "foreign/fdwapi.h"
 #include "optimizer/cost.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 
+/* Hook for plugins to get control in add_paths_to_joinrel() */
+set_join_pathlist_hook_type set_join_pathlist_hook = NULL;
 
 #define PATH_PARAM_BY_REL(path, rel)  \
 	((path)->param_info && bms_overlap(PATH_REQ_OUTER(path), (rel)->relids))
@@ -260,6 +263,16 @@ add_paths_to_joinrel(PlannerInfo *root,
 							 restrictlist, jointype,
 							 sjinfo, &semifactors,
 							 param_source_rels, extra_lateral_rels);
+
+	/*
+	 * 5. Consider paths added by custom-scan providers, or other extensions
+	 * in addition to the built-in paths.
+	 */
+	if (set_join_pathlist_hook)
+		set_join_pathlist_hook(root, joinrel, outerrel, innerrel,
+							   restrictlist, jointype,
+							   sjinfo, &semifactors,
+							   param_source_rels, extra_lateral_rels);
 }
 
 /*
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index fe9fd57..b1c7bcb 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -14,6 +14,7 @@
  */
 #include "postgres.h"
 
+#include "foreign/fdwapi.h"
 #include "optimizer/joininfo.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
@@ -582,6 +583,7 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
 	SpecialJoinInfo sjinfo_data;
 	RelOptInfo *joinrel;
 	List	   *restrictlist;
+	bool		found;
 
 	/* We should never try to join two overlapping sets of rels. */
 	Assert(!bms_overlap(rel1->relids, rel2->relids));
@@ -635,7 +637,7 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
 	 * goes with this particular joining.
 	 */
 	joinrel = build_join_rel(root, joinrelids, rel1, rel2, sjinfo,
-							 &restrictlist);
+							 &restrictlist, &found);
 
 	/*
 	 * If we've already proven this join is empty, we needn't consider any
@@ -648,6 +650,23 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
 	}
 
 	/*
+	 * Prior to all the built-in join logics, consider paths that replaces
+	 * an entire join sub-tree by foreign-scan path, both of inner/outer
+	 * relations are managed by same FDW driver.
+	 * We expect remote join path has usually cheaper cost than local join
+	 * on top of two foreign-scan, so we consult FDW driver to add remote-
+	 * join path first, to break off path consideration with local join
+	 * logics.
+	 */
+	if (!found &&
+		joinrel->fdwroutine &&
+		joinrel->fdwroutine->GetForeignJoinPaths)
+	{
+		joinrel->fdwroutine->GetForeignJoinPaths(root, joinrel, rel1, rel2,
+												 sjinfo, restrictlist);
+	}
+
+	/*
 	 * Consider paths using each rel as both outer and inner.  Depending on
 	 * the join type, a provably empty outer or inner rel might mean the join
 	 * is provably empty too; in which case throw away any previously computed
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index cb69c03..7f86fcb 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -44,7 +44,6 @@
 #include "utils/lsyscache.h"
 
 
-static Plan *create_plan_recurse(PlannerInfo *root, Path *best_path);
 static Plan *create_scan_plan(PlannerInfo *root, Path *best_path);
 static List *build_path_tlist(PlannerInfo *root, Path *path);
 static bool use_physical_tlist(PlannerInfo *root, RelOptInfo *rel);
@@ -220,7 +219,7 @@ create_plan(PlannerInfo *root, Path *best_path)
  * create_plan_recurse
  *	  Recursive guts of create_plan().
  */
-static Plan *
+Plan *
 create_plan_recurse(PlannerInfo *root, Path *best_path)
 {
 	Plan	   *plan;
@@ -1961,16 +1960,26 @@ create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
 	ForeignScan *scan_plan;
 	RelOptInfo *rel = best_path->path.parent;
 	Index		scan_relid = rel->relid;
-	RangeTblEntry *rte;
+	Oid			rel_oid = InvalidOid;
 	Bitmapset  *attrs_used = NULL;
 	ListCell   *lc;
 	int			i;
 
-	/* it should be a base rel... */
-	Assert(scan_relid > 0);
-	Assert(rel->rtekind == RTE_RELATION);
-	rte = planner_rt_fetch(scan_relid, root);
-	Assert(rte->rtekind == RTE_RELATION);
+	/*
+	 * Fetch relation-id, if this foreign-scan node actuall scans on
+	 * a particular real relation. Elsewhere, InvalidOid shall be
+	 * informed to the FDW driver.
+	 */
+	if (scan_relid > 0)
+	{
+		RangeTblEntry *rte;
+
+		Assert(rel->rtekind == RTE_RELATION);
+		rte = planner_rt_fetch(scan_relid, root);
+		Assert(rte->rtekind == RTE_RELATION);
+		rel_oid = rte->relid;
+	}
+	Assert(rel->fdwroutine != NULL);
 
 	/*
 	 * Sort clauses into best execution order.  We do this first since the FDW
@@ -1985,13 +1994,37 @@ create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
 	 * has selected some join clauses for remote use but also wants them
 	 * rechecked locally).
 	 */
-	scan_plan = rel->fdwroutine->GetForeignPlan(root, rel, rte->relid,
+	scan_plan = rel->fdwroutine->GetForeignPlan(root, rel, rel_oid,
 												best_path,
 												tlist, scan_clauses);
+	/*
+	 * Sanity check. Pseudo scan tuple-descriptor shall be constructed
+	 * based on the fdw_ps_tlist, excluding resjunk=true, so we need to
+	 * ensure all valid TLEs have to locate prior to junk ones.
+	 */
+	if (scan_plan->scan.scanrelid == 0)
+	{
+		bool	found_resjunk = false;
+
+		foreach (lc, scan_plan->fdw_ps_tlist)
+		{
+			TargetEntry	   *tle = lfirst(lc);
+
+			if (tle->resjunk)
+				found_resjunk = true;
+			else if (found_resjunk)
+				elog(ERROR, "junk TLE should not apper prior to valid one");
+		}
+	}
+	/* Set the relids that are represented by this foreign scan for Explain */
+	scan_plan->fdw_relids = best_path->path.parent->relids;
 
 	/* Copy cost data from Path to Plan; no need to make FDW do this */
 	copy_path_costsize(&scan_plan->scan.plan, &best_path->path);
 
+	/* Track FDW server-id; no need to make FDW do this */
+	scan_plan->fdw_handler = rel->fdw_handler;
+
 	/*
 	 * Replace any outer-relation variables with nestloop params in the qual
 	 * and fdw_exprs expressions.  We do this last so that the FDW doesn't
@@ -2053,12 +2086,7 @@ create_customscan_plan(PlannerInfo *root, CustomPath *best_path,
 {
 	CustomScan *cplan;
 	RelOptInfo *rel = best_path->path.parent;
-
-	/*
-	 * Right now, all we can support is CustomScan node which is associated
-	 * with a particular base relation to be scanned.
-	 */
-	Assert(rel && rel->reloptkind == RELOPT_BASEREL);
+	ListCell   *lc;
 
 	/*
 	 * Sort clauses into the best execution order, although custom-scan
@@ -2078,6 +2106,28 @@ create_customscan_plan(PlannerInfo *root, CustomPath *best_path,
 	Assert(IsA(cplan, CustomScan));
 
 	/*
+	 * Sanity check. Pseudo scan tuple-descriptor shall be constructed
+	 * based on the custom_ps_tlist, excluding resjunk=true, so we need
+	 * to ensure all valid TLEs have to locate prior to junk ones.
+	 */
+	if (cplan->scan.scanrelid == 0)
+	{
+		bool	found_resjunk = false;
+
+		foreach (lc, cplan->custom_ps_tlist)
+		{
+			TargetEntry	   *tle = lfirst(lc);
+
+			if (tle->resjunk)
+				found_resjunk = true;
+			else if (found_resjunk)
+				elog(ERROR, "junk TLE should not apper prior to valid one");
+		}
+	}
+	/* Set the relids that are represented by this custom scan for Explain */
+	cplan->custom_relids = best_path->path.parent->relids;
+
+	/*
 	 * Copy cost data from Path to Plan; no need to make custom-plan providers
 	 * do this
 	 */
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 94b12ab..60fbb08 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -568,6 +568,38 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
 			{
 				ForeignScan *splan = (ForeignScan *) plan;
 
+				if (rtoffset > 0)
+					splan->fdw_relids =
+						bms_shift_members(splan->fdw_relids, rtoffset);
+
+				if (splan->scan.scanrelid == 0)
+				{
+					indexed_tlist *pscan_itlist =
+						build_tlist_index(splan->fdw_ps_tlist);
+
+					splan->scan.plan.targetlist = (List *)
+						fix_upper_expr(root,
+									   (Node *) splan->scan.plan.targetlist,
+									   pscan_itlist,
+									   INDEX_VAR,
+									   rtoffset);
+					splan->scan.plan.qual = (List *)
+						fix_upper_expr(root,
+									   (Node *) splan->scan.plan.qual,
+									   pscan_itlist,
+									   INDEX_VAR,
+									   rtoffset);
+					splan->fdw_exprs = (List *)
+						fix_upper_expr(root,
+									   (Node *) splan->fdw_exprs,
+									   pscan_itlist,
+									   INDEX_VAR,
+									   rtoffset);
+					splan->fdw_ps_tlist =
+						fix_scan_list(root, splan->fdw_ps_tlist, rtoffset);
+					pfree(pscan_itlist);
+					break;
+				}
 				splan->scan.scanrelid += rtoffset;
 				splan->scan.plan.targetlist =
 					fix_scan_list(root, splan->scan.plan.targetlist, rtoffset);
@@ -582,6 +614,38 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
 			{
 				CustomScan *splan = (CustomScan *) plan;
 
+				if (rtoffset > 0)
+					splan->custom_relids =
+						bms_shift_members(splan->custom_relids, rtoffset);
+
+				if (splan->scan.scanrelid == 0)
+				{
+					indexed_tlist *pscan_itlist =
+						build_tlist_index(splan->custom_ps_tlist);
+
+					splan->scan.plan.targetlist = (List *)
+						fix_upper_expr(root,
+									   (Node *) splan->scan.plan.targetlist,
+									   pscan_itlist,
+									   INDEX_VAR,
+									   rtoffset);
+					splan->scan.plan.qual = (List *)
+						fix_upper_expr(root,
+									   (Node *) splan->scan.plan.qual,
+									   pscan_itlist,
+									   INDEX_VAR,
+									   rtoffset);
+					splan->custom_exprs = (List *)
+						fix_upper_expr(root,
+									   (Node *) splan->custom_exprs,
+									   pscan_itlist,
+									   INDEX_VAR,
+									   rtoffset);
+					splan->custom_ps_tlist =
+						fix_scan_list(root, splan->custom_ps_tlist, rtoffset);
+					pfree(pscan_itlist);
+					break;
+				}
 				splan->scan.scanrelid += rtoffset;
 				splan->scan.plan.targetlist =
 					fix_scan_list(root, splan->scan.plan.targetlist, rtoffset);
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 8abed2a..79e34b8 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -379,10 +379,15 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 
 	/* Grab the fdwroutine info using the relcache, while we have it */
 	if (relation->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
+	{
+		rel->fdw_handler = GetFdwHandlerForRelation(relation);
 		rel->fdwroutine = GetFdwRoutineForRelation(relation, true);
+	}
 	else
+	{
+		rel->fdw_handler = InvalidOid;
 		rel->fdwroutine = NULL;
-
+	}
 	heap_close(relation, NoLock);
 
 	/*
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 8cfbea0..da2bd22 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -14,6 +14,7 @@
  */
 #include "postgres.h"
 
+#include "foreign/fdwapi.h"
 #include "optimizer/cost.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
@@ -122,6 +123,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind)
 	rel->subroot = NULL;
 	rel->subplan_params = NIL;
 	rel->fdwroutine = NULL;
+	rel->fdw_handler = InvalidOid;
 	rel->fdw_private = NULL;
 	rel->baserestrictinfo = NIL;
 	rel->baserestrictcost.startup = 0;
@@ -316,6 +318,8 @@ find_join_rel(PlannerInfo *root, Relids relids)
  * 'restrictlist_ptr': result variable.  If not NULL, *restrictlist_ptr
  *		receives the list of RestrictInfo nodes that apply to this
  *		particular pair of joinable relations.
+ * 'found' : indicates whether RelOptInfo is actually constructed.
+ *		true, if it was already built and on the cache.
  *
  * restrictlist_ptr makes the routine's API a little grotty, but it saves
  * duplicated calculation of the restrictlist...
@@ -326,7 +330,8 @@ build_join_rel(PlannerInfo *root,
 			   RelOptInfo *outer_rel,
 			   RelOptInfo *inner_rel,
 			   SpecialJoinInfo *sjinfo,
-			   List **restrictlist_ptr)
+			   List **restrictlist_ptr,
+			   bool *found)
 {
 	RelOptInfo *joinrel;
 	List	   *restrictlist;
@@ -347,8 +352,11 @@ build_join_rel(PlannerInfo *root,
 														   joinrel,
 														   outer_rel,
 														   inner_rel);
+		*found = true;
 		return joinrel;
 	}
+	/* not found on the cache */
+	*found = false;
 
 	/*
 	 * Nope, so make one.
@@ -427,6 +435,18 @@ build_join_rel(PlannerInfo *root,
 							   sjinfo, restrictlist);
 
 	/*
+	 * Set FDW handler and routine if both outer and inner relation
+	 * are managed by same FDW driver.
+	 */
+	if (OidIsValid(outer_rel->fdw_handler) &&
+		OidIsValid(inner_rel->fdw_handler) &&
+		outer_rel->fdw_handler == inner_rel->fdw_handler)
+	{
+		joinrel->fdw_handler = outer_rel->fdw_handler;
+		joinrel->fdwroutine = GetFdwRoutine(joinrel->fdw_handler);
+	}
+
+	/*
 	 * Add the joinrel to the query's joinrel list, and store it into the
 	 * auxiliary hashtable if there is one.  NB: GEQO requires us to append
 	 * the new joinrel to the end of the list!
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 29b5b1b..82bb438 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -3843,6 +3843,10 @@ set_deparse_planstate(deparse_namespace *dpns, PlanState *ps)
 	/* index_tlist is set only if it's an IndexOnlyScan */
 	if (IsA(ps->plan, IndexOnlyScan))
 		dpns->index_tlist = ((IndexOnlyScan *) ps->plan)->indextlist;
+	else if (IsA(ps->plan, ForeignScan))
+		dpns->index_tlist = ((ForeignScan *) ps->plan)->fdw_ps_tlist;
+	else if (IsA(ps->plan, CustomScan))
+		dpns->index_tlist = ((CustomScan *) ps->plan)->custom_ps_tlist;
 	else
 		dpns->index_tlist = NIL;
 }
diff --git a/src/include/foreign/fdwapi.h b/src/include/foreign/fdwapi.h
index 1d76841..d3a5261 100644
--- a/src/include/foreign/fdwapi.h
+++ b/src/include/foreign/fdwapi.h
@@ -82,6 +82,13 @@ typedef void (*EndForeignModify_function) (EState *estate,
 
 typedef int (*IsForeignRelUpdatable_function) (Relation rel);
 
+typedef void (*GetForeignJoinPaths_function ) (PlannerInfo *root,
+											   RelOptInfo *joinrel,
+											   RelOptInfo *outerrel,
+											   RelOptInfo *innerrel,
+											   SpecialJoinInfo *sjinfo,
+											   List *restrictlist);
+
 typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
 													struct ExplainState *es);
 
@@ -150,6 +157,10 @@ typedef struct FdwRoutine
 
 	/* Support functions for IMPORT FOREIGN SCHEMA */
 	ImportForeignSchema_function ImportForeignSchema;
+
+	/* Support functions for join push-down */
+	GetForeignJoinPaths_function GetForeignJoinPaths;
+
 } FdwRoutine;
 
 
@@ -157,6 +168,7 @@ typedef struct FdwRoutine
 extern FdwRoutine *GetFdwRoutine(Oid fdwhandler);
 extern FdwRoutine *GetFdwRoutineByRelId(Oid relid);
 extern FdwRoutine *GetFdwRoutineForRelation(Relation relation, bool makecopy);
+extern Oid	GetFdwHandlerForRelation(Relation relation);
 extern bool IsImportableForeignTable(const char *tablename,
 						 ImportForeignSchemaStmt *stmt);
 
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index 3a556ee..3ca9791 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -66,6 +66,7 @@ extern void bms_free(Bitmapset *a);
 extern Bitmapset *bms_union(const Bitmapset *a, const Bitmapset *b);
 extern Bitmapset *bms_intersect(const Bitmapset *a, const Bitmapset *b);
 extern Bitmapset *bms_difference(const Bitmapset *a, const Bitmapset *b);
+extern Bitmapset *bms_shift_members(const Bitmapset *a, int shift);
 extern bool bms_is_subset(const Bitmapset *a, const Bitmapset *b);
 extern BMS_Comparison bms_subset_compare(const Bitmapset *a, const Bitmapset *b);
 extern bool bms_is_member(int x, const Bitmapset *a);
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 21cbfa8..b25330e 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -471,7 +471,13 @@ typedef struct WorkTableScan
  * fdw_exprs and fdw_private are both under the control of the foreign-data
  * wrapper, but fdw_exprs is presumed to contain expression trees and will
  * be post-processed accordingly by the planner; fdw_private won't be.
- * Note that everything in both lists must be copiable by copyObject().
+ * An optional fdw_ps_tlist is used to map a reference to an attribute of
+ * underlying relation(s) on a pair of INDEX_VAR and alternative varattno.
+ * It looks like a scan on pseudo relation that is usually result of
+ * relations join on remote data source, and FDW driver is responsible to
+ * set expected target list for this. If FDW returns records as foreign-
+ * table definition, just put NIL here.
+ * Note that everything in above lists must be copiable by copyObject().
  * One way to store an arbitrary blob of bytes is to represent it as a bytea
  * Const.  Usually, though, you'll be better off choosing a representation
  * that can be dumped usefully by nodeToString().
@@ -480,18 +486,23 @@ typedef struct WorkTableScan
 typedef struct ForeignScan
 {
 	Scan		scan;
+	Oid			fdw_handler;	/* OID of FDW handler */
 	List	   *fdw_exprs;		/* expressions that FDW may evaluate */
+	List	   *fdw_ps_tlist;	/* optional pseudo-scan tlist for FDW */
 	List	   *fdw_private;	/* private data for FDW */
+	Bitmapset  *fdw_relids;		/* set of relid (index of range-tables)
+								 * represented by this node */
 	bool		fsSystemCol;	/* true if any "system column" is needed */
 } ForeignScan;
 
 /* ----------------
  *	   CustomScan node
  *
- * The comments for ForeignScan's fdw_exprs and fdw_private fields apply
- * equally to custom_exprs and custom_private.  Note that since Plan trees
- * can be copied, custom scan providers *must* fit all plan data they need
- * into those fields; embedding CustomScan in a larger struct will not work.
+ * The comments for ForeignScan's fdw_exprs, fdw_varmap and fdw_private fields
+ * apply equally to custom_exprs, custom_ps_tlist and custom_private.
+ *  Note that since Plan trees can be copied, custom scan providers *must*
+ * fit all plan data they need into those fields; embedding CustomScan in
+ * a larger struct will not work.
  * ----------------
  */
 struct CustomScan;
@@ -512,7 +523,10 @@ typedef struct CustomScan
 	Scan		scan;
 	uint32		flags;			/* mask of CUSTOMPATH_* flags, see relation.h */
 	List	   *custom_exprs;	/* expressions that custom code may evaluate */
+	List	   *custom_ps_tlist;/* optional pseudo-scan target list */
 	List	   *custom_private; /* private data for custom code */
+	Bitmapset  *custom_relids;	/* set of relid (index of range-tables)
+								 * represented by this node */
 	const CustomScanMethods *methods;
 } CustomScan;
 
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 401a686..1713d29 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -366,6 +366,7 @@ typedef struct PlannerInfo
  *		subroot - PlannerInfo for subquery (NULL if it's not a subquery)
  *		subplan_params - list of PlannerParamItems to be passed to subquery
  *		fdwroutine - function hooks for FDW, if foreign table (else NULL)
+ *		fdw_handler - OID of FDW handler, if foreign table (else InvalidOid)
  *		fdw_private - private state for FDW, if foreign table (else NULL)
  *
  *		Note: for a subquery, tuples, subplan, subroot are not set immediately
@@ -461,6 +462,7 @@ typedef struct RelOptInfo
 	List	   *subplan_params; /* if subquery */
 	/* use "struct FdwRoutine" to avoid including fdwapi.h here */
 	struct FdwRoutine *fdwroutine;		/* if foreign table */
+	Oid			fdw_handler;	/* if foreign table */
 	void	   *fdw_private;	/* if foreign table */
 
 	/* used by various scans and joins: */
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 9923f0e..3053f0f 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -141,7 +141,8 @@ extern RelOptInfo *build_join_rel(PlannerInfo *root,
 			   RelOptInfo *outer_rel,
 			   RelOptInfo *inner_rel,
 			   SpecialJoinInfo *sjinfo,
-			   List **restrictlist_ptr);
+			   List **restrictlist_ptr,
+			   bool *found);
 extern RelOptInfo *build_empty_join_rel(PlannerInfo *root);
 extern AppendRelInfo *find_childrel_appendrelinfo(PlannerInfo *root,
 							RelOptInfo *rel);
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 6cad92e..c42c69d 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -30,6 +30,19 @@ typedef void (*set_rel_pathlist_hook_type) (PlannerInfo *root,
 														RangeTblEntry *rte);
 extern PGDLLIMPORT set_rel_pathlist_hook_type set_rel_pathlist_hook;
 
+/* Hook for plugins to get control in add_paths_to_joinrel() */
+typedef void (*set_join_pathlist_hook_type) (PlannerInfo *root,
+											 RelOptInfo *joinrel,
+											 RelOptInfo *outerrel,
+											 RelOptInfo *innerrel,
+											 List *restrictlist,
+											 JoinType jointype,
+											 SpecialJoinInfo *sjinfo,
+											 SemiAntiJoinFactors *semifactors,
+											 Relids param_source_rels,
+											 Relids extra_lateral_rels);
+extern PGDLLIMPORT set_join_pathlist_hook_type set_join_pathlist_hook;
+
 /* Hook for plugins to replace standard_join_search() */
 typedef RelOptInfo *(*join_search_hook_type) (PlannerInfo *root,
 														  int levels_needed,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index fa72918..0c8cbcd 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -41,6 +41,7 @@ extern Plan *optimize_minmax_aggregates(PlannerInfo *root, List *tlist,
  * prototypes for plan/createplan.c
  */
 extern Plan *create_plan(PlannerInfo *root, Path *best_path);
+extern Plan *create_plan_recurse(PlannerInfo *root, Path *best_path);
 extern SubqueryScan *make_subqueryscan(List *qptlist, List *qpqual,
 				  Index scanrelid, Plan *subplan);
 extern ForeignScan *make_foreignscan(List *qptlist, List *qpqual,

#33

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

over 10 years ago

In reply to: Shigeru HANADA (#29)

I reviewed the foreign_join_v13 patch. Here are my comments

Thanks for this work. It's good to see that the The foreign_join patch
includes extensive tests for postgres_fdw. Thanks for the same.

Sanity
---------
The patch foreign_join didn't get applied cleanly with "git apply" but got
applied using "patch". The patch has "trailing whitespace"s.

The patch compiles cleanly with pgsql-v9.5-custom-join.v11.patch.

make check in regress and postgres_fdw folders passes without any failures.

Tests
-------
1.The postgres_fdw test is re/setting enable_mergejoin at various places.
The goal of these tests seems to be to test the sanity of foreign plans
generated. So, it might be better to reset enable_mergejoin (and may be all
of enable_hashjoin, enable_nestloop_join etc.) to false at the beginning of
the testcase and set them again at the end. That way, we will also make
sure that foreign plans are chosen irrespective of future planner changes.
2. In the patch, I see that the inheritance testcases have been deleted
from postgres_fdw.sql, is that intentional? I do not see those being
replaced anywhere else.
3. We need one test for each join type (or at least for INNER and LEFT
OUTER) where there are unsafe to push conditions in ON clause along-with
safe-to-push conditions. For INNER join, the join should get pushed down
with the safe conditions and for OUTER join it shouldn't be. Same goes for
WHERE clause, in which case the join will be pushed down but the
unsafe-to-push conditions will be applied locally.
4. All the tests have ORDER BY, LIMIT in them, so the setref code is being
exercised. But, something like aggregates would test the setref code
better. So, we should add at-least one test like select avg(ft1.c1 +
ft2.c2) from ft1 join ft2 on (ft1.c1 = ft2.c1).
5. It will be good to add some test which contain join between few foreign
and few local tables to see whether we are able to push down the largest
possible foreign join tree to the foreign server.

Code
-------
In classifyConditions(), the code is now appending RestrictInfo::clause
rather than RestrictInfo itself. But the callers of classifyConditions()
have not changed. Is this change intentional? The functions which consume
the lists produced by this function handle expressions as well
RestrictInfo, so you may not have noticed it. Because of this change, we
might be missing some optimizations e.g. in function
postgresGetForeignPlan()
793 if (list_member_ptr(fpinfo->remote_conds, rinfo))
794 remote_conds = lappend(remote_conds, rinfo->clause);
795 else if (list_member_ptr(fpinfo->local_conds, rinfo))
796 local_exprs = lappend(local_exprs, rinfo->clause);
797 else if (is_foreign_expr(root, baserel, rinfo->clause))
798 remote_conds = lappend(remote_conds, rinfo->clause);
799 else
800 local_exprs = lappend(local_exprs, rinfo->clause);
Finding a RestrictInfo in remote_conds avoids another call to
is_foreign_expr(). So with this change, I think we are doing an extra call
to is_foreign_expr().

The function get_jointype_name() returns an empty string for unsupported
join types. Instead of that it should throw an error, if some code path
accidentally calls the function with unsupported join type e.g. SEMI_JOIN.

While deparsing the SQL with rowmarks, the placement of FOR UPDATE/SHARE
clause in the original query is not being honored, which means that we will
end up locking the rows which are not part of the join result even when the
join is pushed to the foreign server. E.g take the following query (it uses
the tables created in postgres_fdw.sql tests)
contrib_regression=# explain verbose select * from ft1 join ft2 on (ft1.c1
= ft2.c1)* for update of ft1*;

QUERY
PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
LockRows (cost=100.00..124.66 rows=822 width=426)
Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8,
ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft1.*, ft2.*
-> Foreign Scan (cost=100.00..116.44 rows=822 width=426)
Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7,
ft1.c8, ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8,
ft1.*,
ft2.*
Relations: (public.ft1) INNER JOIN (public.ft2)
Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8,
l.a9, r.a1, r.a2, r.a3, r.a4, r.a5, r.a6, r.a7, r.a8, r.a9 FROM (SELECT l.a
10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17, ROW(l.a10, l.a11,
l.a12, l.a13, l.a14, l.a15, l.a16, l.a17) FROM
*(SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17
FROM "S 1"."T 1" FOR UPDATE)* l) l (a1, a2, a3, a4, a5, a6, a7, a8, a9)
INNER JOIN (SELECT r.a9, r.a10, r.a12,
r.a13, r.a14, r.a15, r.a16, r.a17, ROW(r.a9, r.a10, r.a12, r.a13, r.a14,
r.a15, r.a16, r.a17) FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14,
c6
a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2, a3, a4, a5, a6, a7,
a8, a9) ON ((l.a1 = r.a1))
(6 rows)
It's expected that only the rows which are part of join result will be
locked by FOR UPDATE clause. The query sent to the foreign server has
attached the FOR UPDATE clause to the sub-query for table ft1 ("S 1"."T 1"
on foreign server). As per the postgresql documentation, "When a locking
clause appears in a sub-SELECT, the rows locked are those returned to the
outer query by the sub-query.". So it's going to lock all rows from "S
1"."T 1", rather than only the rows which are part of join. This is going
to increase probability of deadlocks, if the join is between a big table
and small table where big table is being used in many queries and the join
is going to have only a single row in the result.

Since there is no is_first argument to appendConditions(), we should remove
corresponding line from the function prologue.

The name TO_RELATIVE() doesn't convey the full meaning of the macro. May be
GET_RELATIVE_ATTNO() or something like that.

In postgresGetForeignJoinPaths(), while separating the conditions into join
quals and other quals,
3014 if (IS_OUTER_JOIN(jointype))
3015 {
3016 extract_actual_join_clauses(joinclauses, &joinclauses,
&otherclauses);
3017 }
3018 else
3019 {
3020 joinclauses = extract_actual_clauses(joinclauses, false);
3021 otherclauses = NIL;
3022 }
we shouldn't differentiate between outer and inner join. For inner join the
join quals can be treated as other clauses and they will be returned as
other clauses, which is fine. Also, the following condition
3050 /*
3051 * Other condition for the join must be safe to push down.
3052 */
3053 foreach(lc, otherclauses)
3054 {
3055 Expr *expr = (Expr *) lfirst(lc);
3056
3057 if (!is_foreign_expr(root, joinrel, expr))
3058 {
3059 ereport(DEBUG3, (errmsg("filter contains unsafe
conditions")));
3060 return;
3061 }
3062 }
is unnecessary. I there are filter conditions which are unsafe to push
down, they can be applied locally after obtaining the join result from the
foreign server. The join quals are all needed to be safe to push down,
since they decide which rows will contain NULL inner side in an OUTER join.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

On Fri, Apr 17, 2015 at 10:13 AM, Shigeru HANADA <shigeru.hanada@gmail.com>
wrote:

Kaigai-san,

2015/04/17 10:13、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール：

Hanada-san,

I merged explain patch into foreign_join patch.

Now v12 is the latest patch.

It contains many garbage lines... Please ensure the
patch is correctly based on tOhe latest master +
custom_join patch.

Oops, sorry. I’ve re-created the patch as v13, based on Custom/Foreign
join v11 patch and latest master.

It contains EXPLAIN enhancement that new subitem “Relations” shows
relations and joins, including order and type, processed by the foreign
scan.

--
Shigeru HANADA
shigeru.hanada@gmail.com

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

#34

Robert Haas

robertmhaas@gmail.com

over 10 years ago

In reply to: Kouhei Kaigai (#14)

On Wed, Mar 25, 2015 at 9:51 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

The attached patch adds GetForeignJoinPaths call on make_join_rel() only when
'joinrel' is actually built and both of child relations are managed by same
FDW driver, prior to any other built-in join paths.
I adjusted the hook definition a little bit, because jointype can be reproduced
using SpecialJoinInfo. Right?

Probably, it will solve the original concern towards multiple calls of FDW
handler in case when it tries to replace an entire join subtree with a foreign-
scan on the result of remote join query.

How about your opinion?

A few random cosmetic problems:

- The hunk in allpaths.c is useless.
- The first hunk in fdwapi.h contains an extra space before the
closing parenthesis.

And then:

+       else if (scan->scanrelid == 0 &&
+                        (IsA(scan, ForeignScan) || IsA(scan, CustomScan)))
+               varno = INDEX_VAR;

Suppose scan->scanrelid == 0 but the scan type is something else? Is
that legal? Is varno == 0 the correct outcome in that case?

More later.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#35

Robert Haas

robertmhaas@gmail.com

over 10 years ago

In reply to: Kouhei Kaigai (#32)

On Tue, Apr 21, 2015 at 10:33 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

[ new patch ]

A little more nitpicking:

ExecInitForeignScan() and ExecInitCustomScan() could declare
currentRelation inside the if (scanrelid > 0) block instead of in the
outer scope.

I'm not too excited about the addition of GetFdwHandlerForRelation,
which is a one-line function used in one place. It seems like we
don't really need that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#36

Shigeru HANADA

shigeru.hanada@gmail.com

over 10 years ago

In reply to: Ashutosh Bapat (#33)

Hi Ashutosh,

Thanks for the review.

2015/04/22 19:28、Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> のメール：

Tests
-------
1.The postgres_fdw test is re/setting enable_mergejoin at various places. The goal of these tests seems to be to test the sanity of foreign plans generated. So, it might be better to reset enable_mergejoin (and may be all of enable_hashjoin, enable_nestloop_join etc.) to false at the beginning of the testcase and set them again at the end. That way, we will also make sure that foreign plans are chosen irrespective of future planner changes.

I have different, rather opposite opinion about it. I disabled other join types as least as the tests pass, because I worry oversights come from planner changes. I hope to eliminate enable_foo from the test script, by improving costing model smarter.

2. In the patch, I see that the inheritance testcases have been deleted from postgres_fdw.sql, is that intentional? I do not see those being replaced anywhere else.

It’s accidental removal, I restored the tests about inheritance feature.

3. We need one test for each join type (or at least for INNER and LEFT OUTER) where there are unsafe to push conditions in ON clause along-with safe-to-push conditions. For INNER join, the join should get pushed down with the safe conditions and for OUTER join it shouldn't be. Same goes for WHERE clause, in which case the join will be pushed down but the unsafe-to-push conditions will be applied locally.

Currently INNER JOINs with unsafe join conditions are not pushed down, so such test is not in the suit. As you say, in theory, INNER JOINs can be pushed down even they have push-down-unsafe join conditions, because such conditions can be evaluated no local side against rows retrieved without those conditions.

4. All the tests have ORDER BY, LIMIT in them, so the setref code is being exercised. But, something like aggregates would test the setref code better. So, we should add at-least one test like select avg(ft1.c1 + ft2.c2) from ft1 join ft2 on (ft1.c1 = ft2.c1).

Added an aggregate case, and also added an UNION case for Append.

5. It will be good to add some test which contain join between few foreign and few local tables to see whether we are able to push down the largest possible foreign join tree to the foreign server.

Code
-------
In classifyConditions(), the code is now appending RestrictInfo::clause rather than RestrictInfo itself. But the callers of classifyConditions() have not changed. Is this change intentional?

Yes, the purpose of the change is to make appendConditions (former name is appendWhereClause) can handle JOIN ON clause, list of Expr.

The functions which consume the lists produced by this function handle expressions as well RestrictInfo, so you may not have noticed it. Because of this change, we might be missing some optimizations e.g. in function postgresGetForeignPlan()
793 if (list_member_ptr(fpinfo->remote_conds, rinfo))
794 remote_conds = lappend(remote_conds, rinfo->clause);
795 else if (list_member_ptr(fpinfo->local_conds, rinfo))
796 local_exprs = lappend(local_exprs, rinfo->clause);
797 else if (is_foreign_expr(root, baserel, rinfo->clause))
798 remote_conds = lappend(remote_conds, rinfo->clause);
799 else
800 local_exprs = lappend(local_exprs, rinfo->clause);
Finding a RestrictInfo in remote_conds avoids another call to is_foreign_expr(). So with this change, I think we are doing an extra call to is_foreign_expr().

Hm, it seems better to revert my change and make appendConditions downcast given information into RestrictInfo or Expr according to the node tag.

The function get_jointype_name() returns an empty string for unsupported join types. Instead of that it should throw an error, if some code path accidentally calls the function with unsupported join type e.g. SEMI_JOIN.

Agreed, fixed.

While deparsing the SQL with rowmarks, the placement of FOR UPDATE/SHARE clause in the original query is not being honored, which means that we will end up locking the rows which are not part of the join result even when the join is pushed to the foreign server. E.g take the following query (it uses the tables created in postgres_fdw.sql tests)
contrib_regression=# explain verbose select * from ft1 join ft2 on (ft1.c1 = ft2.c1) for update of ft1;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
LockRows (cost=100.00..124.66 rows=822 width=426)
Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8, ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft1.*, ft2.*
-> Foreign Scan (cost=100.00..116.44 rows=822 width=426)
Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8, ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft1.*,
ft2.*
Relations: (public.ft1) INNER JOIN (public.ft2)
Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, l.a9, r.a1, r.a2, r.a3, r.a4, r.a5, r.a6, r.a7, r.a8, r.a9 FROM (SELECT l.a
10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17, ROW(l.a10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17) FROM (SELECT "C 1" a10, c2 a11, c3 a12
, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" FOR UPDATE) l) l (a1, a2, a3, a4, a5, a6, a7, a8, a9) INNER JOIN (SELECT r.a9, r.a10, r.a12,
r.a13, r.a14, r.a15, r.a16, r.a17, ROW(r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17) FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6
a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2, a3, a4, a5, a6, a7, a8, a9) ON ((l.a1 = r.a1))
(6 rows)
It's expected that only the rows which are part of join result will be locked by FOR UPDATE clause. The query sent to the foreign server has attached the FOR UPDATE clause to the sub-query for table ft1 ("S 1"."T 1" on foreign server). As per the postgresql documentation, "When a locking clause appears in a sub-SELECT, the rows locked are those returned to the outer query by the sub-query.". So it's going to lock all rows from "S 1"."T 1", rather than only the rows which are part of join. This is going to increase probability of deadlocks, if the join is between a big table and small table where big table is being used in many queries and the join is going to have only a single row in the result.

Since there is no is_first argument to appendConditions(), we should remove corresponding line from the function prologue.

Oops, replaced with the description of prefix.

The name TO_RELATIVE() doesn't convey the full meaning of the macro. May be GET_RELATIVE_ATTNO() or something like that.

Fixed.

In postgresGetForeignJoinPaths(), while separating the conditions into join quals and other quals,
3014 if (IS_OUTER_JOIN(jointype))
3015 {
3016 extract_actual_join_clauses(joinclauses, &joinclauses, &otherclauses);
3017 }
3018 else
3019 {
3020 joinclauses = extract_actual_clauses(joinclauses, false);
3021 otherclauses = NIL;
3022 }
we shouldn't differentiate between outer and inner join. For inner join the join quals can be treated as other clauses and they will be returned as other clauses, which is fine. Also, the following condition
3050 /*
3051 * Other condition for the join must be safe to push down.
3052 */
3053 foreach(lc, otherclauses)
3054 {
3055 Expr *expr = (Expr *) lfirst(lc);
3056
3057 if (!is_foreign_expr(root, joinrel, expr))
3058 {
3059 ereport(DEBUG3, (errmsg("filter contains unsafe conditions")));
3060 return;
3061 }
3062 }
is unnecessary. I there are filter conditions which are unsafe to push down, they can be applied locally after obtaining the join result from the foreign server. The join quals are all needed to be safe to push down, since they decide which rows will contain NULL inner side in an OUTER join.

I’m not sure that we *shouldn’t* differentiate, but I agree that we *don’t need* to differentiate if we are talking about only the result of filtering.

IMO we *should* differentiate inner and outer (or differentiate join conditions and filter conditions) because all conditions of typical INNER JOINs go into otherclauses because their is_pushed_down flag is on, so such joins look like CROSS JOIN + WHERE filter. In the latest patch EXPLAIN shows the join combinations of a foreign join scan node with join type, but your suggestion makes it looks like this:

fdw=# explain (verbose) select * from pgbench_branches b join pgbench_tellers t on t.bid = b.bid;

QUERY PLAN

Thoughts?

Regards,
--
Shigeru HANADA
shigeru.hanada@gmail.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#37

Shigeru HANADA

shigeru.hanada@gmail.com

over 10 years ago

In reply to: Ashutosh Bapat (#33)

Hi Ashutosh,

Thanks for the review.

2015/04/22 19:28、Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> のメール：

Tests
-------
1.The postgres_fdw test is re/setting enable_mergejoin at various places. The goal of these tests seems to be to test the sanity of foreign plans generated. So, it might be better to reset enable_mergejoin (and may be all of enable_hashjoin, enable_nestloop_join etc.) to false at the beginning of the testcase and set them again at the end. That way, we will also make sure that foreign plans are chosen irrespective of future planner changes.

2. In the patch, I see that the inheritance testcases have been deleted from postgres_fdw.sql, is that intentional? I do not see those being replaced anywhere else.

It’s accidental removal, I restored the tests about inheritance feature.

3. We need one test for each join type (or at least for INNER and LEFT OUTER) where there are unsafe to push conditions in ON clause along-with safe-to-push conditions. For INNER join, the join should get pushed down with the safe conditions and for OUTER join it shouldn't be. Same goes for WHERE clause, in which case the join will be pushed down but the unsafe-to-push conditions will be applied locally.

4. All the tests have ORDER BY, LIMIT in them, so the setref code is being exercised. But, something like aggregates would test the setref code better. So, we should add at-least one test like select avg(ft1.c1 + ft2.c2) from ft1 join ft2 on (ft1.c1 = ft2.c1).

Added an aggregate case, and also added an UNION case for Append.

5. It will be good to add some test which contain join between few foreign and few local tables to see whether we are able to push down the largest possible foreign join tree to the foreign server.

Code
-------
In classifyConditions(), the code is now appending RestrictInfo::clause rather than RestrictInfo itself. But the callers of classifyConditions() have not changed. Is this change intentional?

Yes, the purpose of the change is to make appendConditions (former name is appendWhereClause) can handle JOIN ON clause, list of Expr.

The functions which consume the lists produced by this function handle expressions as well RestrictInfo, so you may not have noticed it. Because of this change, we might be missing some optimizations e.g. in function postgresGetForeignPlan()
793 if (list_member_ptr(fpinfo->remote_conds, rinfo))
794 remote_conds = lappend(remote_conds, rinfo->clause);
795 else if (list_member_ptr(fpinfo->local_conds, rinfo))
796 local_exprs = lappend(local_exprs, rinfo->clause);
797 else if (is_foreign_expr(root, baserel, rinfo->clause))
798 remote_conds = lappend(remote_conds, rinfo->clause);
799 else
800 local_exprs = lappend(local_exprs, rinfo->clause);
Finding a RestrictInfo in remote_conds avoids another call to is_foreign_expr(). So with this change, I think we are doing an extra call to is_foreign_expr().

Hm, it seems better to revert my change and make appendConditions downcast given information into RestrictInfo or Expr according to the node tag.

The function get_jointype_name() returns an empty string for unsupported join types. Instead of that it should throw an error, if some code path accidentally calls the function with unsupported join type e.g. SEMI_JOIN.

Agreed, fixed.

While deparsing the SQL with rowmarks, the placement of FOR UPDATE/SHARE clause in the original query is not being honored, which means that we will end up locking the rows which are not part of the join result even when the join is pushed to the foreign server. E.g take the following query (it uses the tables created in postgres_fdw.sql tests)
contrib_regression=# explain verbose select * from ft1 join ft2 on (ft1.c1 = ft2.c1) for update of ft1;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
LockRows (cost=100.00..124.66 rows=822 width=426)
Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8, ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft1.*, ft2.*
-> Foreign Scan (cost=100.00..116.44 rows=822 width=426)
Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8, ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft1.*,
ft2.*
Relations: (public.ft1) INNER JOIN (public.ft2)
Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, l.a9, r.a1, r.a2, r.a3, r.a4, r.a5, r.a6, r.a7, r.a8, r.a9 FROM (SELECT l.a
10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17, ROW(l.a10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17) FROM (SELECT "C 1" a10, c2 a11, c3 a12
, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" FOR UPDATE) l) l (a1, a2, a3, a4, a5, a6, a7, a8, a9) INNER JOIN (SELECT r.a9, r.a10, r.a12,
r.a13, r.a14, r.a15, r.a16, r.a17, ROW(r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17) FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6
a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2, a3, a4, a5, a6, a7, a8, a9) ON ((l.a1 = r.a1))
(6 rows)
It's expected that only the rows which are part of join result will be locked by FOR UPDATE clause. The query sent to the foreign server has attached the FOR UPDATE clause to the sub-query for table ft1 ("S 1"."T 1" on foreign server). As per the postgresql documentation, "When a locking clause appears in a sub-SELECT, the rows locked are those returned to the outer query by the sub-query.". So it's going to lock all rows from "S 1"."T 1", rather than only the rows which are part of join. This is going to increase probability of deadlocks, if the join is between a big table and small table where big table is being used in many queries and the join is going to have only a single row in the result.

Since there is no is_first argument to appendConditions(), we should remove corresponding line from the function prologue.

Oops, replaced with the description of prefix.

The name TO_RELATIVE() doesn't convey the full meaning of the macro. May be GET_RELATIVE_ATTNO() or something like that.

Fixed.

In postgresGetForeignJoinPaths(), while separating the conditions into join quals and other quals,
3014 if (IS_OUTER_JOIN(jointype))
3015 {
3016 extract_actual_join_clauses(joinclauses, &joinclauses, &otherclauses);
3017 }
3018 else
3019 {
3020 joinclauses = extract_actual_clauses(joinclauses, false);
3021 otherclauses = NIL;
3022 }
we shouldn't differentiate between outer and inner join. For inner join the join quals can be treated as other clauses and they will be returned as other clauses, which is fine. Also, the following condition
3050 /*
3051 * Other condition for the join must be safe to push down.
3052 */
3053 foreach(lc, otherclauses)
3054 {
3055 Expr *expr = (Expr *) lfirst(lc);
3056
3057 if (!is_foreign_expr(root, joinrel, expr))
3058 {
3059 ereport(DEBUG3, (errmsg("filter contains unsafe conditions")));
3060 return;
3061 }
3062 }
is unnecessary. I there are filter conditions which are unsafe to push down, they can be applied locally after obtaining the join result from the foreign server. The join quals are all needed to be safe to push down, since they decide which rows will contain NULL inner side in an OUTER join.

I’m not sure that we *shouldn’t* differentiate, but I agree that we *don’t need* to differentiate if we are talking about only the result of filtering.

fdw=# explain (verbose) select * from pgbench_branches b join pgbench_tellers t on t.bid = b.bid;
WARNING: restrictlist: ({RESTRICTINFO :clause {OPEXPR :opno 96 :opfuncid 65 :opresulttype 16 :opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 1 :vartype 23 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 85} {VAR :varno 2 :varattno 2 :vartype 23 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 2 :varoattno 2 :location 77}) :location -1} :is_pushed_down true :outerjoin_delayed false :can_join true :pseudoconstant false :clause_relids (b 1 2) :required_relids (b 1 2) :outer_relids (b) :nullable_relids (b) :left_relids (b 1) :right_relids (b 2) :orclause <> :norm_selec 0.2000 :outer_selec -1.0000 :mergeopfamilies (o 1976) :left_em {EQUIVALENCEMEMBER :em_expr {VAR :varno 1 :varattno 1 :vartype 23 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 85} :em_relids (b 1) :em_nullable_relids (b) :em_is_const false :em_is_child false :em_datatype 23} :right_em {EQUIVALENCEMEMBER :em_expr {VAR :varno 2 :varattno 2 :vartype 23 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 2 :varoattno 2 :location 77} :em_relids (b 2) :em_nullable_relids (b) :em_is_const false :em_is_child false :em_datatype 23} :outer_is_left false :hashjoinoperator 96})
WARNING: joinclauses: <>
WARNING: otherclauses: ({OPEXPR :opno 96 :opfuncid 65 :opresulttype 16 :opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 1 :vartype 23 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 85} {VAR :varno 2 :varattno 2 :vartype 23 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 2 :varoattno 2 :location 77}) :location -1})

QUERY PLAN

Thoughts?

Regards,
--
Shigeru HANADA
shigeru.hanada@gmail.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#38

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

over 10 years ago

In reply to: Shigeru HANADA (#36)

On Fri, Apr 24, 2015 at 3:08 PM, Shigeru HANADA <shigeru.hanada@gmail.com>
wrote:

Hi Ashutosh,

Thanks for the review.

2015/04/22 19:28、Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> のメール：

Tests
-------
1.The postgres_fdw test is re/setting enable_mergejoin at various

places. The goal of these tests seems to be to test the sanity of foreign
plans generated. So, it might be better to reset enable_mergejoin (and may
be all of enable_hashjoin, enable_nestloop_join etc.) to false at the
beginning of the testcase and set them again at the end. That way, we will
also make sure that foreign plans are chosen irrespective of future planner
changes.

I have different, rather opposite opinion about it. I disabled other join
types as least as the tests pass, because I worry oversights come from
planner changes. I hope to eliminate enable_foo from the test script, by
improving costing model smarter.

Ok, if you can do that, that will be excellent.

2. In the patch, I see that the inheritance testcases have been deleted

from postgres_fdw.sql, is that intentional? I do not see those being
replaced anywhere else.

It’s accidental removal, I restored the tests about inheritance feature.

Thanks.

3. We need one test for each join type (or at least for INNER and LEFT

OUTER) where there are unsafe to push conditions in ON clause along-with
safe-to-push conditions. For INNER join, the join should get pushed down
with the safe conditions and for OUTER join it shouldn't be. Same goes for
WHERE clause, in which case the join will be pushed down but the
unsafe-to-push conditions will be applied locally.

Currently INNER JOINs with unsafe join conditions are not pushed down, so
such test is not in the suit. As you say, in theory, INNER JOINs can be
pushed down even they have push-down-unsafe join conditions, because such
conditions can be evaluated no local side against rows retrieved without
those conditions.

4. All the tests have ORDER BY, LIMIT in them, so the setref code is

being exercised. But, something like aggregates would test the setref code
better. So, we should add at-least one test like select avg(ft1.c1 +
ft2.c2) from ft1 join ft2 on (ft1.c1 = ft2.c1).

Added an aggregate case, and also added an UNION case for Append.

Thanks.

5. It will be good to add some test which contain join between few

foreign and few local tables to see whether we are able to push down the
largest possible foreign join tree to the foreign server.

Are you planning to do anything on this point?

Code
-------
In classifyConditions(), the code is now appending RestrictInfo::clause

rather than RestrictInfo itself. But the callers of classifyConditions()
have not changed. Is this change intentional?

Yes, the purpose of the change is to make appendConditions (former name is
appendWhereClause) can handle JOIN ON clause, list of Expr.

The functions which consume the lists produced by this function handle

expressions as well RestrictInfo, so you may not have noticed it. Because
of this change, we might be missing some optimizations e.g. in function
postgresGetForeignPlan()

793 if (list_member_ptr(fpinfo->remote_conds, rinfo))
794 remote_conds = lappend(remote_conds, rinfo->clause);
795 else if (list_member_ptr(fpinfo->local_conds, rinfo))
796 local_exprs = lappend(local_exprs, rinfo->clause);
797 else if (is_foreign_expr(root, baserel, rinfo->clause))
798 remote_conds = lappend(remote_conds, rinfo->clause);
799 else
800 local_exprs = lappend(local_exprs, rinfo->clause);
Finding a RestrictInfo in remote_conds avoids another call to

is_foreign_expr(). So with this change, I think we are doing an extra call
to is_foreign_expr().

Hm, it seems better to revert my change and make appendConditions downcast
given information into RestrictInfo or Expr according to the node tag.

Thanks.

The function get_jointype_name() returns an empty string for unsupported

join types. Instead of that it should throw an error, if some code path
accidentally calls the function with unsupported join type e.g. SEMI_JOIN.

Agreed, fixed.

Thanks.

While deparsing the SQL with rowmarks, the placement of FOR UPDATE/SHARE

clause in the original query is not being honored, which means that we will
end up locking the rows which are not part of the join result even when the
join is pushed to the foreign server. E.g take the following query (it uses
the tables created in postgres_fdw.sql tests)

contrib_regression=# explain verbose select * from ft1 join ft2 on

(ft1.c1 = ft2.c1) for update of ft1;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------------

LockRows (cost=100.00..124.66 rows=822 width=426)
Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7,

ft1.c8, ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8,
ft1.*, ft2.*

-> Foreign Scan (cost=100.00..116.44 rows=822 width=426)
Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7,

ft1.c8, ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8,
ft1.*,

ft2.*
Relations: (public.ft1) INNER JOIN (public.ft2)
Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7,

l.a8, l.a9, r.a1, r.a2, r.a3, r.a4, r.a5, r.a6, r.a7, r.a8, r.a9 FROM
(SELECT l.a

10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17, ROW(l.a10, l.a11,

l.a12, l.a13, l.a14, l.a15, l.a16, l.a17) FROM (SELECT "C 1" a10, c2 a11,
c3 a12

, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" FOR UPDATE) l)

l (a1, a2, a3, a4, a5, a6, a7, a8, a9) INNER JOIN (SELECT r.a9, r.a10,
r.a12,

r.a13, r.a14, r.a15, r.a16, r.a17, ROW(r.a9, r.a10, r.a12, r.a13, r.a14,

r.a15, r.a16, r.a17) FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14,
c6

a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2, a3, a4, a5, a6, a7,

a8, a9) ON ((l.a1 = r.a1))

(6 rows)
It's expected that only the rows which are part of join result will be

locked by FOR UPDATE clause. The query sent to the foreign server has
attached the FOR UPDATE clause to the sub-query for table ft1 ("S 1"."T 1"
on foreign server). As per the postgresql documentation, "When a locking
clause appears in a sub-SELECT, the rows locked are those returned to the
outer query by the sub-query.". So it's going to lock all rows from "S
1"."T 1", rather than only the rows which are part of join. This is going
to increase probability of deadlocks, if the join is between a big table
and small table where big table is being used in many queries and the join
is going to have only a single row in the result.

Are you planning to do anything about this point?

Since there is no is_first argument to appendConditions(), we should

remove corresponding line from the function prologue.

Oops, replaced with the description of prefix.

The name TO_RELATIVE() doesn't convey the full meaning of the macro. May

be GET_RELATIVE_ATTNO() or something like that.

Fixed.

Thanks.

In postgresGetForeignJoinPaths(), while separating the conditions into

join quals and other quals,

3014 if (IS_OUTER_JOIN(jointype))
3015 {
3016 extract_actual_join_clauses(joinclauses, &joinclauses,

&otherclauses);

3017 }
3018 else
3019 {
3020 joinclauses = extract_actual_clauses(joinclauses, false);
3021 otherclauses = NIL;
3022 }
we shouldn't differentiate between outer and inner join. For inner join

the join quals can be treated as other clauses and they will be returned as
other clauses, which is fine. Also, the following condition

3050 /*
3051 * Other condition for the join must be safe to push down.
3052 */
3053 foreach(lc, otherclauses)
3054 {
3055 Expr *expr = (Expr *) lfirst(lc);
3056
3057 if (!is_foreign_expr(root, joinrel, expr))
3058 {
3059 ereport(DEBUG3, (errmsg("filter contains unsafe

conditions")));

3060 return;
3061 }
3062 }
is unnecessary. I there are filter conditions which are unsafe to push

down, they can be applied locally after obtaining the join result from the
foreign server. The join quals are all needed to be safe to push down,
since they decide which rows will contain NULL inner side in an OUTER join.

I’m not sure that we *shouldn’t* differentiate, but I agree that we *don’t
need* to differentiate if we are talking about only the result of filtering.

IMO we *should* differentiate inner and outer (or differentiate join
conditions and filter conditions) because all conditions of typical INNER
JOINs go into otherclauses because their is_pushed_down flag is on, so such
joins look like CROSS JOIN + WHERE filter. In the latest patch EXPLAIN
shows the join combinations of a foreign join scan node with join type, but
your suggestion makes it looks like this:

fdw=# explain (verbose) select * from pgbench_branches b join
pgbench_tellers t on t.bid = b.bid;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------
Foreign Scan (cost=100.00..101.00 rows=50 width=716)
Output: b.bid, b.bbalance, b.filler, t.tid, t.bid, t.tbalance, t.filler
Relations: (public.pgbench_branches b) CROSS JOIN
(public.pgbench_tellers t)
Remote SQL: SELECT l.a1, l.a2, l.a3, r.a1, r.a2, r.a3, r.a4 FROM
(SELECT l.a9, l.a10, l.a11 FROM (SELECT bid a9, bbalance a10, filler a11
FROM public.pgbench_branches) l)
l (a1, a2, a3) CROSS JOIN (SELECT r.a9, r.a10, r.a11, r.a12 FROM (SELECT
tid a9, bid a10, tbalance a11, filler a12 FROM public.pgbench_tellers) r) r
(a1, a2, a3, a4) WHERE
((l.a1 = r.a2))
(4 rows)

Thoughts?

It does hamper readability a bit. But it explicitly shows, how do we want
to treat the join. We can leave this to the committers though.

Regards,
--
Shigeru HANADA
shigeru.hanada@gmail.com

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

#39

Robert Haas

robertmhaas@gmail.com

over 10 years ago

In reply to: Shigeru HANADA (#37)

On Mon, Apr 27, 2015 at 5:05 AM, Shigeru HANADA
<shigeru.hanada@gmail.com> wrote:

Currently INNER JOINs with unsafe join conditions are not pushed down, so such test is not in the suit. As you say, in theory, INNER JOINs can be pushed down even they have push-down-unsafe join conditions, because such conditions can be evaluated no local side against rows retrieved without those conditions.

I suspect it's worth trying to do the pushdown if there is at least
one safe joinclause. If there are none, fetching a Cartesian product
figures to be a loser.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers