Pushing down sorted joins

Started by Ashutosh Bapatalmost 10 years ago6 messages
#1Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
5 attachment(s)

Hi All,
Now that we have join pushdown support in postgres_fdw, we can leverage the
sort pushdown mechanism for base relations to work for pushed down joins as
well. PFA patch for the same.

The code to find useful set of pathkeys and then generate paths for each
list of pathkeys is moved into a function which is called for base
relations and join relations, while creating respective paths. The useful
pathkeys are same as the base relation i.e. root->query_pathkeys and
pathkeys useful for merge join as discussed in [1]/messages/by-id/CAFjFpRfeKHiCmwJ72p4=ZvuZRQsaU9tbfyW7vwr-5PPvrCbcQQ@mail.gmail.com -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company.

I measured performance of pushing down sort for merge joins for query
SELECT lt1.val, ft1.val, ft2.val FROM lt1 join (ft1 join ft2 on (ft1.val =
ft2.val)) on (lt1.val = ft1.val) where ft1, ft2 are foreign tables, join
between which gets pushed down to the foreign server and lt is the local
table.

Without the patch servers prefers local merge join between foreign tables
followed by merge join with local table by getting the data sorted from the
foreign server. But with the patch, it pushes down the foreign join and
also gets the data sorted for local merge join. The times measured over 10
runs of query with and without patch are

With patch
avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--------------+------------------+--------------+--------------
60310.0369 | 251.075471210925 | 59895.064 | 60746.496

Without patch
avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--------------+------------------+--------------+--------------
86396.6001 | 254.30988131848 | 85906.606 | 86742.311

With the patch the execution time of the query reduces by 30%.

The scripts to setup and run query and outputs of running query with and
without patch are attached.

[1]: /messages/by-id/CAFjFpRfeKHiCmwJ72p4=ZvuZRQsaU9tbfyW7vwr-5PPvrCbcQQ@mail.gmail.com -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
/messages/by-id/CAFjFpRfeKHiCmwJ72p4=ZvuZRQsaU9tbfyW7vwr-5PPvrCbcQQ@mail.gmail.com
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachments:

pg_join_sort_pd.patchapplication/x-download; name=pg_join_sort_pd.patchDownload
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 280c377..e26c587 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -437,20 +437,54 @@ SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1"
  103 | 103
  104 | 104
  105 | 105
  106 | 106
  107 | 107
  108 | 108
  109 | 109
  110 | 110
 (10 rows)
 
+-- A join between local table and foreign join. ORDER BY clause is added to the
+-- foreign join so that the local table can be joined using merge join strategy.
+EXPLAIN (COSTS false, VERBOSE)
+	SELECT t1."C 1" FROM "S 1"."T 1" t1 left join ft1 t2 join ft2 t3 on (t2.c1 = t3.c1) on (t3.c1 = t1."C 1") OFFSET 100 LIMIT 10;
+                                                                       QUERY PLAN                                                                        
+---------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1."C 1"
+   ->  Merge Right Join
+         Output: t1."C 1"
+         Merge Cond: (t3.c1 = t1."C 1")
+         ->  Foreign Scan
+               Output: t3.c1
+               Relations: (public.ft1 t2) INNER JOIN (public.ft2 t3)
+               Remote SQL: SELECT r3."C 1" FROM ("S 1"."T 1" r2 INNER JOIN "S 1"."T 1" r3 ON (TRUE)) WHERE ((r2."C 1" = r3."C 1")) ORDER BY r2."C 1" ASC
+         ->  Index Only Scan using t1_pkey on "S 1"."T 1" t1
+               Output: t1."C 1"
+(11 rows)
+
+SELECT t1."C 1" FROM "S 1"."T 1" t1 left join ft1 t2 join ft2 t3 on (t2.c1 = t3.c1) on (t3.c1 = t1."C 1") OFFSET 100 LIMIT 10;
+ C 1 
+-----
+ 101
+ 102
+ 103
+ 104
+ 105
+ 106
+ 107
+ 108
+ 109
+ 110
+(10 rows)
+
 RESET enable_hashjoin;
 RESET enable_nestloop;
 -- ===================================================================
 -- WHERE with remotely-executable conditions
 -- ===================================================================
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 1;         -- Var, OpExpr(b), Const
                                          QUERY PLAN                                          
 ---------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
@@ -862,94 +896,85 @@ SELECT count(c3) FROM ft1 t1 WHERE t1.c1 === t1.c2;
 -- ===================================================================
 -- JOIN queries
 -- ===================================================================
 -- Analyze ft4 and ft5 so that we have better statistics. These tables do not
 -- have use_remote_estimate set.
 ANALYZE ft4;
 ANALYZE ft5;
 -- join two tables
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                                                     QUERY PLAN                                                                     
-----------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                  QUERY PLAN                                                                                   
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1, t1.c3
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.c1, t2.c1, t1.c3
-         Sort Key: t1.c3, t1.c1
-         ->  Foreign Scan
-               Output: t1.c1, t2.c1, t1.c3
-               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
-               Remote SQL: SELECT r1."C 1", r1.c3, r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1"))
-(9 rows)
+         Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+         Remote SQL: SELECT r1."C 1", r1.c3, r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC, r1."C 1" ASC
+(6 rows)
 
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
  c1  | c1  
 -----+-----
  101 | 101
  102 | 102
  103 | 103
  104 | 104
  105 | 105
  106 | 106
  107 | 107
  108 | 108
  109 | 109
  110 | 110
 (10 rows)
 
 -- join three tables
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
-                                                                                                      QUERY PLAN                                                                                                       
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                    QUERY PLAN                                                                                                                    
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c2, t3.c3, t1.c3
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.c1, t2.c2, t3.c3, t1.c3
-         Sort Key: t1.c3, t1.c1
-         ->  Foreign Scan
-               Output: t1.c1, t2.c2, t3.c3, t1.c3
-               Relations: ((public.ft1 t1) INNER JOIN (public.ft2 t2)) INNER JOIN (public.ft4 t3)
-               Remote SQL: SELECT r1."C 1", r1.c3, r2.c2, r4.c3 FROM (("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) INNER JOIN "S 1"."T 3" r4 ON (TRUE)) WHERE ((r1."C 1" = r4.c1)) AND ((r1."C 1" = r2."C 1"))
-(9 rows)
+         Relations: ((public.ft1 t1) INNER JOIN (public.ft2 t2)) INNER JOIN (public.ft4 t3)
+         Remote SQL: SELECT r1."C 1", r1.c3, r2.c2, r4.c3 FROM (("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) INNER JOIN "S 1"."T 3" r4 ON (TRUE)) WHERE ((r1."C 1" = r4.c1)) AND ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC, r1."C 1" ASC
+(6 rows)
 
 SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
  c1 | c2 |   c3   
 ----+----+--------
  22 |  2 | AAA022
  24 |  4 | AAA024
  26 |  6 | AAA026
  28 |  8 | AAA028
  30 |  0 | AAA030
  32 |  2 | AAA032
  34 |  4 | AAA034
  36 |  6 | AAA036
  38 |  8 | AAA038
  40 |  0 | AAA040
 (10 rows)
 
 -- left outer join
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
-                                                     QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                  
+---------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.c1, t2.c1
-         Sort Key: t1.c1, t2.c1
-         ->  Foreign Scan
-               Output: t1.c1, t2.c1
-               Relations: (public.ft4 t1) LEFT JOIN (public.ft5 t2)
-               Remote SQL: SELECT r1.c1, r2.c1 FROM ("S 1"."T 3" r1 LEFT JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1))))
-(9 rows)
+         Relations: (public.ft4 t1) LEFT JOIN (public.ft5 t2)
+         Remote SQL: SELECT r1.c1, r2.c1 FROM ("S 1"."T 3" r1 LEFT JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1)))) ORDER BY r1.c1 ASC, r2.c1 ASC
+(6 rows)
 
 SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
  c1 | c1 
 ----+----
  22 |   
  24 | 24
  26 |   
  28 |   
  30 | 30
  32 |   
@@ -1000,63 +1025,57 @@ SELECT t1.c1, t1.c2, t2.c1, t2.c2 FROM ft4 t1 LEFT JOIN (SELECT * FROM ft5 WHERE
 ----+----+----+----
   2 |  3 |    |   
   4 |  5 |    |   
   6 |  7 |  6 |  7
   8 |  9 |    |   
 (4 rows)
 
 -- right outer join
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft5 t1 RIGHT JOIN ft4 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t1.c1 OFFSET 10 LIMIT 10;
-                                                     QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                  
+---------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.c1, t2.c1
-         Sort Key: t2.c1, t1.c1
-         ->  Foreign Scan
-               Output: t1.c1, t2.c1
-               Relations: (public.ft4 t2) LEFT JOIN (public.ft5 t1)
-               Remote SQL: SELECT r2.c1, r1.c1 FROM ("S 1"."T 3" r2 LEFT JOIN "S 1"."T 4" r1 ON (((r1.c1 = r2.c1))))
-(9 rows)
+         Relations: (public.ft4 t2) LEFT JOIN (public.ft5 t1)
+         Remote SQL: SELECT r2.c1, r1.c1 FROM ("S 1"."T 3" r2 LEFT JOIN "S 1"."T 4" r1 ON (((r1.c1 = r2.c1)))) ORDER BY r2.c1 ASC, r1.c1 ASC
+(6 rows)
 
 SELECT t1.c1, t2.c1 FROM ft5 t1 RIGHT JOIN ft4 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t1.c1 OFFSET 10 LIMIT 10;
  c1 | c1 
 ----+----
     | 22
  24 | 24
     | 26
     | 28
  30 | 30
     | 32
     | 34
  36 | 36
     | 38
     | 40
 (10 rows)
 
 -- full outer join
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
-                                                     QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                  
+---------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.c1, t2.c1
-         Sort Key: t1.c1, t2.c1
-         ->  Foreign Scan
-               Output: t1.c1, t2.c1
-               Relations: (public.ft4 t1) FULL JOIN (public.ft5 t2)
-               Remote SQL: SELECT r1.c1, r2.c1 FROM ("S 1"."T 3" r1 FULL JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1))))
-(9 rows)
+         Relations: (public.ft4 t1) FULL JOIN (public.ft5 t2)
+         Remote SQL: SELECT r1.c1, r2.c1 FROM ("S 1"."T 3" r1 FULL JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1)))) ORDER BY r1.c1 ASC, r2.c1 ASC
+(6 rows)
 
 SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
  c1  | c1 
 -----+----
   92 |   
   94 |   
   96 | 96
   98 |   
  100 |   
      |  3
@@ -1094,191 +1113,179 @@ SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1
     |  3
     |  9
     | 15
     | 21
 (10 rows)
 
 -- join two tables with FOR UPDATE clause
 -- tests whole-row reference for row marks
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR UPDATE OF t1;
-                                                                                                                                                QUERY PLAN                                                                                                                                                 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                              QUERY PLAN                                                                                                                                                              
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
    ->  LockRows
          Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-         ->  Sort
+         ->  Foreign Scan
                Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-               Sort Key: t1.c3, t1.c1
-               ->  Foreign Scan
-                     Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-                     Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
-                     Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) FOR UPDATE OF r1
-                     ->  Merge Join
-                           Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
-                           Merge Cond: (t1.c1 = t2.c1)
-                           ->  Sort
+               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+               Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC, r1."C 1" ASC FOR UPDATE OF r1
+               ->  Merge Join
+                     Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
+                     Merge Cond: (t1.c1 = t2.c1)
+                     ->  Sort
+                           Output: t1.c1, t1.c3, t1.*
+                           Sort Key: t1.c1
+                           ->  Foreign Scan on public.ft1 t1
                                  Output: t1.c1, t1.c3, t1.*
-                                 Sort Key: t1.c1
-                                 ->  Foreign Scan on public.ft1 t1
-                                       Output: t1.c1, t1.c3, t1.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
-                           ->  Sort
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
+                     ->  Sort
+                           Output: t2.c1, t2.*
+                           Sort Key: t2.c1
+                           ->  Foreign Scan on public.ft2 t2
                                  Output: t2.c1, t2.*
-                                 Sort Key: t2.c1
-                                 ->  Foreign Scan on public.ft2 t2
-                                       Output: t2.c1, t2.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
-(26 rows)
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+(23 rows)
 
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR UPDATE OF t1;
  c1  | c1  
 -----+-----
  101 | 101
  102 | 102
  103 | 103
  104 | 104
  105 | 105
  106 | 106
  107 | 107
  108 | 108
  109 | 109
  110 | 110
 (10 rows)
 
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR UPDATE;
-                                                                                                                                                         QUERY PLAN                                                                                                                                                         
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                      QUERY PLAN                                                                                                                                                                       
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
    ->  LockRows
          Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-         ->  Sort
+         ->  Foreign Scan
                Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-               Sort Key: t1.c3, t1.c1
-               ->  Foreign Scan
-                     Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-                     Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
-                     Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) FOR UPDATE OF r1 FOR UPDATE OF r2
-                     ->  Merge Join
-                           Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
-                           Merge Cond: (t1.c1 = t2.c1)
-                           ->  Sort
+               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+               Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC, r1."C 1" ASC FOR UPDATE OF r1 FOR UPDATE OF r2
+               ->  Merge Join
+                     Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
+                     Merge Cond: (t1.c1 = t2.c1)
+                     ->  Sort
+                           Output: t1.c1, t1.c3, t1.*
+                           Sort Key: t1.c1
+                           ->  Foreign Scan on public.ft1 t1
                                  Output: t1.c1, t1.c3, t1.*
-                                 Sort Key: t1.c1
-                                 ->  Foreign Scan on public.ft1 t1
-                                       Output: t1.c1, t1.c3, t1.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
-                           ->  Sort
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
+                     ->  Sort
+                           Output: t2.c1, t2.*
+                           Sort Key: t2.c1
+                           ->  Foreign Scan on public.ft2 t2
                                  Output: t2.c1, t2.*
-                                 Sort Key: t2.c1
-                                 ->  Foreign Scan on public.ft2 t2
-                                       Output: t2.c1, t2.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
-(26 rows)
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
+(23 rows)
 
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR UPDATE;
  c1  | c1  
 -----+-----
  101 | 101
  102 | 102
  103 | 103
  104 | 104
  105 | 105
  106 | 106
  107 | 107
  108 | 108
  109 | 109
  110 | 110
 (10 rows)
 
 -- join two tables with FOR SHARE clause
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR SHARE OF t1;
-                                                                                                                                                QUERY PLAN                                                                                                                                                
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                             QUERY PLAN                                                                                                                                                              
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
    ->  LockRows
          Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-         ->  Sort
+         ->  Foreign Scan
                Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-               Sort Key: t1.c3, t1.c1
-               ->  Foreign Scan
-                     Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-                     Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
-                     Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) FOR SHARE OF r1
-                     ->  Merge Join
-                           Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
-                           Merge Cond: (t1.c1 = t2.c1)
-                           ->  Sort
+               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+               Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC, r1."C 1" ASC FOR SHARE OF r1
+               ->  Merge Join
+                     Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
+                     Merge Cond: (t1.c1 = t2.c1)
+                     ->  Sort
+                           Output: t1.c1, t1.c3, t1.*
+                           Sort Key: t1.c1
+                           ->  Foreign Scan on public.ft1 t1
                                  Output: t1.c1, t1.c3, t1.*
-                                 Sort Key: t1.c1
-                                 ->  Foreign Scan on public.ft1 t1
-                                       Output: t1.c1, t1.c3, t1.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR SHARE
-                           ->  Sort
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR SHARE
+                     ->  Sort
+                           Output: t2.c1, t2.*
+                           Sort Key: t2.c1
+                           ->  Foreign Scan on public.ft2 t2
                                  Output: t2.c1, t2.*
-                                 Sort Key: t2.c1
-                                 ->  Foreign Scan on public.ft2 t2
-                                       Output: t2.c1, t2.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
-(26 rows)
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+(23 rows)
 
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR SHARE OF t1;
  c1  | c1  
 -----+-----
  101 | 101
  102 | 102
  103 | 103
  104 | 104
  105 | 105
  106 | 106
  107 | 107
  108 | 108
  109 | 109
  110 | 110
 (10 rows)
 
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR SHARE;
-                                                                                                                                                        QUERY PLAN                                                                                                                                                        
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                     QUERY PLAN                                                                                                                                                                      
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
    ->  LockRows
          Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-         ->  Sort
+         ->  Foreign Scan
                Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-               Sort Key: t1.c3, t1.c1
-               ->  Foreign Scan
-                     Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-                     Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
-                     Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) FOR SHARE OF r1 FOR SHARE OF r2
-                     ->  Merge Join
-                           Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
-                           Merge Cond: (t1.c1 = t2.c1)
-                           ->  Sort
+               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+               Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC, r1."C 1" ASC FOR SHARE OF r1 FOR SHARE OF r2
+               ->  Merge Join
+                     Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
+                     Merge Cond: (t1.c1 = t2.c1)
+                     ->  Sort
+                           Output: t1.c1, t1.c3, t1.*
+                           Sort Key: t1.c1
+                           ->  Foreign Scan on public.ft1 t1
                                  Output: t1.c1, t1.c3, t1.*
-                                 Sort Key: t1.c1
-                                 ->  Foreign Scan on public.ft1 t1
-                                       Output: t1.c1, t1.c3, t1.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR SHARE
-                           ->  Sort
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR SHARE
+                     ->  Sort
+                           Output: t2.c1, t2.*
+                           Sort Key: t2.c1
+                           ->  Foreign Scan on public.ft2 t2
                                  Output: t2.c1, t2.*
-                                 Sort Key: t2.c1
-                                 ->  Foreign Scan on public.ft2 t2
-                                       Output: t2.c1, t2.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR SHARE
-(26 rows)
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR SHARE
+(23 rows)
 
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR SHARE;
  c1  | c1  
 -----+-----
  101 | 101
  102 | 102
  103 | 103
  104 | 104
  105 | 105
  106 | 106
@@ -1318,32 +1325,29 @@ WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2
   106 |  106
   107 |  107
   108 |  108
   109 |  109
   110 |  110
 (10 rows)
 
 -- ctid with whole-row reference
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                                                                                                                    QUERY PLAN                                                                                                                                     
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                  QUERY PLAN                                                                                                                                                  
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
-         Sort Key: t1.c3, t1.c1
-         ->  Foreign Scan
-               Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
-               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
-               Remote SQL: SELECT r1.ctid, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r1."C 1", r1.c3, ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1"))
-(9 rows)
+         Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+         Remote SQL: SELECT r1.ctid, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r1."C 1", r1.c3, ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC, r1."C 1" ASC
+(6 rows)
 
 -- SEMI JOIN, not pushed down
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
                                     QUERY PLAN                                    
 ----------------------------------------------------------------------------------
  Limit
    Output: t1.c1
    ->  Merge Semi Join
          Output: t1.c1
@@ -1665,32 +1669,29 @@ SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM
 (10 rows)
 
 -- create another user for permission, user mapping, effective user tests
 CREATE USER view_owner;
 -- grant privileges on ft4 and ft5 to view_owner
 GRANT ALL ON ft4 TO view_owner;
 GRANT ALL ON ft5 TO view_owner;
 -- prepare statement with current session user
 PREPARE join_stmt AS SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
 EXPLAIN (COSTS OFF, VERBOSE) EXECUTE join_stmt;
-                                                     QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                  
+---------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.c1, t2.c1
-         Sort Key: t1.c1, t2.c1
-         ->  Foreign Scan
-               Output: t1.c1, t2.c1
-               Relations: (public.ft4 t1) LEFT JOIN (public.ft5 t2)
-               Remote SQL: SELECT r1.c1, r2.c1 FROM ("S 1"."T 3" r1 LEFT JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1))))
-(9 rows)
+         Relations: (public.ft4 t1) LEFT JOIN (public.ft5 t2)
+         Remote SQL: SELECT r1.c1, r2.c1 FROM ("S 1"."T 3" r1 LEFT JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1)))) ORDER BY r1.c1 ASC, r2.c1 ASC
+(6 rows)
 
 EXECUTE join_stmt;
  c1 | c1 
 ----+----
  22 |   
  24 | 24
  26 |   
  28 |   
  30 | 30
  32 |   
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 14a3f98..24169c5 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -324,20 +324,22 @@ static HeapTuple make_tuple_from_result_row(PGresult *res,
 						   int row,
 						   Relation rel,
 						   AttInMetadata *attinmeta,
 						   List *retrieved_attrs,
 						   ForeignScanState *fsstate,
 						   MemoryContext temp_context);
 static void conversion_error_callback(void *arg);
 static bool foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel,
 				JoinType jointype, RelOptInfo *outerrel, RelOptInfo *innerrel,
 				JoinPathExtraData *extra);
+static void add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
+								Path *epq_path);
 
 
 /*
  * Foreign-data wrapper handler function: return a struct with pointers
  * to my callback routines.
  */
 Datum
 postgres_fdw_handler(PG_FUNCTION_ARGS)
 {
 	FdwRoutine *routine = makeNode(FdwRoutine);
@@ -496,20 +498,28 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	 */
 	fpinfo->local_conds_sel = clauselist_selectivity(root,
 													 fpinfo->local_conds,
 													 baserel->relid,
 													 JOIN_INNER,
 													 NULL);
 
 	cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
 
 	/*
+	 * Set cached relation costs to some negative value, so that we can detect
+	 * when they are set to some sensible costs during one (usually the first)
+	 * of the calls to estimate_path_cost_size().
+	 */
+	fpinfo->rel_startup_cost = -1;
+	fpinfo->rel_total_cost = -1;
+
+	/*
 	 * If the table or the server is configured to use remote estimates,
 	 * connect to the foreign server and execute EXPLAIN to estimate the
 	 * number of rows selected by the restriction clauses, as well as the
 	 * average row width.  Otherwise, estimate using whatever statistics we
 	 * have locally, in a way similar to ordinary tables.
 	 */
 	if (fpinfo->use_remote_estimate)
 	{
 		/*
 		 * Get cost/size estimates with help of remote server.  Save the
@@ -627,26 +637,26 @@ get_useful_ecs_for_relation(PlannerInfo *root, RelOptInfo *rel)
 
 		/* Make sure we've got canonical ECs. */
 		update_mergeclause_eclasses(root, restrictinfo);
 
 		/*
 		 * restrictinfo->mergeopfamilies != NIL is sufficient to guarantee
 		 * that left_ec and right_ec will be initialized, per comments in
 		 * distribute_qual_to_rels, and rel->joininfo should only contain ECs
 		 * where this relation appears on one side or the other.
 		 */
-		if (bms_is_subset(restrictinfo->right_ec->ec_relids, relids))
+		if (bms_is_subset(relids, restrictinfo->right_ec->ec_relids))
 			useful_eclass_list = list_append_unique_ptr(useful_eclass_list,
 													 restrictinfo->right_ec);
 		else
 		{
-			Assert(bms_is_subset(restrictinfo->left_ec->ec_relids, relids));
+			Assert(bms_is_subset(relids, restrictinfo->left_ec->ec_relids));
 			useful_eclass_list = list_append_unique_ptr(useful_eclass_list,
 													  restrictinfo->left_ec);
 		}
 	}
 
 	return useful_eclass_list;
 }
 
 /*
  * get_useful_pathkeys_for_relation
@@ -766,63 +776,40 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
  */
 static void
 postgresGetForeignPaths(PlannerInfo *root,
 						RelOptInfo *baserel,
 						Oid foreigntableid)
 {
 	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
 	ForeignPath *path;
 	List	   *ppi_list;
 	ListCell   *lc;
-	List	   *useful_pathkeys_list = NIL;		/* List of all pathkeys */
 
 	/*
 	 * Create simplest ForeignScan path node and add it to baserel.  This path
 	 * corresponds to SeqScan path of regular tables (though depending on what
 	 * baserestrict conditions we were able to send to remote, there might
 	 * actually be an indexscan happening there).  We already did all the work
 	 * to estimate cost and size of this path.
 	 */
 	path = create_foreignscan_path(root, baserel,
 								   fpinfo->rows,
 								   fpinfo->startup_cost,
 								   fpinfo->total_cost,
 								   NIL, /* no pathkeys */
 								   NULL,		/* no outer rel either */
 								   NULL,		/* no extra plan */
 								   NIL);		/* no fdw_private list */
 	add_path(baserel, (Path *) path);
 
-	useful_pathkeys_list = get_useful_pathkeys_for_relation(root, baserel);
-
-	/* Create one path for each set of pathkeys we found above. */
-	foreach(lc, useful_pathkeys_list)
-	{
-		double		rows;
-		int			width;
-		Cost		startup_cost;
-		Cost		total_cost;
-		List	   *useful_pathkeys = lfirst(lc);
-
-		estimate_path_cost_size(root, baserel, NIL, useful_pathkeys,
-								&rows, &width, &startup_cost, &total_cost);
-
-		add_path(baserel, (Path *)
-				 create_foreignscan_path(root, baserel,
-										 rows,
-										 startup_cost,
-										 total_cost,
-										 useful_pathkeys,
-										 NULL,
-										 NULL,
-										 NIL));
-	}
+	/* Add paths with pathkeys */
+	add_paths_with_pathkeys_for_rel(root, baserel, NULL);
 
 	/*
 	 * If we're not using remote estimates, stop here.  We have no way to
 	 * estimate whether any join clauses would be worth sending across, so
 	 * don't bother building parameterized paths.
 	 */
 	if (!fpinfo->use_remote_estimate)
 		return;
 
 	/*
@@ -2174,21 +2161,32 @@ estimate_path_cost_size(PlannerInfo *root,
 		 * Use rows/width estimates made by set_baserel_size_estimates() for
 		 * base foreign relations and set_joinrel_size_estimates() for join
 		 * between foreign relations.
 		 */
 		rows = foreignrel->rows;
 		width = foreignrel->width;
 
 		/* Back into an estimate of the number of retrieved rows. */
 		retrieved_rows = clamp_row_est(rows / fpinfo->local_conds_sel);
 
-		if (foreignrel->reloptkind != RELOPT_JOINREL)
+		/*
+		 * We will come here again and again with different set of pathkeys
+		 * that caller wants to cost. We don't need to calculate the cost of
+		 * bare scan each time. Instead, use the costs if we have cached them
+		 * already.
+		 */
+		if (fpinfo->rel_startup_cost > 0 && fpinfo->rel_total_cost > 0)
+		{
+			startup_cost = fpinfo->rel_startup_cost;
+			run_cost = fpinfo->rel_total_cost - fpinfo->rel_startup_cost;
+		}
+		else if (foreignrel->reloptkind != RELOPT_JOINREL)
 		{
 			/* Clamp retrieved rows estimates to at most foreignrel->tuples. */
 			retrieved_rows = Min(retrieved_rows, foreignrel->tuples);
 
 			/*
 			 * Cost as though this were a seqscan, which is pessimistic.  We
 			 * effectively imagine the local_conds are being evaluated
 			 * remotely, too.
 			 */
 			startup_cost = 0;
@@ -2276,27 +2274,33 @@ estimate_path_cost_size(PlannerInfo *root,
 		if (pathkeys != NIL)
 		{
 			startup_cost *= DEFAULT_FDW_SORT_MULTIPLIER;
 			run_cost *= DEFAULT_FDW_SORT_MULTIPLIER;
 		}
 
 		total_cost = startup_cost + run_cost;
 	}
 
 	/*
-	 * Cache the costs prior to adding the costs for transferring data from
-	 * the foreign server. These costs are useful for costing the join between
-	 * this relation and another foreign relation, when the cost of join can
-	 * not be obtained from the foreign server.
+	 * Cache the costs for scans without any pathkeys or parameterization
+	 * prior to adding the costs for transferring data from the foreign
+	 * server. These costs are useful for costing the join between this
+	 * relation and another foreign relation or to calculate the costs of
+	 * paths with pathkeys for this relation, when the costs can not be
+	 * obtained from the foreign server. This function will be called at least
+	 * once for every foreign relation without pathkeys and parameterization.
 	 */
-	fpinfo->rel_startup_cost = startup_cost;
-	fpinfo->rel_total_cost = total_cost;
+	if (pathkeys == NIL && param_join_conds == NIL)
+	{
+		fpinfo->rel_startup_cost = startup_cost;
+		fpinfo->rel_total_cost = total_cost;
+	}
 
 	/*
 	 * Add some additional cost factors to account for connection overhead
 	 * (fdw_startup_cost), transferring data across the network
 	 * (fdw_tuple_cost per retrieved row), and local manipulation of the data
 	 * (cpu_tuple_cost per retrieved row).
 	 */
 	startup_cost += fpinfo->fdw_startup_cost;
 	total_cost += fpinfo->fdw_startup_cost;
 	total_cost += fpinfo->fdw_tuple_cost * retrieved_rows;
@@ -3451,20 +3455,28 @@ foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
 		fpinfo_i->use_remote_estimate;
 
 	/*
 	 * Since both the joining relations come from the same server, the server
 	 * level options should have same value for both the relations. Pick from
 	 * any side.
 	 */
 	fpinfo->fdw_startup_cost = fpinfo_o->fdw_startup_cost;
 	fpinfo->fdw_tuple_cost = fpinfo_o->fdw_tuple_cost;
 
+	/*
+	 * Set cached relation costs to some negative value, so that we can detect
+	 * when they are set to some sensible costs, during one (usually the
+	 * first) of the calls to estimate_path_cost_size().
+	 */
+	fpinfo->rel_startup_cost = -1;
+	fpinfo->rel_total_cost = -1;
+
 	/* Mark that this join can be pushed down safely */
 	fpinfo->pushdown_safe = true;
 
 	/*
 	 * Set fetch size to maximum of the joining sides, since we are expecting
 	 * the rows returned by the join to be proportional to the relation sizes.
 	 */
 	if (fpinfo_o->fetch_size > fpinfo_i->fetch_size)
 		fpinfo->fetch_size = fpinfo_o->fetch_size;
 	else
@@ -3525,20 +3537,59 @@ foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
 	 */
 	fpinfo->relation_name = makeStringInfo();
 	appendStringInfo(fpinfo->relation_name, "(%s) %s JOIN (%s)",
 					 fpinfo_o->relation_name->data,
 					 get_jointype_name(fpinfo->jointype),
 					 fpinfo_i->relation_name->data);
 
 	return true;
 }
 
+static void
+add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
+								Path *epq_path)
+{
+	List	   *useful_pathkeys_list = NIL;		/* List of all pathkeys */
+	ListCell   *lc;
+
+	useful_pathkeys_list = get_useful_pathkeys_for_relation(root, rel);
+
+	/* Create one path for each set of pathkeys we found above. */
+	foreach(lc, useful_pathkeys_list)
+	{
+		double		rows;
+		int			width;
+		Cost		startup_cost;
+		Cost		total_cost;
+		List	   *useful_pathkeys = lfirst(lc);
+
+		estimate_path_cost_size(root, rel, NIL, useful_pathkeys,
+								&rows, &width, &startup_cost, &total_cost);
+
+		/*
+		 * TODO: we should worry about EPQ path but should that path have
+		 * pathkeys? I guess, that's not really important since it's just
+		 * going to evaluate the join from whole-row references stuffed in the
+		 * corresponding EPQ slots, for which the order doesn't matter.
+		 */
+		add_path(rel, (Path *)
+				 create_foreignscan_path(root, rel,
+										 rows,
+										 startup_cost,
+										 total_cost,
+										 useful_pathkeys,
+										 NULL,
+										 epq_path,
+										 NIL));
+	}
+}
+
 /*
  * postgresGetForeignJoinPaths
  *		Add possible ForeignPath to joinrel, if join is safe to push down.
  */
 static void
 postgresGetForeignJoinPaths(PlannerInfo *root,
 							RelOptInfo *joinrel,
 							RelOptInfo *outerrel,
 							RelOptInfo *innerrel,
 							JoinType jointype,
@@ -3663,21 +3714,22 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
 									   startup_cost,
 									   total_cost,
 									   NIL,		/* no pathkeys */
 									   NULL,	/* no required_outer */
 									   epq_path,
 									   NULL);	/* no fdw_private */
 
 	/* Add generated path into joinrel by add_path(). */
 	add_path(joinrel, (Path *) joinpath);
 
-	/* XXX Consider pathkeys for the join relation */
+	/* Consider pathkeys for the join relation */
+	add_paths_with_pathkeys_for_rel(root, joinrel, epq_path);
 
 	/* XXX Consider parameterized paths for the join relation */
 }
 
 /*
  * Create a tuple from the specified row of the PGresult.
  *
  * rel is the local representation of the foreign table, attinmeta is
  * conversion data for the rel's tupdesc, and retrieved_attrs is an
  * integer list of the table column numbers present in the PGresult.
@@ -3870,21 +3922,21 @@ conversion_error_callback(void *arg)
  */
 extern Expr *
 find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 {
 	ListCell   *lc_em;
 
 	foreach(lc_em, ec->ec_members)
 	{
 		EquivalenceMember *em = lfirst(lc_em);
 
-		if (bms_equal(em->em_relids, rel->relids))
+		if (bms_is_subset(em->em_relids, rel->relids))
 		{
 			/*
 			 * If there is more than one equivalence member whose Vars are
 			 * taken entirely from this relation, we'll be content to choose
 			 * any one of those.
 			 */
 			return em->em_expr;
 		}
 	}
 
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 885a5fb..ee54a4b 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -230,20 +230,25 @@ SET enable_hashjoin TO false;
 SET enable_nestloop TO false;
 -- inner join; expressions in the clauses appear in the equivalence class list
 EXPLAIN (VERBOSE, COSTS false)
 	SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") OFFSET 100 LIMIT 10;
 SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") OFFSET 100 LIMIT 10;
 -- outer join; expressions in the clauses do not appear in equivalence class
 -- list but no output change as compared to the previous query
 EXPLAIN (VERBOSE, COSTS false)
 	SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") OFFSET 100 LIMIT 10;
 SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") OFFSET 100 LIMIT 10;
+-- A join between local table and foreign join. ORDER BY clause is added to the
+-- foreign join so that the local table can be joined using merge join strategy.
+EXPLAIN (COSTS false, VERBOSE)
+	SELECT t1."C 1" FROM "S 1"."T 1" t1 left join ft1 t2 join ft2 t3 on (t2.c1 = t3.c1) on (t3.c1 = t1."C 1") OFFSET 100 LIMIT 10;
+SELECT t1."C 1" FROM "S 1"."T 1" t1 left join ft1 t2 join ft2 t3 on (t2.c1 = t3.c1) on (t3.c1 = t1."C 1") OFFSET 100 LIMIT 10;
 RESET enable_hashjoin;
 RESET enable_nestloop;
 
 -- ===================================================================
 -- WHERE with remotely-executable conditions
 -- ===================================================================
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 1;         -- Var, OpExpr(b), Const
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NULL;        -- NullTest
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;    -- NullTest
sort_pd.sqltext/x-sql; charset=US-ASCII; name=sort_pd.sqlDownload
sort_pd_setup.sqltext/x-sql; charset=US-ASCII; name=sort_pd_setup.sqlDownload
sort_pd.out.without_patchapplication/octet-stream; name=sort_pd.out.without_patchDownload
sort_pd.out.with_patchapplication/octet-stream; name=sort_pd.out.with_patchDownload
#2Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Ashutosh Bapat (#1)
1 attachment(s)
Re: Pushing down sorted joins

Rushabh pointed out that declarations of helper functions
get_useful_ecs_for_relation and get_useful_pathkeys_for_relation() are part
of FDW routines declarations rather than helper function declaration. Since
those functions are related to this patch, the attached patch moves those
declaration in their right place.

On Wed, Feb 17, 2016 at 5:37 PM, Ashutosh Bapat <
ashutosh.bapat@enterprisedb.com> wrote:

Hi All,
Now that we have join pushdown support in postgres_fdw, we can leverage
the sort pushdown mechanism for base relations to work for pushed down
joins as well. PFA patch for the same.

The code to find useful set of pathkeys and then generate paths for each
list of pathkeys is moved into a function which is called for base
relations and join relations, while creating respective paths. The useful
pathkeys are same as the base relation i.e. root->query_pathkeys and
pathkeys useful for merge join as discussed in [1].

I measured performance of pushing down sort for merge joins for query
SELECT lt1.val, ft1.val, ft2.val FROM lt1 join (ft1 join ft2 on (ft1.val =
ft2.val)) on (lt1.val = ft1.val) where ft1, ft2 are foreign tables, join
between which gets pushed down to the foreign server and lt is the local
table.

Without the patch servers prefers local merge join between foreign tables
followed by merge join with local table by getting the data sorted from the
foreign server. But with the patch, it pushes down the foreign join and
also gets the data sorted for local merge join. The times measured over 10
runs of query with and without patch are

With patch
avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--------------+------------------+--------------+--------------
60310.0369 | 251.075471210925 | 59895.064 | 60746.496

Without patch
avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--------------+------------------+--------------+--------------
86396.6001 | 254.30988131848 | 85906.606 | 86742.311

With the patch the execution time of the query reduces by 30%.

The scripts to setup and run query and outputs of running query with and
without patch are attached.

[1]
/messages/by-id/CAFjFpRfeKHiCmwJ72p4=ZvuZRQsaU9tbfyW7vwr-5PPvrCbcQQ@mail.gmail.com
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachments:

pg_join_sort_pd_v2.patchapplication/x-download; name=pg_join_sort_pd_v2.patchDownload
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 280c377..e26c587 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -437,20 +437,54 @@ SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1"
  103 | 103
  104 | 104
  105 | 105
  106 | 106
  107 | 107
  108 | 108
  109 | 109
  110 | 110
 (10 rows)
 
+-- A join between local table and foreign join. ORDER BY clause is added to the
+-- foreign join so that the local table can be joined using merge join strategy.
+EXPLAIN (COSTS false, VERBOSE)
+	SELECT t1."C 1" FROM "S 1"."T 1" t1 left join ft1 t2 join ft2 t3 on (t2.c1 = t3.c1) on (t3.c1 = t1."C 1") OFFSET 100 LIMIT 10;
+                                                                       QUERY PLAN                                                                        
+---------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1."C 1"
+   ->  Merge Right Join
+         Output: t1."C 1"
+         Merge Cond: (t3.c1 = t1."C 1")
+         ->  Foreign Scan
+               Output: t3.c1
+               Relations: (public.ft1 t2) INNER JOIN (public.ft2 t3)
+               Remote SQL: SELECT r3."C 1" FROM ("S 1"."T 1" r2 INNER JOIN "S 1"."T 1" r3 ON (TRUE)) WHERE ((r2."C 1" = r3."C 1")) ORDER BY r2."C 1" ASC
+         ->  Index Only Scan using t1_pkey on "S 1"."T 1" t1
+               Output: t1."C 1"
+(11 rows)
+
+SELECT t1."C 1" FROM "S 1"."T 1" t1 left join ft1 t2 join ft2 t3 on (t2.c1 = t3.c1) on (t3.c1 = t1."C 1") OFFSET 100 LIMIT 10;
+ C 1 
+-----
+ 101
+ 102
+ 103
+ 104
+ 105
+ 106
+ 107
+ 108
+ 109
+ 110
+(10 rows)
+
 RESET enable_hashjoin;
 RESET enable_nestloop;
 -- ===================================================================
 -- WHERE with remotely-executable conditions
 -- ===================================================================
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 1;         -- Var, OpExpr(b), Const
                                          QUERY PLAN                                          
 ---------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
@@ -862,94 +896,85 @@ SELECT count(c3) FROM ft1 t1 WHERE t1.c1 === t1.c2;
 -- ===================================================================
 -- JOIN queries
 -- ===================================================================
 -- Analyze ft4 and ft5 so that we have better statistics. These tables do not
 -- have use_remote_estimate set.
 ANALYZE ft4;
 ANALYZE ft5;
 -- join two tables
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                                                     QUERY PLAN                                                                     
-----------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                  QUERY PLAN                                                                                   
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1, t1.c3
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.c1, t2.c1, t1.c3
-         Sort Key: t1.c3, t1.c1
-         ->  Foreign Scan
-               Output: t1.c1, t2.c1, t1.c3
-               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
-               Remote SQL: SELECT r1."C 1", r1.c3, r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1"))
-(9 rows)
+         Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+         Remote SQL: SELECT r1."C 1", r1.c3, r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC, r1."C 1" ASC
+(6 rows)
 
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
  c1  | c1  
 -----+-----
  101 | 101
  102 | 102
  103 | 103
  104 | 104
  105 | 105
  106 | 106
  107 | 107
  108 | 108
  109 | 109
  110 | 110
 (10 rows)
 
 -- join three tables
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
-                                                                                                      QUERY PLAN                                                                                                       
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                    QUERY PLAN                                                                                                                    
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c2, t3.c3, t1.c3
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.c1, t2.c2, t3.c3, t1.c3
-         Sort Key: t1.c3, t1.c1
-         ->  Foreign Scan
-               Output: t1.c1, t2.c2, t3.c3, t1.c3
-               Relations: ((public.ft1 t1) INNER JOIN (public.ft2 t2)) INNER JOIN (public.ft4 t3)
-               Remote SQL: SELECT r1."C 1", r1.c3, r2.c2, r4.c3 FROM (("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) INNER JOIN "S 1"."T 3" r4 ON (TRUE)) WHERE ((r1."C 1" = r4.c1)) AND ((r1."C 1" = r2."C 1"))
-(9 rows)
+         Relations: ((public.ft1 t1) INNER JOIN (public.ft2 t2)) INNER JOIN (public.ft4 t3)
+         Remote SQL: SELECT r1."C 1", r1.c3, r2.c2, r4.c3 FROM (("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) INNER JOIN "S 1"."T 3" r4 ON (TRUE)) WHERE ((r1."C 1" = r4.c1)) AND ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC, r1."C 1" ASC
+(6 rows)
 
 SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t3 ON (t3.c1 = t1.c1) ORDER BY t1.c3, t1.c1 OFFSET 10 LIMIT 10;
  c1 | c2 |   c3   
 ----+----+--------
  22 |  2 | AAA022
  24 |  4 | AAA024
  26 |  6 | AAA026
  28 |  8 | AAA028
  30 |  0 | AAA030
  32 |  2 | AAA032
  34 |  4 | AAA034
  36 |  6 | AAA036
  38 |  8 | AAA038
  40 |  0 | AAA040
 (10 rows)
 
 -- left outer join
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
-                                                     QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                  
+---------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.c1, t2.c1
-         Sort Key: t1.c1, t2.c1
-         ->  Foreign Scan
-               Output: t1.c1, t2.c1
-               Relations: (public.ft4 t1) LEFT JOIN (public.ft5 t2)
-               Remote SQL: SELECT r1.c1, r2.c1 FROM ("S 1"."T 3" r1 LEFT JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1))))
-(9 rows)
+         Relations: (public.ft4 t1) LEFT JOIN (public.ft5 t2)
+         Remote SQL: SELECT r1.c1, r2.c1 FROM ("S 1"."T 3" r1 LEFT JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1)))) ORDER BY r1.c1 ASC, r2.c1 ASC
+(6 rows)
 
 SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
  c1 | c1 
 ----+----
  22 |   
  24 | 24
  26 |   
  28 |   
  30 | 30
  32 |   
@@ -1000,63 +1025,57 @@ SELECT t1.c1, t1.c2, t2.c1, t2.c2 FROM ft4 t1 LEFT JOIN (SELECT * FROM ft5 WHERE
 ----+----+----+----
   2 |  3 |    |   
   4 |  5 |    |   
   6 |  7 |  6 |  7
   8 |  9 |    |   
 (4 rows)
 
 -- right outer join
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft5 t1 RIGHT JOIN ft4 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t1.c1 OFFSET 10 LIMIT 10;
-                                                     QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                  
+---------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.c1, t2.c1
-         Sort Key: t2.c1, t1.c1
-         ->  Foreign Scan
-               Output: t1.c1, t2.c1
-               Relations: (public.ft4 t2) LEFT JOIN (public.ft5 t1)
-               Remote SQL: SELECT r2.c1, r1.c1 FROM ("S 1"."T 3" r2 LEFT JOIN "S 1"."T 4" r1 ON (((r1.c1 = r2.c1))))
-(9 rows)
+         Relations: (public.ft4 t2) LEFT JOIN (public.ft5 t1)
+         Remote SQL: SELECT r2.c1, r1.c1 FROM ("S 1"."T 3" r2 LEFT JOIN "S 1"."T 4" r1 ON (((r1.c1 = r2.c1)))) ORDER BY r2.c1 ASC, r1.c1 ASC
+(6 rows)
 
 SELECT t1.c1, t2.c1 FROM ft5 t1 RIGHT JOIN ft4 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t1.c1 OFFSET 10 LIMIT 10;
  c1 | c1 
 ----+----
     | 22
  24 | 24
     | 26
     | 28
  30 | 30
     | 32
     | 34
  36 | 36
     | 38
     | 40
 (10 rows)
 
 -- full outer join
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
-                                                     QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                  
+---------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.c1, t2.c1
-         Sort Key: t1.c1, t2.c1
-         ->  Foreign Scan
-               Output: t1.c1, t2.c1
-               Relations: (public.ft4 t1) FULL JOIN (public.ft5 t2)
-               Remote SQL: SELECT r1.c1, r2.c1 FROM ("S 1"."T 3" r1 FULL JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1))))
-(9 rows)
+         Relations: (public.ft4 t1) FULL JOIN (public.ft5 t2)
+         Remote SQL: SELECT r1.c1, r2.c1 FROM ("S 1"."T 3" r1 FULL JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1)))) ORDER BY r1.c1 ASC, r2.c1 ASC
+(6 rows)
 
 SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
  c1  | c1 
 -----+----
   92 |   
   94 |   
   96 | 96
   98 |   
  100 |   
      |  3
@@ -1094,191 +1113,179 @@ SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1
     |  3
     |  9
     | 15
     | 21
 (10 rows)
 
 -- join two tables with FOR UPDATE clause
 -- tests whole-row reference for row marks
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR UPDATE OF t1;
-                                                                                                                                                QUERY PLAN                                                                                                                                                 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                              QUERY PLAN                                                                                                                                                              
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
    ->  LockRows
          Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-         ->  Sort
+         ->  Foreign Scan
                Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-               Sort Key: t1.c3, t1.c1
-               ->  Foreign Scan
-                     Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-                     Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
-                     Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) FOR UPDATE OF r1
-                     ->  Merge Join
-                           Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
-                           Merge Cond: (t1.c1 = t2.c1)
-                           ->  Sort
+               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+               Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC, r1."C 1" ASC FOR UPDATE OF r1
+               ->  Merge Join
+                     Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
+                     Merge Cond: (t1.c1 = t2.c1)
+                     ->  Sort
+                           Output: t1.c1, t1.c3, t1.*
+                           Sort Key: t1.c1
+                           ->  Foreign Scan on public.ft1 t1
                                  Output: t1.c1, t1.c3, t1.*
-                                 Sort Key: t1.c1
-                                 ->  Foreign Scan on public.ft1 t1
-                                       Output: t1.c1, t1.c3, t1.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
-                           ->  Sort
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
+                     ->  Sort
+                           Output: t2.c1, t2.*
+                           Sort Key: t2.c1
+                           ->  Foreign Scan on public.ft2 t2
                                  Output: t2.c1, t2.*
-                                 Sort Key: t2.c1
-                                 ->  Foreign Scan on public.ft2 t2
-                                       Output: t2.c1, t2.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
-(26 rows)
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+(23 rows)
 
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR UPDATE OF t1;
  c1  | c1  
 -----+-----
  101 | 101
  102 | 102
  103 | 103
  104 | 104
  105 | 105
  106 | 106
  107 | 107
  108 | 108
  109 | 109
  110 | 110
 (10 rows)
 
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR UPDATE;
-                                                                                                                                                         QUERY PLAN                                                                                                                                                         
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                      QUERY PLAN                                                                                                                                                                       
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
    ->  LockRows
          Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-         ->  Sort
+         ->  Foreign Scan
                Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-               Sort Key: t1.c3, t1.c1
-               ->  Foreign Scan
-                     Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-                     Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
-                     Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) FOR UPDATE OF r1 FOR UPDATE OF r2
-                     ->  Merge Join
-                           Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
-                           Merge Cond: (t1.c1 = t2.c1)
-                           ->  Sort
+               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+               Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC, r1."C 1" ASC FOR UPDATE OF r1 FOR UPDATE OF r2
+               ->  Merge Join
+                     Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
+                     Merge Cond: (t1.c1 = t2.c1)
+                     ->  Sort
+                           Output: t1.c1, t1.c3, t1.*
+                           Sort Key: t1.c1
+                           ->  Foreign Scan on public.ft1 t1
                                  Output: t1.c1, t1.c3, t1.*
-                                 Sort Key: t1.c1
-                                 ->  Foreign Scan on public.ft1 t1
-                                       Output: t1.c1, t1.c3, t1.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
-                           ->  Sort
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
+                     ->  Sort
+                           Output: t2.c1, t2.*
+                           Sort Key: t2.c1
+                           ->  Foreign Scan on public.ft2 t2
                                  Output: t2.c1, t2.*
-                                 Sort Key: t2.c1
-                                 ->  Foreign Scan on public.ft2 t2
-                                       Output: t2.c1, t2.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
-(26 rows)
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
+(23 rows)
 
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR UPDATE;
  c1  | c1  
 -----+-----
  101 | 101
  102 | 102
  103 | 103
  104 | 104
  105 | 105
  106 | 106
  107 | 107
  108 | 108
  109 | 109
  110 | 110
 (10 rows)
 
 -- join two tables with FOR SHARE clause
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR SHARE OF t1;
-                                                                                                                                                QUERY PLAN                                                                                                                                                
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                             QUERY PLAN                                                                                                                                                              
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
    ->  LockRows
          Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-         ->  Sort
+         ->  Foreign Scan
                Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-               Sort Key: t1.c3, t1.c1
-               ->  Foreign Scan
-                     Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-                     Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
-                     Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) FOR SHARE OF r1
-                     ->  Merge Join
-                           Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
-                           Merge Cond: (t1.c1 = t2.c1)
-                           ->  Sort
+               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+               Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC, r1."C 1" ASC FOR SHARE OF r1
+               ->  Merge Join
+                     Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
+                     Merge Cond: (t1.c1 = t2.c1)
+                     ->  Sort
+                           Output: t1.c1, t1.c3, t1.*
+                           Sort Key: t1.c1
+                           ->  Foreign Scan on public.ft1 t1
                                  Output: t1.c1, t1.c3, t1.*
-                                 Sort Key: t1.c1
-                                 ->  Foreign Scan on public.ft1 t1
-                                       Output: t1.c1, t1.c3, t1.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR SHARE
-                           ->  Sort
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR SHARE
+                     ->  Sort
+                           Output: t2.c1, t2.*
+                           Sort Key: t2.c1
+                           ->  Foreign Scan on public.ft2 t2
                                  Output: t2.c1, t2.*
-                                 Sort Key: t2.c1
-                                 ->  Foreign Scan on public.ft2 t2
-                                       Output: t2.c1, t2.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
-(26 rows)
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+(23 rows)
 
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR SHARE OF t1;
  c1  | c1  
 -----+-----
  101 | 101
  102 | 102
  103 | 103
  104 | 104
  105 | 105
  106 | 106
  107 | 107
  108 | 108
  109 | 109
  110 | 110
 (10 rows)
 
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR SHARE;
-                                                                                                                                                        QUERY PLAN                                                                                                                                                        
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                     QUERY PLAN                                                                                                                                                                      
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
    ->  LockRows
          Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-         ->  Sort
+         ->  Foreign Scan
                Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-               Sort Key: t1.c3, t1.c1
-               ->  Foreign Scan
-                     Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-                     Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
-                     Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) FOR SHARE OF r1 FOR SHARE OF r2
-                     ->  Merge Join
-                           Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
-                           Merge Cond: (t1.c1 = t2.c1)
-                           ->  Sort
+               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+               Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC, r1."C 1" ASC FOR SHARE OF r1 FOR SHARE OF r2
+               ->  Merge Join
+                     Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
+                     Merge Cond: (t1.c1 = t2.c1)
+                     ->  Sort
+                           Output: t1.c1, t1.c3, t1.*
+                           Sort Key: t1.c1
+                           ->  Foreign Scan on public.ft1 t1
                                  Output: t1.c1, t1.c3, t1.*
-                                 Sort Key: t1.c1
-                                 ->  Foreign Scan on public.ft1 t1
-                                       Output: t1.c1, t1.c3, t1.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR SHARE
-                           ->  Sort
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR SHARE
+                     ->  Sort
+                           Output: t2.c1, t2.*
+                           Sort Key: t2.c1
+                           ->  Foreign Scan on public.ft2 t2
                                  Output: t2.c1, t2.*
-                                 Sort Key: t2.c1
-                                 ->  Foreign Scan on public.ft2 t2
-                                       Output: t2.c1, t2.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR SHARE
-(26 rows)
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR SHARE
+(23 rows)
 
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR SHARE;
  c1  | c1  
 -----+-----
  101 | 101
  102 | 102
  103 | 103
  104 | 104
  105 | 105
  106 | 106
@@ -1318,32 +1325,29 @@ WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2
   106 |  106
   107 |  107
   108 |  108
   109 |  109
   110 |  110
 (10 rows)
 
 -- ctid with whole-row reference
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                                                                                                                    QUERY PLAN                                                                                                                                     
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                  QUERY PLAN                                                                                                                                                  
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
-         Sort Key: t1.c3, t1.c1
-         ->  Foreign Scan
-               Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
-               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
-               Remote SQL: SELECT r1.ctid, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r1."C 1", r1.c3, ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1"))
-(9 rows)
+         Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+         Remote SQL: SELECT r1.ctid, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r1."C 1", r1.c3, ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC, r1."C 1" ASC
+(6 rows)
 
 -- SEMI JOIN, not pushed down
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
                                     QUERY PLAN                                    
 ----------------------------------------------------------------------------------
  Limit
    Output: t1.c1
    ->  Merge Semi Join
          Output: t1.c1
@@ -1665,32 +1669,29 @@ SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM
 (10 rows)
 
 -- create another user for permission, user mapping, effective user tests
 CREATE USER view_owner;
 -- grant privileges on ft4 and ft5 to view_owner
 GRANT ALL ON ft4 TO view_owner;
 GRANT ALL ON ft5 TO view_owner;
 -- prepare statement with current session user
 PREPARE join_stmt AS SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
 EXPLAIN (COSTS OFF, VERBOSE) EXECUTE join_stmt;
-                                                     QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
+                                                                 QUERY PLAN                                                                  
+---------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.c1, t2.c1
-         Sort Key: t1.c1, t2.c1
-         ->  Foreign Scan
-               Output: t1.c1, t2.c1
-               Relations: (public.ft4 t1) LEFT JOIN (public.ft5 t2)
-               Remote SQL: SELECT r1.c1, r2.c1 FROM ("S 1"."T 3" r1 LEFT JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1))))
-(9 rows)
+         Relations: (public.ft4 t1) LEFT JOIN (public.ft5 t2)
+         Remote SQL: SELECT r1.c1, r2.c1 FROM ("S 1"."T 3" r1 LEFT JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1)))) ORDER BY r1.c1 ASC, r2.c1 ASC
+(6 rows)
 
 EXECUTE join_stmt;
  c1 | c1 
 ----+----
  22 |   
  24 | 24
  26 |   
  28 |   
  30 | 30
  32 |   
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 14a3f98..35583f4 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -276,23 +276,20 @@ static bool postgresAnalyzeForeignTable(Relation relation,
 static List *postgresImportForeignSchema(ImportForeignSchemaStmt *stmt,
 							Oid serverOid);
 static void postgresGetForeignJoinPaths(PlannerInfo *root,
 							RelOptInfo *joinrel,
 							RelOptInfo *outerrel,
 							RelOptInfo *innerrel,
 							JoinType jointype,
 							JoinPathExtraData *extra);
 static bool postgresRecheckForeignScan(ForeignScanState *node,
 						   TupleTableSlot *slot);
-static List *get_useful_pathkeys_for_relation(PlannerInfo *root,
-								 RelOptInfo *rel);
-static List *get_useful_ecs_for_relation(PlannerInfo *root, RelOptInfo *rel);
 
 /*
  * Helper functions
  */
 static void estimate_path_cost_size(PlannerInfo *root,
 						RelOptInfo *baserel,
 						List *join_conds,
 						List *pathkeys,
 						double *p_rows, int *p_width,
 						Cost *p_startup_cost, Cost *p_total_cost);
@@ -324,20 +321,25 @@ static HeapTuple make_tuple_from_result_row(PGresult *res,
 						   int row,
 						   Relation rel,
 						   AttInMetadata *attinmeta,
 						   List *retrieved_attrs,
 						   ForeignScanState *fsstate,
 						   MemoryContext temp_context);
 static void conversion_error_callback(void *arg);
 static bool foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel,
 				JoinType jointype, RelOptInfo *outerrel, RelOptInfo *innerrel,
 				JoinPathExtraData *extra);
+static List *get_useful_pathkeys_for_relation(PlannerInfo *root,
+								 RelOptInfo *rel);
+static List *get_useful_ecs_for_relation(PlannerInfo *root, RelOptInfo *rel);
+static void add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
+								Path *epq_path);
 
 
 /*
  * Foreign-data wrapper handler function: return a struct with pointers
  * to my callback routines.
  */
 Datum
 postgres_fdw_handler(PG_FUNCTION_ARGS)
 {
 	FdwRoutine *routine = makeNode(FdwRoutine);
@@ -496,20 +498,28 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	 */
 	fpinfo->local_conds_sel = clauselist_selectivity(root,
 													 fpinfo->local_conds,
 													 baserel->relid,
 													 JOIN_INNER,
 													 NULL);
 
 	cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
 
 	/*
+	 * Set cached relation costs to some negative value, so that we can detect
+	 * when they are set to some sensible costs during one (usually the first)
+	 * of the calls to estimate_path_cost_size().
+	 */
+	fpinfo->rel_startup_cost = -1;
+	fpinfo->rel_total_cost = -1;
+
+	/*
 	 * If the table or the server is configured to use remote estimates,
 	 * connect to the foreign server and execute EXPLAIN to estimate the
 	 * number of rows selected by the restriction clauses, as well as the
 	 * average row width.  Otherwise, estimate using whatever statistics we
 	 * have locally, in a way similar to ordinary tables.
 	 */
 	if (fpinfo->use_remote_estimate)
 	{
 		/*
 		 * Get cost/size estimates with help of remote server.  Save the
@@ -627,26 +637,26 @@ get_useful_ecs_for_relation(PlannerInfo *root, RelOptInfo *rel)
 
 		/* Make sure we've got canonical ECs. */
 		update_mergeclause_eclasses(root, restrictinfo);
 
 		/*
 		 * restrictinfo->mergeopfamilies != NIL is sufficient to guarantee
 		 * that left_ec and right_ec will be initialized, per comments in
 		 * distribute_qual_to_rels, and rel->joininfo should only contain ECs
 		 * where this relation appears on one side or the other.
 		 */
-		if (bms_is_subset(restrictinfo->right_ec->ec_relids, relids))
+		if (bms_is_subset(relids, restrictinfo->right_ec->ec_relids))
 			useful_eclass_list = list_append_unique_ptr(useful_eclass_list,
 													 restrictinfo->right_ec);
 		else
 		{
-			Assert(bms_is_subset(restrictinfo->left_ec->ec_relids, relids));
+			Assert(bms_is_subset(relids, restrictinfo->left_ec->ec_relids));
 			useful_eclass_list = list_append_unique_ptr(useful_eclass_list,
 													  restrictinfo->left_ec);
 		}
 	}
 
 	return useful_eclass_list;
 }
 
 /*
  * get_useful_pathkeys_for_relation
@@ -766,63 +776,40 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
  */
 static void
 postgresGetForeignPaths(PlannerInfo *root,
 						RelOptInfo *baserel,
 						Oid foreigntableid)
 {
 	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
 	ForeignPath *path;
 	List	   *ppi_list;
 	ListCell   *lc;
-	List	   *useful_pathkeys_list = NIL;		/* List of all pathkeys */
 
 	/*
 	 * Create simplest ForeignScan path node and add it to baserel.  This path
 	 * corresponds to SeqScan path of regular tables (though depending on what
 	 * baserestrict conditions we were able to send to remote, there might
 	 * actually be an indexscan happening there).  We already did all the work
 	 * to estimate cost and size of this path.
 	 */
 	path = create_foreignscan_path(root, baserel,
 								   fpinfo->rows,
 								   fpinfo->startup_cost,
 								   fpinfo->total_cost,
 								   NIL, /* no pathkeys */
 								   NULL,		/* no outer rel either */
 								   NULL,		/* no extra plan */
 								   NIL);		/* no fdw_private list */
 	add_path(baserel, (Path *) path);
 
-	useful_pathkeys_list = get_useful_pathkeys_for_relation(root, baserel);
-
-	/* Create one path for each set of pathkeys we found above. */
-	foreach(lc, useful_pathkeys_list)
-	{
-		double		rows;
-		int			width;
-		Cost		startup_cost;
-		Cost		total_cost;
-		List	   *useful_pathkeys = lfirst(lc);
-
-		estimate_path_cost_size(root, baserel, NIL, useful_pathkeys,
-								&rows, &width, &startup_cost, &total_cost);
-
-		add_path(baserel, (Path *)
-				 create_foreignscan_path(root, baserel,
-										 rows,
-										 startup_cost,
-										 total_cost,
-										 useful_pathkeys,
-										 NULL,
-										 NULL,
-										 NIL));
-	}
+	/* Add paths with pathkeys */
+	add_paths_with_pathkeys_for_rel(root, baserel, NULL);
 
 	/*
 	 * If we're not using remote estimates, stop here.  We have no way to
 	 * estimate whether any join clauses would be worth sending across, so
 	 * don't bother building parameterized paths.
 	 */
 	if (!fpinfo->use_remote_estimate)
 		return;
 
 	/*
@@ -2174,21 +2161,32 @@ estimate_path_cost_size(PlannerInfo *root,
 		 * Use rows/width estimates made by set_baserel_size_estimates() for
 		 * base foreign relations and set_joinrel_size_estimates() for join
 		 * between foreign relations.
 		 */
 		rows = foreignrel->rows;
 		width = foreignrel->width;
 
 		/* Back into an estimate of the number of retrieved rows. */
 		retrieved_rows = clamp_row_est(rows / fpinfo->local_conds_sel);
 
-		if (foreignrel->reloptkind != RELOPT_JOINREL)
+		/*
+		 * We will come here again and again with different set of pathkeys
+		 * that caller wants to cost. We don't need to calculate the cost of
+		 * bare scan each time. Instead, use the costs if we have cached them
+		 * already.
+		 */
+		if (fpinfo->rel_startup_cost > 0 && fpinfo->rel_total_cost > 0)
+		{
+			startup_cost = fpinfo->rel_startup_cost;
+			run_cost = fpinfo->rel_total_cost - fpinfo->rel_startup_cost;
+		}
+		else if (foreignrel->reloptkind != RELOPT_JOINREL)
 		{
 			/* Clamp retrieved rows estimates to at most foreignrel->tuples. */
 			retrieved_rows = Min(retrieved_rows, foreignrel->tuples);
 
 			/*
 			 * Cost as though this were a seqscan, which is pessimistic.  We
 			 * effectively imagine the local_conds are being evaluated
 			 * remotely, too.
 			 */
 			startup_cost = 0;
@@ -2276,27 +2274,33 @@ estimate_path_cost_size(PlannerInfo *root,
 		if (pathkeys != NIL)
 		{
 			startup_cost *= DEFAULT_FDW_SORT_MULTIPLIER;
 			run_cost *= DEFAULT_FDW_SORT_MULTIPLIER;
 		}
 
 		total_cost = startup_cost + run_cost;
 	}
 
 	/*
-	 * Cache the costs prior to adding the costs for transferring data from
-	 * the foreign server. These costs are useful for costing the join between
-	 * this relation and another foreign relation, when the cost of join can
-	 * not be obtained from the foreign server.
+	 * Cache the costs for scans without any pathkeys or parameterization
+	 * prior to adding the costs for transferring data from the foreign
+	 * server. These costs are useful for costing the join between this
+	 * relation and another foreign relation or to calculate the costs of
+	 * paths with pathkeys for this relation, when the costs can not be
+	 * obtained from the foreign server. This function will be called at least
+	 * once for every foreign relation without pathkeys and parameterization.
 	 */
-	fpinfo->rel_startup_cost = startup_cost;
-	fpinfo->rel_total_cost = total_cost;
+	if (pathkeys == NIL && param_join_conds == NIL)
+	{
+		fpinfo->rel_startup_cost = startup_cost;
+		fpinfo->rel_total_cost = total_cost;
+	}
 
 	/*
 	 * Add some additional cost factors to account for connection overhead
 	 * (fdw_startup_cost), transferring data across the network
 	 * (fdw_tuple_cost per retrieved row), and local manipulation of the data
 	 * (cpu_tuple_cost per retrieved row).
 	 */
 	startup_cost += fpinfo->fdw_startup_cost;
 	total_cost += fpinfo->fdw_startup_cost;
 	total_cost += fpinfo->fdw_tuple_cost * retrieved_rows;
@@ -3451,20 +3455,28 @@ foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
 		fpinfo_i->use_remote_estimate;
 
 	/*
 	 * Since both the joining relations come from the same server, the server
 	 * level options should have same value for both the relations. Pick from
 	 * any side.
 	 */
 	fpinfo->fdw_startup_cost = fpinfo_o->fdw_startup_cost;
 	fpinfo->fdw_tuple_cost = fpinfo_o->fdw_tuple_cost;
 
+	/*
+	 * Set cached relation costs to some negative value, so that we can detect
+	 * when they are set to some sensible costs, during one (usually the
+	 * first) of the calls to estimate_path_cost_size().
+	 */
+	fpinfo->rel_startup_cost = -1;
+	fpinfo->rel_total_cost = -1;
+
 	/* Mark that this join can be pushed down safely */
 	fpinfo->pushdown_safe = true;
 
 	/*
 	 * Set fetch size to maximum of the joining sides, since we are expecting
 	 * the rows returned by the join to be proportional to the relation sizes.
 	 */
 	if (fpinfo_o->fetch_size > fpinfo_i->fetch_size)
 		fpinfo->fetch_size = fpinfo_o->fetch_size;
 	else
@@ -3525,20 +3537,59 @@ foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
 	 */
 	fpinfo->relation_name = makeStringInfo();
 	appendStringInfo(fpinfo->relation_name, "(%s) %s JOIN (%s)",
 					 fpinfo_o->relation_name->data,
 					 get_jointype_name(fpinfo->jointype),
 					 fpinfo_i->relation_name->data);
 
 	return true;
 }
 
+static void
+add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
+								Path *epq_path)
+{
+	List	   *useful_pathkeys_list = NIL;		/* List of all pathkeys */
+	ListCell   *lc;
+
+	useful_pathkeys_list = get_useful_pathkeys_for_relation(root, rel);
+
+	/* Create one path for each set of pathkeys we found above. */
+	foreach(lc, useful_pathkeys_list)
+	{
+		double		rows;
+		int			width;
+		Cost		startup_cost;
+		Cost		total_cost;
+		List	   *useful_pathkeys = lfirst(lc);
+
+		estimate_path_cost_size(root, rel, NIL, useful_pathkeys,
+								&rows, &width, &startup_cost, &total_cost);
+
+		/*
+		 * TODO: we should worry about EPQ path but should that path have
+		 * pathkeys? I guess, that's not really important since it's just
+		 * going to evaluate the join from whole-row references stuffed in the
+		 * corresponding EPQ slots, for which the order doesn't matter.
+		 */
+		add_path(rel, (Path *)
+				 create_foreignscan_path(root, rel,
+										 rows,
+										 startup_cost,
+										 total_cost,
+										 useful_pathkeys,
+										 NULL,
+										 epq_path,
+										 NIL));
+	}
+}
+
 /*
  * postgresGetForeignJoinPaths
  *		Add possible ForeignPath to joinrel, if join is safe to push down.
  */
 static void
 postgresGetForeignJoinPaths(PlannerInfo *root,
 							RelOptInfo *joinrel,
 							RelOptInfo *outerrel,
 							RelOptInfo *innerrel,
 							JoinType jointype,
@@ -3663,21 +3714,22 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
 									   startup_cost,
 									   total_cost,
 									   NIL,		/* no pathkeys */
 									   NULL,	/* no required_outer */
 									   epq_path,
 									   NULL);	/* no fdw_private */
 
 	/* Add generated path into joinrel by add_path(). */
 	add_path(joinrel, (Path *) joinpath);
 
-	/* XXX Consider pathkeys for the join relation */
+	/* Consider pathkeys for the join relation */
+	add_paths_with_pathkeys_for_rel(root, joinrel, epq_path);
 
 	/* XXX Consider parameterized paths for the join relation */
 }
 
 /*
  * Create a tuple from the specified row of the PGresult.
  *
  * rel is the local representation of the foreign table, attinmeta is
  * conversion data for the rel's tupdesc, and retrieved_attrs is an
  * integer list of the table column numbers present in the PGresult.
@@ -3870,21 +3922,21 @@ conversion_error_callback(void *arg)
  */
 extern Expr *
 find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 {
 	ListCell   *lc_em;
 
 	foreach(lc_em, ec->ec_members)
 	{
 		EquivalenceMember *em = lfirst(lc_em);
 
-		if (bms_equal(em->em_relids, rel->relids))
+		if (bms_is_subset(em->em_relids, rel->relids))
 		{
 			/*
 			 * If there is more than one equivalence member whose Vars are
 			 * taken entirely from this relation, we'll be content to choose
 			 * any one of those.
 			 */
 			return em->em_expr;
 		}
 	}
 
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 885a5fb..ee54a4b 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -230,20 +230,25 @@ SET enable_hashjoin TO false;
 SET enable_nestloop TO false;
 -- inner join; expressions in the clauses appear in the equivalence class list
 EXPLAIN (VERBOSE, COSTS false)
 	SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") OFFSET 100 LIMIT 10;
 SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") OFFSET 100 LIMIT 10;
 -- outer join; expressions in the clauses do not appear in equivalence class
 -- list but no output change as compared to the previous query
 EXPLAIN (VERBOSE, COSTS false)
 	SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") OFFSET 100 LIMIT 10;
 SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") OFFSET 100 LIMIT 10;
+-- A join between local table and foreign join. ORDER BY clause is added to the
+-- foreign join so that the local table can be joined using merge join strategy.
+EXPLAIN (COSTS false, VERBOSE)
+	SELECT t1."C 1" FROM "S 1"."T 1" t1 left join ft1 t2 join ft2 t3 on (t2.c1 = t3.c1) on (t3.c1 = t1."C 1") OFFSET 100 LIMIT 10;
+SELECT t1."C 1" FROM "S 1"."T 1" t1 left join ft1 t2 join ft2 t3 on (t2.c1 = t3.c1) on (t3.c1 = t1."C 1") OFFSET 100 LIMIT 10;
 RESET enable_hashjoin;
 RESET enable_nestloop;
 
 -- ===================================================================
 -- WHERE with remotely-executable conditions
 -- ===================================================================
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 1;         -- Var, OpExpr(b), Const
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NULL;        -- NullTest
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;    -- NullTest
#3Robert Haas
robertmhaas@gmail.com
In reply to: Ashutosh Bapat (#2)
Re: Pushing down sorted joins

On Tue, Feb 23, 2016 at 7:48 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

Rushabh pointed out that declarations of helper functions
get_useful_ecs_for_relation and get_useful_pathkeys_for_relation() are part
of FDW routines declarations rather than helper function declaration. Since
those functions are related to this patch, the attached patch moves those
declaration in their right place.

This patch needs to be rebased.

+               /*
+                * TODO: we should worry about EPQ path but should
that path have
+                * pathkeys? I guess, that's not really important
since it's just
+                * going to evaluate the join from whole-row
references stuffed in the
+                * corresponding EPQ slots, for which the order doesn't matter.
+                */

The pathkeys for the EPQ path don't matter. It'll only be called to
recheck one single row, and there's only one order in which you can
return one row.

-               if (bms_equal(em->em_relids, rel->relids))
+               if (bms_is_subset(em->em_relids, rel->relids))

Why do we need this?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Robert Haas (#3)
1 attachment(s)
Re: Pushing down sorted joins

This patch needs to be rebased.

Done.

+               /*
+                * TODO: we should worry about EPQ path but should
that path have
+                * pathkeys? I guess, that's not really important
since it's just
+                * going to evaluate the join from whole-row
references stuffed in the
+                * corresponding EPQ slots, for which the order doesn't
matter.
+                */

The pathkeys for the EPQ path don't matter. It'll only be called to
recheck one single row, and there's only one order in which you can
return one row.

Right. Removed the TODO

-               if (bms_equal(em->em_relids, rel->relids))
+               if (bms_is_subset(em->em_relids, rel->relids))

Why do we need this?

The function find_em_expr_for_rel() find an equivalence member expression
that has all its Vars come from the given relation. It's not necessary that
it will have Vars from relids that are covered by the given relations. E.g.
in query SELECT A.c1, B.c2 FROM A join B ON ... ORDER BY A.c3, there will
be a single equivalence member A.c3 in the pathkeys and em_relids will
indicate only A. Hence instead of equal, (which used to be OK for single
relation join push-down) we have to use subset operation. We want an
equivalence members whose relids are subset of relids contained by given
relation.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachments:

pg_join_sort_pd_v3.patchtext/x-diff; charset=US-ASCII; name=pg_join_sort_pd_v3.patchDownload
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 6479640..48bdbef 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -437,20 +437,54 @@ SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1"
  103 | 103
  104 | 104
  105 | 105
  106 | 106
  107 | 107
  108 | 108
  109 | 109
  110 | 110
 (10 rows)
 
+-- A join between local table and foreign join. ORDER BY clause is added to the
+-- foreign join so that the local table can be joined using merge join strategy.
+EXPLAIN (COSTS false, VERBOSE)
+	SELECT t1."C 1" FROM "S 1"."T 1" t1 left join ft1 t2 join ft2 t3 on (t2.c1 = t3.c1) on (t3.c1 = t1."C 1") OFFSET 100 LIMIT 10;
+                                                                             QUERY PLAN                                                                             
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Limit
+   Output: t1."C 1"
+   ->  Merge Right Join
+         Output: t1."C 1"
+         Merge Cond: (t3.c1 = t1."C 1")
+         ->  Foreign Scan
+               Output: t3.c1
+               Relations: (public.ft1 t2) INNER JOIN (public.ft2 t3)
+               Remote SQL: SELECT r3."C 1" FROM ("S 1"."T 1" r2 INNER JOIN "S 1"."T 1" r3 ON (TRUE)) WHERE ((r2."C 1" = r3."C 1")) ORDER BY r2."C 1" ASC NULLS LAST
+         ->  Index Only Scan using t1_pkey on "S 1"."T 1" t1
+               Output: t1."C 1"
+(11 rows)
+
+SELECT t1."C 1" FROM "S 1"."T 1" t1 left join ft1 t2 join ft2 t3 on (t2.c1 = t3.c1) on (t3.c1 = t1."C 1") OFFSET 100 LIMIT 10;
+ C 1 
+-----
+ 101
+ 102
+ 103
+ 104
+ 105
+ 106
+ 107
+ 108
+ 109
+ 110
+(10 rows)
+
 RESET enable_hashjoin;
 RESET enable_nestloop;
 -- ===================================================================
 -- WHERE with remotely-executable conditions
 -- ===================================================================
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 1;         -- Var, OpExpr(b), Const
                                          QUERY PLAN                                          
 ---------------------------------------------------------------------------------------------
  Foreign Scan on public.ft1 t1
    Output: c1, c2, c3, c4, c5, c6, c7, c8
@@ -862,32 +896,29 @@ SELECT count(c3) FROM ft1 t1 WHERE t1.c1 === t1.c2;
 -- ===================================================================
 -- JOIN queries
 -- ===================================================================
 -- Analyze ft4 and ft5 so that we have better statistics. These tables do not
 -- have use_remote_estimate set.
 ANALYZE ft4;
 ANALYZE ft5;
 -- join two tables
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                                                     QUERY PLAN                                                                     
-----------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                             QUERY PLAN                                                                                              
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1, t1.c3
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.c1, t2.c1, t1.c3
-         Sort Key: t1.c3, t1.c1
-         ->  Foreign Scan
-               Output: t1.c1, t2.c1, t1.c3
-               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
-               Remote SQL: SELECT r1."C 1", r1.c3, r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1"))
-(9 rows)
+         Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+         Remote SQL: SELECT r1."C 1", r1.c3, r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC NULLS LAST, r1."C 1" ASC NULLS LAST
+(6 rows)
 
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
  c1  | c1  
 -----+-----
  101 | 101
  102 | 102
  103 | 103
  104 | 104
  105 | 105
  106 | 106
@@ -924,32 +955,29 @@ SELECT t1.c1, t2.c2, t3.c3 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) JOIN ft4 t
  32 |  2 | AAA032
  34 |  4 | AAA034
  36 |  6 | AAA036
  38 |  8 | AAA038
  40 |  0 | AAA040
 (10 rows)
 
 -- left outer join
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
-                                                     QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
+                                                                            QUERY PLAN                                                                             
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.c1, t2.c1
-         Sort Key: t1.c1, t2.c1
-         ->  Foreign Scan
-               Output: t1.c1, t2.c1
-               Relations: (public.ft4 t1) LEFT JOIN (public.ft5 t2)
-               Remote SQL: SELECT r1.c1, r2.c1 FROM ("S 1"."T 3" r1 LEFT JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1))))
-(9 rows)
+         Relations: (public.ft4 t1) LEFT JOIN (public.ft5 t2)
+         Remote SQL: SELECT r1.c1, r2.c1 FROM ("S 1"."T 3" r1 LEFT JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1)))) ORDER BY r1.c1 ASC NULLS LAST, r2.c1 ASC NULLS LAST
+(6 rows)
 
 SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
  c1 | c1 
 ----+----
  22 |   
  24 | 24
  26 |   
  28 |   
  30 | 30
  32 |   
@@ -1000,63 +1028,57 @@ SELECT t1.c1, t1.c2, t2.c1, t2.c2 FROM ft4 t1 LEFT JOIN (SELECT * FROM ft5 WHERE
 ----+----+----+----
   2 |  3 |    |   
   4 |  5 |    |   
   6 |  7 |  6 |  7
   8 |  9 |    |   
 (4 rows)
 
 -- right outer join
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft5 t1 RIGHT JOIN ft4 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t1.c1 OFFSET 10 LIMIT 10;
-                                                     QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
+                                                                            QUERY PLAN                                                                             
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.c1, t2.c1
-         Sort Key: t2.c1, t1.c1
-         ->  Foreign Scan
-               Output: t1.c1, t2.c1
-               Relations: (public.ft4 t2) LEFT JOIN (public.ft5 t1)
-               Remote SQL: SELECT r2.c1, r1.c1 FROM ("S 1"."T 3" r2 LEFT JOIN "S 1"."T 4" r1 ON (((r1.c1 = r2.c1))))
-(9 rows)
+         Relations: (public.ft4 t2) LEFT JOIN (public.ft5 t1)
+         Remote SQL: SELECT r2.c1, r1.c1 FROM ("S 1"."T 3" r2 LEFT JOIN "S 1"."T 4" r1 ON (((r1.c1 = r2.c1)))) ORDER BY r2.c1 ASC NULLS LAST, r1.c1 ASC NULLS LAST
+(6 rows)
 
 SELECT t1.c1, t2.c1 FROM ft5 t1 RIGHT JOIN ft4 t2 ON (t1.c1 = t2.c1) ORDER BY t2.c1, t1.c1 OFFSET 10 LIMIT 10;
  c1 | c1 
 ----+----
     | 22
  24 | 24
     | 26
     | 28
  30 | 30
     | 32
     | 34
  36 | 36
     | 38
     | 40
 (10 rows)
 
 -- full outer join
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
-                                                     QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
+                                                                            QUERY PLAN                                                                             
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.c1, t2.c1
-         Sort Key: t1.c1, t2.c1
-         ->  Foreign Scan
-               Output: t1.c1, t2.c1
-               Relations: (public.ft4 t1) FULL JOIN (public.ft5 t2)
-               Remote SQL: SELECT r1.c1, r2.c1 FROM ("S 1"."T 3" r1 FULL JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1))))
-(9 rows)
+         Relations: (public.ft4 t1) FULL JOIN (public.ft5 t2)
+         Remote SQL: SELECT r1.c1, r2.c1 FROM ("S 1"."T 3" r1 FULL JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1)))) ORDER BY r1.c1 ASC NULLS LAST, r2.c1 ASC NULLS LAST
+(6 rows)
 
 SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 45 LIMIT 10;
  c1  | c1 
 -----+----
   92 |   
   94 |   
   96 | 96
   98 |   
  100 |   
      |  3
@@ -1094,191 +1116,179 @@ SELECT t1.c1, t2.c1 FROM ft4 t1 FULL JOIN ft5 t2 ON (t1.c1 = t2.c1) WHERE (t1.c1
     |  3
     |  9
     | 15
     | 21
 (10 rows)
 
 -- join two tables with FOR UPDATE clause
 -- tests whole-row reference for row marks
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR UPDATE OF t1;
-                                                                                                                                                QUERY PLAN                                                                                                                                                 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                         QUERY PLAN                                                                                                                                                                         
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
    ->  LockRows
          Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-         ->  Sort
+         ->  Foreign Scan
                Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-               Sort Key: t1.c3, t1.c1
-               ->  Foreign Scan
-                     Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-                     Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
-                     Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) FOR UPDATE OF r1
-                     ->  Merge Join
-                           Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
-                           Merge Cond: (t1.c1 = t2.c1)
-                           ->  Sort
+               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+               Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC NULLS LAST, r1."C 1" ASC NULLS LAST FOR UPDATE OF r1
+               ->  Merge Join
+                     Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
+                     Merge Cond: (t1.c1 = t2.c1)
+                     ->  Sort
+                           Output: t1.c1, t1.c3, t1.*
+                           Sort Key: t1.c1
+                           ->  Foreign Scan on public.ft1 t1
                                  Output: t1.c1, t1.c3, t1.*
-                                 Sort Key: t1.c1
-                                 ->  Foreign Scan on public.ft1 t1
-                                       Output: t1.c1, t1.c3, t1.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
-                           ->  Sort
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
+                     ->  Sort
+                           Output: t2.c1, t2.*
+                           Sort Key: t2.c1
+                           ->  Foreign Scan on public.ft2 t2
                                  Output: t2.c1, t2.*
-                                 Sort Key: t2.c1
-                                 ->  Foreign Scan on public.ft2 t2
-                                       Output: t2.c1, t2.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
-(26 rows)
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+(23 rows)
 
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR UPDATE OF t1;
  c1  | c1  
 -----+-----
  101 | 101
  102 | 102
  103 | 103
  104 | 104
  105 | 105
  106 | 106
  107 | 107
  108 | 108
  109 | 109
  110 | 110
 (10 rows)
 
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR UPDATE;
-                                                                                                                                                         QUERY PLAN                                                                                                                                                         
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                                 QUERY PLAN                                                                                                                                                                                  
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
    ->  LockRows
          Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-         ->  Sort
+         ->  Foreign Scan
                Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-               Sort Key: t1.c3, t1.c1
-               ->  Foreign Scan
-                     Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-                     Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
-                     Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) FOR UPDATE OF r1 FOR UPDATE OF r2
-                     ->  Merge Join
-                           Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
-                           Merge Cond: (t1.c1 = t2.c1)
-                           ->  Sort
+               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+               Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC NULLS LAST, r1."C 1" ASC NULLS LAST FOR UPDATE OF r1 FOR UPDATE OF r2
+               ->  Merge Join
+                     Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
+                     Merge Cond: (t1.c1 = t2.c1)
+                     ->  Sort
+                           Output: t1.c1, t1.c3, t1.*
+                           Sort Key: t1.c1
+                           ->  Foreign Scan on public.ft1 t1
                                  Output: t1.c1, t1.c3, t1.*
-                                 Sort Key: t1.c1
-                                 ->  Foreign Scan on public.ft1 t1
-                                       Output: t1.c1, t1.c3, t1.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
-                           ->  Sort
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
+                     ->  Sort
+                           Output: t2.c1, t2.*
+                           Sort Key: t2.c1
+                           ->  Foreign Scan on public.ft2 t2
                                  Output: t2.c1, t2.*
-                                 Sort Key: t2.c1
-                                 ->  Foreign Scan on public.ft2 t2
-                                       Output: t2.c1, t2.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
-(26 rows)
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
+(23 rows)
 
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR UPDATE;
  c1  | c1  
 -----+-----
  101 | 101
  102 | 102
  103 | 103
  104 | 104
  105 | 105
  106 | 106
  107 | 107
  108 | 108
  109 | 109
  110 | 110
 (10 rows)
 
 -- join two tables with FOR SHARE clause
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR SHARE OF t1;
-                                                                                                                                                QUERY PLAN                                                                                                                                                
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                        QUERY PLAN                                                                                                                                                                         
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
    ->  LockRows
          Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-         ->  Sort
+         ->  Foreign Scan
                Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-               Sort Key: t1.c3, t1.c1
-               ->  Foreign Scan
-                     Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-                     Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
-                     Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) FOR SHARE OF r1
-                     ->  Merge Join
-                           Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
-                           Merge Cond: (t1.c1 = t2.c1)
-                           ->  Sort
+               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+               Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC NULLS LAST, r1."C 1" ASC NULLS LAST FOR SHARE OF r1
+               ->  Merge Join
+                     Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
+                     Merge Cond: (t1.c1 = t2.c1)
+                     ->  Sort
+                           Output: t1.c1, t1.c3, t1.*
+                           Sort Key: t1.c1
+                           ->  Foreign Scan on public.ft1 t1
                                  Output: t1.c1, t1.c3, t1.*
-                                 Sort Key: t1.c1
-                                 ->  Foreign Scan on public.ft1 t1
-                                       Output: t1.c1, t1.c3, t1.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR SHARE
-                           ->  Sort
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR SHARE
+                     ->  Sort
+                           Output: t2.c1, t2.*
+                           Sort Key: t2.c1
+                           ->  Foreign Scan on public.ft2 t2
                                  Output: t2.c1, t2.*
-                                 Sort Key: t2.c1
-                                 ->  Foreign Scan on public.ft2 t2
-                                       Output: t2.c1, t2.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
-(26 rows)
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+(23 rows)
 
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR SHARE OF t1;
  c1  | c1  
 -----+-----
  101 | 101
  102 | 102
  103 | 103
  104 | 104
  105 | 105
  106 | 106
  107 | 107
  108 | 108
  109 | 109
  110 | 110
 (10 rows)
 
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR SHARE;
-                                                                                                                                                        QUERY PLAN                                                                                                                                                        
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                                                QUERY PLAN                                                                                                                                                                                 
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
    ->  LockRows
          Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-         ->  Sort
+         ->  Foreign Scan
                Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-               Sort Key: t1.c3, t1.c1
-               ->  Foreign Scan
-                     Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
-                     Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
-                     Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) FOR SHARE OF r1 FOR SHARE OF r2
-                     ->  Merge Join
-                           Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
-                           Merge Cond: (t1.c1 = t2.c1)
-                           ->  Sort
+               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+               Remote SQL: SELECT r1."C 1", r1.c3, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r2."C 1", ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC NULLS LAST, r1."C 1" ASC NULLS LAST FOR SHARE OF r1 FOR SHARE OF r2
+               ->  Merge Join
+                     Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
+                     Merge Cond: (t1.c1 = t2.c1)
+                     ->  Sort
+                           Output: t1.c1, t1.c3, t1.*
+                           Sort Key: t1.c1
+                           ->  Foreign Scan on public.ft1 t1
                                  Output: t1.c1, t1.c3, t1.*
-                                 Sort Key: t1.c1
-                                 ->  Foreign Scan on public.ft1 t1
-                                       Output: t1.c1, t1.c3, t1.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR SHARE
-                           ->  Sort
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR SHARE
+                     ->  Sort
+                           Output: t2.c1, t2.*
+                           Sort Key: t2.c1
+                           ->  Foreign Scan on public.ft2 t2
                                  Output: t2.c1, t2.*
-                                 Sort Key: t2.c1
-                                 ->  Foreign Scan on public.ft2 t2
-                                       Output: t2.c1, t2.*
-                                       Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR SHARE
-(26 rows)
+                                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR SHARE
+(23 rows)
 
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR SHARE;
  c1  | c1  
 -----+-----
  101 | 101
  102 | 102
  103 | 103
  104 | 104
  105 | 105
  106 | 106
@@ -1318,32 +1328,29 @@ WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2
   106 |  106
   107 |  107
   108 |  108
   109 |  109
   110 |  110
 (10 rows)
 
 -- ctid with whole-row reference
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                                                                                                                    QUERY PLAN                                                                                                                                     
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                                             QUERY PLAN                                                                                                                                                             
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
-         Sort Key: t1.c3, t1.c1
-         ->  Foreign Scan
-               Output: t1.ctid, t1.*, t2.*, t1.c1, t1.c3
-               Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
-               Remote SQL: SELECT r1.ctid, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r1."C 1", r1.c3, ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1"))
-(9 rows)
+         Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+         Remote SQL: SELECT r1.ctid, ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8), r1."C 1", r1.c3, ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (TRUE)) WHERE ((r1."C 1" = r2."C 1")) ORDER BY r1.c3 ASC NULLS LAST, r1."C 1" ASC NULLS LAST
+(6 rows)
 
 -- SEMI JOIN, not pushed down
 EXPLAIN (COSTS false, VERBOSE)
 SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1) ORDER BY t1.c1 OFFSET 100 LIMIT 10;
                                          QUERY PLAN                                          
 ---------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1
    ->  Merge Semi Join
          Output: t1.c1
@@ -1665,32 +1672,29 @@ SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM
 (10 rows)
 
 -- create another user for permission, user mapping, effective user tests
 CREATE USER view_owner;
 -- grant privileges on ft4 and ft5 to view_owner
 GRANT ALL ON ft4 TO view_owner;
 GRANT ALL ON ft5 TO view_owner;
 -- prepare statement with current session user
 PREPARE join_stmt AS SELECT t1.c1, t2.c1 FROM ft4 t1 LEFT JOIN ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 10 LIMIT 10;
 EXPLAIN (COSTS OFF, VERBOSE) EXECUTE join_stmt;
-                                                     QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
+                                                                            QUERY PLAN                                                                             
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.c1, t2.c1
-         Sort Key: t1.c1, t2.c1
-         ->  Foreign Scan
-               Output: t1.c1, t2.c1
-               Relations: (public.ft4 t1) LEFT JOIN (public.ft5 t2)
-               Remote SQL: SELECT r1.c1, r2.c1 FROM ("S 1"."T 3" r1 LEFT JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1))))
-(9 rows)
+         Relations: (public.ft4 t1) LEFT JOIN (public.ft5 t2)
+         Remote SQL: SELECT r1.c1, r2.c1 FROM ("S 1"."T 3" r1 LEFT JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1)))) ORDER BY r1.c1 ASC NULLS LAST, r2.c1 ASC NULLS LAST
+(6 rows)
 
 EXECUTE join_stmt;
  c1 | c1 
 ----+----
  22 |   
  24 | 24
  26 |   
  28 |   
  30 | 30
  32 |   
@@ -1716,32 +1720,29 @@ CREATE VIEW v_ft5 AS SELECT * FROM ft5;
 -- ft5 is view_owner and not the current user.
 ALTER VIEW v_ft5 OWNER TO view_owner;
 -- create a public user mapping for loopback server
 -- drop user mapping for current_user.
 DROP USER MAPPING FOR CURRENT_USER SERVER loopback;
 CREATE USER MAPPING FOR PUBLIC SERVER loopback;
 -- different effective user for permission check, but same user mapping for the
 -- joining sides, join pushed down, no result expected.
 PREPARE join_stmt AS SELECT t1.c1, t2.c1 FROM ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
 EXPLAIN (COSTS false, VERBOSE) EXECUTE join_stmt;
-                                                           QUERY PLAN                                                            
----------------------------------------------------------------------------------------------------------------------------------
+                                                                       QUERY PLAN                                                                        
+---------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, ft5.c1
-   ->  Sort
+   ->  Foreign Scan
          Output: t1.c1, ft5.c1
-         Sort Key: t1.c1
-         ->  Foreign Scan
-               Output: t1.c1, ft5.c1
-               Relations: (public.ft5 t1) INNER JOIN (public.ft5)
-               Remote SQL: SELECT r1.c1, r6.c1 FROM ("S 1"."T 4" r1 INNER JOIN "S 1"."T 4" r6 ON (TRUE)) WHERE ((r1.c1 = r6.c1))
-(9 rows)
+         Relations: (public.ft5 t1) INNER JOIN (public.ft5)
+         Remote SQL: SELECT r1.c1, r6.c1 FROM ("S 1"."T 4" r1 INNER JOIN "S 1"."T 4" r6 ON (TRUE)) WHERE ((r1.c1 = r6.c1)) ORDER BY r1.c1 ASC NULLS LAST
+(6 rows)
 
 EXECUTE join_stmt;
  c1 | c1 
 ----+----
 (0 rows)
 
 -- create user mapping for view_owner and execute the prepared statement
 -- the join should not be pushed down since joining relations now use two
 -- different user mappings
 CREATE USER MAPPING FOR view_owner SERVER loopback;
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 76d0e15..40bffd6 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -276,23 +276,20 @@ static bool postgresAnalyzeForeignTable(Relation relation,
 static List *postgresImportForeignSchema(ImportForeignSchemaStmt *stmt,
 							Oid serverOid);
 static void postgresGetForeignJoinPaths(PlannerInfo *root,
 							RelOptInfo *joinrel,
 							RelOptInfo *outerrel,
 							RelOptInfo *innerrel,
 							JoinType jointype,
 							JoinPathExtraData *extra);
 static bool postgresRecheckForeignScan(ForeignScanState *node,
 						   TupleTableSlot *slot);
-static List *get_useful_pathkeys_for_relation(PlannerInfo *root,
-								 RelOptInfo *rel);
-static List *get_useful_ecs_for_relation(PlannerInfo *root, RelOptInfo *rel);
 
 /*
  * Helper functions
  */
 static void estimate_path_cost_size(PlannerInfo *root,
 						RelOptInfo *baserel,
 						List *join_conds,
 						List *pathkeys,
 						double *p_rows, int *p_width,
 						Cost *p_startup_cost, Cost *p_total_cost);
@@ -324,20 +321,25 @@ static HeapTuple make_tuple_from_result_row(PGresult *res,
 						   int row,
 						   Relation rel,
 						   AttInMetadata *attinmeta,
 						   List *retrieved_attrs,
 						   ForeignScanState *fsstate,
 						   MemoryContext temp_context);
 static void conversion_error_callback(void *arg);
 static bool foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel,
 				JoinType jointype, RelOptInfo *outerrel, RelOptInfo *innerrel,
 				JoinPathExtraData *extra);
+static List *get_useful_pathkeys_for_relation(PlannerInfo *root,
+								 RelOptInfo *rel);
+static List *get_useful_ecs_for_relation(PlannerInfo *root, RelOptInfo *rel);
+static void add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
+								Path *epq_path);
 
 
 /*
  * Foreign-data wrapper handler function: return a struct with pointers
  * to my callback routines.
  */
 Datum
 postgres_fdw_handler(PG_FUNCTION_ARGS)
 {
 	FdwRoutine *routine = makeNode(FdwRoutine);
@@ -496,20 +498,28 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	 */
 	fpinfo->local_conds_sel = clauselist_selectivity(root,
 													 fpinfo->local_conds,
 													 baserel->relid,
 													 JOIN_INNER,
 													 NULL);
 
 	cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
 
 	/*
+	 * Set cached relation costs to some negative value, so that we can detect
+	 * when they are set to some sensible costs during one (usually the first)
+	 * of the calls to estimate_path_cost_size().
+	 */
+	fpinfo->rel_startup_cost = -1;
+	fpinfo->rel_total_cost = -1;
+
+	/*
 	 * If the table or the server is configured to use remote estimates,
 	 * connect to the foreign server and execute EXPLAIN to estimate the
 	 * number of rows selected by the restriction clauses, as well as the
 	 * average row width.  Otherwise, estimate using whatever statistics we
 	 * have locally, in a way similar to ordinary tables.
 	 */
 	if (fpinfo->use_remote_estimate)
 	{
 		/*
 		 * Get cost/size estimates with help of remote server.  Save the
@@ -767,63 +777,40 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
  */
 static void
 postgresGetForeignPaths(PlannerInfo *root,
 						RelOptInfo *baserel,
 						Oid foreigntableid)
 {
 	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
 	ForeignPath *path;
 	List	   *ppi_list;
 	ListCell   *lc;
-	List	   *useful_pathkeys_list = NIL;		/* List of all pathkeys */
 
 	/*
 	 * Create simplest ForeignScan path node and add it to baserel.  This path
 	 * corresponds to SeqScan path of regular tables (though depending on what
 	 * baserestrict conditions we were able to send to remote, there might
 	 * actually be an indexscan happening there).  We already did all the work
 	 * to estimate cost and size of this path.
 	 */
 	path = create_foreignscan_path(root, baserel,
 								   fpinfo->rows,
 								   fpinfo->startup_cost,
 								   fpinfo->total_cost,
 								   NIL, /* no pathkeys */
 								   NULL,		/* no outer rel either */
 								   NULL,		/* no extra plan */
 								   NIL);		/* no fdw_private list */
 	add_path(baserel, (Path *) path);
 
-	useful_pathkeys_list = get_useful_pathkeys_for_relation(root, baserel);
-
-	/* Create one path for each set of pathkeys we found above. */
-	foreach(lc, useful_pathkeys_list)
-	{
-		double		rows;
-		int			width;
-		Cost		startup_cost;
-		Cost		total_cost;
-		List	   *useful_pathkeys = lfirst(lc);
-
-		estimate_path_cost_size(root, baserel, NIL, useful_pathkeys,
-								&rows, &width, &startup_cost, &total_cost);
-
-		add_path(baserel, (Path *)
-				 create_foreignscan_path(root, baserel,
-										 rows,
-										 startup_cost,
-										 total_cost,
-										 useful_pathkeys,
-										 NULL,
-										 NULL,
-										 NIL));
-	}
+	/* Add paths with pathkeys */
+	add_paths_with_pathkeys_for_rel(root, baserel, NULL);
 
 	/*
 	 * If we're not using remote estimates, stop here.  We have no way to
 	 * estimate whether any join clauses would be worth sending across, so
 	 * don't bother building parameterized paths.
 	 */
 	if (!fpinfo->use_remote_estimate)
 		return;
 
 	/*
@@ -2175,21 +2162,32 @@ estimate_path_cost_size(PlannerInfo *root,
 		 * Use rows/width estimates made by set_baserel_size_estimates() for
 		 * base foreign relations and set_joinrel_size_estimates() for join
 		 * between foreign relations.
 		 */
 		rows = foreignrel->rows;
 		width = foreignrel->reltarget.width;
 
 		/* Back into an estimate of the number of retrieved rows. */
 		retrieved_rows = clamp_row_est(rows / fpinfo->local_conds_sel);
 
-		if (foreignrel->reloptkind != RELOPT_JOINREL)
+		/*
+		 * We will come here again and again with different set of pathkeys
+		 * that caller wants to cost. We don't need to calculate the cost of
+		 * bare scan each time. Instead, use the costs if we have cached them
+		 * already.
+		 */
+		if (fpinfo->rel_startup_cost > 0 && fpinfo->rel_total_cost > 0)
+		{
+			startup_cost = fpinfo->rel_startup_cost;
+			run_cost = fpinfo->rel_total_cost - fpinfo->rel_startup_cost;
+		}
+		else if (foreignrel->reloptkind != RELOPT_JOINREL)
 		{
 			/* Clamp retrieved rows estimates to at most foreignrel->tuples. */
 			retrieved_rows = Min(retrieved_rows, foreignrel->tuples);
 
 			/*
 			 * Cost as though this were a seqscan, which is pessimistic.  We
 			 * effectively imagine the local_conds are being evaluated
 			 * remotely, too.
 			 */
 			startup_cost = 0;
@@ -2277,27 +2275,33 @@ estimate_path_cost_size(PlannerInfo *root,
 		if (pathkeys != NIL)
 		{
 			startup_cost *= DEFAULT_FDW_SORT_MULTIPLIER;
 			run_cost *= DEFAULT_FDW_SORT_MULTIPLIER;
 		}
 
 		total_cost = startup_cost + run_cost;
 	}
 
 	/*
-	 * Cache the costs prior to adding the costs for transferring data from
-	 * the foreign server. These costs are useful for costing the join between
-	 * this relation and another foreign relation, when the cost of join can
-	 * not be obtained from the foreign server.
+	 * Cache the costs for scans without any pathkeys or parameterization
+	 * before adding the costs for transferring data from the foreign server.
+	 * These costs are useful for costing the join between this relation and
+	 * another foreign relation or to calculate the costs of paths with
+	 * pathkeys for this relation, when the costs can not be obtained from the
+	 * foreign server. This function will be called at least once for every
+	 * foreign relation without pathkeys and parameterization.
 	 */
-	fpinfo->rel_startup_cost = startup_cost;
-	fpinfo->rel_total_cost = total_cost;
+	if (pathkeys == NIL && param_join_conds == NIL)
+	{
+		fpinfo->rel_startup_cost = startup_cost;
+		fpinfo->rel_total_cost = total_cost;
+	}
 
 	/*
 	 * Add some additional cost factors to account for connection overhead
 	 * (fdw_startup_cost), transferring data across the network
 	 * (fdw_tuple_cost per retrieved row), and local manipulation of the data
 	 * (cpu_tuple_cost per retrieved row).
 	 */
 	startup_cost += fpinfo->fdw_startup_cost;
 	total_cost += fpinfo->fdw_startup_cost;
 	total_cost += fpinfo->fdw_tuple_cost * retrieved_rows;
@@ -3451,20 +3455,28 @@ foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
 		fpinfo_i->use_remote_estimate;
 
 	/*
 	 * Since both the joining relations come from the same server, the server
 	 * level options should have same value for both the relations. Pick from
 	 * any side.
 	 */
 	fpinfo->fdw_startup_cost = fpinfo_o->fdw_startup_cost;
 	fpinfo->fdw_tuple_cost = fpinfo_o->fdw_tuple_cost;
 
+	/*
+	 * Set cached relation costs to some negative value, so that we can detect
+	 * when they are set to some sensible costs, during one (usually the
+	 * first) of the calls to estimate_path_cost_size().
+	 */
+	fpinfo->rel_startup_cost = -1;
+	fpinfo->rel_total_cost = -1;
+
 	/* Mark that this join can be pushed down safely */
 	fpinfo->pushdown_safe = true;
 
 	/*
 	 * Set fetch size to maximum of the joining sides, since we are expecting
 	 * the rows returned by the join to be proportional to the relation sizes.
 	 */
 	if (fpinfo_o->fetch_size > fpinfo_i->fetch_size)
 		fpinfo->fetch_size = fpinfo_o->fetch_size;
 	else
@@ -3525,20 +3537,53 @@ foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
 	 */
 	fpinfo->relation_name = makeStringInfo();
 	appendStringInfo(fpinfo->relation_name, "(%s) %s JOIN (%s)",
 					 fpinfo_o->relation_name->data,
 					 get_jointype_name(fpinfo->jointype),
 					 fpinfo_i->relation_name->data);
 
 	return true;
 }
 
+static void
+add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
+								Path *epq_path)
+{
+	List	   *useful_pathkeys_list = NIL;		/* List of all pathkeys */
+	ListCell   *lc;
+
+	useful_pathkeys_list = get_useful_pathkeys_for_relation(root, rel);
+
+	/* Create one path for each set of pathkeys we found above. */
+	foreach(lc, useful_pathkeys_list)
+	{
+		double		rows;
+		int			width;
+		Cost		startup_cost;
+		Cost		total_cost;
+		List	   *useful_pathkeys = lfirst(lc);
+
+		estimate_path_cost_size(root, rel, NIL, useful_pathkeys,
+								&rows, &width, &startup_cost, &total_cost);
+
+		add_path(rel, (Path *)
+				 create_foreignscan_path(root, rel,
+										 rows,
+										 startup_cost,
+										 total_cost,
+										 useful_pathkeys,
+										 NULL,
+										 epq_path,
+										 NIL));
+	}
+}
+
 /*
  * postgresGetForeignJoinPaths
  *		Add possible ForeignPath to joinrel, if join is safe to push down.
  */
 static void
 postgresGetForeignJoinPaths(PlannerInfo *root,
 							RelOptInfo *joinrel,
 							RelOptInfo *outerrel,
 							RelOptInfo *innerrel,
 							JoinType jointype,
@@ -3663,21 +3708,22 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
 									   startup_cost,
 									   total_cost,
 									   NIL,		/* no pathkeys */
 									   NULL,	/* no required_outer */
 									   epq_path,
 									   NULL);	/* no fdw_private */
 
 	/* Add generated path into joinrel by add_path(). */
 	add_path(joinrel, (Path *) joinpath);
 
-	/* XXX Consider pathkeys for the join relation */
+	/* Consider pathkeys for the join relation */
+	add_paths_with_pathkeys_for_rel(root, joinrel, epq_path);
 
 	/* XXX Consider parameterized paths for the join relation */
 }
 
 /*
  * Create a tuple from the specified row of the PGresult.
  *
  * rel is the local representation of the foreign table, attinmeta is
  * conversion data for the rel's tupdesc, and retrieved_attrs is an
  * integer list of the table column numbers present in the PGresult.
@@ -3870,21 +3916,21 @@ conversion_error_callback(void *arg)
  */
 extern Expr *
 find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 {
 	ListCell   *lc_em;
 
 	foreach(lc_em, ec->ec_members)
 	{
 		EquivalenceMember *em = lfirst(lc_em);
 
-		if (bms_equal(em->em_relids, rel->relids))
+		if (bms_is_subset(em->em_relids, rel->relids))
 		{
 			/*
 			 * If there is more than one equivalence member whose Vars are
 			 * taken entirely from this relation, we'll be content to choose
 			 * any one of those.
 			 */
 			return em->em_expr;
 		}
 	}
 
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 95e00eb..4b88a30 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -230,20 +230,25 @@ SET enable_hashjoin TO false;
 SET enable_nestloop TO false;
 -- inner join; expressions in the clauses appear in the equivalence class list
 EXPLAIN (VERBOSE, COSTS false)
 	SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") OFFSET 100 LIMIT 10;
 SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") OFFSET 100 LIMIT 10;
 -- outer join; expressions in the clauses do not appear in equivalence class
 -- list but no output change as compared to the previous query
 EXPLAIN (VERBOSE, COSTS false)
 	SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") OFFSET 100 LIMIT 10;
 SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") OFFSET 100 LIMIT 10;
+-- A join between local table and foreign join. ORDER BY clause is added to the
+-- foreign join so that the local table can be joined using merge join strategy.
+EXPLAIN (COSTS false, VERBOSE)
+	SELECT t1."C 1" FROM "S 1"."T 1" t1 left join ft1 t2 join ft2 t3 on (t2.c1 = t3.c1) on (t3.c1 = t1."C 1") OFFSET 100 LIMIT 10;
+SELECT t1."C 1" FROM "S 1"."T 1" t1 left join ft1 t2 join ft2 t3 on (t2.c1 = t3.c1) on (t3.c1 = t1."C 1") OFFSET 100 LIMIT 10;
 RESET enable_hashjoin;
 RESET enable_nestloop;
 
 -- ===================================================================
 -- WHERE with remotely-executable conditions
 -- ===================================================================
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 1;         -- Var, OpExpr(b), Const
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NULL;        -- NullTest
 EXPLAIN (VERBOSE, COSTS false) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;    -- NullTest
#5Robert Haas
robertmhaas@gmail.com
In reply to: Ashutosh Bapat (#4)
Re: Pushing down sorted joins

On Wed, Mar 9, 2016 at 2:23 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

[ new patch ]

This looks OK to me. Committed!

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Robert Haas (#5)
Re: Pushing down sorted joins

On Wed, Mar 9, 2016 at 9:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Mar 9, 2016 at 2:23 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

[ new patch ]

This looks OK to me. Committed!

Thanks.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company