ORDER BY pushdowns seem broken in postgres_fdw

Started by David Rowleyover 4 years ago22 messages
#1David Rowley
dgrowleyml@gmail.com

I'm working on a patch [1]/messages/by-id/1882015.KPgzjnsp5C@aivenronan to get the planner to consider adding
PathKeys to satisfy ORDER BY / DISTINCT aggregates. I think this has
led me to discover some problems with postgres_fdw's handling of
pushing down ORDER BY clauses into the foreign server.

The following test exists in the postgres_fdw module:

create operator class my_op_class for type int using btree family
my_op_family as
operator 1 public.<^,
operator 3 public.=^,
operator 5 public.>^,
function 1 my_op_cmp(int, int);
-- This will not be pushed as user defined sort operator is not part of the
-- extension yet.
explain (verbose, costs off)
select array_agg(c1 order by c1 using operator(public.<^)) from ft2
where c2 = 6 and c1 < 100 group by c2;
QUERY PLAN
--------------------------------------------------------------------------------------------
GroupAggregate
Output: array_agg(c1 ORDER BY c1 USING <^ NULLS LAST), c2
Group Key: ft2.c2
-> Foreign Scan on public.ft2
Output: c1, c2
Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (("C 1" <
100)) AND ((c2 = 6))
(6 rows)

Here the test claims that it wants to ensure that the order by using
operator(public.<^) is not pushed down into the foreign scan.
However, unless I'm mistaken, it seems there's a completely wrong
assumption there that the planner would even attempt that. In current
master we don't add PathKeys for ORDER BY aggregates, why would that
sort get pushed down in the first place?

If I adjust that query to something that would have the planner set
pathkeys for, it does push the ORDER BY to the foreign server without
any consideration that the sort operator is not shippable to the
foreign server.

postgres=# explain verbose select * from ft2 order by c1 using
operator(public.<^);
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Foreign Scan on public.ft2 (cost=100.28..169.27 rows=1000 width=88)
Output: c1, c2, c3, c4, c5, c6, c7, c8
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T
1" ORDER BY "C 1" ASC NULLS LAST
(3 rows)

Am I missing something here, or is postgres_fdw.c's
get_useful_pathkeys_for_relation() just broken?

David

[1]: /messages/by-id/1882015.KPgzjnsp5C@aivenronan

#2Ronan Dunklau
ronan.dunklau@aiven.io
In reply to: David Rowley (#1)
Re: ORDER BY pushdowns seem broken in postgres_fdw

Le mercredi 21 juillet 2021, 04:25:00 CEST David Rowley a écrit :

Here the test claims that it wants to ensure that the order by using
operator(public.<^) is not pushed down into the foreign scan.
However, unless I'm mistaken, it seems there's a completely wrong
assumption there that the planner would even attempt that. In current
master we don't add PathKeys for ORDER BY aggregates, why would that
sort get pushed down in the first place?

The whole aggregate, including it's order by clause, can be pushed down so
there is nothing related to pathkeys here.

If I adjust that query to something that would have the planner set
pathkeys for, it does push the ORDER BY to the foreign server without
any consideration that the sort operator is not shippable to the
foreign server.

Am I missing something here, or is postgres_fdw.c's
get_useful_pathkeys_for_relation() just broken?

I think you're right, we need to add a check if the opfamily is shippable.
I'll submit a patch for that including regression tests.

Regards,

--
Ronan Dunklau

#3Ronan Dunklau
ronan.dunklau@aiven.io
In reply to: Ronan Dunklau (#2)
1 attachment(s)
Re: ORDER BY pushdowns seem broken in postgres_fdw

Le mercredi 21 juillet 2021, 11:05:14 CEST Ronan Dunklau a écrit :

Le mercredi 21 juillet 2021, 04:25:00 CEST David Rowley a écrit :

Here the test claims that it wants to ensure that the order by using
operator(public.<^) is not pushed down into the foreign scan.
However, unless I'm mistaken, it seems there's a completely wrong
assumption there that the planner would even attempt that. In current
master we don't add PathKeys for ORDER BY aggregates, why would that
sort get pushed down in the first place?

The whole aggregate, including it's order by clause, can be pushed down so
there is nothing related to pathkeys here.

If I adjust that query to something that would have the planner set
pathkeys for, it does push the ORDER BY to the foreign server without
any consideration that the sort operator is not shippable to the
foreign server.

Am I missing something here, or is postgres_fdw.c's
get_useful_pathkeys_for_relation() just broken?

I think you're right, we need to add a check if the opfamily is shippable.
I'll submit a patch for that including regression tests.

In fact the support for generating the correct USING clause was inexistent
too, so that needed a bit more work.

The attached patch does the following:
- verify the opfamily is shippable to keep pathkeys
- generate a correct order by clause using the actual operator.

The second part needed a bit of refactoring: the find_em_expr_for_input_target
and find_em_expr_for_rel need to return the whole EquivalenceMember, because we
can't know the type used by the opfamily from the expression (example: the
expression could be of type intarray, but the type used by the opfamily could
be anyarray).

I also moved the "USING <operator>"' string generation to a separate function
since it's now used by appendAggOrderBy and appendOrderByClause.

The find_em_expr_for_rel is exposed in optimizer/paths.h, so I kept the
existing function which returns the expr directly in case it is used out of
tree.

--
Ronan Dunklau

Attachments:

v1_fix_postgresfdw_orderby_handlingtext/plain; charset=UTF-8; name=v1_fix_postgresfdw_orderby_handlingDownload
commit 2376cc6656853987e8f08e9b8f444bf391a9c269
Author: Ronan Dunklau <ronan.dunklau@aiven.io>
Date:   Wed Jul 21 12:44:41 2021 +0200

    Fix postgres_fdw PathKey's handling.
    
    The operator family being used for the sort was completely
    ignored, and as such its existence on the foreign server was not
    checked.

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 31919fda8c..2c986d3325 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -37,9 +37,11 @@
 #include "access/sysattr.h"
 #include "access/table.h"
 #include "catalog/pg_aggregate.h"
+#include "catalog/pg_amop.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_namespace.h"
 #include "catalog/pg_operator.h"
+#include "catalog/pg_opfamily.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
@@ -180,6 +182,7 @@ static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root,
 							   Index ignore_rel, List **ignore_conds, List **params_list);
 static void deparseAggref(Aggref *node, deparse_expr_cxt *context);
 static void appendGroupByClause(List *tlist, deparse_expr_cxt *context);
+static void appendOrderbyUsingClause(Oid sortop, Oid sortcoltype, deparse_expr_cxt *context);
 static void appendAggOrderBy(List *orderList, List *targetList,
 							 deparse_expr_cxt *context);
 static void appendFunctionName(Oid funcid, deparse_expr_cxt *context);
@@ -910,6 +913,26 @@ is_foreign_param(PlannerInfo *root,
 	return false;
 }
 
+/* Returns true if the given pathkey can be evaluated on the remote side
+ */
+bool
+is_foreign_pathkey(PlannerInfo *root,
+				   RelOptInfo *baserel,
+				   PathKey *pathkey)
+{
+	EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
+	EquivalenceMember *em;
+	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
+
+	if (pathkey_ec->ec_has_volatile)
+		return false;
+	return (!pathkey_ec->ec_has_volatile &&
+			(em = find_em_for_rel(pathkey_ec, baserel)) &&
+			is_foreign_expr(root, baserel, em->em_expr) &&
+			is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId, fpinfo));
+}
+
+
 /*
  * Convert type OID + typmod info into a type name we can ship to the remote
  * server.  Someplace else had better have verified that this type name is
@@ -3125,6 +3148,37 @@ deparseAggref(Aggref *node, deparse_expr_cxt *context)
 	appendStringInfoChar(buf, ')');
 }
 
+/*
+ * Append the USING <OPERATOR> part of an ORDER BY clause
+ */
+static void
+appendOrderbyUsingClause(Oid sortop, Oid sortcoltype, deparse_expr_cxt *context)
+{
+	StringInfo	buf = context->buf;
+	TypeCacheEntry *typentry;
+	typentry = lookup_type_cache(sortcoltype, TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
+
+	if (sortop == typentry->lt_opr)
+		appendStringInfoString(buf, " ASC");
+	else if (sortop == typentry->gt_opr)
+		appendStringInfoString(buf, " DESC");
+	else
+	{
+		HeapTuple	opertup;
+		Form_pg_operator operform;
+
+		appendStringInfoString(buf, " USING ");
+
+		/* Append operator name. */
+		opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(sortop));
+		if (!HeapTupleIsValid(opertup))
+			elog(ERROR, "cache lookup failed for operator %u", sortop);
+		operform = (Form_pg_operator) GETSTRUCT(opertup);
+		deparseOperatorName(buf, operform);
+		ReleaseSysCache(opertup);
+	}
+}
+
 /*
  * Append ORDER BY within aggregate function.
  */
@@ -3140,7 +3194,6 @@ appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context)
 		SortGroupClause *srt = (SortGroupClause *) lfirst(lc);
 		Node	   *sortexpr;
 		Oid			sortcoltype;
-		TypeCacheEntry *typentry;
 
 		if (!first)
 			appendStringInfoString(buf, ", ");
@@ -3150,27 +3203,7 @@ appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context)
 										  false, context);
 		sortcoltype = exprType(sortexpr);
 		/* See whether operator is default < or > for datatype */
-		typentry = lookup_type_cache(sortcoltype,
-									 TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
-		if (srt->sortop == typentry->lt_opr)
-			appendStringInfoString(buf, " ASC");
-		else if (srt->sortop == typentry->gt_opr)
-			appendStringInfoString(buf, " DESC");
-		else
-		{
-			HeapTuple	opertup;
-			Form_pg_operator operform;
-
-			appendStringInfoString(buf, " USING ");
-
-			/* Append operator name. */
-			opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(srt->sortop));
-			if (!HeapTupleIsValid(opertup))
-				elog(ERROR, "cache lookup failed for operator %u", srt->sortop);
-			operform = (Form_pg_operator) GETSTRUCT(opertup);
-			deparseOperatorName(buf, operform);
-			ReleaseSysCache(opertup);
-		}
+		appendOrderbyUsingClause(srt->sortop, sortcoltype, context);
 
 		if (srt->nulls_first)
 			appendStringInfoString(buf, " NULLS FIRST");
@@ -3280,7 +3313,11 @@ appendOrderByClause(List *pathkeys, bool has_final_sort,
 	foreach(lcell, pathkeys)
 	{
 		PathKey    *pathkey = lfirst(lcell);
+		EquivalenceMember *em;
 		Expr	   *em_expr;
+		HeapTuple	tuple;
+		Oid			oprid;
+		bool		isNull;
 
 		if (has_final_sort)
 		{
@@ -3288,21 +3325,44 @@ appendOrderByClause(List *pathkeys, bool has_final_sort,
 			 * By construction, context->foreignrel is the input relation to
 			 * the final sort.
 			 */
-			em_expr = find_em_expr_for_input_target(context->root,
-													pathkey->pk_eclass,
-													context->foreignrel->reltarget);
+			em = find_em_for_input_target(context->root,
+										  pathkey->pk_eclass,
+										  context->foreignrel->reltarget);
 		}
 		else
-			em_expr = find_em_expr_for_rel(pathkey->pk_eclass, baserel);
+			em = find_em_for_rel(pathkey->pk_eclass, baserel);
+		em_expr = em->em_expr;
+
+		/*
+		 * Lookup the operator corresponding to the strategy in the opclass.
+		 * The datatype used by the opfamily is not necessarily the same as
+		 * the expression type (for array types for example).
+		 */
+		tuple = SearchSysCache4(AMOPSTRATEGY,
+								ObjectIdGetDatum(pathkey->pk_opfamily),
+								em->em_datatype,
+								em->em_datatype,
+								pathkey->pk_strategy);
+		if (!HeapTupleIsValid(tuple))
+		{
+			elog(ERROR, "cache lookup failed for operator family %u", pathkey->pk_opfamily);
+		}
+
+		oprid = DatumGetObjectId(SysCacheGetAttr(AMOPSTRATEGY, tuple,
+												 Anum_pg_amop_amopopr, &isNull));
+		ReleaseSysCache(tuple);
+
 
 		Assert(em_expr != NULL);
 
 		appendStringInfoString(buf, delim);
 		deparseExpr(em_expr, context);
-		if (pathkey->pk_strategy == BTLessStrategyNumber)
-			appendStringInfoString(buf, " ASC");
-		else
-			appendStringInfoString(buf, " DESC");
+
+		/*
+		 * Here we need to use the expression type to compare against the
+		 * default btree sort operator
+		 */
+		appendOrderbyUsingClause(oprid, exprType((Node *) em_expr), context);
 
 		if (pathkey->pk_nulls_first)
 			appendStringInfoString(buf, " NULLS FIRST");
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index ed25e7a743..66bada908f 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3168,6 +3168,18 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
          Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (("C 1" < 100)) AND ((c2 = 6))
 (6 rows)
 
+-- This will not be pushed either
+explain verbose select * from ft2 order by c1 using operator(public.<^);
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
+ Sort  (cost=190.83..193.33 rows=1000 width=142)
+   Output: c1, c2, c3, c4, c5, c6, c7, c8
+   Sort Key: ft2.c1 USING <^
+   ->  Foreign Scan on public.ft2  (cost=100.00..141.00 rows=1000 width=47)
+         Output: c1, c2, c3, c4, c5, c6, c7, c8
+         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+(6 rows)
+
 -- Update local stats on ft2
 ANALYZE ft2;
 -- Add into extension
@@ -3195,6 +3207,15 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
  {6,16,26,36,46,56,66,76,86,96}
 (1 row)
 
+-- And this will be pushed too
+explain verbose select * from ft2 order by c1 using operator(public.<^);
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan on public.ft2  (cost=170.83..193.33 rows=1000 width=47)
+   Output: c1, c2, c3, c4, c5, c6, c7, c8
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" ORDER BY "C 1" USING OPERATOR(public.<^) NULLS LAST
+(3 rows)
+
 -- Remove from extension
 alter extension postgres_fdw drop operator class my_op_class using btree;
 alter extension postgres_fdw drop function my_op_cmp(a int, b int);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index f15c97ad7a..9922ee614c 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -916,8 +916,6 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 		foreach(lc, root->query_pathkeys)
 		{
 			PathKey    *pathkey = (PathKey *) lfirst(lc);
-			EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
-			Expr	   *em_expr;
 
 			/*
 			 * The planner and executor don't have any clever strategy for
@@ -929,9 +927,7 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 			 * is_foreign_expr would detect volatile expressions as well, but
 			 * checking ec_has_volatile here saves some cycles.
 			 */
-			if (pathkey_ec->ec_has_volatile ||
-				!(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) ||
-				!is_foreign_expr(root, rel, em_expr))
+			if (!is_foreign_pathkey(root, rel, pathkey))
 			{
 				query_pathkeys_ok = false;
 				break;
@@ -6510,9 +6506,9 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 			return;
 
 		/* Get the sort expression for the pathkey_ec */
-		sort_expr = find_em_expr_for_input_target(root,
+		sort_expr = find_em_for_input_target(root,
 												  pathkey_ec,
-												  input_rel->reltarget);
+												  input_rel->reltarget)->em_expr;
 
 		/* If it's unsafe to remote, we cannot push down the final sort */
 		if (!is_foreign_expr(root, input_rel, sort_expr))
@@ -7250,11 +7246,11 @@ conversion_error_callback(void *arg)
 }
 
 /*
- * Find an equivalence class member expression to be computed as a sort column
+ * Find an equivalence class member to be computed as a sort column
  * in the given target.
  */
-Expr *
-find_em_expr_for_input_target(PlannerInfo *root,
+EquivalenceMember*
+find_em_for_input_target(PlannerInfo *root,
 							  EquivalenceClass *ec,
 							  PathTarget *target)
 {
@@ -7301,7 +7297,7 @@ find_em_expr_for_input_target(PlannerInfo *root,
 				em_expr = ((RelabelType *) em_expr)->arg;
 
 			if (equal(em_expr, expr))
-				return em->em_expr;
+				return em;
 		}
 
 		i++;
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 9591c0f6c2..a0d5c859da 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -165,12 +165,17 @@ extern void classifyConditions(PlannerInfo *root,
 							   List *input_conds,
 							   List **remote_conds,
 							   List **local_conds);
+
 extern bool is_foreign_expr(PlannerInfo *root,
 							RelOptInfo *baserel,
 							Expr *expr);
 extern bool is_foreign_param(PlannerInfo *root,
 							 RelOptInfo *baserel,
 							 Expr *expr);
+extern bool is_foreign_pathkey(PlannerInfo *root,
+							   RelOptInfo *baserel,
+							   PathKey *pathkey);
+
 extern void deparseInsertSql(StringInfo buf, RangeTblEntry *rte,
 							 Index rtindex, Relation rel,
 							 List *targetAttrs, bool doNothing,
@@ -212,8 +217,8 @@ extern void deparseTruncateSql(StringInfo buf,
 							   DropBehavior behavior,
 							   bool restart_seqs);
 extern void deparseStringLiteral(StringInfo buf, const char *val);
-extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
-extern Expr *find_em_expr_for_input_target(PlannerInfo *root,
+extern EquivalenceMember *find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
+extern EquivalenceMember *find_em_for_input_target(PlannerInfo *root,
 										   EquivalenceClass *ec,
 										   PathTarget *target);
 extern List *build_tlist_to_deparse(RelOptInfo *foreignrel);
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 02a6b15a13..841fac84e4 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -873,6 +873,9 @@ create operator class my_op_class for type int using btree family my_op_family a
 explain (verbose, costs off)
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 
+-- This will not be pushed either
+explain verbose select * from ft2 order by c1 using operator(public.<^);
+
 -- Update local stats on ft2
 ANALYZE ft2;
 
@@ -890,6 +893,9 @@ explain (verbose, costs off)
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 
+-- And this will be pushed too
+explain verbose select * from ft2 order by c1 using operator(public.<^);
+
 -- Remove from extension
 alter extension postgres_fdw drop operator class my_op_class using btree;
 alter extension postgres_fdw drop function my_op_cmp(a int, b int);
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 6f1abbe47d..d2eac89528 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -935,8 +935,8 @@ is_exprlist_member(Expr *node, List *exprs)
  * Find an equivalence class member expression, all of whose Vars, come from
  * the indicated relation.
  */
-Expr *
-find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+EquivalenceMember*
+find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 {
 	ListCell   *lc_em;
 
@@ -952,7 +952,7 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 			 * taken entirely from this relation, we'll be content to choose
 			 * any one of those.
 			 */
-			return em->em_expr;
+			return em;
 		}
 	}
 
@@ -960,6 +960,14 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 	return NULL;
 }
 
+Expr *
+find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+{
+	EquivalenceMember *em = find_em_for_rel(ec, rel);
+	return em ? em->em_expr : NULL;
+}
+
+
 /*
  * relation_can_be_sorted_early
  *		Can this relation be sorted on this EC before the final output step?
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index f1d111063c..bc17114ba3 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -144,6 +144,7 @@ extern EquivalenceMember *find_computable_ec_member(PlannerInfo *root,
 													Relids relids,
 													bool require_parallel_safe);
 extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
+extern EquivalenceMember *find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
 extern bool relation_can_be_sorted_early(PlannerInfo *root, RelOptInfo *rel,
 										 EquivalenceClass *ec,
 										 bool require_parallel_safe);
#4David Rowley
dgrowleyml@gmail.com
In reply to: Ronan Dunklau (#3)
Re: ORDER BY pushdowns seem broken in postgres_fdw

On Thu, 22 Jul 2021 at 00:28, Ronan Dunklau <ronan.dunklau@aiven.io> wrote:

The attached patch does the following:
- verify the opfamily is shippable to keep pathkeys
- generate a correct order by clause using the actual operator.

Thanks for writing the patch.

This is just a very superficial review. I've not spent a great deal
of time looking at postgres_fdw code, so would rather some eyes that
were more familiar with the code looked too.

1. This comment needs to be updated. It still mentions
is_foreign_expr, which you're no longer calling.

  * is_foreign_expr would detect volatile expressions as well, but
  * checking ec_has_volatile here saves some cycles.
  */
- if (pathkey_ec->ec_has_volatile ||
- !(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) ||
- !is_foreign_expr(root, rel, em_expr))
+ if (!is_foreign_pathkey(root, rel, pathkey))

2. This is not a very easy return condition to read:

+ return (!pathkey_ec->ec_has_volatile &&
+ (em = find_em_for_rel(pathkey_ec, baserel)) &&
+ is_foreign_expr(root, baserel, em->em_expr) &&
+ is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId, fpinfo));

I think it would be nicer to break that down into something easier on
the eyes that could be commented a little more.

3. This comment is no longer true:

  * Find an equivalence class member expression, all of whose Vars, come from
  * the indicated relation.
  */
-Expr *
-find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+EquivalenceMember*
+find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel)

Also, missing space after EquivalenceMember.

The comment can just be moved down to:

+Expr *
+find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+{
+ EquivalenceMember *em = find_em_for_rel(ec, rel);
+ return em ? em->em_expr : NULL;
+}

and you can rewrite the one for find_em_for_rel.

David

#5Ronan Dunklau
ronan.dunklau@aiven.io
In reply to: David Rowley (#4)
1 attachment(s)
Re: ORDER BY pushdowns seem broken in postgres_fdw

Le mercredi 21 juillet 2021 15:45:15 CEST, vous avez écrit :

On Thu, 22 Jul 2021 at 00:28, Ronan Dunklau <ronan.dunklau@aiven.io> wrote:

The attached patch does the following:
- verify the opfamily is shippable to keep pathkeys
- generate a correct order by clause using the actual operator.

Thanks for writing the patch.

This is just a very superficial review. I've not spent a great deal
of time looking at postgres_fdw code, so would rather some eyes that
were more familiar with the code looked too.

Thank you for the review.

1. This comment needs to be updated. It still mentions
is_foreign_expr, which you're no longer calling.

* is_foreign_expr would detect volatile expressions as well, but
* checking ec_has_volatile here saves some cycles.
*/
- if (pathkey_ec->ec_has_volatile ||
- !(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) ||
- !is_foreign_expr(root, rel, em_expr))
+ if (!is_foreign_pathkey(root, rel, pathkey))

Done. By the way, the comment just above mentions we don't have a way to use a
prefix pathkey, but I suppose we should revisit that now that we have
IncrementalSort. I'll mark it in my todo list for another patch.

2. This is not a very easy return condition to read:

+ return (!pathkey_ec->ec_has_volatile &&
+ (em = find_em_for_rel(pathkey_ec, baserel)) &&
+ is_foreign_expr(root, baserel, em->em_expr) &&
+ is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId, fpinfo));

I think it would be nicer to break that down into something easier on
the eyes that could be commented a little more.

Done, let me know what you think about it.

3. This comment is no longer true:

* Find an equivalence class member expression, all of whose Vars, come
from * the indicated relation.
*/
-Expr *
-find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+EquivalenceMember*
+find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel)

Also, missing space after EquivalenceMember.

The comment can just be moved down to:

+Expr *
+find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+{
+ EquivalenceMember *em = find_em_for_rel(ec, rel);
+ return em ? em->em_expr : NULL;
+}

and you can rewrite the one for find_em_for_rel.

I have done it the other way around. I'm not sure we really need to keep the
find_em_expr_for_rel function on HEAD. If we decide to backpatch, it would need
to be kept though.

--
Ronan Dunklau

Attachments:

v2_fix_postgresfdw_orderby_handling.patchtext/x-patch; charset=UTF-8; name=v2_fix_postgresfdw_orderby_handling.patchDownload
commit f7b3dc81878bf2ac503899f010eb18f390a64e37
Author: Ronan Dunklau <ronan.dunklau@aiven.io>
Date:   Wed Jul 21 12:44:41 2021 +0200

    Fix postgres_fdw PathKey's handling.
    
    The operator family being used for the sort was completely
    ignored, and as such its existence on the foreign server was not
    checked.

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 31919fda8c..5efefff65e 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -37,9 +37,11 @@
 #include "access/sysattr.h"
 #include "access/table.h"
 #include "catalog/pg_aggregate.h"
+#include "catalog/pg_amop.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_namespace.h"
 #include "catalog/pg_operator.h"
+#include "catalog/pg_opfamily.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
@@ -180,6 +182,7 @@ static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root,
 							   Index ignore_rel, List **ignore_conds, List **params_list);
 static void deparseAggref(Aggref *node, deparse_expr_cxt *context);
 static void appendGroupByClause(List *tlist, deparse_expr_cxt *context);
+static void appendOrderbyUsingClause(Oid sortop, Oid sortcoltype, deparse_expr_cxt *context);
 static void appendAggOrderBy(List *orderList, List *targetList,
 							 deparse_expr_cxt *context);
 static void appendFunctionName(Oid funcid, deparse_expr_cxt *context);
@@ -910,6 +913,37 @@ is_foreign_param(PlannerInfo *root,
 	return false;
 }
 
+/* Returns true if the given pathkey can be evaluated on the remote side
+ */
+bool
+is_foreign_pathkey(PlannerInfo *root,
+				   RelOptInfo *baserel,
+				   PathKey *pathkey)
+{
+	EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
+	EquivalenceMember *em;
+	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
+
+	/*
+	 * is_foreign_expr would detect volatile expressions as well, but
+	 * checking ec_has_volatile here saves some cycles.
+	 */
+	if (pathkey_ec->ec_has_volatile)
+		return false;
+
+	em = find_em_for_rel(pathkey_ec, baserel);
+	if (em == NULL)
+		return false;
+
+	/*
+	 * Finally, verify the found member's expression is foreign and its operator
+	 * family is shippable.
+	 */
+	return (is_foreign_expr(root, baserel, em->em_expr) &&
+			is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId, fpinfo));
+}
+
+
 /*
  * Convert type OID + typmod info into a type name we can ship to the remote
  * server.  Someplace else had better have verified that this type name is
@@ -3125,6 +3159,37 @@ deparseAggref(Aggref *node, deparse_expr_cxt *context)
 	appendStringInfoChar(buf, ')');
 }
 
+/*
+ * Append the USING <OPERATOR> part of an ORDER BY clause
+ */
+static void
+appendOrderbyUsingClause(Oid sortop, Oid sortcoltype, deparse_expr_cxt *context)
+{
+	StringInfo	buf = context->buf;
+	TypeCacheEntry *typentry;
+	typentry = lookup_type_cache(sortcoltype, TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
+
+	if (sortop == typentry->lt_opr)
+		appendStringInfoString(buf, " ASC");
+	else if (sortop == typentry->gt_opr)
+		appendStringInfoString(buf, " DESC");
+	else
+	{
+		HeapTuple	opertup;
+		Form_pg_operator operform;
+
+		appendStringInfoString(buf, " USING ");
+
+		/* Append operator name. */
+		opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(sortop));
+		if (!HeapTupleIsValid(opertup))
+			elog(ERROR, "cache lookup failed for operator %u", sortop);
+		operform = (Form_pg_operator) GETSTRUCT(opertup);
+		deparseOperatorName(buf, operform);
+		ReleaseSysCache(opertup);
+	}
+}
+
 /*
  * Append ORDER BY within aggregate function.
  */
@@ -3140,7 +3205,6 @@ appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context)
 		SortGroupClause *srt = (SortGroupClause *) lfirst(lc);
 		Node	   *sortexpr;
 		Oid			sortcoltype;
-		TypeCacheEntry *typentry;
 
 		if (!first)
 			appendStringInfoString(buf, ", ");
@@ -3150,27 +3214,7 @@ appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context)
 										  false, context);
 		sortcoltype = exprType(sortexpr);
 		/* See whether operator is default < or > for datatype */
-		typentry = lookup_type_cache(sortcoltype,
-									 TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
-		if (srt->sortop == typentry->lt_opr)
-			appendStringInfoString(buf, " ASC");
-		else if (srt->sortop == typentry->gt_opr)
-			appendStringInfoString(buf, " DESC");
-		else
-		{
-			HeapTuple	opertup;
-			Form_pg_operator operform;
-
-			appendStringInfoString(buf, " USING ");
-
-			/* Append operator name. */
-			opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(srt->sortop));
-			if (!HeapTupleIsValid(opertup))
-				elog(ERROR, "cache lookup failed for operator %u", srt->sortop);
-			operform = (Form_pg_operator) GETSTRUCT(opertup);
-			deparseOperatorName(buf, operform);
-			ReleaseSysCache(opertup);
-		}
+		appendOrderbyUsingClause(srt->sortop, sortcoltype, context);
 
 		if (srt->nulls_first)
 			appendStringInfoString(buf, " NULLS FIRST");
@@ -3280,7 +3324,11 @@ appendOrderByClause(List *pathkeys, bool has_final_sort,
 	foreach(lcell, pathkeys)
 	{
 		PathKey    *pathkey = lfirst(lcell);
+		EquivalenceMember *em;
 		Expr	   *em_expr;
+		HeapTuple	tuple;
+		Oid			oprid;
+		bool		isNull;
 
 		if (has_final_sort)
 		{
@@ -3288,21 +3336,44 @@ appendOrderByClause(List *pathkeys, bool has_final_sort,
 			 * By construction, context->foreignrel is the input relation to
 			 * the final sort.
 			 */
-			em_expr = find_em_expr_for_input_target(context->root,
-													pathkey->pk_eclass,
-													context->foreignrel->reltarget);
+			em = find_em_for_input_target(context->root,
+										  pathkey->pk_eclass,
+										  context->foreignrel->reltarget);
 		}
 		else
-			em_expr = find_em_expr_for_rel(pathkey->pk_eclass, baserel);
+			em = find_em_for_rel(pathkey->pk_eclass, baserel);
+		em_expr = em->em_expr;
+
+		/*
+		 * Lookup the operator corresponding to the strategy in the opclass.
+		 * The datatype used by the opfamily is not necessarily the same as
+		 * the expression type (for array types for example).
+		 */
+		tuple = SearchSysCache4(AMOPSTRATEGY,
+								ObjectIdGetDatum(pathkey->pk_opfamily),
+								em->em_datatype,
+								em->em_datatype,
+								pathkey->pk_strategy);
+		if (!HeapTupleIsValid(tuple))
+		{
+			elog(ERROR, "cache lookup failed for operator family %u", pathkey->pk_opfamily);
+		}
+
+		oprid = DatumGetObjectId(SysCacheGetAttr(AMOPSTRATEGY, tuple,
+												 Anum_pg_amop_amopopr, &isNull));
+		ReleaseSysCache(tuple);
+
 
 		Assert(em_expr != NULL);
 
 		appendStringInfoString(buf, delim);
 		deparseExpr(em_expr, context);
-		if (pathkey->pk_strategy == BTLessStrategyNumber)
-			appendStringInfoString(buf, " ASC");
-		else
-			appendStringInfoString(buf, " DESC");
+
+		/*
+		 * Here we need to use the expression type to compare against the
+		 * default btree sort operator
+		 */
+		appendOrderbyUsingClause(oprid, exprType((Node *) em_expr), context);
 
 		if (pathkey->pk_nulls_first)
 			appendStringInfoString(buf, " NULLS FIRST");
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index ed25e7a743..66bada908f 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3168,6 +3168,18 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
          Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (("C 1" < 100)) AND ((c2 = 6))
 (6 rows)
 
+-- This will not be pushed either
+explain verbose select * from ft2 order by c1 using operator(public.<^);
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
+ Sort  (cost=190.83..193.33 rows=1000 width=142)
+   Output: c1, c2, c3, c4, c5, c6, c7, c8
+   Sort Key: ft2.c1 USING <^
+   ->  Foreign Scan on public.ft2  (cost=100.00..141.00 rows=1000 width=47)
+         Output: c1, c2, c3, c4, c5, c6, c7, c8
+         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+(6 rows)
+
 -- Update local stats on ft2
 ANALYZE ft2;
 -- Add into extension
@@ -3195,6 +3207,15 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
  {6,16,26,36,46,56,66,76,86,96}
 (1 row)
 
+-- And this will be pushed too
+explain verbose select * from ft2 order by c1 using operator(public.<^);
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan on public.ft2  (cost=170.83..193.33 rows=1000 width=47)
+   Output: c1, c2, c3, c4, c5, c6, c7, c8
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" ORDER BY "C 1" USING OPERATOR(public.<^) NULLS LAST
+(3 rows)
+
 -- Remove from extension
 alter extension postgres_fdw drop operator class my_op_class using btree;
 alter extension postgres_fdw drop function my_op_cmp(a int, b int);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index f15c97ad7a..26c54d6a11 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -916,8 +916,6 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 		foreach(lc, root->query_pathkeys)
 		{
 			PathKey    *pathkey = (PathKey *) lfirst(lc);
-			EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
-			Expr	   *em_expr;
 
 			/*
 			 * The planner and executor don't have any clever strategy for
@@ -925,13 +923,8 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 			 * getting it to be sorted by all of those pathkeys. We'll just
 			 * end up resorting the entire data set.  So, unless we can push
 			 * down all of the query pathkeys, forget it.
-			 *
-			 * is_foreign_expr would detect volatile expressions as well, but
-			 * checking ec_has_volatile here saves some cycles.
 			 */
-			if (pathkey_ec->ec_has_volatile ||
-				!(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) ||
-				!is_foreign_expr(root, rel, em_expr))
+			if (!is_foreign_pathkey(root, rel, pathkey))
 			{
 				query_pathkeys_ok = false;
 				break;
@@ -6510,9 +6503,9 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 			return;
 
 		/* Get the sort expression for the pathkey_ec */
-		sort_expr = find_em_expr_for_input_target(root,
+		sort_expr = find_em_for_input_target(root,
 												  pathkey_ec,
-												  input_rel->reltarget);
+												  input_rel->reltarget)->em_expr;
 
 		/* If it's unsafe to remote, we cannot push down the final sort */
 		if (!is_foreign_expr(root, input_rel, sort_expr))
@@ -7250,11 +7243,11 @@ conversion_error_callback(void *arg)
 }
 
 /*
- * Find an equivalence class member expression to be computed as a sort column
+ * Find an equivalence class member to be computed as a sort column
  * in the given target.
  */
-Expr *
-find_em_expr_for_input_target(PlannerInfo *root,
+EquivalenceMember*
+find_em_for_input_target(PlannerInfo *root,
 							  EquivalenceClass *ec,
 							  PathTarget *target)
 {
@@ -7301,7 +7294,7 @@ find_em_expr_for_input_target(PlannerInfo *root,
 				em_expr = ((RelabelType *) em_expr)->arg;
 
 			if (equal(em_expr, expr))
-				return em->em_expr;
+				return em;
 		}
 
 		i++;
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 9591c0f6c2..a0d5c859da 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -165,12 +165,17 @@ extern void classifyConditions(PlannerInfo *root,
 							   List *input_conds,
 							   List **remote_conds,
 							   List **local_conds);
+
 extern bool is_foreign_expr(PlannerInfo *root,
 							RelOptInfo *baserel,
 							Expr *expr);
 extern bool is_foreign_param(PlannerInfo *root,
 							 RelOptInfo *baserel,
 							 Expr *expr);
+extern bool is_foreign_pathkey(PlannerInfo *root,
+							   RelOptInfo *baserel,
+							   PathKey *pathkey);
+
 extern void deparseInsertSql(StringInfo buf, RangeTblEntry *rte,
 							 Index rtindex, Relation rel,
 							 List *targetAttrs, bool doNothing,
@@ -212,8 +217,8 @@ extern void deparseTruncateSql(StringInfo buf,
 							   DropBehavior behavior,
 							   bool restart_seqs);
 extern void deparseStringLiteral(StringInfo buf, const char *val);
-extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
-extern Expr *find_em_expr_for_input_target(PlannerInfo *root,
+extern EquivalenceMember *find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
+extern EquivalenceMember *find_em_for_input_target(PlannerInfo *root,
 										   EquivalenceClass *ec,
 										   PathTarget *target);
 extern List *build_tlist_to_deparse(RelOptInfo *foreignrel);
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 02a6b15a13..841fac84e4 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -873,6 +873,9 @@ create operator class my_op_class for type int using btree family my_op_family a
 explain (verbose, costs off)
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 
+-- This will not be pushed either
+explain verbose select * from ft2 order by c1 using operator(public.<^);
+
 -- Update local stats on ft2
 ANALYZE ft2;
 
@@ -890,6 +893,9 @@ explain (verbose, costs off)
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 
+-- And this will be pushed too
+explain verbose select * from ft2 order by c1 using operator(public.<^);
+
 -- Remove from extension
 alter extension postgres_fdw drop operator class my_op_class using btree;
 alter extension postgres_fdw drop function my_op_cmp(a int, b int);
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 6f1abbe47d..001296597f 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -932,11 +932,11 @@ is_exprlist_member(Expr *node, List *exprs)
 }
 
 /*
- * Find an equivalence class member expression, all of whose Vars, come from
+ * Find an equivalence class member, all of whose Vars, come from
  * the indicated relation.
  */
-Expr *
-find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+EquivalenceMember *
+find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 {
 	ListCell   *lc_em;
 
@@ -952,7 +952,7 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 			 * taken entirely from this relation, we'll be content to choose
 			 * any one of those.
 			 */
-			return em->em_expr;
+			return em;
 		}
 	}
 
@@ -960,6 +960,17 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 	return NULL;
 }
 
+/*
+ * Simple wrapper around find_em_for_rel returning it's expression.
+ */
+Expr *
+find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+{
+	EquivalenceMember *em = find_em_for_rel(ec, rel);
+	return em ? em->em_expr : NULL;
+}
+
+
 /*
  * relation_can_be_sorted_early
  *		Can this relation be sorted on this EC before the final output step?
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index f1d111063c..bc17114ba3 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -144,6 +144,7 @@ extern EquivalenceMember *find_computable_ec_member(PlannerInfo *root,
 													Relids relids,
 													bool require_parallel_safe);
 extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
+extern EquivalenceMember *find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
 extern bool relation_can_be_sorted_early(PlannerInfo *root, RelOptInfo *rel,
 										 EquivalenceClass *ec,
 										 bool require_parallel_safe);
#6Ranier Vilela
ranier.vf@gmail.com
In reply to: Ronan Dunklau (#5)
1 attachment(s)
Re: ORDER BY pushdowns seem broken in postgres_fdw

Em qua., 21 de jul. de 2021 às 11:33, Ronan Dunklau <ronan.dunklau@aiven.io>
escreveu:

Le mercredi 21 juillet 2021 15:45:15 CEST, vous avez écrit :

On Thu, 22 Jul 2021 at 00:28, Ronan Dunklau <ronan.dunklau@aiven.io>

wrote:

The attached patch does the following:
- verify the opfamily is shippable to keep pathkeys
- generate a correct order by clause using the actual operator.

Thanks for writing the patch.

This is just a very superficial review. I've not spent a great deal
of time looking at postgres_fdw code, so would rather some eyes that
were more familiar with the code looked too.

Thank you for the review.

1. This comment needs to be updated. It still mentions
is_foreign_expr, which you're no longer calling.

* is_foreign_expr would detect volatile expressions as well, but
* checking ec_has_volatile here saves some cycles.
*/
- if (pathkey_ec->ec_has_volatile ||
- !(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) ||
- !is_foreign_expr(root, rel, em_expr))
+ if (!is_foreign_pathkey(root, rel, pathkey))

Done. By the way, the comment just above mentions we don't have a way to
use a
prefix pathkey, but I suppose we should revisit that now that we have
IncrementalSort. I'll mark it in my todo list for another patch.

2. This is not a very easy return condition to read:

+ return (!pathkey_ec->ec_has_volatile &&
+ (em = find_em_for_rel(pathkey_ec, baserel)) &&
+ is_foreign_expr(root, baserel, em->em_expr) &&
+ is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId, fpinfo));

I think it would be nicer to break that down into something easier on
the eyes that could be commented a little more.

Done, let me know what you think about it.

3. This comment is no longer true:

* Find an equivalence class member expression, all of whose Vars, come
from * the indicated relation.
*/
-Expr *
-find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+EquivalenceMember*
+find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel)

Also, missing space after EquivalenceMember.

The comment can just be moved down to:

+Expr *
+find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+{
+ EquivalenceMember *em = find_em_for_rel(ec, rel);
+ return em ? em->em_expr : NULL;
+}

and you can rewrite the one for find_em_for_rel.

I have done it the other way around. I'm not sure we really need to keep
the
find_em_expr_for_rel function on HEAD. If we decide to backpatch, it would
need
to be kept though.

Unfortunately your patch does not apply clear into the head.
So I have a few suggestions on v2, attached with the .txt extension to
avoid cf bot.
Please, if ok, make the v3.

1. new version is_foreign_pathke?
+bool
+is_foreign_pathkey(PlannerInfo *root,
+   RelOptInfo *baserel,
+   PathKey *pathkey)
+{
+ EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
+ EquivalenceMember *em;
+
+ /*
+ * is_foreign_expr would detect volatile expressions as well, but
+ * checking ec_has_volatile here saves some cycles.
+ */
+ if (pathkey_ec->ec_has_volatile)
+ return false;
+
+ /*
+ * Found member's expression is foreign?
+ */
+ em = find_em_for_rel(pathkey_ec, baserel);
+ if (em != NULL && is_foreign_expr(root, baserel, em->em_expr))
+   {
+ PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
+
+ /*
+ * Operator family is shippable?
+ */
+ return is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId,
fpinfo);
+ }
+
+ return false;
+}

2. appendOrderbyUsingClause function
Put the buffer actions together?

3. Apply style Postgres?
+ if (!HeapTupleIsValid(tuple))
+ {
+ elog(ERROR, "cache lookup failed for operator family %u",
pathkey->pk_opfamily);
+ }
4. Assertion not ok here?
+ em = find_em_for_rel(pathkey->pk_eclass, baserel);
+ em_expr = em->em_expr;
  Assert(em_expr != NULL);

find_em_for_rel function can returns NULL.
I think that is need deal with em_expr == NULL at runtime.

5. More readable version?
+find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+{
+ EquivalenceMember *em = find_em_for_rel(ec, rel);
+
+ if (em != NULL)
+ return em->em_expr;
+
+ return NULL;
+}

regards,
Ranier Vilela

Attachments:

v3_fix_postgresfdw_orderby_handling.txttext/plain; charset=US-ASCII; name=v3_fix_postgresfdw_orderby_handling.txtDownload
commit f7b3dc81878bf2ac503899f010eb18f390a64e37
Author: Ronan Dunklau <ronan.dunklau@aiven.io>
Date:   Wed Jul 21 12:44:41 2021 +0200

    Fix postgres_fdw PathKey's handling.
    
    The operator family being used for the sort was completely
    ignored, and as such its existence on the foreign server was not
    checked.

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 31919fda8c..5efefff65e 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -37,9 +37,11 @@
 #include "access/sysattr.h"
 #include "access/table.h"
 #include "catalog/pg_aggregate.h"
+#include "catalog/pg_amop.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_namespace.h"
 #include "catalog/pg_operator.h"
+#include "catalog/pg_opfamily.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
@@ -180,6 +182,7 @@ static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root,
 							   Index ignore_rel, List **ignore_conds, List **params_list);
 static void deparseAggref(Aggref *node, deparse_expr_cxt *context);
 static void appendGroupByClause(List *tlist, deparse_expr_cxt *context);
+static void appendOrderbyUsingClause(Oid sortop, Oid sortcoltype, deparse_expr_cxt *context);
 static void appendAggOrderBy(List *orderList, List *targetList,
 							 deparse_expr_cxt *context);
 static void appendFunctionName(Oid funcid, deparse_expr_cxt *context);
@@ -910,6 +913,37 @@ is_foreign_param(PlannerInfo *root,
 	return false;
 }
 
+/* Returns true if the given pathkey can be evaluated on the remote side
+ */
+bool
+is_foreign_pathkey(PlannerInfo *root,
+				   RelOptInfo *baserel,
+				   PathKey *pathkey)
+{
+	EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
+	EquivalenceMember *em;
+
+	/*
+	 * is_foreign_expr would detect volatile expressions as well, but
+	 * checking ec_has_volatile here saves some cycles.
+	 */
+	if (pathkey_ec->ec_has_volatile)
+		return false;
+
+	/*
+	 * Found member's expression is foreign?
+ 	 */
+	em = find_em_for_rel(pathkey_ec, baserel);
+	if (em != NULL && is_foreign_expr(root, baserel, em->em_expr))
+   {
+		PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
+
+		/*
+		 * Operator family is shippable?
+	 	 */
+		return is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId, fpinfo);
+	}
+
+	return false;
+}
+
 /*
  * Convert type OID + typmod info into a type name we can ship to the remote
  * server.  Someplace else had better have verified that this type name is
@@ -3125,6 +3159,37 @@ deparseAggref(Aggref *node, deparse_expr_cxt *context)
 	appendStringInfoChar(buf, ')');
 }
 
+/*
+ * Append the USING <OPERATOR> part of an ORDER BY clause
+ */
+static void
+appendOrderbyUsingClause(Oid sortop, Oid sortcoltype, deparse_expr_cxt *context)
+{
+	StringInfo	buf = context->buf;
+	TypeCacheEntry *typentry;
+
+	typentry = lookup_type_cache(sortcoltype, TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
+	if (sortop == typentry->lt_opr)
+		appendStringInfoString(buf, " ASC");
+	else if (sortop == typentry->gt_opr)
+		appendStringInfoString(buf, " DESC");
+	else
+	{
+		HeapTuple	opertup;
+		Form_pg_operator operform;
+
+		/* Append operator name. */
+		opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(sortop));
+		if (!HeapTupleIsValid(opertup))
+			elog(ERROR, "cache lookup failed for operator %u", sortop);
+		operform = (Form_pg_operator) GETSTRUCT(opertup);
+
+		appendStringInfoString(buf, " USING ");
+		deparseOperatorName(buf, operform);
+
+		ReleaseSysCache(opertup);
+	}
+}
+
 /*
  * Append ORDER BY within aggregate function.
  */
@@ -3140,7 +3205,6 @@ appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context)
 		SortGroupClause *srt = (SortGroupClause *) lfirst(lc);
 		Node	   *sortexpr;
 		Oid			sortcoltype;
-		TypeCacheEntry *typentry;
 
 		if (!first)
 			appendStringInfoString(buf, ", ");
@@ -3150,27 +3214,7 @@ appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context)
 										  false, context);
 		sortcoltype = exprType(sortexpr);
 		/* See whether operator is default < or > for datatype */
-		typentry = lookup_type_cache(sortcoltype,
-									 TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
-		if (srt->sortop == typentry->lt_opr)
-			appendStringInfoString(buf, " ASC");
-		else if (srt->sortop == typentry->gt_opr)
-			appendStringInfoString(buf, " DESC");
-		else
-		{
-			HeapTuple	opertup;
-			Form_pg_operator operform;
-
-			appendStringInfoString(buf, " USING ");
-
-			/* Append operator name. */
-			opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(srt->sortop));
-			if (!HeapTupleIsValid(opertup))
-				elog(ERROR, "cache lookup failed for operator %u", srt->sortop);
-			operform = (Form_pg_operator) GETSTRUCT(opertup);
-			deparseOperatorName(buf, operform);
-			ReleaseSysCache(opertup);
-		}
+		appendOrderbyUsingClause(srt->sortop, sortcoltype, context);
 
 		if (srt->nulls_first)
 			appendStringInfoString(buf, " NULLS FIRST");
@@ -3280,7 +3324,11 @@ appendOrderByClause(List *pathkeys, bool has_final_sort,
 	foreach(lcell, pathkeys)
 	{
 		PathKey    *pathkey = lfirst(lcell);
+		EquivalenceMember *em;
 		Expr	   *em_expr;
+		HeapTuple	tuple;
+		Oid			oprid;
+		bool		isNull;
 
 		if (has_final_sort)
 		{
@@ -3288,21 +3336,44 @@ appendOrderByClause(List *pathkeys, bool has_final_sort,
 			 * By construction, context->foreignrel is the input relation to
 			 * the final sort.
 			 */
-			em_expr = find_em_expr_for_input_target(context->root,
-													pathkey->pk_eclass,
-													context->foreignrel->reltarget);
+			em = find_em_for_input_target(context->root,
+										  pathkey->pk_eclass,
+										  context->foreignrel->reltarget);
 		}
 		else
-			em_expr = find_em_expr_for_rel(pathkey->pk_eclass, baserel);
+			em = find_em_for_rel(pathkey->pk_eclass, baserel);
+		em_expr = em->em_expr;
+
+		/*
+		 * Lookup the operator corresponding to the strategy in the opclass.
+		 * The datatype used by the opfamily is not necessarily the same as
+		 * the expression type (for array types for example).
+		 */
+		tuple = SearchSysCache4(AMOPSTRATEGY,
+								ObjectIdGetDatum(pathkey->pk_opfamily),
+								em->em_datatype,
+								em->em_datatype,
+								pathkey->pk_strategy);
+		if (!HeapTupleIsValid(tuple))
+			elog(ERROR, "cache lookup failed for operator family %u", pathkey->pk_opfamily);
+
+		oprid = DatumGetObjectId(SysCacheGetAttr(AMOPSTRATEGY, tuple,
+												 Anum_pg_amop_amopopr, &isNull));
+		ReleaseSysCache(tuple);
+
 
 		Assert(em_expr != NULL);
 
 		appendStringInfoString(buf, delim);
 		deparseExpr(em_expr, context);
-		if (pathkey->pk_strategy == BTLessStrategyNumber)
-			appendStringInfoString(buf, " ASC");
-		else
-			appendStringInfoString(buf, " DESC");
+
+		/*
+		 * Here we need to use the expression type to compare against the
+		 * default btree sort operator
+		 */
+		appendOrderbyUsingClause(oprid, exprType((Node *) em_expr), context);
 
 		if (pathkey->pk_nulls_first)
 			appendStringInfoString(buf, " NULLS FIRST");
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index ed25e7a743..66bada908f 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3168,6 +3168,18 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
          Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (("C 1" < 100)) AND ((c2 = 6))
 (6 rows)
 
+-- This will not be pushed either
+explain verbose select * from ft2 order by c1 using operator(public.<^);
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
+ Sort  (cost=190.83..193.33 rows=1000 width=142)
+   Output: c1, c2, c3, c4, c5, c6, c7, c8
+   Sort Key: ft2.c1 USING <^
+   ->  Foreign Scan on public.ft2  (cost=100.00..141.00 rows=1000 width=47)
+         Output: c1, c2, c3, c4, c5, c6, c7, c8
+         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+(6 rows)
+
 -- Update local stats on ft2
 ANALYZE ft2;
 -- Add into extension
@@ -3195,6 +3207,15 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
  {6,16,26,36,46,56,66,76,86,96}
 (1 row)
 
+-- And this will be pushed too
+explain verbose select * from ft2 order by c1 using operator(public.<^);
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan on public.ft2  (cost=170.83..193.33 rows=1000 width=47)
+   Output: c1, c2, c3, c4, c5, c6, c7, c8
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" ORDER BY "C 1" USING OPERATOR(public.<^) NULLS LAST
+(3 rows)
+
 -- Remove from extension
 alter extension postgres_fdw drop operator class my_op_class using btree;
 alter extension postgres_fdw drop function my_op_cmp(a int, b int);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index f15c97ad7a..26c54d6a11 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -916,8 +916,6 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 		foreach(lc, root->query_pathkeys)
 		{
 			PathKey    *pathkey = (PathKey *) lfirst(lc);
-			EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
-			Expr	   *em_expr;
 
 			/*
 			 * The planner and executor don't have any clever strategy for
@@ -925,13 +923,8 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 			 * getting it to be sorted by all of those pathkeys. We'll just
 			 * end up resorting the entire data set.  So, unless we can push
 			 * down all of the query pathkeys, forget it.
-			 *
-			 * is_foreign_expr would detect volatile expressions as well, but
-			 * checking ec_has_volatile here saves some cycles.
 			 */
-			if (pathkey_ec->ec_has_volatile ||
-				!(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) ||
-				!is_foreign_expr(root, rel, em_expr))
+			if (!is_foreign_pathkey(root, rel, pathkey))
 			{
 				query_pathkeys_ok = false;
 				break;
@@ -6510,9 +6503,9 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 			return;
 
 		/* Get the sort expression for the pathkey_ec */
-		sort_expr = find_em_expr_for_input_target(root,
+		sort_expr = find_em_for_input_target(root,
 												  pathkey_ec,
-												  input_rel->reltarget);
+												  input_rel->reltarget)->em_expr;
 
 		/* If it's unsafe to remote, we cannot push down the final sort */
 		if (!is_foreign_expr(root, input_rel, sort_expr))
@@ -7250,11 +7243,11 @@ conversion_error_callback(void *arg)
 }
 
 /*
- * Find an equivalence class member expression to be computed as a sort column
+ * Find an equivalence class member to be computed as a sort column
  * in the given target.
  */
-Expr *
-find_em_expr_for_input_target(PlannerInfo *root,
+EquivalenceMember*
+find_em_for_input_target(PlannerInfo *root,
 							  EquivalenceClass *ec,
 							  PathTarget *target)
 {
@@ -7301,7 +7294,7 @@ find_em_expr_for_input_target(PlannerInfo *root,
 				em_expr = ((RelabelType *) em_expr)->arg;
 
 			if (equal(em_expr, expr))
-				return em->em_expr;
+				return em;
 		}
 
 		i++;
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 9591c0f6c2..a0d5c859da 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -165,12 +165,17 @@ extern void classifyConditions(PlannerInfo *root,
 							   List *input_conds,
 							   List **remote_conds,
 							   List **local_conds);
+
 extern bool is_foreign_expr(PlannerInfo *root,
 							RelOptInfo *baserel,
 							Expr *expr);
 extern bool is_foreign_param(PlannerInfo *root,
 							 RelOptInfo *baserel,
 							 Expr *expr);
+extern bool is_foreign_pathkey(PlannerInfo *root,
+							   RelOptInfo *baserel,
+							   PathKey *pathkey);
+
 extern void deparseInsertSql(StringInfo buf, RangeTblEntry *rte,
 							 Index rtindex, Relation rel,
 							 List *targetAttrs, bool doNothing,
@@ -212,8 +217,8 @@ extern void deparseTruncateSql(StringInfo buf,
 							   DropBehavior behavior,
 							   bool restart_seqs);
 extern void deparseStringLiteral(StringInfo buf, const char *val);
-extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
-extern Expr *find_em_expr_for_input_target(PlannerInfo *root,
+extern EquivalenceMember *find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
+extern EquivalenceMember *find_em_for_input_target(PlannerInfo *root,
 										   EquivalenceClass *ec,
 										   PathTarget *target);
 extern List *build_tlist_to_deparse(RelOptInfo *foreignrel);
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 02a6b15a13..841fac84e4 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -873,6 +873,9 @@ create operator class my_op_class for type int using btree family my_op_family a
 explain (verbose, costs off)
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 
+-- This will not be pushed either
+explain verbose select * from ft2 order by c1 using operator(public.<^);
+
 -- Update local stats on ft2
 ANALYZE ft2;
 
@@ -890,6 +893,9 @@ explain (verbose, costs off)
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 
+-- And this will be pushed too
+explain verbose select * from ft2 order by c1 using operator(public.<^);
+
 -- Remove from extension
 alter extension postgres_fdw drop operator class my_op_class using btree;
 alter extension postgres_fdw drop function my_op_cmp(a int, b int);
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 6f1abbe47d..001296597f 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -932,11 +932,11 @@ is_exprlist_member(Expr *node, List *exprs)
 }
 
 /*
- * Find an equivalence class member expression, all of whose Vars, come from
+ * Find an equivalence class member, all of whose Vars, come from
  * the indicated relation.
  */
-Expr *
-find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+EquivalenceMember *
+find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 {
 	ListCell   *lc_em;
 
@@ -952,7 +952,7 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 			 * taken entirely from this relation, we'll be content to choose
 			 * any one of those.
 			 */
-			return em->em_expr;
+			return em;
 		}
 	}
 
@@ -960,6 +960,17 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 	return NULL;
 }
 
+/*
+ * Simple wrapper around find_em_for_rel returning it's expression.
+ */
+Expr *
+find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+{
+	EquivalenceMember *em = find_em_for_rel(ec, rel);
+
+	if (em != NULL)
+		return em->em_expr;
+
+	return NULL;
+}
+
+
 /*
  * relation_can_be_sorted_early
  *		Can this relation be sorted on this EC before the final output step?
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index f1d111063c..bc17114ba3 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -144,6 +144,7 @@ extern EquivalenceMember *find_computable_ec_member(PlannerInfo *root,
 													Relids relids,
 													bool require_parallel_safe);
 extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
+extern EquivalenceMember *find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
 extern bool relation_can_be_sorted_early(PlannerInfo *root, RelOptInfo *rel,
 										 EquivalenceClass *ec,
 										 bool require_parallel_safe);
#7Ronan Dunklau
ronan.dunklau@aiven.io
In reply to: Ranier Vilela (#6)
1 attachment(s)
Re: ORDER BY pushdowns seem broken in postgres_fdw

Le jeudi 22 juillet 2021, 02:16:52 CEST Ranier Vilela a écrit :

Unfortunately your patch does not apply clear into the head.
So I have a few suggestions on v2, attached with the .txt extension to
avoid cf bot.
Please, if ok, make the v3.

Hum weird, it applied cleanly for me, and was formatted using git show which I
admit is not ideal. Please find it reattached.

2. appendOrderbyUsingClause function
Put the buffer actions together?

Not sure what you mean here ?

3. Apply style Postgres?
+ if (!HeapTupleIsValid(tuple))
+ {
+ elog(ERROR, "cache lookup failed for operator family %u",
pathkey->pk_opfamily);
+ }

Good catch !

4. Assertion not ok here?
+ em = find_em_for_rel(pathkey->pk_eclass, baserel);
+ em_expr = em->em_expr;
Assert(em_expr != NULL);

If we are here there should never be a case where the em can't be found. I
moved the assertion where it makes sense though.

Regards,

--
Ronan Dunklau

Attachments:

v4_fix_postgresfdw_orderby_handling.txttext/plain; charset=UTF-8; name=v4_fix_postgresfdw_orderby_handling.txtDownload
From 1e8fdee27ff69ad7c9ba5c77ce3c3664d70cd251 Mon Sep 17 00:00:00 2001
From: Ronan Dunklau <ronan.dunklau@aiven.io>
Date: Wed, 21 Jul 2021 12:44:41 +0200
Subject: [PATCH v4] Fix postgres_fdw PathKey's handling.

The operator family being used for the sort was completely
ignored, and as such its existence on the foreign server was not
checked.
---
 contrib/postgres_fdw/deparse.c                | 128 +++++++++++++-----
 .../postgres_fdw/expected/postgres_fdw.out    |  21 +++
 contrib/postgres_fdw/postgres_fdw.c           |  21 +--
 contrib/postgres_fdw/postgres_fdw.h           |   9 +-
 contrib/postgres_fdw/sql/postgres_fdw.sql     |   6 +
 src/backend/optimizer/path/equivclass.c       |  19 ++-
 src/include/optimizer/paths.h                 |   1 +
 7 files changed, 154 insertions(+), 51 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 31919fda8c..8e9fc512a1 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -37,9 +37,11 @@
 #include "access/sysattr.h"
 #include "access/table.h"
 #include "catalog/pg_aggregate.h"
+#include "catalog/pg_amop.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_namespace.h"
 #include "catalog/pg_operator.h"
+#include "catalog/pg_opfamily.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
@@ -180,6 +182,7 @@ static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root,
 							   Index ignore_rel, List **ignore_conds, List **params_list);
 static void deparseAggref(Aggref *node, deparse_expr_cxt *context);
 static void appendGroupByClause(List *tlist, deparse_expr_cxt *context);
+static void appendOrderbyUsingClause(Oid sortop, Oid sortcoltype, deparse_expr_cxt *context);
 static void appendAggOrderBy(List *orderList, List *targetList,
 							 deparse_expr_cxt *context);
 static void appendFunctionName(Oid funcid, deparse_expr_cxt *context);
@@ -910,6 +913,37 @@ is_foreign_param(PlannerInfo *root,
 	return false;
 }
 
+/* Returns true if the given pathkey can be evaluated on the remote side
+ */
+bool
+is_foreign_pathkey(PlannerInfo *root,
+				   RelOptInfo *baserel,
+				   PathKey *pathkey)
+{
+	EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
+	EquivalenceMember *em;
+	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
+
+	/*
+	 * is_foreign_expr would detect volatile expressions as well, but
+	 * checking ec_has_volatile here saves some cycles.
+	 */
+	if (pathkey_ec->ec_has_volatile)
+		return false;
+
+	em = find_em_for_rel(pathkey_ec, baserel);
+	if (em == NULL)
+		return false;
+
+	/*
+	 * Finally, verify the found member's expression is foreign and its operator
+	 * family is shippable.
+	 */
+	return (is_foreign_expr(root, baserel, em->em_expr) &&
+			is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId, fpinfo));
+}
+
+
 /*
  * Convert type OID + typmod info into a type name we can ship to the remote
  * server.  Someplace else had better have verified that this type name is
@@ -3125,6 +3159,37 @@ deparseAggref(Aggref *node, deparse_expr_cxt *context)
 	appendStringInfoChar(buf, ')');
 }
 
+/*
+ * Append the USING <OPERATOR> part of an ORDER BY clause
+ */
+static void
+appendOrderbyUsingClause(Oid sortop, Oid sortcoltype, deparse_expr_cxt *context)
+{
+	StringInfo	buf = context->buf;
+	TypeCacheEntry *typentry;
+	typentry = lookup_type_cache(sortcoltype, TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
+
+	if (sortop == typentry->lt_opr)
+		appendStringInfoString(buf, " ASC");
+	else if (sortop == typentry->gt_opr)
+		appendStringInfoString(buf, " DESC");
+	else
+	{
+		HeapTuple	opertup;
+		Form_pg_operator operform;
+
+		appendStringInfoString(buf, " USING ");
+
+		/* Append operator name. */
+		opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(sortop));
+		if (!HeapTupleIsValid(opertup))
+			elog(ERROR, "cache lookup failed for operator %u", sortop);
+		operform = (Form_pg_operator) GETSTRUCT(opertup);
+		deparseOperatorName(buf, operform);
+		ReleaseSysCache(opertup);
+	}
+}
+
 /*
  * Append ORDER BY within aggregate function.
  */
@@ -3140,7 +3205,6 @@ appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context)
 		SortGroupClause *srt = (SortGroupClause *) lfirst(lc);
 		Node	   *sortexpr;
 		Oid			sortcoltype;
-		TypeCacheEntry *typentry;
 
 		if (!first)
 			appendStringInfoString(buf, ", ");
@@ -3150,27 +3214,7 @@ appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context)
 										  false, context);
 		sortcoltype = exprType(sortexpr);
 		/* See whether operator is default < or > for datatype */
-		typentry = lookup_type_cache(sortcoltype,
-									 TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
-		if (srt->sortop == typentry->lt_opr)
-			appendStringInfoString(buf, " ASC");
-		else if (srt->sortop == typentry->gt_opr)
-			appendStringInfoString(buf, " DESC");
-		else
-		{
-			HeapTuple	opertup;
-			Form_pg_operator operform;
-
-			appendStringInfoString(buf, " USING ");
-
-			/* Append operator name. */
-			opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(srt->sortop));
-			if (!HeapTupleIsValid(opertup))
-				elog(ERROR, "cache lookup failed for operator %u", srt->sortop);
-			operform = (Form_pg_operator) GETSTRUCT(opertup);
-			deparseOperatorName(buf, operform);
-			ReleaseSysCache(opertup);
-		}
+		appendOrderbyUsingClause(srt->sortop, sortcoltype, context);
 
 		if (srt->nulls_first)
 			appendStringInfoString(buf, " NULLS FIRST");
@@ -3280,7 +3324,11 @@ appendOrderByClause(List *pathkeys, bool has_final_sort,
 	foreach(lcell, pathkeys)
 	{
 		PathKey    *pathkey = lfirst(lcell);
+		EquivalenceMember *em;
 		Expr	   *em_expr;
+		HeapTuple	tuple;
+		Oid			oprid;
+		bool		isNull;
 
 		if (has_final_sort)
 		{
@@ -3288,21 +3336,39 @@ appendOrderByClause(List *pathkeys, bool has_final_sort,
 			 * By construction, context->foreignrel is the input relation to
 			 * the final sort.
 			 */
-			em_expr = find_em_expr_for_input_target(context->root,
-													pathkey->pk_eclass,
-													context->foreignrel->reltarget);
+			em = find_em_for_input_target(context->root,
+										  pathkey->pk_eclass,
+										  context->foreignrel->reltarget);
 		}
 		else
-			em_expr = find_em_expr_for_rel(pathkey->pk_eclass, baserel);
+			em = find_em_for_rel(pathkey->pk_eclass, baserel);
+		em_expr = em->em_expr;
+		Assert(em != NULL);
 
-		Assert(em_expr != NULL);
+		/*
+		 * Lookup the operator corresponding to the strategy in the opclass.
+		 * The datatype used by the opfamily is not necessarily the same as
+		 * the expression type (for array types for example).
+		 */
+		tuple = SearchSysCache4(AMOPSTRATEGY,
+								ObjectIdGetDatum(pathkey->pk_opfamily),
+								em->em_datatype,
+								em->em_datatype,
+								pathkey->pk_strategy);
+		if (!HeapTupleIsValid(tuple))
+			elog(ERROR, "cache lookup failed for operator family %u", pathkey->pk_opfamily);
+		oprid = DatumGetObjectId(SysCacheGetAttr(AMOPSTRATEGY, tuple,
+												 Anum_pg_amop_amopopr, &isNull));
+		ReleaseSysCache(tuple);
 
 		appendStringInfoString(buf, delim);
 		deparseExpr(em_expr, context);
-		if (pathkey->pk_strategy == BTLessStrategyNumber)
-			appendStringInfoString(buf, " ASC");
-		else
-			appendStringInfoString(buf, " DESC");
+
+		/*
+		 * Here we need to use the expression type to compare against the
+		 * default btree sort operator
+		 */
+		appendOrderbyUsingClause(oprid, exprType((Node *) em_expr), context);
 
 		if (pathkey->pk_nulls_first)
 			appendStringInfoString(buf, " NULLS FIRST");
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index ed25e7a743..66bada908f 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3168,6 +3168,18 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
          Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (("C 1" < 100)) AND ((c2 = 6))
 (6 rows)
 
+-- This will not be pushed either
+explain verbose select * from ft2 order by c1 using operator(public.<^);
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
+ Sort  (cost=190.83..193.33 rows=1000 width=142)
+   Output: c1, c2, c3, c4, c5, c6, c7, c8
+   Sort Key: ft2.c1 USING <^
+   ->  Foreign Scan on public.ft2  (cost=100.00..141.00 rows=1000 width=47)
+         Output: c1, c2, c3, c4, c5, c6, c7, c8
+         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+(6 rows)
+
 -- Update local stats on ft2
 ANALYZE ft2;
 -- Add into extension
@@ -3195,6 +3207,15 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
  {6,16,26,36,46,56,66,76,86,96}
 (1 row)
 
+-- And this will be pushed too
+explain verbose select * from ft2 order by c1 using operator(public.<^);
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan on public.ft2  (cost=170.83..193.33 rows=1000 width=47)
+   Output: c1, c2, c3, c4, c5, c6, c7, c8
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" ORDER BY "C 1" USING OPERATOR(public.<^) NULLS LAST
+(3 rows)
+
 -- Remove from extension
 alter extension postgres_fdw drop operator class my_op_class using btree;
 alter extension postgres_fdw drop function my_op_cmp(a int, b int);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index f15c97ad7a..26c54d6a11 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -916,8 +916,6 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 		foreach(lc, root->query_pathkeys)
 		{
 			PathKey    *pathkey = (PathKey *) lfirst(lc);
-			EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
-			Expr	   *em_expr;
 
 			/*
 			 * The planner and executor don't have any clever strategy for
@@ -925,13 +923,8 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 			 * getting it to be sorted by all of those pathkeys. We'll just
 			 * end up resorting the entire data set.  So, unless we can push
 			 * down all of the query pathkeys, forget it.
-			 *
-			 * is_foreign_expr would detect volatile expressions as well, but
-			 * checking ec_has_volatile here saves some cycles.
 			 */
-			if (pathkey_ec->ec_has_volatile ||
-				!(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) ||
-				!is_foreign_expr(root, rel, em_expr))
+			if (!is_foreign_pathkey(root, rel, pathkey))
 			{
 				query_pathkeys_ok = false;
 				break;
@@ -6510,9 +6503,9 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 			return;
 
 		/* Get the sort expression for the pathkey_ec */
-		sort_expr = find_em_expr_for_input_target(root,
+		sort_expr = find_em_for_input_target(root,
 												  pathkey_ec,
-												  input_rel->reltarget);
+												  input_rel->reltarget)->em_expr;
 
 		/* If it's unsafe to remote, we cannot push down the final sort */
 		if (!is_foreign_expr(root, input_rel, sort_expr))
@@ -7250,11 +7243,11 @@ conversion_error_callback(void *arg)
 }
 
 /*
- * Find an equivalence class member expression to be computed as a sort column
+ * Find an equivalence class member to be computed as a sort column
  * in the given target.
  */
-Expr *
-find_em_expr_for_input_target(PlannerInfo *root,
+EquivalenceMember*
+find_em_for_input_target(PlannerInfo *root,
 							  EquivalenceClass *ec,
 							  PathTarget *target)
 {
@@ -7301,7 +7294,7 @@ find_em_expr_for_input_target(PlannerInfo *root,
 				em_expr = ((RelabelType *) em_expr)->arg;
 
 			if (equal(em_expr, expr))
-				return em->em_expr;
+				return em;
 		}
 
 		i++;
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 9591c0f6c2..a0d5c859da 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -165,12 +165,17 @@ extern void classifyConditions(PlannerInfo *root,
 							   List *input_conds,
 							   List **remote_conds,
 							   List **local_conds);
+
 extern bool is_foreign_expr(PlannerInfo *root,
 							RelOptInfo *baserel,
 							Expr *expr);
 extern bool is_foreign_param(PlannerInfo *root,
 							 RelOptInfo *baserel,
 							 Expr *expr);
+extern bool is_foreign_pathkey(PlannerInfo *root,
+							   RelOptInfo *baserel,
+							   PathKey *pathkey);
+
 extern void deparseInsertSql(StringInfo buf, RangeTblEntry *rte,
 							 Index rtindex, Relation rel,
 							 List *targetAttrs, bool doNothing,
@@ -212,8 +217,8 @@ extern void deparseTruncateSql(StringInfo buf,
 							   DropBehavior behavior,
 							   bool restart_seqs);
 extern void deparseStringLiteral(StringInfo buf, const char *val);
-extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
-extern Expr *find_em_expr_for_input_target(PlannerInfo *root,
+extern EquivalenceMember *find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
+extern EquivalenceMember *find_em_for_input_target(PlannerInfo *root,
 										   EquivalenceClass *ec,
 										   PathTarget *target);
 extern List *build_tlist_to_deparse(RelOptInfo *foreignrel);
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 02a6b15a13..841fac84e4 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -873,6 +873,9 @@ create operator class my_op_class for type int using btree family my_op_family a
 explain (verbose, costs off)
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 
+-- This will not be pushed either
+explain verbose select * from ft2 order by c1 using operator(public.<^);
+
 -- Update local stats on ft2
 ANALYZE ft2;
 
@@ -890,6 +893,9 @@ explain (verbose, costs off)
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 
+-- And this will be pushed too
+explain verbose select * from ft2 order by c1 using operator(public.<^);
+
 -- Remove from extension
 alter extension postgres_fdw drop operator class my_op_class using btree;
 alter extension postgres_fdw drop function my_op_cmp(a int, b int);
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 6f1abbe47d..001296597f 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -932,11 +932,11 @@ is_exprlist_member(Expr *node, List *exprs)
 }
 
 /*
- * Find an equivalence class member expression, all of whose Vars, come from
+ * Find an equivalence class member, all of whose Vars, come from
  * the indicated relation.
  */
-Expr *
-find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+EquivalenceMember *
+find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 {
 	ListCell   *lc_em;
 
@@ -952,7 +952,7 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 			 * taken entirely from this relation, we'll be content to choose
 			 * any one of those.
 			 */
-			return em->em_expr;
+			return em;
 		}
 	}
 
@@ -960,6 +960,17 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 	return NULL;
 }
 
+/*
+ * Simple wrapper around find_em_for_rel returning it's expression.
+ */
+Expr *
+find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+{
+	EquivalenceMember *em = find_em_for_rel(ec, rel);
+	return em ? em->em_expr : NULL;
+}
+
+
 /*
  * relation_can_be_sorted_early
  *		Can this relation be sorted on this EC before the final output step?
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index f1d111063c..bc17114ba3 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -144,6 +144,7 @@ extern EquivalenceMember *find_computable_ec_member(PlannerInfo *root,
 													Relids relids,
 													bool require_parallel_safe);
 extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
+extern EquivalenceMember *find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
 extern bool relation_can_be_sorted_early(PlannerInfo *root, RelOptInfo *rel,
 										 EquivalenceClass *ec,
 										 bool require_parallel_safe);
-- 
2.32.0

#8David Rowley
dgrowleyml@gmail.com
In reply to: Ronan Dunklau (#7)
Re: ORDER BY pushdowns seem broken in postgres_fdw

On Thu, 22 Jul 2021 at 19:00, Ronan Dunklau <ronan.dunklau@aiven.io> wrote:

Please find it reattached.

+-- This will not be pushed either
+explain verbose select * from ft2 order by c1 using operator(public.<^);
+                                  QUERY PLAN
+-------------------------------------------------------------------------------
+ Sort  (cost=190.83..193.33 rows=1000 width=142)

Can you also use explain (verbose, costs off) the same as the other
tests in that area. Having the costs there would never survive a run
of the buildfarm. Different hardware will produce different costs, e.g
32-bit hardware might cost cheaper due to narrower widths.

History lesson: costs off was added so we could test plans. Before
that, I don't think that the regression tests had any coverage for
plans. Older test files still likely lack much testing with EXPLAIN.

David

#9Ronan Dunklau
ronan.dunklau@aiven.io
In reply to: David Rowley (#8)
1 attachment(s)
Re: ORDER BY pushdowns seem broken in postgres_fdw

Le jeudi 22 juillet 2021, 09:44:54 CEST David Rowley a écrit :

+-- This will not be pushed either
+explain verbose select * from ft2 order by c1 using operator(public.<^);
+                                  QUERY PLAN
+---------------------------------------------------------------------------
---- + Sort  (cost=190.83..193.33 rows=1000 width=142)

Can you also use explain (verbose, costs off) the same as the other
tests in that area. Having the costs there would never survive a run
of the buildfarm. Different hardware will produce different costs, e.g
32-bit hardware might cost cheaper due to narrower widths.

Sorry about that. Here it is.

Regards,

--
Ronan Dunklau

Attachments:

v5_fix_postgresfdw_orderby_handling.patchtext/x-patch; charset=UTF-8; name=v5_fix_postgresfdw_orderby_handling.patchDownload
From 89217c39631440198111be931634ed2f227e08e0 Mon Sep 17 00:00:00 2001
From: Ronan Dunklau <ronan.dunklau@aiven.io>
Date: Wed, 21 Jul 2021 12:44:41 +0200
Subject: [PATCH v5] Fix postgres_fdw PathKey's handling.

The operator family being used for the sort was completely
ignored, and as such its existence on the foreign server was not
checked.
---
 contrib/postgres_fdw/deparse.c                | 128 +++++++++++++-----
 .../postgres_fdw/expected/postgres_fdw.out    |  21 +++
 contrib/postgres_fdw/postgres_fdw.c           |  21 +--
 contrib/postgres_fdw/postgres_fdw.h           |   9 +-
 contrib/postgres_fdw/sql/postgres_fdw.sql     |   6 +
 src/backend/optimizer/path/equivclass.c       |  19 ++-
 src/include/optimizer/paths.h                 |   1 +
 7 files changed, 154 insertions(+), 51 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 31919fda8c..8e9fc512a1 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -37,9 +37,11 @@
 #include "access/sysattr.h"
 #include "access/table.h"
 #include "catalog/pg_aggregate.h"
+#include "catalog/pg_amop.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_namespace.h"
 #include "catalog/pg_operator.h"
+#include "catalog/pg_opfamily.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
@@ -180,6 +182,7 @@ static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root,
 							   Index ignore_rel, List **ignore_conds, List **params_list);
 static void deparseAggref(Aggref *node, deparse_expr_cxt *context);
 static void appendGroupByClause(List *tlist, deparse_expr_cxt *context);
+static void appendOrderbyUsingClause(Oid sortop, Oid sortcoltype, deparse_expr_cxt *context);
 static void appendAggOrderBy(List *orderList, List *targetList,
 							 deparse_expr_cxt *context);
 static void appendFunctionName(Oid funcid, deparse_expr_cxt *context);
@@ -910,6 +913,37 @@ is_foreign_param(PlannerInfo *root,
 	return false;
 }
 
+/* Returns true if the given pathkey can be evaluated on the remote side
+ */
+bool
+is_foreign_pathkey(PlannerInfo *root,
+				   RelOptInfo *baserel,
+				   PathKey *pathkey)
+{
+	EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
+	EquivalenceMember *em;
+	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
+
+	/*
+	 * is_foreign_expr would detect volatile expressions as well, but
+	 * checking ec_has_volatile here saves some cycles.
+	 */
+	if (pathkey_ec->ec_has_volatile)
+		return false;
+
+	em = find_em_for_rel(pathkey_ec, baserel);
+	if (em == NULL)
+		return false;
+
+	/*
+	 * Finally, verify the found member's expression is foreign and its operator
+	 * family is shippable.
+	 */
+	return (is_foreign_expr(root, baserel, em->em_expr) &&
+			is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId, fpinfo));
+}
+
+
 /*
  * Convert type OID + typmod info into a type name we can ship to the remote
  * server.  Someplace else had better have verified that this type name is
@@ -3125,6 +3159,37 @@ deparseAggref(Aggref *node, deparse_expr_cxt *context)
 	appendStringInfoChar(buf, ')');
 }
 
+/*
+ * Append the USING <OPERATOR> part of an ORDER BY clause
+ */
+static void
+appendOrderbyUsingClause(Oid sortop, Oid sortcoltype, deparse_expr_cxt *context)
+{
+	StringInfo	buf = context->buf;
+	TypeCacheEntry *typentry;
+	typentry = lookup_type_cache(sortcoltype, TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
+
+	if (sortop == typentry->lt_opr)
+		appendStringInfoString(buf, " ASC");
+	else if (sortop == typentry->gt_opr)
+		appendStringInfoString(buf, " DESC");
+	else
+	{
+		HeapTuple	opertup;
+		Form_pg_operator operform;
+
+		appendStringInfoString(buf, " USING ");
+
+		/* Append operator name. */
+		opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(sortop));
+		if (!HeapTupleIsValid(opertup))
+			elog(ERROR, "cache lookup failed for operator %u", sortop);
+		operform = (Form_pg_operator) GETSTRUCT(opertup);
+		deparseOperatorName(buf, operform);
+		ReleaseSysCache(opertup);
+	}
+}
+
 /*
  * Append ORDER BY within aggregate function.
  */
@@ -3140,7 +3205,6 @@ appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context)
 		SortGroupClause *srt = (SortGroupClause *) lfirst(lc);
 		Node	   *sortexpr;
 		Oid			sortcoltype;
-		TypeCacheEntry *typentry;
 
 		if (!first)
 			appendStringInfoString(buf, ", ");
@@ -3150,27 +3214,7 @@ appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context)
 										  false, context);
 		sortcoltype = exprType(sortexpr);
 		/* See whether operator is default < or > for datatype */
-		typentry = lookup_type_cache(sortcoltype,
-									 TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
-		if (srt->sortop == typentry->lt_opr)
-			appendStringInfoString(buf, " ASC");
-		else if (srt->sortop == typentry->gt_opr)
-			appendStringInfoString(buf, " DESC");
-		else
-		{
-			HeapTuple	opertup;
-			Form_pg_operator operform;
-
-			appendStringInfoString(buf, " USING ");
-
-			/* Append operator name. */
-			opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(srt->sortop));
-			if (!HeapTupleIsValid(opertup))
-				elog(ERROR, "cache lookup failed for operator %u", srt->sortop);
-			operform = (Form_pg_operator) GETSTRUCT(opertup);
-			deparseOperatorName(buf, operform);
-			ReleaseSysCache(opertup);
-		}
+		appendOrderbyUsingClause(srt->sortop, sortcoltype, context);
 
 		if (srt->nulls_first)
 			appendStringInfoString(buf, " NULLS FIRST");
@@ -3280,7 +3324,11 @@ appendOrderByClause(List *pathkeys, bool has_final_sort,
 	foreach(lcell, pathkeys)
 	{
 		PathKey    *pathkey = lfirst(lcell);
+		EquivalenceMember *em;
 		Expr	   *em_expr;
+		HeapTuple	tuple;
+		Oid			oprid;
+		bool		isNull;
 
 		if (has_final_sort)
 		{
@@ -3288,21 +3336,39 @@ appendOrderByClause(List *pathkeys, bool has_final_sort,
 			 * By construction, context->foreignrel is the input relation to
 			 * the final sort.
 			 */
-			em_expr = find_em_expr_for_input_target(context->root,
-													pathkey->pk_eclass,
-													context->foreignrel->reltarget);
+			em = find_em_for_input_target(context->root,
+										  pathkey->pk_eclass,
+										  context->foreignrel->reltarget);
 		}
 		else
-			em_expr = find_em_expr_for_rel(pathkey->pk_eclass, baserel);
+			em = find_em_for_rel(pathkey->pk_eclass, baserel);
+		em_expr = em->em_expr;
+		Assert(em != NULL);
 
-		Assert(em_expr != NULL);
+		/*
+		 * Lookup the operator corresponding to the strategy in the opclass.
+		 * The datatype used by the opfamily is not necessarily the same as
+		 * the expression type (for array types for example).
+		 */
+		tuple = SearchSysCache4(AMOPSTRATEGY,
+								ObjectIdGetDatum(pathkey->pk_opfamily),
+								em->em_datatype,
+								em->em_datatype,
+								pathkey->pk_strategy);
+		if (!HeapTupleIsValid(tuple))
+			elog(ERROR, "cache lookup failed for operator family %u", pathkey->pk_opfamily);
+		oprid = DatumGetObjectId(SysCacheGetAttr(AMOPSTRATEGY, tuple,
+												 Anum_pg_amop_amopopr, &isNull));
+		ReleaseSysCache(tuple);
 
 		appendStringInfoString(buf, delim);
 		deparseExpr(em_expr, context);
-		if (pathkey->pk_strategy == BTLessStrategyNumber)
-			appendStringInfoString(buf, " ASC");
-		else
-			appendStringInfoString(buf, " DESC");
+
+		/*
+		 * Here we need to use the expression type to compare against the
+		 * default btree sort operator
+		 */
+		appendOrderbyUsingClause(oprid, exprType((Node *) em_expr), context);
 
 		if (pathkey->pk_nulls_first)
 			appendStringInfoString(buf, " NULLS FIRST");
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index ed25e7a743..a5f09da589 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3168,6 +3168,18 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
          Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (("C 1" < 100)) AND ((c2 = 6))
 (6 rows)
 
+-- This will not be pushed either
+explain (verbose, costs off) select * from ft2 order by c1 using operator(public.<^);
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
+ Sort
+   Output: c1, c2, c3, c4, c5, c6, c7, c8
+   Sort Key: ft2.c1 USING <^
+   ->  Foreign Scan on public.ft2
+         Output: c1, c2, c3, c4, c5, c6, c7, c8
+         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+(6 rows)
+
 -- Update local stats on ft2
 ANALYZE ft2;
 -- Add into extension
@@ -3195,6 +3207,15 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
  {6,16,26,36,46,56,66,76,86,96}
 (1 row)
 
+-- And this will be pushed too
+explain (verbose, costs off) select * from ft2 order by c1 using operator(public.<^);
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan on public.ft2
+   Output: c1, c2, c3, c4, c5, c6, c7, c8
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" ORDER BY "C 1" USING OPERATOR(public.<^) NULLS LAST
+(3 rows)
+
 -- Remove from extension
 alter extension postgres_fdw drop operator class my_op_class using btree;
 alter extension postgres_fdw drop function my_op_cmp(a int, b int);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index f15c97ad7a..26c54d6a11 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -916,8 +916,6 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 		foreach(lc, root->query_pathkeys)
 		{
 			PathKey    *pathkey = (PathKey *) lfirst(lc);
-			EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
-			Expr	   *em_expr;
 
 			/*
 			 * The planner and executor don't have any clever strategy for
@@ -925,13 +923,8 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 			 * getting it to be sorted by all of those pathkeys. We'll just
 			 * end up resorting the entire data set.  So, unless we can push
 			 * down all of the query pathkeys, forget it.
-			 *
-			 * is_foreign_expr would detect volatile expressions as well, but
-			 * checking ec_has_volatile here saves some cycles.
 			 */
-			if (pathkey_ec->ec_has_volatile ||
-				!(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) ||
-				!is_foreign_expr(root, rel, em_expr))
+			if (!is_foreign_pathkey(root, rel, pathkey))
 			{
 				query_pathkeys_ok = false;
 				break;
@@ -6510,9 +6503,9 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 			return;
 
 		/* Get the sort expression for the pathkey_ec */
-		sort_expr = find_em_expr_for_input_target(root,
+		sort_expr = find_em_for_input_target(root,
 												  pathkey_ec,
-												  input_rel->reltarget);
+												  input_rel->reltarget)->em_expr;
 
 		/* If it's unsafe to remote, we cannot push down the final sort */
 		if (!is_foreign_expr(root, input_rel, sort_expr))
@@ -7250,11 +7243,11 @@ conversion_error_callback(void *arg)
 }
 
 /*
- * Find an equivalence class member expression to be computed as a sort column
+ * Find an equivalence class member to be computed as a sort column
  * in the given target.
  */
-Expr *
-find_em_expr_for_input_target(PlannerInfo *root,
+EquivalenceMember*
+find_em_for_input_target(PlannerInfo *root,
 							  EquivalenceClass *ec,
 							  PathTarget *target)
 {
@@ -7301,7 +7294,7 @@ find_em_expr_for_input_target(PlannerInfo *root,
 				em_expr = ((RelabelType *) em_expr)->arg;
 
 			if (equal(em_expr, expr))
-				return em->em_expr;
+				return em;
 		}
 
 		i++;
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 9591c0f6c2..a0d5c859da 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -165,12 +165,17 @@ extern void classifyConditions(PlannerInfo *root,
 							   List *input_conds,
 							   List **remote_conds,
 							   List **local_conds);
+
 extern bool is_foreign_expr(PlannerInfo *root,
 							RelOptInfo *baserel,
 							Expr *expr);
 extern bool is_foreign_param(PlannerInfo *root,
 							 RelOptInfo *baserel,
 							 Expr *expr);
+extern bool is_foreign_pathkey(PlannerInfo *root,
+							   RelOptInfo *baserel,
+							   PathKey *pathkey);
+
 extern void deparseInsertSql(StringInfo buf, RangeTblEntry *rte,
 							 Index rtindex, Relation rel,
 							 List *targetAttrs, bool doNothing,
@@ -212,8 +217,8 @@ extern void deparseTruncateSql(StringInfo buf,
 							   DropBehavior behavior,
 							   bool restart_seqs);
 extern void deparseStringLiteral(StringInfo buf, const char *val);
-extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
-extern Expr *find_em_expr_for_input_target(PlannerInfo *root,
+extern EquivalenceMember *find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
+extern EquivalenceMember *find_em_for_input_target(PlannerInfo *root,
 										   EquivalenceClass *ec,
 										   PathTarget *target);
 extern List *build_tlist_to_deparse(RelOptInfo *foreignrel);
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 02a6b15a13..2bcbb98e3c 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -873,6 +873,9 @@ create operator class my_op_class for type int using btree family my_op_family a
 explain (verbose, costs off)
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 
+-- This will not be pushed either
+explain (verbose, costs off) select * from ft2 order by c1 using operator(public.<^);
+
 -- Update local stats on ft2
 ANALYZE ft2;
 
@@ -890,6 +893,9 @@ explain (verbose, costs off)
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 
+-- And this will be pushed too
+explain (verbose, costs off) select * from ft2 order by c1 using operator(public.<^);
+
 -- Remove from extension
 alter extension postgres_fdw drop operator class my_op_class using btree;
 alter extension postgres_fdw drop function my_op_cmp(a int, b int);
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 6f1abbe47d..001296597f 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -932,11 +932,11 @@ is_exprlist_member(Expr *node, List *exprs)
 }
 
 /*
- * Find an equivalence class member expression, all of whose Vars, come from
+ * Find an equivalence class member, all of whose Vars, come from
  * the indicated relation.
  */
-Expr *
-find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+EquivalenceMember *
+find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 {
 	ListCell   *lc_em;
 
@@ -952,7 +952,7 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 			 * taken entirely from this relation, we'll be content to choose
 			 * any one of those.
 			 */
-			return em->em_expr;
+			return em;
 		}
 	}
 
@@ -960,6 +960,17 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 	return NULL;
 }
 
+/*
+ * Simple wrapper around find_em_for_rel returning it's expression.
+ */
+Expr *
+find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+{
+	EquivalenceMember *em = find_em_for_rel(ec, rel);
+	return em ? em->em_expr : NULL;
+}
+
+
 /*
  * relation_can_be_sorted_early
  *		Can this relation be sorted on this EC before the final output step?
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index f1d111063c..bc17114ba3 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -144,6 +144,7 @@ extern EquivalenceMember *find_computable_ec_member(PlannerInfo *root,
 													Relids relids,
 													bool require_parallel_safe);
 extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
+extern EquivalenceMember *find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
 extern bool relation_can_be_sorted_early(PlannerInfo *root, RelOptInfo *rel,
 										 EquivalenceClass *ec,
 										 bool require_parallel_safe);
-- 
2.32.0

#10Ranier Vilela
ranier.vf@gmail.com
In reply to: Ronan Dunklau (#7)
Re: ORDER BY pushdowns seem broken in postgres_fdw

Em qui., 22 de jul. de 2021 às 04:00, Ronan Dunklau <ronan.dunklau@aiven.io>
escreveu:

Le jeudi 22 juillet 2021, 02:16:52 CEST Ranier Vilela a écrit :

Unfortunately your patch does not apply clear into the head.
So I have a few suggestions on v2, attached with the .txt extension to
avoid cf bot.
Please, if ok, make the v3.

Hum weird, it applied cleanly for me, and was formatted using git show
which I
admit is not ideal. Please find it reattached.

ranier@notebook2:/usr/src/postgres$ git apply <
v2_fix_postgresfdw_orderby_handling.patch
error: falha no patch: contrib/postgres_fdw/deparse.c:37
error: contrib/postgres_fdw/deparse.c: patch does not apply
error: falha no patch: contrib/postgres_fdw/expected/postgres_fdw.out:3168
error: contrib/postgres_fdw/expected/postgres_fdw.out: patch does not apply
error: falha no patch: contrib/postgres_fdw/postgres_fdw.c:916
error: contrib/postgres_fdw/postgres_fdw.c: patch does not apply
error: falha no patch: contrib/postgres_fdw/postgres_fdw.h:165
error: contrib/postgres_fdw/postgres_fdw.h: patch does not apply
error: falha no patch: contrib/postgres_fdw/sql/postgres_fdw.sql:873
error: contrib/postgres_fdw/sql/postgres_fdw.sql: patch does not apply
error: falha no patch: src/backend/optimizer/path/equivclass.c:932
error: src/backend/optimizer/path/equivclass.c: patch does not apply
error: falha no patch: src/include/optimizer/paths.h:144
error: src/include/optimizer/paths.h: patch does not apply

2. appendOrderbyUsingClause function
Put the buffer actions together?

Not sure what you mean here ?

+ appendStringInfoString(buf, " USING ");
+ deparseOperatorName(buf, operform);
3. Apply style Postgres?
+ if (!HeapTupleIsValid(tuple))
+ {
+ elog(ERROR, "cache lookup failed for operator family %u",
pathkey->pk_opfamily);
+ }

Good catch !

4. Assertion not ok here?
+ em = find_em_for_rel(pathkey->pk_eclass, baserel);
+ em_expr = em->em_expr;
Assert(em_expr != NULL);

If we are here there should never be a case where the em can't be found. I
moved the assertion where it makes sense though.

Your version of function is_foreign_pathkey (v4),

not reduce scope the variable PgFdwRelationInfo *fpinfo.
I still prefer the v3 version.

The C ternary operator ? : ;
It's nothing more than a disguised if else

regards,
Ranier Vilela

#11David Rowley
dgrowleyml@gmail.com
In reply to: Ronan Dunklau (#9)
1 attachment(s)
Re: ORDER BY pushdowns seem broken in postgres_fdw

On Thu, 22 Jul 2021 at 20:49, Ronan Dunklau <ronan.dunklau@aiven.io> wrote:

Le jeudi 22 juillet 2021, 09:44:54 CEST David Rowley a écrit :

Can you also use explain (verbose, costs off) the same as the other
tests in that area. Having the costs there would never survive a run
of the buildfarm. Different hardware will produce different costs, e.g
32-bit hardware might cost cheaper due to narrower widths.

Sorry about that. Here it is.

I had a look over the v5 patch and noticed a few issues and a few
things that could be improved.

This is not ok:

+        tuple = SearchSysCache4(AMOPSTRATEGY,
+                                ObjectIdGetDatum(pathkey->pk_opfamily),
+                                em->em_datatype,
+                                em->em_datatype,
+                                pathkey->pk_strategy);

SearchSysCache* expects Datum inputs, so you must use the *GetDatum()
macro for each input that isn't already a Datum.

I also:
1. Changed the error message for when that lookup fails so that it's
the same as the others that perform a lookup with AMOPSTRATEGY.
2. Put back the comment in equivclass.c for find_em_expr_for_rel. I
saw no reason that comment should be changed when the function does
exactly what it did before.
3. Renamed appendOrderbyUsingClause to appendOrderBySuffix. I wasn't
happy that the name indicated it was only handling USING clauses when
it also handled ASC/DESC. I also moved in the NULLS FIRST/LAST stuff
in there
4. Adjusted is_foreign_pathkey() to make it easier to read and do
is_shippable() before calling find_em_expr_for_rel(). I didn't see
the need to call find_em_expr_for_rel() when is_shippable() returned
false.
5. Adjusted find_em_expr_for_rel() to remove the ternary operator.

I've attached what I ended up with.

It seems that it was the following commit that introduced the ability
for sorts to be pushed down to the foreign server, so it would be good
if the authors of that patch could look over this.

commit f18c944b6137329ac4a6b2dce5745c5dc21a8578
Author: Robert Haas <rhaas@postgresql.org>
Date: Tue Nov 3 12:46:06 2015 -0500

postgres_fdw: Add ORDER BY to some remote SQL queries.

David

Attachments:

v6_fix_postgresfdw_orderby_handling.patchapplication/octet-stream; name=v6_fix_postgresfdw_orderby_handling.patchDownload
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 31919fda8c..5ff453fa42 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -37,9 +37,11 @@
 #include "access/sysattr.h"
 #include "access/table.h"
 #include "catalog/pg_aggregate.h"
+#include "catalog/pg_amop.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_namespace.h"
 #include "catalog/pg_operator.h"
+#include "catalog/pg_opfamily.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
@@ -47,6 +49,7 @@
 #include "nodes/nodeFuncs.h"
 #include "nodes/plannodes.h"
 #include "optimizer/optimizer.h"
+#include "optimizer/paths.h"
 #include "optimizer/prep.h"
 #include "optimizer/tlist.h"
 #include "parser/parsetree.h"
@@ -180,6 +183,8 @@ static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root,
 							   Index ignore_rel, List **ignore_conds, List **params_list);
 static void deparseAggref(Aggref *node, deparse_expr_cxt *context);
 static void appendGroupByClause(List *tlist, deparse_expr_cxt *context);
+static void appendOrderBySuffix(Oid sortop, Oid sortcoltype, bool nulls_first,
+								deparse_expr_cxt *context);
 static void appendAggOrderBy(List *orderList, List *targetList,
 							 deparse_expr_cxt *context);
 static void appendFunctionName(Oid funcid, deparse_expr_cxt *context);
@@ -910,6 +915,41 @@ is_foreign_param(PlannerInfo *root,
 	return false;
 }
 
+/*
+ * Returns true if it's safe to push down a sort as described by 'pathkey' to
+ * the foreign server
+ */
+bool
+is_foreign_pathkey(PlannerInfo *root,
+				   RelOptInfo *baserel,
+				   PathKey *pathkey)
+{
+	EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
+	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
+	Expr	   *em_expr;
+
+	/*
+	 * is_foreign_expr would detect volatile expressions as well, but checking
+	 * ec_has_volatile here saves some cycles.
+	 */
+	if (pathkey_ec->ec_has_volatile)
+		return false;
+
+	/* can't push down the sort if the pathkey's opfamily is not shippable */
+	if (!is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId, fpinfo))
+		return false;
+
+	em_expr = find_em_expr_for_rel(pathkey_ec, baserel);
+	if (em_expr == NULL)
+		return false;
+
+	/*
+	 * Finally, determine if it's safe to evaluate the found expr on the
+	 * foreign server.
+	 */
+	return is_foreign_expr(root, baserel, em_expr);
+}
+
 /*
  * Convert type OID + typmod info into a type name we can ship to the remote
  * server.  Someplace else had better have verified that this type name is
@@ -3125,6 +3165,45 @@ deparseAggref(Aggref *node, deparse_expr_cxt *context)
 	appendStringInfoChar(buf, ')');
 }
 
+/*
+ * Append the ASC, DESC, USING <OPERATOR> and NULLS FIRST / NULLS LAST part
+ * of the ORDER BY clause
+ */
+static void
+appendOrderBySuffix(Oid sortop, Oid sortcoltype, bool nulls_first,
+					deparse_expr_cxt *context)
+{
+	StringInfo	buf = context->buf;
+	TypeCacheEntry *typentry;
+
+	typentry = lookup_type_cache(sortcoltype, TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
+
+	if (sortop == typentry->lt_opr)
+		appendStringInfoString(buf, " ASC");
+	else if (sortop == typentry->gt_opr)
+		appendStringInfoString(buf, " DESC");
+	else
+	{
+		HeapTuple	opertup;
+		Form_pg_operator operform;
+
+		appendStringInfoString(buf, " USING ");
+
+		/* Append operator name. */
+		opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(sortop));
+		if (!HeapTupleIsValid(opertup))
+			elog(ERROR, "cache lookup failed for operator %u", sortop);
+		operform = (Form_pg_operator) GETSTRUCT(opertup);
+		deparseOperatorName(buf, operform);
+		ReleaseSysCache(opertup);
+	}
+
+	if (nulls_first)
+		appendStringInfoString(buf, " NULLS FIRST");
+	else
+		appendStringInfoString(buf, " NULLS LAST");
+}
+
 /*
  * Append ORDER BY within aggregate function.
  */
@@ -3140,7 +3219,6 @@ appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context)
 		SortGroupClause *srt = (SortGroupClause *) lfirst(lc);
 		Node	   *sortexpr;
 		Oid			sortcoltype;
-		TypeCacheEntry *typentry;
 
 		if (!first)
 			appendStringInfoString(buf, ", ");
@@ -3150,32 +3228,8 @@ appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context)
 										  false, context);
 		sortcoltype = exprType(sortexpr);
 		/* See whether operator is default < or > for datatype */
-		typentry = lookup_type_cache(sortcoltype,
-									 TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
-		if (srt->sortop == typentry->lt_opr)
-			appendStringInfoString(buf, " ASC");
-		else if (srt->sortop == typentry->gt_opr)
-			appendStringInfoString(buf, " DESC");
-		else
-		{
-			HeapTuple	opertup;
-			Form_pg_operator operform;
-
-			appendStringInfoString(buf, " USING ");
-
-			/* Append operator name. */
-			opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(srt->sortop));
-			if (!HeapTupleIsValid(opertup))
-				elog(ERROR, "cache lookup failed for operator %u", srt->sortop);
-			operform = (Form_pg_operator) GETSTRUCT(opertup);
-			deparseOperatorName(buf, operform);
-			ReleaseSysCache(opertup);
-		}
-
-		if (srt->nulls_first)
-			appendStringInfoString(buf, " NULLS FIRST");
-		else
-			appendStringInfoString(buf, " NULLS LAST");
+		appendOrderBySuffix(srt->sortop, sortcoltype, srt->nulls_first,
+							context);
 	}
 }
 
@@ -3280,7 +3334,11 @@ appendOrderByClause(List *pathkeys, bool has_final_sort,
 	foreach(lcell, pathkeys)
 	{
 		PathKey    *pathkey = lfirst(lcell);
+		EquivalenceMember *em;
 		Expr	   *em_expr;
+		HeapTuple	tuple;
+		Oid			oprid;
+		bool		isNull;
 
 		if (has_final_sort)
 		{
@@ -3288,26 +3346,44 @@ appendOrderByClause(List *pathkeys, bool has_final_sort,
 			 * By construction, context->foreignrel is the input relation to
 			 * the final sort.
 			 */
-			em_expr = find_em_expr_for_input_target(context->root,
-													pathkey->pk_eclass,
-													context->foreignrel->reltarget);
+			em = find_em_for_input_target(context->root,
+										  pathkey->pk_eclass,
+										  context->foreignrel->reltarget);
 		}
 		else
-			em_expr = find_em_expr_for_rel(pathkey->pk_eclass, baserel);
+			em = find_em_for_rel(pathkey->pk_eclass, baserel);
+
+		em_expr = em->em_expr;
+
+		/*
+		 * Lookup the operator corresponding to the strategy in the opclass.
+		 * The datatype used by the opfamily is not necessarily the same as
+		 * the expression type (for array types for example).
+		 */
+		tuple = SearchSysCache4(AMOPSTRATEGY,
+								ObjectIdGetDatum(pathkey->pk_opfamily),
+								ObjectIdGetDatum(em->em_datatype),
+								ObjectIdGetDatum(em->em_datatype),
+								Int16GetDatum(pathkey->pk_strategy));
+
+		if (!HeapTupleIsValid(tuple))
+			elog(ERROR, "missing operator %d(%u,%u) in opfamily %u",
+				 pathkey->pk_strategy, em->em_datatype, em->em_datatype,
+				 pathkey->pk_opfamily);
 
-		Assert(em_expr != NULL);
+		oprid = DatumGetObjectId(SysCacheGetAttr(AMOPSTRATEGY, tuple,
+												 Anum_pg_amop_amopopr, &isNull));
+		ReleaseSysCache(tuple);
 
 		appendStringInfoString(buf, delim);
 		deparseExpr(em_expr, context);
-		if (pathkey->pk_strategy == BTLessStrategyNumber)
-			appendStringInfoString(buf, " ASC");
-		else
-			appendStringInfoString(buf, " DESC");
 
-		if (pathkey->pk_nulls_first)
-			appendStringInfoString(buf, " NULLS FIRST");
-		else
-			appendStringInfoString(buf, " NULLS LAST");
+		/*
+		 * Here we need to use the expression type to compare against the
+		 * default btree sort operator.
+		 */
+		appendOrderBySuffix(oprid, exprType((Node *) em_expr),
+							pathkey->pk_nulls_first, context);
 
 		delim = ", ";
 	}
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index ed25e7a743..485bbec0e9 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3168,6 +3168,19 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
          Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (("C 1" < 100)) AND ((c2 = 6))
 (6 rows)
 
+-- Ensure that we don't push down an ORDER BY with a non-shippable operator
+explain (verbose, costs off)
+select * from ft2 order by c1 using operator(public.<^);
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
+ Sort
+   Output: c1, c2, c3, c4, c5, c6, c7, c8
+   Sort Key: ft2.c1 USING <^
+   ->  Foreign Scan on public.ft2
+         Output: c1, c2, c3, c4, c5, c6, c7, c8
+         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+(6 rows)
+
 -- Update local stats on ft2
 ANALYZE ft2;
 -- Add into extension
@@ -3195,6 +3208,16 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
  {6,16,26,36,46,56,66,76,86,96}
 (1 row)
 
+-- Ensure that the ORDER BY is pushed to the foreign server
+explain (verbose, costs off)
+select * from ft2 order by c1 using operator(public.<^);
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan on public.ft2
+   Output: c1, c2, c3, c4, c5, c6, c7, c8
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" ORDER BY "C 1" USING OPERATOR(public.<^) NULLS LAST
+(3 rows)
+
 -- Remove from extension
 alter extension postgres_fdw drop operator class my_op_class using btree;
 alter extension postgres_fdw drop function my_op_cmp(a int, b int);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index f15c97ad7a..3364771bcb 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -916,8 +916,6 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 		foreach(lc, root->query_pathkeys)
 		{
 			PathKey    *pathkey = (PathKey *) lfirst(lc);
-			EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
-			Expr	   *em_expr;
 
 			/*
 			 * The planner and executor don't have any clever strategy for
@@ -925,13 +923,8 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 			 * getting it to be sorted by all of those pathkeys. We'll just
 			 * end up resorting the entire data set.  So, unless we can push
 			 * down all of the query pathkeys, forget it.
-			 *
-			 * is_foreign_expr would detect volatile expressions as well, but
-			 * checking ec_has_volatile here saves some cycles.
 			 */
-			if (pathkey_ec->ec_has_volatile ||
-				!(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) ||
-				!is_foreign_expr(root, rel, em_expr))
+			if (!is_foreign_pathkey(root, rel, pathkey))
 			{
 				query_pathkeys_ok = false;
 				break;
@@ -6510,9 +6503,9 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 			return;
 
 		/* Get the sort expression for the pathkey_ec */
-		sort_expr = find_em_expr_for_input_target(root,
-												  pathkey_ec,
-												  input_rel->reltarget);
+		sort_expr = find_em_for_input_target(root,
+											 pathkey_ec,
+											 input_rel->reltarget)->em_expr;
 
 		/* If it's unsafe to remote, we cannot push down the final sort */
 		if (!is_foreign_expr(root, input_rel, sort_expr))
@@ -7250,13 +7243,12 @@ conversion_error_callback(void *arg)
 }
 
 /*
- * Find an equivalence class member expression to be computed as a sort column
- * in the given target.
+ * Find an equivalence class member to be computed as a sort column in the
+ * given target.
  */
-Expr *
-find_em_expr_for_input_target(PlannerInfo *root,
-							  EquivalenceClass *ec,
-							  PathTarget *target)
+EquivalenceMember *
+find_em_for_input_target(PlannerInfo *root, EquivalenceClass *ec,
+						 PathTarget *target)
 {
 	ListCell   *lc1;
 	int			i;
@@ -7301,7 +7293,7 @@ find_em_expr_for_input_target(PlannerInfo *root,
 				em_expr = ((RelabelType *) em_expr)->arg;
 
 			if (equal(em_expr, expr))
-				return em->em_expr;
+				return em;
 		}
 
 		i++;
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 9591c0f6c2..b59098dcdd 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -171,6 +171,9 @@ extern bool is_foreign_expr(PlannerInfo *root,
 extern bool is_foreign_param(PlannerInfo *root,
 							 RelOptInfo *baserel,
 							 Expr *expr);
+extern bool is_foreign_pathkey(PlannerInfo *root,
+							   RelOptInfo *baserel,
+							   PathKey *pathkey);
 extern void deparseInsertSql(StringInfo buf, RangeTblEntry *rte,
 							 Index rtindex, Relation rel,
 							 List *targetAttrs, bool doNothing,
@@ -212,10 +215,10 @@ extern void deparseTruncateSql(StringInfo buf,
 							   DropBehavior behavior,
 							   bool restart_seqs);
 extern void deparseStringLiteral(StringInfo buf, const char *val);
-extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
-extern Expr *find_em_expr_for_input_target(PlannerInfo *root,
-										   EquivalenceClass *ec,
-										   PathTarget *target);
+extern EquivalenceMember *find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
+extern EquivalenceMember *find_em_for_input_target(PlannerInfo *root,
+												   EquivalenceClass *ec,
+												   PathTarget *target);
 extern List *build_tlist_to_deparse(RelOptInfo *foreignrel);
 extern void deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root,
 									RelOptInfo *foreignrel, List *tlist,
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 02a6b15a13..a11fea42b0 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -873,6 +873,10 @@ create operator class my_op_class for type int using btree family my_op_family a
 explain (verbose, costs off)
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 
+-- Ensure that we don't push down an ORDER BY with a non-shippable operator
+explain (verbose, costs off)
+select * from ft2 order by c1 using operator(public.<^);
+
 -- Update local stats on ft2
 ANALYZE ft2;
 
@@ -890,6 +894,10 @@ explain (verbose, costs off)
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 
+-- Ensure that the ORDER BY is pushed to the foreign server
+explain (verbose, costs off)
+select * from ft2 order by c1 using operator(public.<^);
+
 -- Remove from extension
 alter extension postgres_fdw drop operator class my_op_class using btree;
 alter extension postgres_fdw drop function my_op_cmp(a int, b int);
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 6f1abbe47d..b367f18eab 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -932,11 +932,11 @@ is_exprlist_member(Expr *node, List *exprs)
 }
 
 /*
- * Find an equivalence class member expression, all of whose Vars, come from
- * the indicated relation.
+ * Find an equivalence class member, all of whose Vars, come from the
+ * indicated relation.
  */
-Expr *
-find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+EquivalenceMember *
+find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 {
 	ListCell   *lc_em;
 
@@ -952,7 +952,7 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 			 * taken entirely from this relation, we'll be content to choose
 			 * any one of those.
 			 */
-			return em->em_expr;
+			return em;
 		}
 	}
 
@@ -960,6 +960,22 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 	return NULL;
 }
 
+/*
+ * Find an equivalence class member expression, all of whose Vars, come from
+ * the indicated relation.
+ */
+Expr *
+find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+{
+	EquivalenceMember *em = find_em_for_rel(ec, rel);
+
+	if (em != NULL)
+		return em->em_expr;
+
+	return NULL;
+}
+
+
 /*
  * relation_can_be_sorted_early
  *		Can this relation be sorted on this EC before the final output step?
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index f1d111063c..5a2bbc87e1 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -144,6 +144,8 @@ extern EquivalenceMember *find_computable_ec_member(PlannerInfo *root,
 													Relids relids,
 													bool require_parallel_safe);
 extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
+extern EquivalenceMember *find_em_for_rel(EquivalenceClass *ec,
+										  RelOptInfo *rel);
 extern bool relation_can_be_sorted_early(PlannerInfo *root, RelOptInfo *rel,
 										 EquivalenceClass *ec,
 										 bool require_parallel_safe);
#12Ronan Dunklau
ronan.dunklau@aiven.io
In reply to: David Rowley (#11)
Re: ORDER BY pushdowns seem broken in postgres_fdw

Le mardi 27 juillet 2021, 03:19:18 CEST David Rowley a écrit :

On Thu, 22 Jul 2021 at 20:49, Ronan Dunklau <ronan.dunklau@aiven.io> wrote:

Le jeudi 22 juillet 2021, 09:44:54 CEST David Rowley a écrit :

Can you also use explain (verbose, costs off) the same as the other
tests in that area. Having the costs there would never survive a run
of the buildfarm. Different hardware will produce different costs, e.g
32-bit hardware might cost cheaper due to narrower widths.

Sorry about that. Here it is.

I had a look over the v5 patch and noticed a few issues and a few
things that could be improved.

Thank you.

This is not ok:

+        tuple = SearchSysCache4(AMOPSTRATEGY,
+                                ObjectIdGetDatum(pathkey->pk_opfamily),
+                                em->em_datatype,
+                                em->em_datatype,
+                                pathkey->pk_strategy);

SearchSysCache* expects Datum inputs, so you must use the *GetDatum()
macro for each input that isn't already a Datum.

Noted.

I also:
1. Changed the error message for when that lookup fails so that it's
the same as the others that perform a lookup with AMOPSTRATEGY.
2. Put back the comment in equivclass.c for find_em_expr_for_rel. I
saw no reason that comment should be changed when the function does
exactly what it did before.
3. Renamed appendOrderbyUsingClause to appendOrderBySuffix. I wasn't
happy that the name indicated it was only handling USING clauses when
it also handled ASC/DESC. I also moved in the NULLS FIRST/LAST stuff
in there

I agree that name is better.

4. Adjusted is_foreign_pathkey() to make it easier to read and do
is_shippable() before calling find_em_expr_for_rel(). I didn't see
the need to call find_em_expr_for_rel() when is_shippable() returned
false.
5. Adjusted find_em_expr_for_rel() to remove the ternary operator.

I've attached what I ended up with.

Looks good to me.

It seems that it was the following commit that introduced the ability
for sorts to be pushed down to the foreign server, so it would be good
if the authors of that patch could look over this.

One thing in particular I was not sure about was how to fetch the operator
associated with the path key ordering. I chose to go through the opfamily
recorded on the member, but maybe locating the original SortGroupClause by its
ref and getting the operator number here woud have worked. It seems more
straightforward like this though.

--
Ronan Dunklau

#13David Rowley
dgrowleyml@gmail.com
In reply to: Ronan Dunklau (#12)
Re: ORDER BY pushdowns seem broken in postgres_fdw

On Tue, 27 Jul 2021 at 17:20, Ronan Dunklau <ronan.dunklau@aiven.io> wrote:

One thing in particular I was not sure about was how to fetch the operator
associated with the path key ordering. I chose to go through the opfamily
recorded on the member, but maybe locating the original SortGroupClause by its
ref and getting the operator number here woud have worked. It seems more
straightforward like this though.

I spent a bit of time trying to find a less invasive way of doing that
and didn't manage to come up with anything. I'm interested to hear if
anyone else has any better ideas.

David

#14David Zhang
david.zhang@highgo.ca
In reply to: David Rowley (#13)
Re: ORDER BY pushdowns seem broken in postgres_fdw

The following review has been posted through the commitfest application:
make installcheck-world: tested, failed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not tested

Applied the v6 patch to master branch and ran regression test for contrib, the result was "All tests successful."

#15Ronan Dunklau
ronan@dunklau.fr
In reply to: David Zhang (#14)
1 attachment(s)
Re: ORDER BY pushdowns seem broken in postgres_fdw

Le vendredi 3 septembre 2021, 22:54:25 CEST David Zhang a écrit :

The following review has been posted through the commitfest application:
make installcheck-world: tested, failed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not tested

Applied the v6 patch to master branch and ran regression test for contrib,
the result was "All tests successful."

What kind of error did you get running make installcheck-world ? If it passed
the make check for contrib, I can't see why it would fail running make
installcheck-world.

In any case, I just checked and running make installcheck-world doesn't
produce any error.

Since HEAD had moved a bit since the last version, I rebased the patch,
resulting in the attached v7.

Best regards,

--
Ronan Dunklau

Attachments:

v7-0001-Fix-orderby-handling-in-postgres_fdw.patchtext/x-patch; charset=UTF-8; name=v7-0001-Fix-orderby-handling-in-postgres_fdw.patchDownload
From ccfb0a3d7dae42f56c699aac6f699c3c2f08c812 Mon Sep 17 00:00:00 2001
From: Ronan Dunklau <ronan.dunklau@aiven.io>
Date: Mon, 6 Sep 2021 09:54:43 +0200
Subject: [PATCH v7] Fix orderby handling in postgres_fdw

The logic for pushing down order bys in postgres fdw didn't take into
account the specific operator used, and as such a pushed-down order by
could return wrong results.
---
 contrib/postgres_fdw/deparse.c                | 156 +++++++++++++-----
 .../postgres_fdw/expected/postgres_fdw.out    |  23 +++
 contrib/postgres_fdw/postgres_fdw.c           |  28 ++--
 contrib/postgres_fdw/postgres_fdw.h           |  11 +-
 contrib/postgres_fdw/sql/postgres_fdw.sql     |   8 +
 src/backend/optimizer/path/equivclass.c       |  26 ++-
 src/include/optimizer/paths.h                 |   2 +
 7 files changed, 187 insertions(+), 67 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index d98bd66681..fb1b5f9d9b 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -37,9 +37,11 @@
 #include "access/sysattr.h"
 #include "access/table.h"
 #include "catalog/pg_aggregate.h"
+#include "catalog/pg_amop.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_namespace.h"
 #include "catalog/pg_operator.h"
+#include "catalog/pg_opfamily.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
@@ -47,6 +49,7 @@
 #include "nodes/nodeFuncs.h"
 #include "nodes/plannodes.h"
 #include "optimizer/optimizer.h"
+#include "optimizer/paths.h"
 #include "optimizer/prep.h"
 #include "optimizer/tlist.h"
 #include "parser/parsetree.h"
@@ -182,6 +185,8 @@ static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root,
 							   Index ignore_rel, List **ignore_conds, List **params_list);
 static void deparseAggref(Aggref *node, deparse_expr_cxt *context);
 static void appendGroupByClause(List *tlist, deparse_expr_cxt *context);
+static void appendOrderBySuffix(Oid sortop, Oid sortcoltype, bool nulls_first,
+								deparse_expr_cxt *context);
 static void appendAggOrderBy(List *orderList, List *targetList,
 							 deparse_expr_cxt *context);
 static void appendFunctionName(Oid funcid, deparse_expr_cxt *context);
@@ -1037,6 +1042,41 @@ is_foreign_param(PlannerInfo *root,
 	return false;
 }
 
+/*
+ * Returns true if it's safe to push down a sort as described by 'pathkey' to
+ * the foreign server
+ */
+bool
+is_foreign_pathkey(PlannerInfo *root,
+				   RelOptInfo *baserel,
+				   PathKey *pathkey)
+{
+	EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
+	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
+	Expr	   *em_expr;
+
+	/*
+	 * is_foreign_expr would detect volatile expressions as well, but checking
+	 * ec_has_volatile here saves some cycles.
+	 */
+	if (pathkey_ec->ec_has_volatile)
+		return false;
+
+	/* can't push down the sort if the pathkey's opfamily is not shippable */
+	if (!is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId, fpinfo))
+		return false;
+
+	em_expr = find_em_expr_for_rel(pathkey_ec, baserel);
+	if (em_expr == NULL)
+		return false;
+
+	/*
+	 * Finally, determine if it's safe to evaluate the found expr on the
+	 * foreign server.
+	 */
+	return is_foreign_expr(root, baserel, em_expr);
+}
+
 /*
  * Convert type OID + typmod info into a type name we can ship to the remote
  * server.  Someplace else had better have verified that this type name is
@@ -3331,6 +3371,45 @@ deparseAggref(Aggref *node, deparse_expr_cxt *context)
 	appendStringInfoChar(buf, ')');
 }
 
+/*
+ * Append the ASC, DESC, USING <OPERATOR> and NULLS FIRST / NULLS LAST part
+ * of the ORDER BY clause
+ */
+static void
+appendOrderBySuffix(Oid sortop, Oid sortcoltype, bool nulls_first,
+					deparse_expr_cxt *context)
+{
+	StringInfo	buf = context->buf;
+	TypeCacheEntry *typentry;
+
+	typentry = lookup_type_cache(sortcoltype, TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
+
+	if (sortop == typentry->lt_opr)
+		appendStringInfoString(buf, " ASC");
+	else if (sortop == typentry->gt_opr)
+		appendStringInfoString(buf, " DESC");
+	else
+	{
+		HeapTuple	opertup;
+		Form_pg_operator operform;
+
+		appendStringInfoString(buf, " USING ");
+
+		/* Append operator name. */
+		opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(sortop));
+		if (!HeapTupleIsValid(opertup))
+			elog(ERROR, "cache lookup failed for operator %u", sortop);
+		operform = (Form_pg_operator) GETSTRUCT(opertup);
+		deparseOperatorName(buf, operform);
+		ReleaseSysCache(opertup);
+	}
+
+	if (nulls_first)
+		appendStringInfoString(buf, " NULLS FIRST");
+	else
+		appendStringInfoString(buf, " NULLS LAST");
+}
+
 /*
  * Append ORDER BY within aggregate function.
  */
@@ -3346,7 +3425,6 @@ appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context)
 		SortGroupClause *srt = (SortGroupClause *) lfirst(lc);
 		Node	   *sortexpr;
 		Oid			sortcoltype;
-		TypeCacheEntry *typentry;
 
 		if (!first)
 			appendStringInfoString(buf, ", ");
@@ -3356,32 +3434,8 @@ appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context)
 										  false, context);
 		sortcoltype = exprType(sortexpr);
 		/* See whether operator is default < or > for datatype */
-		typentry = lookup_type_cache(sortcoltype,
-									 TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
-		if (srt->sortop == typentry->lt_opr)
-			appendStringInfoString(buf, " ASC");
-		else if (srt->sortop == typentry->gt_opr)
-			appendStringInfoString(buf, " DESC");
-		else
-		{
-			HeapTuple	opertup;
-			Form_pg_operator operform;
-
-			appendStringInfoString(buf, " USING ");
-
-			/* Append operator name. */
-			opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(srt->sortop));
-			if (!HeapTupleIsValid(opertup))
-				elog(ERROR, "cache lookup failed for operator %u", srt->sortop);
-			operform = (Form_pg_operator) GETSTRUCT(opertup);
-			deparseOperatorName(buf, operform);
-			ReleaseSysCache(opertup);
-		}
-
-		if (srt->nulls_first)
-			appendStringInfoString(buf, " NULLS FIRST");
-		else
-			appendStringInfoString(buf, " NULLS LAST");
+		appendOrderBySuffix(srt->sortop, sortcoltype, srt->nulls_first,
+							context);
 	}
 }
 
@@ -3486,7 +3540,11 @@ appendOrderByClause(List *pathkeys, bool has_final_sort,
 	foreach(lcell, pathkeys)
 	{
 		PathKey    *pathkey = lfirst(lcell);
+		EquivalenceMember *em;
 		Expr	   *em_expr;
+		HeapTuple	tuple;
+		Oid			oprid;
+		bool		isNull;
 
 		if (has_final_sort)
 		{
@@ -3494,26 +3552,44 @@ appendOrderByClause(List *pathkeys, bool has_final_sort,
 			 * By construction, context->foreignrel is the input relation to
 			 * the final sort.
 			 */
-			em_expr = find_em_expr_for_input_target(context->root,
-													pathkey->pk_eclass,
-													context->foreignrel->reltarget);
+			em = find_em_for_input_target(context->root,
+										  pathkey->pk_eclass,
+										  context->foreignrel->reltarget);
 		}
 		else
-			em_expr = find_em_expr_for_rel(pathkey->pk_eclass, baserel);
+			em = find_em_for_rel(pathkey->pk_eclass, baserel);
+
+		em_expr = em->em_expr;
+
+		/*
+		 * Lookup the operator corresponding to the strategy in the opclass.
+		 * The datatype used by the opfamily is not necessarily the same as
+		 * the expression type (for array types for example).
+		 */
+		tuple = SearchSysCache4(AMOPSTRATEGY,
+								ObjectIdGetDatum(pathkey->pk_opfamily),
+								ObjectIdGetDatum(em->em_datatype),
+								ObjectIdGetDatum(em->em_datatype),
+								Int16GetDatum(pathkey->pk_strategy));
+
+		if (!HeapTupleIsValid(tuple))
+			elog(ERROR, "missing operator %d(%u,%u) in opfamily %u",
+				 pathkey->pk_strategy, em->em_datatype, em->em_datatype,
+				 pathkey->pk_opfamily);
 
-		Assert(em_expr != NULL);
+		oprid = DatumGetObjectId(SysCacheGetAttr(AMOPSTRATEGY, tuple,
+												 Anum_pg_amop_amopopr, &isNull));
+		ReleaseSysCache(tuple);
 
 		appendStringInfoString(buf, delim);
 		deparseExpr(em_expr, context);
-		if (pathkey->pk_strategy == BTLessStrategyNumber)
-			appendStringInfoString(buf, " ASC");
-		else
-			appendStringInfoString(buf, " DESC");
 
-		if (pathkey->pk_nulls_first)
-			appendStringInfoString(buf, " NULLS FIRST");
-		else
-			appendStringInfoString(buf, " NULLS LAST");
+		/*
+		 * Here we need to use the expression type to compare against the
+		 * default btree sort operator.
+		 */
+		appendOrderBySuffix(oprid, exprType((Node *) em_expr),
+							pathkey->pk_nulls_first, context);
 
 		delim = ", ";
 	}
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index e3ee30f1aa..1f44c365ef 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3258,6 +3258,19 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
          Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (("C 1" < 100)) AND ((c2 = 6))
 (6 rows)
 
+-- Ensure that we don't push down an ORDER BY with a non-shippable operator
+explain (verbose, costs off)
+select * from ft2 order by c1 using operator(public.<^);
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
+ Sort
+   Output: c1, c2, c3, c4, c5, c6, c7, c8
+   Sort Key: ft2.c1 USING <^
+   ->  Foreign Scan on public.ft2
+         Output: c1, c2, c3, c4, c5, c6, c7, c8
+         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+(6 rows)
+
 -- Update local stats on ft2
 ANALYZE ft2;
 -- Add into extension
@@ -3285,6 +3298,16 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
  {6,16,26,36,46,56,66,76,86,96}
 (1 row)
 
+-- Ensure that the ORDER BY is pushed to the foreign server
+explain (verbose, costs off)
+select * from ft2 order by c1 using operator(public.<^);
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan on public.ft2
+   Output: c1, c2, c3, c4, c5, c6, c7, c8
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" ORDER BY "C 1" USING OPERATOR(public.<^) NULLS LAST
+(3 rows)
+
 -- Remove from extension
 alter extension postgres_fdw drop operator class my_op_class using btree;
 alter extension postgres_fdw drop function my_op_cmp(a int, b int);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 4bdab30a73..1dd4779185 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -917,8 +917,6 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 		foreach(lc, root->query_pathkeys)
 		{
 			PathKey    *pathkey = (PathKey *) lfirst(lc);
-			EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
-			Expr	   *em_expr;
 
 			/*
 			 * The planner and executor don't have any clever strategy for
@@ -926,13 +924,8 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 			 * getting it to be sorted by all of those pathkeys. We'll just
 			 * end up resorting the entire data set.  So, unless we can push
 			 * down all of the query pathkeys, forget it.
-			 *
-			 * is_foreign_expr would detect volatile expressions as well, but
-			 * checking ec_has_volatile here saves some cycles.
 			 */
-			if (pathkey_ec->ec_has_volatile ||
-				!(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) ||
-				!is_foreign_expr(root, rel, em_expr))
+			if (!is_foreign_pathkey(root, rel, pathkey))
 			{
 				query_pathkeys_ok = false;
 				break;
@@ -6540,9 +6533,9 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 			return;
 
 		/* Get the sort expression for the pathkey_ec */
-		sort_expr = find_em_expr_for_input_target(root,
-												  pathkey_ec,
-												  input_rel->reltarget);
+		sort_expr = find_em_for_input_target(root,
+											 pathkey_ec,
+											 input_rel->reltarget)->em_expr;
 
 		/* If it's unsafe to remote, we cannot push down the final sort */
 		if (!is_foreign_expr(root, input_rel, sort_expr))
@@ -7332,13 +7325,12 @@ conversion_error_callback(void *arg)
 }
 
 /*
- * Find an equivalence class member expression to be computed as a sort column
- * in the given target.
+ * Find an equivalence class member to be computed as a sort column in the
+ * given target.
  */
-Expr *
-find_em_expr_for_input_target(PlannerInfo *root,
-							  EquivalenceClass *ec,
-							  PathTarget *target)
+EquivalenceMember *
+find_em_for_input_target(PlannerInfo *root, EquivalenceClass *ec,
+						 PathTarget *target)
 {
 	ListCell   *lc1;
 	int			i;
@@ -7383,7 +7375,7 @@ find_em_expr_for_input_target(PlannerInfo *root,
 				em_expr = ((RelabelType *) em_expr)->arg;
 
 			if (equal(em_expr, expr))
-				return em->em_expr;
+				return em;
 		}
 
 		i++;
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index ca83306af9..9930226b6f 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -171,6 +171,9 @@ extern bool is_foreign_expr(PlannerInfo *root,
 extern bool is_foreign_param(PlannerInfo *root,
 							 RelOptInfo *baserel,
 							 Expr *expr);
+extern bool is_foreign_pathkey(PlannerInfo *root,
+							   RelOptInfo *baserel,
+							   PathKey *pathkey);
 extern void deparseInsertSql(StringInfo buf, RangeTblEntry *rte,
 							 Index rtindex, Relation rel,
 							 List *targetAttrs, bool doNothing,
@@ -213,10 +216,10 @@ extern void deparseTruncateSql(StringInfo buf,
 							   DropBehavior behavior,
 							   bool restart_seqs);
 extern void deparseStringLiteral(StringInfo buf, const char *val);
-extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
-extern Expr *find_em_expr_for_input_target(PlannerInfo *root,
-										   EquivalenceClass *ec,
-										   PathTarget *target);
+extern EquivalenceMember *find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
+extern EquivalenceMember *find_em_for_input_target(PlannerInfo *root,
+												   EquivalenceClass *ec,
+												   PathTarget *target);
 extern List *build_tlist_to_deparse(RelOptInfo *foreignrel);
 extern void deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root,
 									RelOptInfo *foreignrel, List *tlist,
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 30b5175da5..1bbe1212c4 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -902,6 +902,10 @@ create operator class my_op_class for type int using btree family my_op_family a
 explain (verbose, costs off)
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 
+-- Ensure that we don't push down an ORDER BY with a non-shippable operator
+explain (verbose, costs off)
+select * from ft2 order by c1 using operator(public.<^);
+
 -- Update local stats on ft2
 ANALYZE ft2;
 
@@ -919,6 +923,10 @@ explain (verbose, costs off)
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 
+-- Ensure that the ORDER BY is pushed to the foreign server
+explain (verbose, costs off)
+select * from ft2 order by c1 using operator(public.<^);
+
 -- Remove from extension
 alter extension postgres_fdw drop operator class my_op_class using btree;
 alter extension postgres_fdw drop function my_op_cmp(a int, b int);
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 6f1abbe47d..b367f18eab 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -932,11 +932,11 @@ is_exprlist_member(Expr *node, List *exprs)
 }
 
 /*
- * Find an equivalence class member expression, all of whose Vars, come from
- * the indicated relation.
+ * Find an equivalence class member, all of whose Vars, come from the
+ * indicated relation.
  */
-Expr *
-find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+EquivalenceMember *
+find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 {
 	ListCell   *lc_em;
 
@@ -952,7 +952,7 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 			 * taken entirely from this relation, we'll be content to choose
 			 * any one of those.
 			 */
-			return em->em_expr;
+			return em;
 		}
 	}
 
@@ -960,6 +960,22 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 	return NULL;
 }
 
+/*
+ * Find an equivalence class member expression, all of whose Vars, come from
+ * the indicated relation.
+ */
+Expr *
+find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+{
+	EquivalenceMember *em = find_em_for_rel(ec, rel);
+
+	if (em != NULL)
+		return em->em_expr;
+
+	return NULL;
+}
+
+
 /*
  * relation_can_be_sorted_early
  *		Can this relation be sorted on this EC before the final output step?
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index f1d111063c..5a2bbc87e1 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -144,6 +144,8 @@ extern EquivalenceMember *find_computable_ec_member(PlannerInfo *root,
 													Relids relids,
 													bool require_parallel_safe);
 extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
+extern EquivalenceMember *find_em_for_rel(EquivalenceClass *ec,
+										  RelOptInfo *rel);
 extern bool relation_can_be_sorted_early(PlannerInfo *root, RelOptInfo *rel,
 										 EquivalenceClass *ec,
 										 bool require_parallel_safe);
-- 
2.33.0

#16Zhihong Yu
zyu@yugabyte.com
In reply to: Ronan Dunklau (#15)
Re: ORDER BY pushdowns seem broken in postgres_fdw

On Mon, Sep 6, 2021 at 1:17 AM Ronan Dunklau <ronan@dunklau.fr> wrote:

Le vendredi 3 septembre 2021, 22:54:25 CEST David Zhang a écrit :

The following review has been posted through the commitfest application:
make installcheck-world: tested, failed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not tested

Applied the v6 patch to master branch and ran regression test for

contrib,

the result was "All tests successful."

What kind of error did you get running make installcheck-world ? If it
passed
the make check for contrib, I can't see why it would fail running make
installcheck-world.

In any case, I just checked and running make installcheck-world doesn't
produce any error.

Since HEAD had moved a bit since the last version, I rebased the patch,
resulting in the attached v7.

Best regards,

--
Ronan Dunklau

Hi,
bq. a pushed-down order by could return wrong results.

Can you briefly summarize the approach for fixing the bug in the
description ?

+ * Returns true if it's safe to push down a sort as described by 'pathkey'
to
+ * the foreign server
+ */
+bool
+is_foreign_pathkey(PlannerInfo *root,

It would be better to name the method which reflects whether pushdown is
safe. e.g. is_pathkey_safe_for_pushdown.

Cheers

#17Ronan Dunklau
ronan.dunklau@aiven.io
In reply to: Zhihong Yu (#16)
1 attachment(s)
Re: ORDER BY pushdowns seem broken in postgres_fdw

Le lundi 6 septembre 2021, 11:25:39 CEST Zhihong Yu a écrit :

On Mon, Sep 6, 2021 at 1:17 AM Ronan Dunklau <ronan@dunklau.fr> wrote:

Le vendredi 3 septembre 2021, 22:54:25 CEST David Zhang a écrit :

The following review has been posted through the commitfest application:
make installcheck-world: tested, failed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not tested

Applied the v6 patch to master branch and ran regression test for

contrib,

the result was "All tests successful."

What kind of error did you get running make installcheck-world ? If it
passed
the make check for contrib, I can't see why it would fail running make
installcheck-world.

In any case, I just checked and running make installcheck-world doesn't
produce any error.

Since HEAD had moved a bit since the last version, I rebased the patch,
resulting in the attached v7.

Best regards,

--
Ronan Dunklau

Hi,
bq. a pushed-down order by could return wrong results.

Can you briefly summarize the approach for fixing the bug in the
description ?

Done, let me know what you think about it.

+ * Returns true if it's safe to push down a sort as described by 'pathkey'
to
+ * the foreign server
+ */
+bool
+is_foreign_pathkey(PlannerInfo *root,

It would be better to name the method which reflects whether pushdown is
safe. e.g. is_pathkey_safe_for_pushdown.

The convention used here is the same one as in is_foreign_expr and
is_foreign_param, which are also related to pushdown-safety.

--
Ronan Dunklau

Attachments:

v8-0001-Fix-orderby-handling-in-postgres_fdw.patchtext/x-patch; charset=UTF-8; name=v8-0001-Fix-orderby-handling-in-postgres_fdw.patchDownload
From 8d0ebd4df2e5e72a8d490a65001c9971516a4337 Mon Sep 17 00:00:00 2001
From: Ronan Dunklau <ronan.dunklau@aiven.io>
Date: Mon, 6 Sep 2021 09:54:43 +0200
Subject: [PATCH v8] Fix orderby handling in postgres_fdw

The logic for pushing down order bys in postgres fdw didn't take into
account the specific operator used, and as such a pushed-down order by
could return wrong results.

This patch looks up the original operator associated to the pathkey
opfamily, and checks that it actually exists on the foreign side.

If it does, the operator is then used to rebuild an equivalent USING
<operator> clause.
---
 contrib/postgres_fdw/deparse.c                | 156 +++++++++++++-----
 .../postgres_fdw/expected/postgres_fdw.out    |  23 +++
 contrib/postgres_fdw/postgres_fdw.c           |  28 ++--
 contrib/postgres_fdw/postgres_fdw.h           |  11 +-
 contrib/postgres_fdw/sql/postgres_fdw.sql     |   8 +
 src/backend/optimizer/path/equivclass.c       |  26 ++-
 src/include/optimizer/paths.h                 |   2 +
 7 files changed, 187 insertions(+), 67 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index d98bd66681..fb1b5f9d9b 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -37,9 +37,11 @@
 #include "access/sysattr.h"
 #include "access/table.h"
 #include "catalog/pg_aggregate.h"
+#include "catalog/pg_amop.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_namespace.h"
 #include "catalog/pg_operator.h"
+#include "catalog/pg_opfamily.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
@@ -47,6 +49,7 @@
 #include "nodes/nodeFuncs.h"
 #include "nodes/plannodes.h"
 #include "optimizer/optimizer.h"
+#include "optimizer/paths.h"
 #include "optimizer/prep.h"
 #include "optimizer/tlist.h"
 #include "parser/parsetree.h"
@@ -182,6 +185,8 @@ static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root,
 							   Index ignore_rel, List **ignore_conds, List **params_list);
 static void deparseAggref(Aggref *node, deparse_expr_cxt *context);
 static void appendGroupByClause(List *tlist, deparse_expr_cxt *context);
+static void appendOrderBySuffix(Oid sortop, Oid sortcoltype, bool nulls_first,
+								deparse_expr_cxt *context);
 static void appendAggOrderBy(List *orderList, List *targetList,
 							 deparse_expr_cxt *context);
 static void appendFunctionName(Oid funcid, deparse_expr_cxt *context);
@@ -1037,6 +1042,41 @@ is_foreign_param(PlannerInfo *root,
 	return false;
 }
 
+/*
+ * Returns true if it's safe to push down a sort as described by 'pathkey' to
+ * the foreign server
+ */
+bool
+is_foreign_pathkey(PlannerInfo *root,
+				   RelOptInfo *baserel,
+				   PathKey *pathkey)
+{
+	EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
+	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
+	Expr	   *em_expr;
+
+	/*
+	 * is_foreign_expr would detect volatile expressions as well, but checking
+	 * ec_has_volatile here saves some cycles.
+	 */
+	if (pathkey_ec->ec_has_volatile)
+		return false;
+
+	/* can't push down the sort if the pathkey's opfamily is not shippable */
+	if (!is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId, fpinfo))
+		return false;
+
+	em_expr = find_em_expr_for_rel(pathkey_ec, baserel);
+	if (em_expr == NULL)
+		return false;
+
+	/*
+	 * Finally, determine if it's safe to evaluate the found expr on the
+	 * foreign server.
+	 */
+	return is_foreign_expr(root, baserel, em_expr);
+}
+
 /*
  * Convert type OID + typmod info into a type name we can ship to the remote
  * server.  Someplace else had better have verified that this type name is
@@ -3331,6 +3371,45 @@ deparseAggref(Aggref *node, deparse_expr_cxt *context)
 	appendStringInfoChar(buf, ')');
 }
 
+/*
+ * Append the ASC, DESC, USING <OPERATOR> and NULLS FIRST / NULLS LAST part
+ * of the ORDER BY clause
+ */
+static void
+appendOrderBySuffix(Oid sortop, Oid sortcoltype, bool nulls_first,
+					deparse_expr_cxt *context)
+{
+	StringInfo	buf = context->buf;
+	TypeCacheEntry *typentry;
+
+	typentry = lookup_type_cache(sortcoltype, TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
+
+	if (sortop == typentry->lt_opr)
+		appendStringInfoString(buf, " ASC");
+	else if (sortop == typentry->gt_opr)
+		appendStringInfoString(buf, " DESC");
+	else
+	{
+		HeapTuple	opertup;
+		Form_pg_operator operform;
+
+		appendStringInfoString(buf, " USING ");
+
+		/* Append operator name. */
+		opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(sortop));
+		if (!HeapTupleIsValid(opertup))
+			elog(ERROR, "cache lookup failed for operator %u", sortop);
+		operform = (Form_pg_operator) GETSTRUCT(opertup);
+		deparseOperatorName(buf, operform);
+		ReleaseSysCache(opertup);
+	}
+
+	if (nulls_first)
+		appendStringInfoString(buf, " NULLS FIRST");
+	else
+		appendStringInfoString(buf, " NULLS LAST");
+}
+
 /*
  * Append ORDER BY within aggregate function.
  */
@@ -3346,7 +3425,6 @@ appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context)
 		SortGroupClause *srt = (SortGroupClause *) lfirst(lc);
 		Node	   *sortexpr;
 		Oid			sortcoltype;
-		TypeCacheEntry *typentry;
 
 		if (!first)
 			appendStringInfoString(buf, ", ");
@@ -3356,32 +3434,8 @@ appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context)
 										  false, context);
 		sortcoltype = exprType(sortexpr);
 		/* See whether operator is default < or > for datatype */
-		typentry = lookup_type_cache(sortcoltype,
-									 TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
-		if (srt->sortop == typentry->lt_opr)
-			appendStringInfoString(buf, " ASC");
-		else if (srt->sortop == typentry->gt_opr)
-			appendStringInfoString(buf, " DESC");
-		else
-		{
-			HeapTuple	opertup;
-			Form_pg_operator operform;
-
-			appendStringInfoString(buf, " USING ");
-
-			/* Append operator name. */
-			opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(srt->sortop));
-			if (!HeapTupleIsValid(opertup))
-				elog(ERROR, "cache lookup failed for operator %u", srt->sortop);
-			operform = (Form_pg_operator) GETSTRUCT(opertup);
-			deparseOperatorName(buf, operform);
-			ReleaseSysCache(opertup);
-		}
-
-		if (srt->nulls_first)
-			appendStringInfoString(buf, " NULLS FIRST");
-		else
-			appendStringInfoString(buf, " NULLS LAST");
+		appendOrderBySuffix(srt->sortop, sortcoltype, srt->nulls_first,
+							context);
 	}
 }
 
@@ -3486,7 +3540,11 @@ appendOrderByClause(List *pathkeys, bool has_final_sort,
 	foreach(lcell, pathkeys)
 	{
 		PathKey    *pathkey = lfirst(lcell);
+		EquivalenceMember *em;
 		Expr	   *em_expr;
+		HeapTuple	tuple;
+		Oid			oprid;
+		bool		isNull;
 
 		if (has_final_sort)
 		{
@@ -3494,26 +3552,44 @@ appendOrderByClause(List *pathkeys, bool has_final_sort,
 			 * By construction, context->foreignrel is the input relation to
 			 * the final sort.
 			 */
-			em_expr = find_em_expr_for_input_target(context->root,
-													pathkey->pk_eclass,
-													context->foreignrel->reltarget);
+			em = find_em_for_input_target(context->root,
+										  pathkey->pk_eclass,
+										  context->foreignrel->reltarget);
 		}
 		else
-			em_expr = find_em_expr_for_rel(pathkey->pk_eclass, baserel);
+			em = find_em_for_rel(pathkey->pk_eclass, baserel);
+
+		em_expr = em->em_expr;
+
+		/*
+		 * Lookup the operator corresponding to the strategy in the opclass.
+		 * The datatype used by the opfamily is not necessarily the same as
+		 * the expression type (for array types for example).
+		 */
+		tuple = SearchSysCache4(AMOPSTRATEGY,
+								ObjectIdGetDatum(pathkey->pk_opfamily),
+								ObjectIdGetDatum(em->em_datatype),
+								ObjectIdGetDatum(em->em_datatype),
+								Int16GetDatum(pathkey->pk_strategy));
+
+		if (!HeapTupleIsValid(tuple))
+			elog(ERROR, "missing operator %d(%u,%u) in opfamily %u",
+				 pathkey->pk_strategy, em->em_datatype, em->em_datatype,
+				 pathkey->pk_opfamily);
 
-		Assert(em_expr != NULL);
+		oprid = DatumGetObjectId(SysCacheGetAttr(AMOPSTRATEGY, tuple,
+												 Anum_pg_amop_amopopr, &isNull));
+		ReleaseSysCache(tuple);
 
 		appendStringInfoString(buf, delim);
 		deparseExpr(em_expr, context);
-		if (pathkey->pk_strategy == BTLessStrategyNumber)
-			appendStringInfoString(buf, " ASC");
-		else
-			appendStringInfoString(buf, " DESC");
 
-		if (pathkey->pk_nulls_first)
-			appendStringInfoString(buf, " NULLS FIRST");
-		else
-			appendStringInfoString(buf, " NULLS LAST");
+		/*
+		 * Here we need to use the expression type to compare against the
+		 * default btree sort operator.
+		 */
+		appendOrderBySuffix(oprid, exprType((Node *) em_expr),
+							pathkey->pk_nulls_first, context);
 
 		delim = ", ";
 	}
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index e3ee30f1aa..1f44c365ef 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3258,6 +3258,19 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
          Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (("C 1" < 100)) AND ((c2 = 6))
 (6 rows)
 
+-- Ensure that we don't push down an ORDER BY with a non-shippable operator
+explain (verbose, costs off)
+select * from ft2 order by c1 using operator(public.<^);
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
+ Sort
+   Output: c1, c2, c3, c4, c5, c6, c7, c8
+   Sort Key: ft2.c1 USING <^
+   ->  Foreign Scan on public.ft2
+         Output: c1, c2, c3, c4, c5, c6, c7, c8
+         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+(6 rows)
+
 -- Update local stats on ft2
 ANALYZE ft2;
 -- Add into extension
@@ -3285,6 +3298,16 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
  {6,16,26,36,46,56,66,76,86,96}
 (1 row)
 
+-- Ensure that the ORDER BY is pushed to the foreign server
+explain (verbose, costs off)
+select * from ft2 order by c1 using operator(public.<^);
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan on public.ft2
+   Output: c1, c2, c3, c4, c5, c6, c7, c8
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" ORDER BY "C 1" USING OPERATOR(public.<^) NULLS LAST
+(3 rows)
+
 -- Remove from extension
 alter extension postgres_fdw drop operator class my_op_class using btree;
 alter extension postgres_fdw drop function my_op_cmp(a int, b int);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 4bdab30a73..1dd4779185 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -917,8 +917,6 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 		foreach(lc, root->query_pathkeys)
 		{
 			PathKey    *pathkey = (PathKey *) lfirst(lc);
-			EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
-			Expr	   *em_expr;
 
 			/*
 			 * The planner and executor don't have any clever strategy for
@@ -926,13 +924,8 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 			 * getting it to be sorted by all of those pathkeys. We'll just
 			 * end up resorting the entire data set.  So, unless we can push
 			 * down all of the query pathkeys, forget it.
-			 *
-			 * is_foreign_expr would detect volatile expressions as well, but
-			 * checking ec_has_volatile here saves some cycles.
 			 */
-			if (pathkey_ec->ec_has_volatile ||
-				!(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) ||
-				!is_foreign_expr(root, rel, em_expr))
+			if (!is_foreign_pathkey(root, rel, pathkey))
 			{
 				query_pathkeys_ok = false;
 				break;
@@ -6540,9 +6533,9 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 			return;
 
 		/* Get the sort expression for the pathkey_ec */
-		sort_expr = find_em_expr_for_input_target(root,
-												  pathkey_ec,
-												  input_rel->reltarget);
+		sort_expr = find_em_for_input_target(root,
+											 pathkey_ec,
+											 input_rel->reltarget)->em_expr;
 
 		/* If it's unsafe to remote, we cannot push down the final sort */
 		if (!is_foreign_expr(root, input_rel, sort_expr))
@@ -7332,13 +7325,12 @@ conversion_error_callback(void *arg)
 }
 
 /*
- * Find an equivalence class member expression to be computed as a sort column
- * in the given target.
+ * Find an equivalence class member to be computed as a sort column in the
+ * given target.
  */
-Expr *
-find_em_expr_for_input_target(PlannerInfo *root,
-							  EquivalenceClass *ec,
-							  PathTarget *target)
+EquivalenceMember *
+find_em_for_input_target(PlannerInfo *root, EquivalenceClass *ec,
+						 PathTarget *target)
 {
 	ListCell   *lc1;
 	int			i;
@@ -7383,7 +7375,7 @@ find_em_expr_for_input_target(PlannerInfo *root,
 				em_expr = ((RelabelType *) em_expr)->arg;
 
 			if (equal(em_expr, expr))
-				return em->em_expr;
+				return em;
 		}
 
 		i++;
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index ca83306af9..9930226b6f 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -171,6 +171,9 @@ extern bool is_foreign_expr(PlannerInfo *root,
 extern bool is_foreign_param(PlannerInfo *root,
 							 RelOptInfo *baserel,
 							 Expr *expr);
+extern bool is_foreign_pathkey(PlannerInfo *root,
+							   RelOptInfo *baserel,
+							   PathKey *pathkey);
 extern void deparseInsertSql(StringInfo buf, RangeTblEntry *rte,
 							 Index rtindex, Relation rel,
 							 List *targetAttrs, bool doNothing,
@@ -213,10 +216,10 @@ extern void deparseTruncateSql(StringInfo buf,
 							   DropBehavior behavior,
 							   bool restart_seqs);
 extern void deparseStringLiteral(StringInfo buf, const char *val);
-extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
-extern Expr *find_em_expr_for_input_target(PlannerInfo *root,
-										   EquivalenceClass *ec,
-										   PathTarget *target);
+extern EquivalenceMember *find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
+extern EquivalenceMember *find_em_for_input_target(PlannerInfo *root,
+												   EquivalenceClass *ec,
+												   PathTarget *target);
 extern List *build_tlist_to_deparse(RelOptInfo *foreignrel);
 extern void deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root,
 									RelOptInfo *foreignrel, List *tlist,
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 30b5175da5..1bbe1212c4 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -902,6 +902,10 @@ create operator class my_op_class for type int using btree family my_op_family a
 explain (verbose, costs off)
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 
+-- Ensure that we don't push down an ORDER BY with a non-shippable operator
+explain (verbose, costs off)
+select * from ft2 order by c1 using operator(public.<^);
+
 -- Update local stats on ft2
 ANALYZE ft2;
 
@@ -919,6 +923,10 @@ explain (verbose, costs off)
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 
+-- Ensure that the ORDER BY is pushed to the foreign server
+explain (verbose, costs off)
+select * from ft2 order by c1 using operator(public.<^);
+
 -- Remove from extension
 alter extension postgres_fdw drop operator class my_op_class using btree;
 alter extension postgres_fdw drop function my_op_cmp(a int, b int);
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 6f1abbe47d..b367f18eab 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -932,11 +932,11 @@ is_exprlist_member(Expr *node, List *exprs)
 }
 
 /*
- * Find an equivalence class member expression, all of whose Vars, come from
- * the indicated relation.
+ * Find an equivalence class member, all of whose Vars, come from the
+ * indicated relation.
  */
-Expr *
-find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+EquivalenceMember *
+find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 {
 	ListCell   *lc_em;
 
@@ -952,7 +952,7 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 			 * taken entirely from this relation, we'll be content to choose
 			 * any one of those.
 			 */
-			return em->em_expr;
+			return em;
 		}
 	}
 
@@ -960,6 +960,22 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 	return NULL;
 }
 
+/*
+ * Find an equivalence class member expression, all of whose Vars, come from
+ * the indicated relation.
+ */
+Expr *
+find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+{
+	EquivalenceMember *em = find_em_for_rel(ec, rel);
+
+	if (em != NULL)
+		return em->em_expr;
+
+	return NULL;
+}
+
+
 /*
  * relation_can_be_sorted_early
  *		Can this relation be sorted on this EC before the final output step?
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index f1d111063c..5a2bbc87e1 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -144,6 +144,8 @@ extern EquivalenceMember *find_computable_ec_member(PlannerInfo *root,
 													Relids relids,
 													bool require_parallel_safe);
 extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
+extern EquivalenceMember *find_em_for_rel(EquivalenceClass *ec,
+										  RelOptInfo *rel);
 extern bool relation_can_be_sorted_early(PlannerInfo *root, RelOptInfo *rel,
 										 EquivalenceClass *ec,
 										 bool require_parallel_safe);
-- 
2.33.0

#18Zhihong Yu
zyu@yugabyte.com
In reply to: Ronan Dunklau (#17)
Re: ORDER BY pushdowns seem broken in postgres_fdw

On Mon, Sep 6, 2021 at 2:41 AM Ronan Dunklau <ronan.dunklau@aiven.io> wrote:

Le lundi 6 septembre 2021, 11:25:39 CEST Zhihong Yu a écrit :

On Mon, Sep 6, 2021 at 1:17 AM Ronan Dunklau <ronan@dunklau.fr> wrote:

Le vendredi 3 septembre 2021, 22:54:25 CEST David Zhang a écrit :

The following review has been posted through the commitfest

application:

make installcheck-world: tested, failed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not tested

Applied the v6 patch to master branch and ran regression test for

contrib,

the result was "All tests successful."

What kind of error did you get running make installcheck-world ? If it
passed
the make check for contrib, I can't see why it would fail running make
installcheck-world.

In any case, I just checked and running make installcheck-world doesn't
produce any error.

Since HEAD had moved a bit since the last version, I rebased the patch,
resulting in the attached v7.

Best regards,

--
Ronan Dunklau

Hi,
bq. a pushed-down order by could return wrong results.

Can you briefly summarize the approach for fixing the bug in the
description ?

Done, let me know what you think about it.

+ * Returns true if it's safe to push down a sort as described by

'pathkey'

to
+ * the foreign server
+ */
+bool
+is_foreign_pathkey(PlannerInfo *root,

It would be better to name the method which reflects whether pushdown is
safe. e.g. is_pathkey_safe_for_pushdown.

The convention used here is the same one as in is_foreign_expr and
is_foreign_param, which are also related to pushdown-safety.

--
Ronan Dunklau

Hi,
w.r.t. description:
bq. original operator associated to the pathkey

associated to -> associated with

w.r.t. method name, it is fine to use the current name, considering the
functions it calls don't have pushdown in their names.

Cheers

#19David Zhang
david.zhang@highgo.ca
In reply to: Ronan Dunklau (#15)
Re: ORDER BY pushdowns seem broken in postgres_fdw

On 2021-09-06 1:16 a.m., Ronan Dunklau wrote:

Le vendredi 3 septembre 2021, 22:54:25 CEST David Zhang a �crit :

The following review has been posted through the commitfest application:
make installcheck-world: tested, failed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not tested

Applied the v6 patch to master branch and ran regression test for contrib,
the result was "All tests successful."

What kind of error did you get running make installcheck-world ? If it passed
the make check for contrib, I can't see why it would fail running make
installcheck-world.

Just to clarify, the error I encountered was not related with patch v6,
it was related with other extensions.

In any case, I just checked and running make installcheck-world doesn't
produce any error.

Since HEAD had moved a bit since the last version, I rebased the patch,
resulting in the attached v7.

Best regards,

--
Ronan Dunklau

--
David

Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ronan Dunklau (#17)
1 attachment(s)
Re: ORDER BY pushdowns seem broken in postgres_fdw

Ronan Dunklau <ronan.dunklau@aiven.io> writes:

[ v8-0001-Fix-orderby-handling-in-postgres_fdw.patch ]

I looked through this patch. It's going in the right direction,
but I have a couple of nitpicks:

1. There are still some more places that aren't checking shippability
of the relevant opfamily.

2. The existing usage of find_em_expr_for_rel is fundamentally broken,
because that function will seize on the first EC member that is from the
given rel, whether it's shippable or not. There might be another one
later that is shippable, so this is just the wrong API. It's not like
this function gives us any useful isolation from the details of ECs,
because postgres_fdw is already looking into those elsewhere, notably
in find_em_expr_for_input_target --- which has the same order-sensitivity
bug.

I think that instead of doubling down on a wrong API, we should just
take that out and move the logic into postgres_fdw.c. This also has
the advantage of producing a patch that's much safer to backpatch,
because it doesn't rely on the core backend getting updated before
postgres_fdw.so is.

So hacking on those two points, and doing some additional cleanup,
led me to the attached v9. (In this patch, the removal of code
from equivclass.c is only meant to be applied to HEAD; we have to
leave the function in place in the back branches for API stability.)

If no objections, I think this is committable.

regards, tom lane

Attachments:

v9-0001-Fix-orderby-handling-in-postgres_fdw.patchtext/x-diff; charset=us-ascii; name=v9-0001-Fix-orderby-handling-in-postgres_fdw.patchDownload
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index bf12eac028..8f4d8a5022 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -40,6 +40,7 @@
 #include "catalog/pg_collation.h"
 #include "catalog/pg_namespace.h"
 #include "catalog/pg_operator.h"
+#include "catalog/pg_opfamily.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
@@ -183,6 +184,8 @@ static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root,
 							   Index ignore_rel, List **ignore_conds, List **params_list);
 static void deparseAggref(Aggref *node, deparse_expr_cxt *context);
 static void appendGroupByClause(List *tlist, deparse_expr_cxt *context);
+static void appendOrderBySuffix(Oid sortop, Oid sortcoltype, bool nulls_first,
+								deparse_expr_cxt *context);
 static void appendAggOrderBy(List *orderList, List *targetList,
 							 deparse_expr_cxt *context);
 static void appendFunctionName(Oid funcid, deparse_expr_cxt *context);
@@ -1038,6 +1041,33 @@ is_foreign_param(PlannerInfo *root,
 	return false;
 }
 
+/*
+ * Returns true if it's safe to push down the sort expression described by
+ * 'pathkey' to the foreign server.
+ */
+bool
+is_foreign_pathkey(PlannerInfo *root,
+				   RelOptInfo *baserel,
+				   PathKey *pathkey)
+{
+	EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
+	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
+
+	/*
+	 * is_foreign_expr would detect volatile expressions as well, but checking
+	 * ec_has_volatile here saves some cycles.
+	 */
+	if (pathkey_ec->ec_has_volatile)
+		return false;
+
+	/* can't push down the sort if the pathkey's opfamily is not shippable */
+	if (!is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId, fpinfo))
+		return false;
+
+	/* can push if a suitable EC member exists */
+	return (find_em_for_rel(root, pathkey_ec, baserel) != NULL);
+}
+
 /*
  * Convert type OID + typmod info into a type name we can ship to the remote
  * server.  Someplace else had better have verified that this type name is
@@ -3445,44 +3475,59 @@ appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context)
 	{
 		SortGroupClause *srt = (SortGroupClause *) lfirst(lc);
 		Node	   *sortexpr;
-		Oid			sortcoltype;
-		TypeCacheEntry *typentry;
 
 		if (!first)
 			appendStringInfoString(buf, ", ");
 		first = false;
 
+		/* Deparse the sort expression proper. */
 		sortexpr = deparseSortGroupClause(srt->tleSortGroupRef, targetList,
 										  false, context);
-		sortcoltype = exprType(sortexpr);
-		/* See whether operator is default < or > for datatype */
-		typentry = lookup_type_cache(sortcoltype,
-									 TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
-		if (srt->sortop == typentry->lt_opr)
-			appendStringInfoString(buf, " ASC");
-		else if (srt->sortop == typentry->gt_opr)
-			appendStringInfoString(buf, " DESC");
-		else
-		{
-			HeapTuple	opertup;
-			Form_pg_operator operform;
-
-			appendStringInfoString(buf, " USING ");
-
-			/* Append operator name. */
-			opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(srt->sortop));
-			if (!HeapTupleIsValid(opertup))
-				elog(ERROR, "cache lookup failed for operator %u", srt->sortop);
-			operform = (Form_pg_operator) GETSTRUCT(opertup);
-			deparseOperatorName(buf, operform);
-			ReleaseSysCache(opertup);
-		}
+		/* Add decoration as needed. */
+		appendOrderBySuffix(srt->sortop, exprType(sortexpr), srt->nulls_first,
+							context);
+	}
+}
 
-		if (srt->nulls_first)
-			appendStringInfoString(buf, " NULLS FIRST");
-		else
-			appendStringInfoString(buf, " NULLS LAST");
+/*
+ * Append the ASC, DESC, USING <OPERATOR> and NULLS FIRST / NULLS LAST parts
+ * of an ORDER BY clause.
+ */
+static void
+appendOrderBySuffix(Oid sortop, Oid sortcoltype, bool nulls_first,
+					deparse_expr_cxt *context)
+{
+	StringInfo	buf = context->buf;
+	TypeCacheEntry *typentry;
+
+	/* See whether operator is default < or > for sort expr's datatype. */
+	typentry = lookup_type_cache(sortcoltype,
+								 TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
+
+	if (sortop == typentry->lt_opr)
+		appendStringInfoString(buf, " ASC");
+	else if (sortop == typentry->gt_opr)
+		appendStringInfoString(buf, " DESC");
+	else
+	{
+		HeapTuple	opertup;
+		Form_pg_operator operform;
+
+		appendStringInfoString(buf, " USING ");
+
+		/* Append operator name. */
+		opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(sortop));
+		if (!HeapTupleIsValid(opertup))
+			elog(ERROR, "cache lookup failed for operator %u", sortop);
+		operform = (Form_pg_operator) GETSTRUCT(opertup);
+		deparseOperatorName(buf, operform);
+		ReleaseSysCache(opertup);
 	}
+
+	if (nulls_first)
+		appendStringInfoString(buf, " NULLS FIRST");
+	else
+		appendStringInfoString(buf, " NULLS LAST");
 }
 
 /*
@@ -3565,9 +3610,13 @@ appendGroupByClause(List *tlist, deparse_expr_cxt *context)
 }
 
 /*
- * Deparse ORDER BY clause according to the given pathkeys for given base
- * relation. From given pathkeys expressions belonging entirely to the given
- * base relation are obtained and deparsed.
+ * Deparse ORDER BY clause defined by the given pathkeys.
+ *
+ * The clause should use Vars from context->scanrel if !has_final_sort,
+ * or from context->foreignrel's targetlist if has_final_sort.
+ *
+ * We find a suitable pathkey expression (some earlier step
+ * should have verified that there is one) and deparse it.
  */
 static void
 appendOrderByClause(List *pathkeys, bool has_final_sort,
@@ -3575,8 +3624,7 @@ appendOrderByClause(List *pathkeys, bool has_final_sort,
 {
 	ListCell   *lcell;
 	int			nestlevel;
-	char	   *delim = " ";
-	RelOptInfo *baserel = context->scanrel;
+	const char *delim = " ";
 	StringInfo	buf = context->buf;
 
 	/* Make sure any constants in the exprs are printed portably */
@@ -3586,7 +3634,9 @@ appendOrderByClause(List *pathkeys, bool has_final_sort,
 	foreach(lcell, pathkeys)
 	{
 		PathKey    *pathkey = lfirst(lcell);
+		EquivalenceMember *em;
 		Expr	   *em_expr;
+		Oid			oprid;
 
 		if (has_final_sort)
 		{
@@ -3594,26 +3644,48 @@ appendOrderByClause(List *pathkeys, bool has_final_sort,
 			 * By construction, context->foreignrel is the input relation to
 			 * the final sort.
 			 */
-			em_expr = find_em_expr_for_input_target(context->root,
-													pathkey->pk_eclass,
-													context->foreignrel->reltarget);
+			em = find_em_for_rel_target(context->root,
+										pathkey->pk_eclass,
+										context->foreignrel);
 		}
 		else
-			em_expr = find_em_expr_for_rel(pathkey->pk_eclass, baserel);
+			em = find_em_for_rel(context->root,
+								 pathkey->pk_eclass,
+								 context->scanrel);
+
+		/*
+		 * We don't expect any error here; it would mean that shippability
+		 * wasn't verified earlier.  For the same reason, we don't recheck
+		 * shippability of the sort operator.
+		 */
+		if (em == NULL)
+			elog(ERROR, "could not find pathkey item to sort");
+
+		em_expr = em->em_expr;
 
-		Assert(em_expr != NULL);
+		/*
+		 * Lookup the operator corresponding to the strategy in the opclass.
+		 * The datatype used by the opfamily is not necessarily the same as
+		 * the expression type (for array types for example).
+		 */
+		oprid = get_opfamily_member(pathkey->pk_opfamily,
+									em->em_datatype,
+									em->em_datatype,
+									pathkey->pk_strategy);
+		if (!OidIsValid(oprid))
+			elog(ERROR, "missing operator %d(%u,%u) in opfamily %u",
+				 pathkey->pk_strategy, em->em_datatype, em->em_datatype,
+				 pathkey->pk_opfamily);
 
 		appendStringInfoString(buf, delim);
 		deparseExpr(em_expr, context);
-		if (pathkey->pk_strategy == BTLessStrategyNumber)
-			appendStringInfoString(buf, " ASC");
-		else
-			appendStringInfoString(buf, " DESC");
 
-		if (pathkey->pk_nulls_first)
-			appendStringInfoString(buf, " NULLS FIRST");
-		else
-			appendStringInfoString(buf, " NULLS LAST");
+		/*
+		 * Here we need to use the expression's actual type to discover
+		 * whether the desired operator will be the default or not.
+		 */
+		appendOrderBySuffix(oprid, exprType((Node *) em_expr),
+							pathkey->pk_nulls_first, context);
 
 		delim = ", ";
 	}
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index f210f91188..2725b2b211 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3265,6 +3265,19 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
          Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (("C 1" < 100)) AND ((c2 = 6))
 (6 rows)
 
+-- This should not be pushed either.
+explain (verbose, costs off)
+select * from ft2 order by c1 using operator(public.<^);
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
+ Sort
+   Output: c1, c2, c3, c4, c5, c6, c7, c8
+   Sort Key: ft2.c1 USING <^
+   ->  Foreign Scan on public.ft2
+         Output: c1, c2, c3, c4, c5, c6, c7, c8
+         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
+(6 rows)
+
 -- Update local stats on ft2
 ANALYZE ft2;
 -- Add into extension
@@ -3292,6 +3305,16 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
  {6,16,26,36,46,56,66,76,86,96}
 (1 row)
 
+-- This should be pushed too.
+explain (verbose, costs off)
+select * from ft2 order by c1 using operator(public.<^);
+                                                         QUERY PLAN                                                          
+-----------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan on public.ft2
+   Output: c1, c2, c3, c4, c5, c6, c7, c8
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" ORDER BY "C 1" USING OPERATOR(public.<^) NULLS LAST
+(3 rows)
+
 -- Remove from extension
 alter extension postgres_fdw drop operator class my_op_class using btree;
 alter extension postgres_fdw drop function my_op_cmp(a int, b int);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 56654844e8..c51dd68722 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -18,6 +18,7 @@
 #include "access/sysattr.h"
 #include "access/table.h"
 #include "catalog/pg_class.h"
+#include "catalog/pg_opfamily.h"
 #include "commands/defrem.h"
 #include "commands/explain.h"
 #include "commands/vacuum.h"
@@ -918,8 +919,6 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 		foreach(lc, root->query_pathkeys)
 		{
 			PathKey    *pathkey = (PathKey *) lfirst(lc);
-			EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
-			Expr	   *em_expr;
 
 			/*
 			 * The planner and executor don't have any clever strategy for
@@ -927,13 +926,8 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 			 * getting it to be sorted by all of those pathkeys. We'll just
 			 * end up resorting the entire data set.  So, unless we can push
 			 * down all of the query pathkeys, forget it.
-			 *
-			 * is_foreign_expr would detect volatile expressions as well, but
-			 * checking ec_has_volatile here saves some cycles.
 			 */
-			if (pathkey_ec->ec_has_volatile ||
-				!(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) ||
-				!is_foreign_expr(root, rel, em_expr))
+			if (!is_foreign_pathkey(root, rel, pathkey))
 			{
 				query_pathkeys_ok = false;
 				break;
@@ -980,16 +974,19 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 	foreach(lc, useful_eclass_list)
 	{
 		EquivalenceClass *cur_ec = lfirst(lc);
-		Expr	   *em_expr;
 		PathKey    *pathkey;
 
 		/* If redundant with what we did above, skip it. */
 		if (cur_ec == query_ec)
 			continue;
 
+		/* Can't push down the sort if the EC's opfamily is not shippable. */
+		if (!is_shippable(linitial_oid(cur_ec->ec_opfamilies),
+						  OperatorFamilyRelationId, fpinfo))
+			continue;
+
 		/* If no pushable expression for this rel, skip it. */
-		em_expr = find_em_expr_for_rel(cur_ec, rel);
-		if (em_expr == NULL || !is_foreign_expr(root, rel, em_expr))
+		if (find_em_for_rel(root, cur_ec, rel) == NULL)
 			continue;
 
 		/* Looks like we can generate a pathkey, so let's do it. */
@@ -6543,7 +6540,6 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 	{
 		PathKey    *pathkey = (PathKey *) lfirst(lc);
 		EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
-		Expr	   *sort_expr;
 
 		/*
 		 * is_foreign_expr would detect volatile expressions as well, but
@@ -6552,13 +6548,20 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 		if (pathkey_ec->ec_has_volatile)
 			return;
 
-		/* Get the sort expression for the pathkey_ec */
-		sort_expr = find_em_expr_for_input_target(root,
-												  pathkey_ec,
-												  input_rel->reltarget);
+		/*
+		 * Can't push down the sort if pathkey's opfamily is not shippable.
+		 */
+		if (!is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId,
+						  fpinfo))
+			return;
 
-		/* If it's unsafe to remote, we cannot push down the final sort */
-		if (!is_foreign_expr(root, input_rel, sort_expr))
+		/*
+		 * The EC must contain a shippable EM that is computed in input_rel's
+		 * reltarget, else we can't push down the sort.
+		 */
+		if (find_em_for_rel_target(root,
+								   pathkey_ec,
+								   input_rel) == NULL)
 			return;
 	}
 
@@ -7384,14 +7387,55 @@ conversion_error_callback(void *arg)
 }
 
 /*
- * Find an equivalence class member expression to be computed as a sort column
- * in the given target.
+ * Given an EquivalenceClass and a foreign relation, find an EC member
+ * that can be used to sort the relation remotely according to a pathkey
+ * using this EC.
+ *
+ * If there is more than one suitable candidate, return an arbitrary
+ * one of them.  If there is none, return NULL.
+ *
+ * This checks that the EC member expression uses only Vars from the given
+ * rel and is shippable.  Caller must separately verify that the pathkey's
+ * ordering operator is shippable.
+ */
+EquivalenceMember *
+find_em_for_rel(PlannerInfo *root, EquivalenceClass *ec, RelOptInfo *rel)
+{
+	ListCell   *lc;
+
+	foreach(lc, ec->ec_members)
+	{
+		EquivalenceMember *em = (EquivalenceMember *) lfirst(lc);
+
+		/*
+		 * Note we require !bms_is_empty, else we'd accept constant
+		 * expressions which are not suitable for the purpose.
+		 */
+		if (bms_is_subset(em->em_relids, rel->relids) &&
+			!bms_is_empty(em->em_relids) &&
+			is_foreign_expr(root, rel, em->em_expr))
+			return em;
+	}
+
+	return NULL;
+}
+
+/*
+ * Find an EquivalenceClass member that is to be computed as a sort column
+ * in the given rel's reltarget, and is shippable.
+ *
+ * If there is more than one suitable candidate, return an arbitrary
+ * one of them.  If there is none, return NULL.
+ *
+ * This checks that the EC member expression uses only Vars from the given
+ * rel and is shippable.  Caller must separately verify that the pathkey's
+ * ordering operator is shippable.
  */
-Expr *
-find_em_expr_for_input_target(PlannerInfo *root,
-							  EquivalenceClass *ec,
-							  PathTarget *target)
+EquivalenceMember *
+find_em_for_rel_target(PlannerInfo *root, EquivalenceClass *ec,
+					   RelOptInfo *rel)
 {
+	PathTarget *target = rel->reltarget;
 	ListCell   *lc1;
 	int			i;
 
@@ -7434,15 +7478,18 @@ find_em_expr_for_input_target(PlannerInfo *root,
 			while (em_expr && IsA(em_expr, RelabelType))
 				em_expr = ((RelabelType *) em_expr)->arg;
 
-			if (equal(em_expr, expr))
-				return em->em_expr;
+			if (!equal(em_expr, expr))
+				continue;
+
+			/* Check that expression (including relabels!) is shippable */
+			if (is_foreign_expr(root, rel, em->em_expr))
+				return em;
 		}
 
 		i++;
 	}
 
-	elog(ERROR, "could not find pathkey item to sort");
-	return NULL;				/* keep compiler quiet */
+	return NULL;
 }
 
 /*
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 8ae79e97e4..21f2b20ce8 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -173,6 +173,9 @@ extern bool is_foreign_expr(PlannerInfo *root,
 extern bool is_foreign_param(PlannerInfo *root,
 							 RelOptInfo *baserel,
 							 Expr *expr);
+extern bool is_foreign_pathkey(PlannerInfo *root,
+							   RelOptInfo *baserel,
+							   PathKey *pathkey);
 extern void deparseInsertSql(StringInfo buf, RangeTblEntry *rte,
 							 Index rtindex, Relation rel,
 							 List *targetAttrs, bool doNothing,
@@ -215,10 +218,12 @@ extern void deparseTruncateSql(StringInfo buf,
 							   DropBehavior behavior,
 							   bool restart_seqs);
 extern void deparseStringLiteral(StringInfo buf, const char *val);
-extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
-extern Expr *find_em_expr_for_input_target(PlannerInfo *root,
-										   EquivalenceClass *ec,
-										   PathTarget *target);
+extern EquivalenceMember *find_em_for_rel(PlannerInfo *root,
+										  EquivalenceClass *ec,
+										  RelOptInfo *rel);
+extern EquivalenceMember *find_em_for_rel_target(PlannerInfo *root,
+												 EquivalenceClass *ec,
+												 RelOptInfo *rel);
 extern List *build_tlist_to_deparse(RelOptInfo *foreignrel);
 extern void deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root,
 									RelOptInfo *foreignrel, List *tlist,
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 95b6b7192e..6b5de89e14 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -907,6 +907,10 @@ create operator class my_op_class for type int using btree family my_op_family a
 explain (verbose, costs off)
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 
+-- This should not be pushed either.
+explain (verbose, costs off)
+select * from ft2 order by c1 using operator(public.<^);
+
 -- Update local stats on ft2
 ANALYZE ft2;
 
@@ -924,6 +928,10 @@ explain (verbose, costs off)
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
 
+-- This should be pushed too.
+explain (verbose, costs off)
+select * from ft2 order by c1 using operator(public.<^);
+
 -- Remove from extension
 alter extension postgres_fdw drop operator class my_op_class using btree;
 alter extension postgres_fdw drop function my_op_cmp(a int, b int);
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 8c6770de97..ba1931c312 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -931,35 +931,6 @@ is_exprlist_member(Expr *node, List *exprs)
 	return false;
 }
 
-/*
- * Find an equivalence class member expression, all of whose Vars, come from
- * the indicated relation.
- */
-Expr *
-find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
-{
-	ListCell   *lc_em;
-
-	foreach(lc_em, ec->ec_members)
-	{
-		EquivalenceMember *em = lfirst(lc_em);
-
-		if (bms_is_subset(em->em_relids, rel->relids) &&
-			!bms_is_empty(em->em_relids))
-		{
-			/*
-			 * If there is more than one equivalence member whose Vars are
-			 * taken entirely from this relation, we'll be content to choose
-			 * any one of those.
-			 */
-			return em->em_expr;
-		}
-	}
-
-	/* We didn't find any suitable equivalence class expression */
-	return NULL;
-}
-
 /*
  * relation_can_be_sorted_early
  *		Can this relation be sorted on this EC before the final output step?
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 0c3a0b90c8..e313eb2138 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -143,7 +143,6 @@ extern EquivalenceMember *find_computable_ec_member(PlannerInfo *root,
 													List *exprs,
 													Relids relids,
 													bool require_parallel_safe);
-extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
 extern bool relation_can_be_sorted_early(PlannerInfo *root, RelOptInfo *rel,
 										 EquivalenceClass *ec,
 										 bool require_parallel_safe);
#21Ronan Dunklau
ronan.dunklau@aiven.io
In reply to: Tom Lane (#20)
Re: ORDER BY pushdowns seem broken in postgres_fdw

Le jeudi 31 mars 2022, 01:41:37 CEST Tom Lane a écrit :

I looked through this patch. It's going in the right direction,
but I have a couple of nitpicks:

Thank you Tom for taking a look at this.

I think that instead of doubling down on a wrong API, we should just
take that out and move the logic into postgres_fdw.c. This also has
the advantage of producing a patch that's much safer to backpatch,
because it doesn't rely on the core backend getting updated before
postgres_fdw.so is.

It makes total sense.

If no objections, I think this is committable.

No objections on my end.

--
Ronan Dunklau

#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ronan Dunklau (#21)
Re: ORDER BY pushdowns seem broken in postgres_fdw

Ronan Dunklau <ronan.dunklau@aiven.io> writes:

Le jeudi 31 mars 2022, 01:41:37 CEST Tom Lane a écrit :

If no objections, I think this is committable.

No objections on my end.

Pushed.

regards, tom lane