Avoid extra Sort nodes between WindowAggs when sorting can be reused

Started by Daniel Gustafssonover 7 years ago22 messages
#1Daniel Gustafsson
daniel@yesql.se
1 attachment(s)

Currently, we can only reuse Sort nodes between WindowAgg nodes iff the
partitioning and ordering clauses are identical. If a window Sort node
sortorder is a prefix of another window, we could however reuse the Sort node
to hopefully produce a cheaper plan. In src/backend/optimizer/plan/planner.c
there is a comment alluding to this:

* ...
*
* There is room to be much smarter here, for example detecting whether
* one window's sort keys are a prefix of another's (so that sorting for
* the latter would do for the former), or putting windows first that
* match a sort order available for the underlying query. For the moment
* we are content with meeting the spec.
*/

The attached patch takes a stab at implementing the sorting on partitioning/
ordering prefix, inspired by a similar optimization in the Greenplum planner.
In testing the impact on planning time seems quite minimal, or within the error
margin.

cheers ./daniel

Attachments:

window_prefix_sort.patchapplication/octet-stream; name=window_prefix_sort.patch; x-unix-mode=0644Download
From eba463f191d59485225f38b56d0597d750417957 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Wed, 30 May 2018 10:16:23 -0400
Subject: [PATCH] Order windows on partition/ordering prefix to reuse Sort
 nodes

By ordering the Windows on partitioning/ordering prefix, the result
from Sort nodes can be reused by multiple WindowAgg nodes and thus
we can avoid Sort nodes.
---
 src/backend/optimizer/plan/planner.c | 81 +++++++++++++++++++++---------------
 1 file changed, 47 insertions(+), 34 deletions(-)

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 67a2c7a581..4431d0c2a9 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -237,6 +237,8 @@ static void create_partitionwise_grouping_paths(PlannerInfo *root,
 static bool group_by_has_partkey(RelOptInfo *input_rel,
 					 List *targetList,
 					 List *groupClause);
+static int common_prefix_cmp(const void *a, const void *b);
+
 
 
 /*****************************************************************************
@@ -5251,7 +5253,6 @@ postprocess_setop_tlist(List *new_tlist, List *orig_tlist)
 static List *
 select_active_windows(PlannerInfo *root, WindowFuncLists *wflists)
 {
-	List	   *result;
 	List	   *actives;
 	ListCell   *lc;
 
@@ -5273,43 +5274,55 @@ select_active_windows(PlannerInfo *root, WindowFuncLists *wflists)
 	 * says that only one sort is to be used for such windows, even if they
 	 * are otherwise distinct (eg, different names or framing clauses).
 	 *
-	 * There is room to be much smarter here, for example detecting whether
-	 * one window's sort keys are a prefix of another's (so that sorting for
-	 * the latter would do for the former), or putting windows first that
-	 * match a sort order available for the underlying query.  For the moment
-	 * we are content with meeting the spec.
-	 */
-	result = NIL;
-	while (actives != NIL)
-	{
-		WindowClause *wc = linitial_node(WindowClause, actives);
-		ListCell   *prev;
-		ListCell   *next;
-
-		/* Move wc from actives to result */
-		actives = list_delete_first(actives);
-		result = lappend(result, wc);
+	 * Also sort windows by partitioning/ordering prefixes where the sorting
+	 * required for a window will cover the sorting required for another
+	 * window, thus reducing the number of Sort nodes.
+	 */
+	if (list_length(actives) > 1)
+		actives = list_qsort(actives, common_prefix_cmp);
 
-		/* Now move any matching windows from actives to result */
-		prev = NULL;
-		for (lc = list_head(actives); lc; lc = next)
-		{
-			WindowClause *wc2 = lfirst_node(WindowClause, lc);
+	return actives;
+}
 
-			next = lnext(lc);
-			/* framing options are NOT to be compared here! */
-			if (equal(wc->partitionClause, wc2->partitionClause) &&
-				equal(wc->orderClause, wc2->orderClause))
-			{
-				actives = list_delete_cell(actives, lc, prev);
-				result = lappend(result, wc2);
-			}
-			else
-				prev = lc;
-		}
+static int
+common_prefix_cmp(const void *a, const void *b)
+{
+	WindowClause *wca = (WindowClause *) lfirst(*(ListCell **) a);
+	WindowClause *wcb = (WindowClause *) lfirst(*(ListCell **) b);
+	List *aa = NIL;
+	List *bb = NIL;
+	ListCell *item_a;
+	ListCell *item_b;
+	int lla;
+	int llb;
+
+	aa = list_concat_unique(list_copy(wca->partitionClause), wca->orderClause);
+	bb = list_concat_unique(list_copy(wcb->partitionClause), wcb->orderClause);
+
+	forboth(item_a, aa, item_b, bb)
+	{
+		SortGroupClause *sca = lfirst(item_a);
+		SortGroupClause *scb = lfirst(item_b);
+
+		if (sca->tleSortGroupRef < scb->tleSortGroupRef)
+			return -1;
+		else if (sca->tleSortGroupRef > scb->tleSortGroupRef)
+			return 1;
+		else if (sca->sortop < scb->sortop)
+			return -1;
+		else if (sca->sortop > scb->sortop)
+			return 1;
 	}
 
-	return result;
+	lla = list_length(aa);
+	llb = list_length(bb);
+
+	if (lla < llb)
+		return -1;
+	else if (lla > llb)
+		return 1;
+
+	return 0;
 }
 
 /*
-- 
2.14.1.145.gb3622a4ee

#2Daniel Gustafsson
daniel@yesql.se
In reply to: Daniel Gustafsson (#1)
1 attachment(s)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

On 30 May 2018, at 18:19, Daniel Gustafsson <daniel@yesql.se> wrote:

Currently, we can only reuse Sort nodes between WindowAgg nodes iff the
partitioning and ordering clauses are identical. If a window Sort node
sortorder is a prefix of another window, we could however reuse the Sort node
to hopefully produce a cheaper plan. In src/backend/optimizer/plan/planner.c
there is a comment alluding to this:

* ...
*
* There is room to be much smarter here, for example detecting whether
* one window's sort keys are a prefix of another's (so that sorting for
* the latter would do for the former), or putting windows first that
* match a sort order available for the underlying query. For the moment
* we are content with meeting the spec.
*/

The attached patch takes a stab at implementing the sorting on partitioning/
ordering prefix, inspired by a similar optimization in the Greenplum planner.
In testing the impact on planning time seems quite minimal, or within the error
margin.

Attached is a rebased v2 addressing off-list review comments and including a
test. Parking this in the commitfest.

cheers ./daniel

Attachments:

window_prefix_sort-v2.patchapplication/octet-stream; name=window_prefix_sort-v2.patch; x-unix-mode=0644Download
From d8280d29646843070ed9e1ea0a28df09b8b55065 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Tue, 12 Jun 2018 14:13:48 +0200
Subject: [PATCH] Order windows on partition/ordering prefix to reuse Sort
 nodes

By ordering the Windows on partitioning/ordering prefix, the result
from Sort nodes can be reused by multiple WindowAgg nodes and thus
we can avoid Sort nodes.
---
 src/backend/optimizer/plan/planner.c | 106 +++++++++++++++++++++++++----------
 src/test/regress/expected/window.out |  45 ++++++++++-----
 src/test/regress/sql/window.sql      |   9 +++
 3 files changed, 118 insertions(+), 42 deletions(-)

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 67a2c7a581..9c44186177 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -110,6 +110,13 @@ typedef struct
 	int		   *tleref_to_colnum_map;
 } grouping_sets_data;
 
+typedef struct
+{
+	WindowClause *wc;
+	List	   *uniqueOrder;
+	int			uniqueLen;
+} WindowClauseSortNode;
+
 /* Local functions */
 static Node *preprocess_expression(PlannerInfo *root, Node *expr, int kind);
 static void preprocess_qual_conditions(PlannerInfo *root, Node *jtnode);
@@ -237,6 +244,8 @@ static void create_partitionwise_grouping_paths(PlannerInfo *root,
 static bool group_by_has_partkey(RelOptInfo *input_rel,
 					 List *targetList,
 					 List *groupClause);
+static int common_prefix_cmp(const void *a, const void *b);
+
 
 
 /*****************************************************************************
@@ -5254,9 +5263,11 @@ select_active_windows(PlannerInfo *root, WindowFuncLists *wflists)
 	List	   *result;
 	List	   *actives;
 	ListCell   *lc;
+	int			nelem;
 
 	/* First, make a list of the active windows */
 	actives = NIL;
+	nelem = 0;
 	foreach(lc, root->parse->windowClause)
 	{
 		WindowClause *wc = lfirst_node(WindowClause, lc);
@@ -5264,7 +5275,21 @@ select_active_windows(PlannerInfo *root, WindowFuncLists *wflists)
 		/* It's only active if wflists shows some related WindowFuncs */
 		Assert(wc->winref <= wflists->maxWinRef);
 		if (wflists->windowFuncs[wc->winref] != NIL)
-			actives = lappend(actives, wc);
+		{
+			WindowClauseSortNode *wcs = palloc(sizeof(WindowClauseSortNode));
+
+			wcs->wc = wc;
+			/*
+			 * Generating the uniqueOrder can be offloaded to the comparison
+			 * function to optimize for the case where we only have a single
+			 * window.  For now, optimize for readibility.
+			 */
+			wcs->uniqueOrder = list_concat_unique(
+							list_copy(wc->partitionClause), wc->orderClause);
+			wcs->uniqueLen = list_length(wcs->uniqueOrder);
+			actives = lappend(actives, wcs);
+			nelem++;
+		}
 	}
 
 	/*
@@ -5273,45 +5298,68 @@ select_active_windows(PlannerInfo *root, WindowFuncLists *wflists)
 	 * says that only one sort is to be used for such windows, even if they
 	 * are otherwise distinct (eg, different names or framing clauses).
 	 *
-	 * There is room to be much smarter here, for example detecting whether
-	 * one window's sort keys are a prefix of another's (so that sorting for
-	 * the latter would do for the former), or putting windows first that
-	 * match a sort order available for the underlying query.  For the moment
-	 * we are content with meeting the spec.
+	 * Also sort windows by partitioning/ordering prefixes where the sorting
+	 * required for a window will cover the sorting required for another
+	 * window, thus reducing the number of Sort nodes.  If we only have a
+	 * single window then skip the sorting.
 	 */
+	if (nelem > 1)
+		actives = list_qsort(actives, common_prefix_cmp);
+
 	result = NIL;
 	while (actives != NIL)
 	{
-		WindowClause *wc = linitial_node(WindowClause, actives);
-		ListCell   *prev;
-		ListCell   *next;
+		WindowClauseSortNode *wcs;
 
-		/* Move wc from actives to result */
-		actives = list_delete_first(actives);
-		result = lappend(result, wc);
-
-		/* Now move any matching windows from actives to result */
-		prev = NULL;
-		for (lc = list_head(actives); lc; lc = next)
-		{
-			WindowClause *wc2 = lfirst_node(WindowClause, lc);
+		/*
+		 * The standard a/b sortorder for a comparison function is backwards
+		 * to what we want here, but rather than confusing readers with a
+		 * comparison function that sorts a/b backwards we'll move the cells
+		 * from actives to result from the tail here.
+		 */
+		Assert(nelem > 0);
+		wcs = (WindowClauseSortNode *) lfirst(list_nth_cell(actives, --nelem));
+		actives = list_truncate(actives, nelem);
+		result = lappend(result, wcs->wc);
 
-			next = lnext(lc);
-			/* framing options are NOT to be compared here! */
-			if (equal(wc->partitionClause, wc2->partitionClause) &&
-				equal(wc->orderClause, wc2->orderClause))
-			{
-				actives = list_delete_cell(actives, lc, prev);
-				result = lappend(result, wc2);
-			}
-			else
-				prev = lc;
-		}
+		list_free(wcs->uniqueOrder);
+		pfree(wcs);
 	}
 
 	return result;
 }
 
+static int
+common_prefix_cmp(const void *a, const void *b)
+{
+	WindowClauseSortNode *wcsa = (WindowClauseSortNode *) lfirst(*(ListCell **) a);
+	WindowClauseSortNode *wcsb = (WindowClauseSortNode *) lfirst(*(ListCell **) b);
+	ListCell *item_a;
+	ListCell *item_b;
+
+	forboth(item_a, wcsa->uniqueOrder, item_b, wcsb->uniqueOrder)
+	{
+		SortGroupClause *sca = lfirst(item_a);
+		SortGroupClause *scb = lfirst(item_b);
+
+		if (sca->tleSortGroupRef < scb->tleSortGroupRef)
+			return -1;
+		else if (sca->tleSortGroupRef > scb->tleSortGroupRef)
+			return 1;
+		else if (sca->sortop < scb->sortop)
+			return -1;
+		else if (sca->sortop > scb->sortop)
+			return 1;
+	}
+
+	if (wcsa->uniqueLen < wcsb->uniqueLen)
+		return -1;
+	else if (wcsa->uniqueLen > wcsb->uniqueLen)
+		return 1;
+
+	return 0;
+}
+
 /*
  * make_window_input_target
  *	  Generate appropriate PathTarget for initial input to WindowAgg nodes.
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 85d81e7c9f..e2c504b2ed 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -504,9 +504,9 @@ SELECT sum(salary),
 FROM empsalary GROUP BY depname;
   sum  | row_number |  sum  
 -------+------------+-------
- 14600 |          3 | 14600
-  7400 |          2 | 22000
  25100 |          1 | 47100
+  7400 |          2 | 22000
+ 14600 |          3 | 14600
 (3 rows)
 
 -- identical windows with different names
@@ -2899,9 +2899,9 @@ SELECT sum(salary), row_number() OVER (ORDER BY depname), sum(
 FROM empsalary GROUP BY depname;
   sum  | row_number | filtered_sum |  depname  
 -------+------------+--------------+-----------
- 14600 |          3 |              | sales
-  7400 |          2 |         3500 | personnel
  25100 |          1 |        22600 | develop
+  7400 |          2 |         3500 | personnel
+ 14600 |          3 |              | sales
 (3 rows)
 
 -- Test pushdown of quals into a subquery containing window functions
@@ -2913,13 +2913,13 @@ SELECT * FROM
           min(salary) OVER (PARTITION BY depname || 'A', depname) depminsalary
    FROM empsalary) emp
 WHERE depname = 'sales';
-                             QUERY PLAN                              
----------------------------------------------------------------------
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
  Subquery Scan on emp
    ->  WindowAgg
-         ->  Sort
-               Sort Key: (((empsalary.depname)::text || 'A'::text))
-               ->  WindowAgg
+         ->  WindowAgg
+               ->  Sort
+                     Sort Key: (((empsalary.depname)::text || 'A'::text))
                      ->  Seq Scan on empsalary
                            Filter: ((depname)::text = 'sales'::text)
 (7 rows)
@@ -2932,19 +2932,38 @@ SELECT * FROM
           min(salary) OVER (PARTITION BY depname) depminsalary
    FROM empsalary) emp
 WHERE depname = 'sales';
-                        QUERY PLAN                         
------------------------------------------------------------
+                      QUERY PLAN                       
+-------------------------------------------------------
  Subquery Scan on emp
    Filter: ((emp.depname)::text = 'sales'::text)
    ->  WindowAgg
          ->  Sort
-               Sort Key: empsalary.depname
+               Sort Key: empsalary.enroll_date
                ->  WindowAgg
                      ->  Sort
-                           Sort Key: empsalary.enroll_date
+                           Sort Key: empsalary.depname
                            ->  Seq Scan on empsalary
 (9 rows)
 
+-- Test Sort node collapsing
+EXPLAIN (COSTS OFF)
+SELECT * FROM
+  (SELECT depname,
+          sum(salary) OVER (PARTITION BY depname order by empno) depsalary,
+          min(salary) OVER (PARTITION BY depname, empno order by enroll_date) depminsalary
+   FROM empsalary) emp
+WHERE depname = 'sales';
+                              QUERY PLAN                              
+----------------------------------------------------------------------
+ Subquery Scan on emp
+   ->  WindowAgg
+         ->  WindowAgg
+               ->  Sort
+                     Sort Key: empsalary.empno, empsalary.enroll_date
+                     ->  Seq Scan on empsalary
+                           Filter: ((depname)::text = 'sales'::text)
+(7 rows)
+
 -- cleanup
 DROP TABLE empsalary;
 -- test user-defined window function with named args and default args
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 051b50b2d3..dc757ba8b4 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -854,6 +854,15 @@ SELECT * FROM
    FROM empsalary) emp
 WHERE depname = 'sales';
 
+-- Test Sort node collapsing
+EXPLAIN (COSTS OFF)
+SELECT * FROM
+  (SELECT depname,
+          sum(salary) OVER (PARTITION BY depname order by empno) depsalary,
+          min(salary) OVER (PARTITION BY depname, empno order by enroll_date) depminsalary
+   FROM empsalary) emp
+WHERE depname = 'sales';
+
 -- cleanup
 DROP TABLE empsalary;
 
-- 
2.14.1.145.gb3622a4ee

#3Alexander Kuzmenkov
a.kuzmenkov@postgrespro.ru
In reply to: Daniel Gustafsson (#2)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

Daniel,

I took a look at the patch. It applies and compiles, the tests pass.

Some thoughts about the code:

* Postgres lists cache their lengths, so you don't need uniqueLen.

* Use an array of WindowClauseSortNode's instead of a list. It's more
suitable here because you are going to sort (qsort_list creates a
temporary array).

* Reversing the sorted list looks more confusing to me than just sorting
it in the proper order right away. A qsort() comparator returning
negative means "left goes before right", but here it is effectively the
other way around.

* This isn't relevant given the previous points, but to reverse a list,
you can walk it with foreach() and construct a reversed version with
lcons().

* There is a function named make_pathkeys_for_window that makes a list
of canonical pathkeys given a window clause. Using this function to give
you the pathkeys, and then comparing them, would be more future-proof in
case we ever start using hashing for windowing. Moreover, the canonical
pathkeys can be compared with pointer comparison which is much faster
than equal(). Still, I'm not sure whether it's going to be convenient to
use in this case, or even whether it is a right thing to do. What do you
think?

--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#4Daniel Gustafsson
daniel@yesql.se
In reply to: Alexander Kuzmenkov (#3)
1 attachment(s)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

On 26 Jun 2018, at 17:11, Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru> wrote:

I took a look at the patch. It applies and compiles, the tests pass.

Thanks for reviewing, and apologies for the slow response.

Some thoughts about the code:

* Postgres lists cache their lengths, so you don't need uniqueLen.

Good point, fixed.

* Use an array of WindowClauseSortNode's instead of a list. It's more suitable here because you are going to sort (qsort_list creates a temporary array).

I was thinking about that, but opted for code simplicity with a List approach.
The required size of the array isn’t known ahead of time, so it must either
potentially overallocate to the upper bound of root->parse->windowClause or use
heuristics and risk reallocating when growing, neither of which is terribly
appealing. Do you have any suggestions or preferences?

* Reversing the sorted list looks more confusing to me than just sorting it in the proper order right away. A qsort() comparator returning negative means "left goes before right", but here it is effectively the other way around.

Changed.

* There is a function named make_pathkeys_for_window that makes a list of canonical pathkeys given a window clause. Using this function to give you the pathkeys, and then comparing them, would be more future-proof in case we ever start using hashing for windowing. Moreover, the canonical pathkeys can be compared with pointer comparison which is much faster than equal(). Still, I'm not sure whether it's going to be convenient to use in this case, or even whether it is a right thing to do. What do you think?

That’s an interesting thought, one that didn’t occur to me while hacking. I’m
not sure whether is would be wise/clean to overload with this functionality
though.

Attached updated version also adds a testcase that was clearly missing from the
previous version and an updated window.out.

cheers ./daniel

Attachments:

window_prefix_sort-v3.patchapplication/octet-stream; name=window_prefix_sort-v3.patch; x-unix-mode=0644Download
From 90c4adb2db083718e5f2ac89b691120463241369 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Tue, 12 Jun 2018 14:13:48 +0200
Subject: [PATCH] Order windows on partition/ordering prefix to reuse Sort
 nodes

By ordering the Windows on partitioning/ordering prefix, the result
from Sort nodes can be reused by multiple WindowAgg nodes and thus
we can avoid Sort nodes.
---
 src/backend/optimizer/plan/planner.c | 91 +++++++++++++++++++++++++-----------
 src/test/regress/expected/window.out | 60 ++++++++++++++++++------
 src/test/regress/sql/window.sql      | 16 +++++++
 3 files changed, 126 insertions(+), 41 deletions(-)

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index fd45c9767d..11f7c81099 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -110,6 +110,12 @@ typedef struct
 	int		   *tleref_to_colnum_map;
 } grouping_sets_data;
 
+typedef struct
+{
+	WindowClause *wc;
+	List	   *uniqueOrder;
+} WindowClauseSortNode;
+
 /* Local functions */
 static Node *preprocess_expression(PlannerInfo *root, Node *expr, int kind);
 static void preprocess_qual_conditions(PlannerInfo *root, Node *jtnode);
@@ -237,6 +243,8 @@ static void create_partitionwise_grouping_paths(PlannerInfo *root,
 static bool group_by_has_partkey(RelOptInfo *input_rel,
 					 List *targetList,
 					 List *groupClause);
+static int common_prefix_cmp(const void *a, const void *b);
+
 
 
 /*****************************************************************************
@@ -5268,7 +5276,19 @@ select_active_windows(PlannerInfo *root, WindowFuncLists *wflists)
 		/* It's only active if wflists shows some related WindowFuncs */
 		Assert(wc->winref <= wflists->maxWinRef);
 		if (wflists->windowFuncs[wc->winref] != NIL)
-			actives = lappend(actives, wc);
+		{
+			WindowClauseSortNode *wcs = palloc(sizeof(WindowClauseSortNode));
+
+			wcs->wc = wc;
+			/*
+			 * Generating the uniqueOrder can be offloaded to the comparison
+			 * function to optimize for the case where we only have a single
+			 * window.  For now, optimize for readibility.
+			 */
+			wcs->uniqueOrder = list_concat_unique(
+							list_copy(wc->partitionClause), wc->orderClause);
+			actives = lappend(actives, wcs);
+		}
 	}
 
 	/*
@@ -5277,45 +5297,60 @@ select_active_windows(PlannerInfo *root, WindowFuncLists *wflists)
 	 * says that only one sort is to be used for such windows, even if they
 	 * are otherwise distinct (eg, different names or framing clauses).
 	 *
-	 * There is room to be much smarter here, for example detecting whether
-	 * one window's sort keys are a prefix of another's (so that sorting for
-	 * the latter would do for the former), or putting windows first that
-	 * match a sort order available for the underlying query.  For the moment
-	 * we are content with meeting the spec.
+	 * Also sort windows by partitioning/ordering prefixes where the sorting
+	 * required for a window will cover the sorting required for another
+	 * window, thus reducing the number of Sort nodes.  If we only have a
+	 * single window then skip the sorting.
 	 */
+	if (list_length(actives) > 1)
+		actives = list_qsort(actives, common_prefix_cmp);
+
 	result = NIL;
 	while (actives != NIL)
 	{
-		WindowClause *wc = linitial_node(WindowClause, actives);
-		ListCell   *prev;
-		ListCell   *next;
+		WindowClauseSortNode *wcs = (WindowClauseSortNode *) linitial(actives);
 
-		/* Move wc from actives to result */
 		actives = list_delete_first(actives);
-		result = lappend(result, wc);
-
-		/* Now move any matching windows from actives to result */
-		prev = NULL;
-		for (lc = list_head(actives); lc; lc = next)
-		{
-			WindowClause *wc2 = lfirst_node(WindowClause, lc);
+		result = lappend(result, wcs->wc);
 
-			next = lnext(lc);
-			/* framing options are NOT to be compared here! */
-			if (equal(wc->partitionClause, wc2->partitionClause) &&
-				equal(wc->orderClause, wc2->orderClause))
-			{
-				actives = list_delete_cell(actives, lc, prev);
-				result = lappend(result, wc2);
-			}
-			else
-				prev = lc;
-		}
+		list_free(wcs->uniqueOrder);
+		pfree(wcs);
 	}
 
 	return result;
 }
 
+static int
+common_prefix_cmp(const void *a, const void *b)
+{
+	WindowClauseSortNode *wcsa = (WindowClauseSortNode *) lfirst(*(ListCell **) a);
+	WindowClauseSortNode *wcsb = (WindowClauseSortNode *) lfirst(*(ListCell **) b);
+	ListCell *item_a;
+	ListCell *item_b;
+
+	forboth(item_a, wcsa->uniqueOrder, item_b, wcsb->uniqueOrder)
+	{
+		SortGroupClause *sca = lfirst(item_a);
+		SortGroupClause *scb = lfirst(item_b);
+
+		if (sca->tleSortGroupRef > scb->tleSortGroupRef)
+			return -1;
+		else if (sca->tleSortGroupRef < scb->tleSortGroupRef)
+			return 1;
+		else if (sca->sortop > scb->sortop)
+			return -1;
+		else if (sca->sortop < scb->sortop)
+			return 1;
+	}
+
+	if (list_length(wcsa->uniqueOrder) > list_length(wcsb->uniqueOrder))
+		return -1;
+	else if (list_length(wcsa->uniqueOrder) < list_length(wcsb->uniqueOrder))
+		return 1;
+
+	return 0;
+}
+
 /*
  * make_window_input_target
  *	  Generate appropriate PathTarget for initial input to WindowAgg nodes.
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 85d81e7c9f..eb67a77c85 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -504,9 +504,9 @@ SELECT sum(salary),
 FROM empsalary GROUP BY depname;
   sum  | row_number |  sum  
 -------+------------+-------
- 14600 |          3 | 14600
-  7400 |          2 | 22000
  25100 |          1 | 47100
+  7400 |          2 | 22000
+ 14600 |          3 | 14600
 (3 rows)
 
 -- identical windows with different names
@@ -2899,9 +2899,9 @@ SELECT sum(salary), row_number() OVER (ORDER BY depname), sum(
 FROM empsalary GROUP BY depname;
   sum  | row_number | filtered_sum |  depname  
 -------+------------+--------------+-----------
- 14600 |          3 |              | sales
-  7400 |          2 |         3500 | personnel
  25100 |          1 |        22600 | develop
+  7400 |          2 |         3500 | personnel
+ 14600 |          3 |              | sales
 (3 rows)
 
 -- Test pushdown of quals into a subquery containing window functions
@@ -2913,13 +2913,13 @@ SELECT * FROM
           min(salary) OVER (PARTITION BY depname || 'A', depname) depminsalary
    FROM empsalary) emp
 WHERE depname = 'sales';
-                             QUERY PLAN                              
----------------------------------------------------------------------
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
  Subquery Scan on emp
    ->  WindowAgg
-         ->  Sort
-               Sort Key: (((empsalary.depname)::text || 'A'::text))
-               ->  WindowAgg
+         ->  WindowAgg
+               ->  Sort
+                     Sort Key: (((empsalary.depname)::text || 'A'::text))
                      ->  Seq Scan on empsalary
                            Filter: ((depname)::text = 'sales'::text)
 (7 rows)
@@ -2932,19 +2932,53 @@ SELECT * FROM
           min(salary) OVER (PARTITION BY depname) depminsalary
    FROM empsalary) emp
 WHERE depname = 'sales';
-                        QUERY PLAN                         
------------------------------------------------------------
+                      QUERY PLAN                       
+-------------------------------------------------------
  Subquery Scan on emp
    Filter: ((emp.depname)::text = 'sales'::text)
    ->  WindowAgg
          ->  Sort
-               Sort Key: empsalary.depname
+               Sort Key: empsalary.enroll_date
                ->  WindowAgg
                      ->  Sort
-                           Sort Key: empsalary.enroll_date
+                           Sort Key: empsalary.depname
                            ->  Seq Scan on empsalary
 (9 rows)
 
+-- Test Sort node collapsing
+EXPLAIN (COSTS OFF)
+SELECT * FROM
+  (SELECT depname,
+          sum(salary) OVER (PARTITION BY depname order by empno) depsalary,
+          min(salary) OVER (PARTITION BY depname, empno order by enroll_date) depminsalary
+   FROM empsalary) emp
+WHERE depname = 'sales';
+                              QUERY PLAN                              
+----------------------------------------------------------------------
+ Subquery Scan on emp
+   ->  WindowAgg
+         ->  WindowAgg
+               ->  Sort
+                     Sort Key: empsalary.empno, empsalary.enroll_date
+                     ->  Seq Scan on empsalary
+                           Filter: ((depname)::text = 'sales'::text)
+(7 rows)
+
+-- Test Sort node reordering
+EXPLAIN (COSTS OFF)
+SELECT
+  lead(1) OVER (PARTITION BY depname ORDER BY salary, enroll_date),
+  lag(1) OVER (PARTITION BY depname ORDER BY salary,enroll_date,empno)
+  from empsalary;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ WindowAgg
+   ->  WindowAgg
+         ->  Sort
+               Sort Key: depname, salary, enroll_date, empno
+               ->  Seq Scan on empsalary
+(5 rows)
+
 -- cleanup
 DROP TABLE empsalary;
 -- test user-defined window function with named args and default args
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 051b50b2d3..2809aa991b 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -854,6 +854,22 @@ SELECT * FROM
    FROM empsalary) emp
 WHERE depname = 'sales';
 
+-- Test Sort node collapsing
+EXPLAIN (COSTS OFF)
+SELECT * FROM
+  (SELECT depname,
+          sum(salary) OVER (PARTITION BY depname order by empno) depsalary,
+          min(salary) OVER (PARTITION BY depname, empno order by enroll_date) depminsalary
+   FROM empsalary) emp
+WHERE depname = 'sales';
+
+-- Test Sort node reordering
+EXPLAIN (COSTS OFF)
+SELECT
+  lead(1) OVER (PARTITION BY depname ORDER BY salary, enroll_date),
+  lag(1) OVER (PARTITION BY depname ORDER BY salary,enroll_date,empno)
+  from empsalary;
+
 -- cleanup
 DROP TABLE empsalary;
 
-- 
2.14.1.145.gb3622a4ee

#5Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Daniel Gustafsson (#4)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

On Mon, Jul 2, 2018 at 5:25 PM, Daniel Gustafsson <daniel@yesql.se> wrote:

On 26 Jun 2018, at 17:11, Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru> wrote:

I took a look at the patch. It applies and compiles, the tests pass.

Thanks for reviewing, and apologies for the slow response.

Some thoughts about the code:

* Postgres lists cache their lengths, so you don't need uniqueLen.

Good point, fixed.

* Use an array of WindowClauseSortNode's instead of a list. It's more suitable here because you are going to sort (qsort_list creates a temporary array).

I was thinking about that, but opted for code simplicity with a List approach.
The required size of the array isn’t known ahead of time, so it must either
potentially overallocate to the upper bound of root->parse->windowClause or use
heuristics and risk reallocating when growing, neither of which is terribly
appealing. Do you have any suggestions or preferences?

* Reversing the sorted list looks more confusing to me than just sorting it in the proper order right away. A qsort() comparator returning negative means "left goes before right", but here it is effectively the other way around.

Changed.

* There is a function named make_pathkeys_for_window that makes a list of canonical pathkeys given a window clause. Using this function to give you the pathkeys, and then comparing them, would be more future-proof in case we ever start using hashing for windowing. Moreover, the canonical pathkeys can be compared with pointer comparison which is much faster than equal(). Still, I'm not sure whether it's going to be convenient to use in this case, or even whether it is a right thing to do. What do you think?

That’s an interesting thought, one that didn’t occur to me while hacking. I’m
not sure whether is would be wise/clean to overload with this functionality
though.

Attached updated version also adds a testcase that was clearly missing from the
previous version and an updated window.out.

cheers ./daniel

Thank you for updating the patch! There are two review comments.

The current select_active_windows() function compares the all fields
of WindowClause for the sorting but with this patch we compare only
tleSortGroupRef, sortop and the number of uniqueOrder. I think this
leads a degradation as follows.

=# explain select *, sum(b) over w1, sum(a) over w2, sum(b) over w3
from w window w1 as (partition by a order by a nulls first), w2 as
(partition by a order by a), w3 as (partition by a order by a nulls
first);

* Current HEAD
QUERY PLAN
-----------------------------------------------------------------------------------
WindowAgg (cost=369.16..414.36 rows=2260 width=32)
-> Sort (cost=369.16..374.81 rows=2260 width=24)
Sort Key: a
-> WindowAgg (cost=158.51..243.26 rows=2260 width=24)
-> WindowAgg (cost=158.51..203.71 rows=2260 width=16)
-> Sort (cost=158.51..164.16 rows=2260 width=8)
Sort Key: a NULLS FIRST
-> Seq Scan on w (cost=0.00..32.60
rows=2260 width=8)
(8 rows)

* With patch
QUERY PLAN
-----------------------------------------------------------------------------------------
WindowAgg 3 (cost=500.72..545.92 rows=2260 width=32)
-> Sort (cost=500.72..506.37 rows=2260 width=24)
Sort Key: a NULLS FIRST
-> WindowAgg 2 (cost=329.61..374.81 rows=2260 width=24)
-> Sort (cost=329.61..335.26 rows=2260 width=16)
Sort Key: a
-> WindowAgg 1 (cost=158.51..203.71 rows=2260 width=16)
-> Sort (cost=158.51..164.16 rows=2260 width=8)
Sort Key: a NULLS FIRST
-> Seq Scan on w (cost=0.00..32.60
rows=2260 width=8)
(10 rows)

---
+                        * Generating the uniqueOrder can be offloaded
to the comparison
+                        * function to optimize for the case where we
only have a single
+                        * window.  For now, optimize for readibility.

s/readibility/readability/

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#6Daniel Gustafsson
daniel@yesql.se
In reply to: Masahiko Sawada (#5)
1 attachment(s)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

On 2 Jul 2018, at 14:01, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you for updating the patch! There are two review comments.

Thanks for reviewing!

The current select_active_windows() function compares the all fields
of WindowClause for the sorting but with this patch we compare only
tleSortGroupRef, sortop and the number of uniqueOrder. I think this
leads a degradation as follows.

You are right, that was an oversight. The attached patch takes a stab at
fixing this.

s/readibility/readability/

Fixed.

cheers ./daniel

Attachments:

window_prefix_sort-v4.patchapplication/octet-stream; name=window_prefix_sort-v4.patch; x-unix-mode=0644Download
From 287072879a5a949cb7dc88e622a0a242410aea1f Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Tue, 12 Jun 2018 14:13:48 +0200
Subject: [PATCH] Order windows on partition/ordering prefix to reuse Sort
 nodes

By ordering the Windows on partitioning/ordering prefix, the result
from Sort nodes can be reused by multiple WindowAgg nodes and thus
we can avoid Sort nodes.
---
 src/backend/optimizer/plan/planner.c | 95 +++++++++++++++++++++++++-----------
 src/test/regress/expected/window.out | 60 ++++++++++++++++++-----
 src/test/regress/sql/window.sql      | 16 ++++++
 3 files changed, 130 insertions(+), 41 deletions(-)

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index fd45c9767d..884c46a598 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -110,6 +110,12 @@ typedef struct
 	int		   *tleref_to_colnum_map;
 } grouping_sets_data;
 
+typedef struct
+{
+	WindowClause *wc;
+	List	   *uniqueOrder;
+} WindowClauseSortNode;
+
 /* Local functions */
 static Node *preprocess_expression(PlannerInfo *root, Node *expr, int kind);
 static void preprocess_qual_conditions(PlannerInfo *root, Node *jtnode);
@@ -237,6 +243,8 @@ static void create_partitionwise_grouping_paths(PlannerInfo *root,
 static bool group_by_has_partkey(RelOptInfo *input_rel,
 					 List *targetList,
 					 List *groupClause);
+static int common_prefix_cmp(const void *a, const void *b);
+
 
 
 /*****************************************************************************
@@ -5268,7 +5276,19 @@ select_active_windows(PlannerInfo *root, WindowFuncLists *wflists)
 		/* It's only active if wflists shows some related WindowFuncs */
 		Assert(wc->winref <= wflists->maxWinRef);
 		if (wflists->windowFuncs[wc->winref] != NIL)
-			actives = lappend(actives, wc);
+		{
+			WindowClauseSortNode *wcs = palloc(sizeof(WindowClauseSortNode));
+
+			wcs->wc = wc;
+			/*
+			 * Generating the uniqueOrder can be offloaded to the comparison
+			 * function to optimize for the case where we only have a single
+			 * window.  For now, optimize for readability.
+			 */
+			wcs->uniqueOrder = list_concat_unique(
+							list_copy(wc->partitionClause), wc->orderClause);
+			actives = lappend(actives, wcs);
+		}
 	}
 
 	/*
@@ -5277,45 +5297,64 @@ select_active_windows(PlannerInfo *root, WindowFuncLists *wflists)
 	 * says that only one sort is to be used for such windows, even if they
 	 * are otherwise distinct (eg, different names or framing clauses).
 	 *
-	 * There is room to be much smarter here, for example detecting whether
-	 * one window's sort keys are a prefix of another's (so that sorting for
-	 * the latter would do for the former), or putting windows first that
-	 * match a sort order available for the underlying query.  For the moment
-	 * we are content with meeting the spec.
+	 * Also sort windows by partitioning/ordering prefixes where the sorting
+	 * required for a window will cover the sorting required for another
+	 * window, thus reducing the number of Sort nodes.  If we only have a
+	 * single window then skip the sorting.
 	 */
+	if (list_length(actives) > 1)
+		actives = list_qsort(actives, common_prefix_cmp);
+
 	result = NIL;
 	while (actives != NIL)
 	{
-		WindowClause *wc = linitial_node(WindowClause, actives);
-		ListCell   *prev;
-		ListCell   *next;
+		WindowClauseSortNode *wcs = (WindowClauseSortNode *) linitial(actives);
 
-		/* Move wc from actives to result */
 		actives = list_delete_first(actives);
-		result = lappend(result, wc);
-
-		/* Now move any matching windows from actives to result */
-		prev = NULL;
-		for (lc = list_head(actives); lc; lc = next)
-		{
-			WindowClause *wc2 = lfirst_node(WindowClause, lc);
+		result = lappend(result, wcs->wc);
 
-			next = lnext(lc);
-			/* framing options are NOT to be compared here! */
-			if (equal(wc->partitionClause, wc2->partitionClause) &&
-				equal(wc->orderClause, wc2->orderClause))
-			{
-				actives = list_delete_cell(actives, lc, prev);
-				result = lappend(result, wc2);
-			}
-			else
-				prev = lc;
-		}
+		list_free(wcs->uniqueOrder);
+		pfree(wcs);
 	}
 
 	return result;
 }
 
+static int
+common_prefix_cmp(const void *a, const void *b)
+{
+	WindowClauseSortNode *wcsa = (WindowClauseSortNode *) lfirst(*(ListCell **) a);
+	WindowClauseSortNode *wcsb = (WindowClauseSortNode *) lfirst(*(ListCell **) b);
+	ListCell *item_a;
+	ListCell *item_b;
+
+	forboth(item_a, wcsa->uniqueOrder, item_b, wcsb->uniqueOrder)
+	{
+		SortGroupClause *sca = lfirst(item_a);
+		SortGroupClause *scb = lfirst(item_b);
+
+		if (sca->tleSortGroupRef > scb->tleSortGroupRef)
+			return -1;
+		else if (sca->tleSortGroupRef < scb->tleSortGroupRef)
+			return 1;
+		else if (sca->sortop > scb->sortop)
+			return -1;
+		else if (sca->sortop < scb->sortop)
+			return 1;
+		else if (sca->nulls_first && !scb->nulls_first)
+			return -1;
+		else if (!sca->nulls_first && scb->nulls_first)
+			return 1;
+	}
+
+	if (list_length(wcsa->uniqueOrder) > list_length(wcsb->uniqueOrder))
+		return -1;
+	else if (list_length(wcsa->uniqueOrder) < list_length(wcsb->uniqueOrder))
+		return 1;
+
+	return 0;
+}
+
 /*
  * make_window_input_target
  *	  Generate appropriate PathTarget for initial input to WindowAgg nodes.
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 85d81e7c9f..eb67a77c85 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -504,9 +504,9 @@ SELECT sum(salary),
 FROM empsalary GROUP BY depname;
   sum  | row_number |  sum  
 -------+------------+-------
- 14600 |          3 | 14600
-  7400 |          2 | 22000
  25100 |          1 | 47100
+  7400 |          2 | 22000
+ 14600 |          3 | 14600
 (3 rows)
 
 -- identical windows with different names
@@ -2899,9 +2899,9 @@ SELECT sum(salary), row_number() OVER (ORDER BY depname), sum(
 FROM empsalary GROUP BY depname;
   sum  | row_number | filtered_sum |  depname  
 -------+------------+--------------+-----------
- 14600 |          3 |              | sales
-  7400 |          2 |         3500 | personnel
  25100 |          1 |        22600 | develop
+  7400 |          2 |         3500 | personnel
+ 14600 |          3 |              | sales
 (3 rows)
 
 -- Test pushdown of quals into a subquery containing window functions
@@ -2913,13 +2913,13 @@ SELECT * FROM
           min(salary) OVER (PARTITION BY depname || 'A', depname) depminsalary
    FROM empsalary) emp
 WHERE depname = 'sales';
-                             QUERY PLAN                              
----------------------------------------------------------------------
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
  Subquery Scan on emp
    ->  WindowAgg
-         ->  Sort
-               Sort Key: (((empsalary.depname)::text || 'A'::text))
-               ->  WindowAgg
+         ->  WindowAgg
+               ->  Sort
+                     Sort Key: (((empsalary.depname)::text || 'A'::text))
                      ->  Seq Scan on empsalary
                            Filter: ((depname)::text = 'sales'::text)
 (7 rows)
@@ -2932,19 +2932,53 @@ SELECT * FROM
           min(salary) OVER (PARTITION BY depname) depminsalary
    FROM empsalary) emp
 WHERE depname = 'sales';
-                        QUERY PLAN                         
------------------------------------------------------------
+                      QUERY PLAN                       
+-------------------------------------------------------
  Subquery Scan on emp
    Filter: ((emp.depname)::text = 'sales'::text)
    ->  WindowAgg
          ->  Sort
-               Sort Key: empsalary.depname
+               Sort Key: empsalary.enroll_date
                ->  WindowAgg
                      ->  Sort
-                           Sort Key: empsalary.enroll_date
+                           Sort Key: empsalary.depname
                            ->  Seq Scan on empsalary
 (9 rows)
 
+-- Test Sort node collapsing
+EXPLAIN (COSTS OFF)
+SELECT * FROM
+  (SELECT depname,
+          sum(salary) OVER (PARTITION BY depname order by empno) depsalary,
+          min(salary) OVER (PARTITION BY depname, empno order by enroll_date) depminsalary
+   FROM empsalary) emp
+WHERE depname = 'sales';
+                              QUERY PLAN                              
+----------------------------------------------------------------------
+ Subquery Scan on emp
+   ->  WindowAgg
+         ->  WindowAgg
+               ->  Sort
+                     Sort Key: empsalary.empno, empsalary.enroll_date
+                     ->  Seq Scan on empsalary
+                           Filter: ((depname)::text = 'sales'::text)
+(7 rows)
+
+-- Test Sort node reordering
+EXPLAIN (COSTS OFF)
+SELECT
+  lead(1) OVER (PARTITION BY depname ORDER BY salary, enroll_date),
+  lag(1) OVER (PARTITION BY depname ORDER BY salary,enroll_date,empno)
+  from empsalary;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ WindowAgg
+   ->  WindowAgg
+         ->  Sort
+               Sort Key: depname, salary, enroll_date, empno
+               ->  Seq Scan on empsalary
+(5 rows)
+
 -- cleanup
 DROP TABLE empsalary;
 -- test user-defined window function with named args and default args
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 051b50b2d3..2809aa991b 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -854,6 +854,22 @@ SELECT * FROM
    FROM empsalary) emp
 WHERE depname = 'sales';
 
+-- Test Sort node collapsing
+EXPLAIN (COSTS OFF)
+SELECT * FROM
+  (SELECT depname,
+          sum(salary) OVER (PARTITION BY depname order by empno) depsalary,
+          min(salary) OVER (PARTITION BY depname, empno order by enroll_date) depminsalary
+   FROM empsalary) emp
+WHERE depname = 'sales';
+
+-- Test Sort node reordering
+EXPLAIN (COSTS OFF)
+SELECT
+  lead(1) OVER (PARTITION BY depname ORDER BY salary, enroll_date),
+  lag(1) OVER (PARTITION BY depname ORDER BY salary,enroll_date,empno)
+  from empsalary;
+
 -- cleanup
 DROP TABLE empsalary;
 
-- 
2.14.1.145.gb3622a4ee

#7Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Daniel Gustafsson (#6)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

On Tue, Jul 3, 2018 at 6:19 AM, Daniel Gustafsson <daniel@yesql.se> wrote:

On 2 Jul 2018, at 14:01, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you for updating the patch! There are two review comments.

Thanks for reviewing!

The current select_active_windows() function compares the all fields
of WindowClause for the sorting but with this patch we compare only
tleSortGroupRef, sortop and the number of uniqueOrder. I think this
leads a degradation as follows.

You are right, that was an oversight. The attached patch takes a stab at
fixing this.

s/readibility/readability/

Fixed.

Thank you for updating the patch.

+               if (sca->tleSortGroupRef > scb->tleSortGroupRef)
+                       return -1;
+               else if (sca->tleSortGroupRef < scb->tleSortGroupRef)
+                       return 1;
+               else if (sca->sortop > scb->sortop)
+                       return -1;
+               else if (sca->sortop < scb->sortop)
+                       return 1;
+               else if (sca->nulls_first && !scb->nulls_first)
+                       return -1;
+               else if (!sca->nulls_first && scb->nulls_first)
+                       return 1;

Hmm, this is missing the eqop fields of SortGroupClause. I haven't
tested yet but does the similar degradation happen if two
SortGroupCaluses have different eqop and the same other values?

--
The source code comments for common_prefix_cmp() function and
WindowClauseSortNode struct is needed.

--
+-- Test Sort node reordering
+EXPLAIN (COSTS OFF)
+SELECT
+  lead(1) OVER (PARTITION BY depname ORDER BY salary, enroll_date),
+  lag(1) OVER (PARTITION BY depname ORDER BY salary,enroll_date,empno)
+  from empsalary;

I think it's better to change "from empsalary" to "FROM empsalary" for
consistency with other code.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#8Daniel Gustafsson
daniel@yesql.se
In reply to: Masahiko Sawada (#7)
1 attachment(s)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

On 3 Jul 2018, at 12:24, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you for updating the patch.

Thanks for reviewing!

Hmm, this is missing the eqop fields of SortGroupClause. I haven't
tested yet but does the similar degradation happen if two
SortGroupCaluses have different eqop and the same other values?

I wasn’t able to construct a case showing this, but I also think you’re right.
Do you have an idea of a query that can trigger a regression? The attached
patch adds a stab at this, but I’m not sure if it’s the right approach.

The source code comments for common_prefix_cmp() function and
WindowClauseSortNode struct is needed.

Fixed.

+ from empsalary;

I think it's better to change "from empsalary" to "FROM empsalary" for
consistency with other code.

Yes, that was a silly oversight. Fixed.

cheers ./daniel

Attachments:

window_prefix_sort-v5.patchapplication/octet-stream; name=window_prefix_sort-v5.patch; x-unix-mode=0644Download
From 03ded4cb2adc4c262d3e12f87bc75bffa79fed5e Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Tue, 12 Jun 2018 14:13:48 +0200
Subject: [PATCH] Order windows on partition/ordering prefix to reuse Sort
 nodes

By ordering the Windows on partitioning/ordering prefix, the result
from Sort nodes can be reused by multiple WindowAgg nodes and thus
we can avoid Sort nodes.
---
 src/backend/optimizer/plan/planner.c | 113 ++++++++++++++++++++++++++---------
 src/test/regress/expected/window.out |  60 +++++++++++++++----
 src/test/regress/sql/window.sql      |  16 +++++
 3 files changed, 149 insertions(+), 40 deletions(-)

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index df4ec448cb..0d2f8c410b 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -110,6 +110,17 @@ typedef struct
 	int		   *tleref_to_colnum_map;
 } grouping_sets_data;
 
+/*
+ * Temporary structure for use during WindowClause reordering in order to be
+ * be able to sort WindowClauses on partitioning/ordering prefix.
+ */
+typedef struct
+{
+	WindowClause *wc;
+	List	   *uniqueOrder;	/* A List of unique ordering/partitioning
+								   clauses per Window */
+} WindowClauseSortNode;
+
 /* Local functions */
 static Node *preprocess_expression(PlannerInfo *root, Node *expr, int kind);
 static void preprocess_qual_conditions(PlannerInfo *root, Node *jtnode);
@@ -237,6 +248,8 @@ static void create_partitionwise_grouping_paths(PlannerInfo *root,
 static bool group_by_has_partkey(RelOptInfo *input_rel,
 					 List *targetList,
 					 List *groupClause);
+static int common_prefix_cmp(const void *a, const void *b);
+
 
 
 /*****************************************************************************
@@ -5266,7 +5279,19 @@ select_active_windows(PlannerInfo *root, WindowFuncLists *wflists)
 		/* It's only active if wflists shows some related WindowFuncs */
 		Assert(wc->winref <= wflists->maxWinRef);
 		if (wflists->windowFuncs[wc->winref] != NIL)
-			actives = lappend(actives, wc);
+		{
+			WindowClauseSortNode *wcs = palloc(sizeof(WindowClauseSortNode));
+
+			wcs->wc = wc;
+			/*
+			 * Generating the uniqueOrder can be offloaded to the comparison
+			 * function to optimize for the case where we only have a single
+			 * window.  For now, optimize for readability.
+			 */
+			wcs->uniqueOrder = list_concat_unique(
+							list_copy(wc->partitionClause), wc->orderClause);
+			actives = lappend(actives, wcs);
+		}
 	}
 
 	/*
@@ -5275,43 +5300,77 @@ select_active_windows(PlannerInfo *root, WindowFuncLists *wflists)
 	 * says that only one sort is to be used for such windows, even if they
 	 * are otherwise distinct (eg, different names or framing clauses).
 	 *
-	 * There is room to be much smarter here, for example detecting whether
-	 * one window's sort keys are a prefix of another's (so that sorting for
-	 * the latter would do for the former), or putting windows first that
-	 * match a sort order available for the underlying query.  For the moment
-	 * we are content with meeting the spec.
+	 * Also sort windows by partitioning/ordering prefixes where the sorting
+	 * required for a window will cover the sorting required for another
+	 * window, thus reducing the number of Sort nodes.  If we only have a
+	 * single window then skip the sorting.
 	 */
+	if (list_length(actives) > 1)
+		actives = list_qsort(actives, common_prefix_cmp);
+
 	result = NIL;
 	while (actives != NIL)
 	{
-		WindowClause *wc = linitial_node(WindowClause, actives);
-		ListCell   *prev;
-		ListCell   *next;
+		WindowClauseSortNode *wcs = (WindowClauseSortNode *) linitial(actives);
 
-		/* Move wc from actives to result */
 		actives = list_delete_first(actives);
-		result = lappend(result, wc);
+		result = lappend(result, wcs->wc);
 
-		/* Now move any matching windows from actives to result */
-		prev = NULL;
-		for (lc = list_head(actives); lc; lc = next)
-		{
-			WindowClause *wc2 = lfirst_node(WindowClause, lc);
+		list_free(wcs->uniqueOrder);
+		pfree(wcs);
+	}
 
-			next = lnext(lc);
-			/* framing options are NOT to be compared here! */
-			if (equal(wc->partitionClause, wc2->partitionClause) &&
-				equal(wc->orderClause, wc2->orderClause))
-			{
-				actives = list_delete_cell(actives, lc, prev);
-				result = lappend(result, wc2);
-			}
-			else
-				prev = lc;
+	return result;
+}
+
+/*
+ *
+ * common_prefix_cmp
+ *	  QSort comparison function for WindowClauseSortNodes
+ *
+ * Ensure that windows are sorted by identical partitioning/ordering clauses,
+ * but that by partitioning/ordering prefix in order to reuse Sort nodes.
+ */
+static int
+common_prefix_cmp(const void *a, const void *b)
+{
+	WindowClauseSortNode *wcsa = (WindowClauseSortNode *) lfirst(*(ListCell **) a);
+	WindowClauseSortNode *wcsb = (WindowClauseSortNode *) lfirst(*(ListCell **) b);
+	ListCell *item_a;
+	ListCell *item_b;
+
+	forboth(item_a, wcsa->uniqueOrder, item_b, wcsb->uniqueOrder)
+	{
+		SortGroupClause *sca = lfirst(item_a);
+		SortGroupClause *scb = lfirst(item_b);
+
+		if (sca->tleSortGroupRef > scb->tleSortGroupRef)
+			return -1;
+		else if (sca->tleSortGroupRef < scb->tleSortGroupRef)
+			return 1;
+		else if (sca->sortop > scb->sortop)
+			return -1;
+		else if (sca->sortop < scb->sortop)
+			return 1;
+		else if (equality_ops_are_compatible(sca->eqop, scb->eqop))
+		{
+			if (sca->eqop > scb->eqop)
+				return -1;
+			else if (sca->eqop < scb->eqop)
+				return 1;
 		}
+		else if (sca->nulls_first && !scb->nulls_first)
+			return -1;
+		else if (!sca->nulls_first && scb->nulls_first)
+			return 1;
 	}
 
-	return result;
+	if (list_length(wcsa->uniqueOrder) > list_length(wcsb->uniqueOrder))
+		return -1;
+	else if (list_length(wcsa->uniqueOrder) < list_length(wcsb->uniqueOrder))
+		return 1;
+
+	return 0;
 }
 
 /*
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 562006a2b8..662d348653 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -504,9 +504,9 @@ SELECT sum(salary),
 FROM empsalary GROUP BY depname;
   sum  | row_number |  sum  
 -------+------------+-------
- 14600 |          3 | 14600
-  7400 |          2 | 22000
  25100 |          1 | 47100
+  7400 |          2 | 22000
+ 14600 |          3 | 14600
 (3 rows)
 
 -- identical windows with different names
@@ -2994,9 +2994,9 @@ SELECT sum(salary), row_number() OVER (ORDER BY depname), sum(
 FROM empsalary GROUP BY depname;
   sum  | row_number | filtered_sum |  depname  
 -------+------------+--------------+-----------
- 14600 |          3 |              | sales
-  7400 |          2 |         3500 | personnel
  25100 |          1 |        22600 | develop
+  7400 |          2 |         3500 | personnel
+ 14600 |          3 |              | sales
 (3 rows)
 
 -- Test pushdown of quals into a subquery containing window functions
@@ -3008,13 +3008,13 @@ SELECT * FROM
           min(salary) OVER (PARTITION BY depname || 'A', depname) depminsalary
    FROM empsalary) emp
 WHERE depname = 'sales';
-                             QUERY PLAN                              
----------------------------------------------------------------------
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
  Subquery Scan on emp
    ->  WindowAgg
-         ->  Sort
-               Sort Key: (((empsalary.depname)::text || 'A'::text))
-               ->  WindowAgg
+         ->  WindowAgg
+               ->  Sort
+                     Sort Key: (((empsalary.depname)::text || 'A'::text))
                      ->  Seq Scan on empsalary
                            Filter: ((depname)::text = 'sales'::text)
 (7 rows)
@@ -3027,19 +3027,53 @@ SELECT * FROM
           min(salary) OVER (PARTITION BY depname) depminsalary
    FROM empsalary) emp
 WHERE depname = 'sales';
-                        QUERY PLAN                         
------------------------------------------------------------
+                      QUERY PLAN                       
+-------------------------------------------------------
  Subquery Scan on emp
    Filter: ((emp.depname)::text = 'sales'::text)
    ->  WindowAgg
          ->  Sort
-               Sort Key: empsalary.depname
+               Sort Key: empsalary.enroll_date
                ->  WindowAgg
                      ->  Sort
-                           Sort Key: empsalary.enroll_date
+                           Sort Key: empsalary.depname
                            ->  Seq Scan on empsalary
 (9 rows)
 
+-- Test Sort node collapsing
+EXPLAIN (COSTS OFF)
+SELECT * FROM
+  (SELECT depname,
+          sum(salary) OVER (PARTITION BY depname order by empno) depsalary,
+          min(salary) OVER (PARTITION BY depname, empno order by enroll_date) depminsalary
+   FROM empsalary) emp
+WHERE depname = 'sales';
+                              QUERY PLAN                              
+----------------------------------------------------------------------
+ Subquery Scan on emp
+   ->  WindowAgg
+         ->  WindowAgg
+               ->  Sort
+                     Sort Key: empsalary.empno, empsalary.enroll_date
+                     ->  Seq Scan on empsalary
+                           Filter: ((depname)::text = 'sales'::text)
+(7 rows)
+
+-- Test Sort node reordering
+EXPLAIN (COSTS OFF)
+SELECT
+  lead(1) OVER (PARTITION BY depname ORDER BY salary, enroll_date),
+  lag(1) OVER (PARTITION BY depname ORDER BY salary,enroll_date,empno)
+FROM empsalary;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ WindowAgg
+   ->  WindowAgg
+         ->  Sort
+               Sort Key: depname, salary, enroll_date, empno
+               ->  Seq Scan on empsalary
+(5 rows)
+
 -- cleanup
 DROP TABLE empsalary;
 -- test user-defined window function with named args and default args
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index e2943a38f1..fc6d4cc903 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -892,6 +892,22 @@ SELECT * FROM
    FROM empsalary) emp
 WHERE depname = 'sales';
 
+-- Test Sort node collapsing
+EXPLAIN (COSTS OFF)
+SELECT * FROM
+  (SELECT depname,
+          sum(salary) OVER (PARTITION BY depname order by empno) depsalary,
+          min(salary) OVER (PARTITION BY depname, empno order by enroll_date) depminsalary
+   FROM empsalary) emp
+WHERE depname = 'sales';
+
+-- Test Sort node reordering
+EXPLAIN (COSTS OFF)
+SELECT
+  lead(1) OVER (PARTITION BY depname ORDER BY salary, enroll_date),
+  lag(1) OVER (PARTITION BY depname ORDER BY salary,enroll_date,empno)
+FROM empsalary;
+
 -- cleanup
 DROP TABLE empsalary;
 
-- 
2.14.1.145.gb3622a4ee

#9Alexander Kuzmenkov
a.kuzmenkov@postgrespro.ru
In reply to: Daniel Gustafsson (#8)
1 attachment(s)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

Daniel,

Thanks for the update.

On 07/25/2018 01:37 AM, Daniel Gustafsson wrote:

Hmm, this is missing the eqop fields of SortGroupClause. I haven't
tested yet but does the similar degradation happen if two
SortGroupCaluses have different eqop and the same other values?

I wasn’t able to construct a case showing this, but I also think you’re right.
Do you have an idea of a query that can trigger a regression? The attached
patch adds a stab at this, but I’m not sure if it’s the right approach.

To trigger that, in your test example you could order by empno::int8 for
one window, and by empno::int2 for another. But don't I think you have
to compare eqop here, because if eqop differs, sortop will differ too. I
removed the comparison from the patch. I also clarified (I hope) the
comments, and did the optimization I mentioned earlier: using array
instead of list for active clauses. Please see the attached v6.
Otherwise I think the patch is good.

--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

0001-Order-windows-on-partition-ordering-prefix-to-reuse-v6.patchtext/x-patch; name=0001-Order-windows-on-partition-ordering-prefix-to-reuse-v6.patchDownload
From e6006da62d7d4c0c54750260e864e52e93e2dba9 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Tue, 12 Jun 2018 14:13:48 +0200
Subject: [PATCH] Order windows on partition/ordering prefix to reuse Sort
 nodes

By ordering the Windows on partitioning/ordering prefix, the result
from Sort nodes can be reused by multiple WindowAgg nodes and thus
we can avoid Sort nodes.
---
 src/backend/optimizer/plan/planner.c | 126 ++++++++++++++++++++++-------------
 src/test/regress/expected/window.out |  60 +++++++++++++----
 src/test/regress/sql/window.sql      |  16 +++++
 3 files changed, 144 insertions(+), 58 deletions(-)

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index df4ec44..299149f 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -110,6 +110,17 @@ typedef struct
 	int		   *tleref_to_colnum_map;
 } grouping_sets_data;
 
+/*
+ * Temporary structure for use during WindowClause reordering in order to be
+ * be able to sort WindowClauses on partitioning/ordering prefix.
+ */
+typedef struct
+{
+	WindowClause *wc;
+	List	   *uniqueOrder;	/* A List of unique ordering/partitioning
+								   clauses per Window */
+} WindowClauseSortNode;
+
 /* Local functions */
 static Node *preprocess_expression(PlannerInfo *root, Node *expr, int kind);
 static void preprocess_qual_conditions(PlannerInfo *root, Node *jtnode);
@@ -237,6 +248,7 @@ static void create_partitionwise_grouping_paths(PlannerInfo *root,
 static bool group_by_has_partkey(RelOptInfo *input_rel,
 					 List *targetList,
 					 List *groupClause);
+static int common_prefix_cmp(const void *a, const void *b);
 
 
 /*****************************************************************************
@@ -5253,68 +5265,92 @@ postprocess_setop_tlist(List *new_tlist, List *orig_tlist)
 static List *
 select_active_windows(PlannerInfo *root, WindowFuncLists *wflists)
 {
-	List	   *result;
-	List	   *actives;
+	List	   *result = NIL;
 	ListCell   *lc;
+	WindowClauseSortNode *actives = palloc(sizeof(WindowClauseSortNode)
+									* list_length(root->parse->windowClause));
+	int nActive = 0;
 
 	/* First, make a list of the active windows */
-	actives = NIL;
-	foreach(lc, root->parse->windowClause)
+	foreach (lc, root->parse->windowClause)
 	{
 		WindowClause *wc = lfirst_node(WindowClause, lc);
 
 		/* It's only active if wflists shows some related WindowFuncs */
 		Assert(wc->winref <= wflists->maxWinRef);
-		if (wflists->windowFuncs[wc->winref] != NIL)
-			actives = lappend(actives, wc);
+		if (wflists->windowFuncs[wc->winref] == NIL)
+			continue;
+
+		actives[nActive].wc = wc;
+		actives[nActive].uniqueOrder = list_concat_unique(
+							list_copy(wc->partitionClause), wc->orderClause);
+		nActive++;
 	}
 
 	/*
-	 * Now, ensure that windows with identical partitioning/ordering clauses
-	 * are adjacent in the list.  This is required by the SQL standard, which
-	 * says that only one sort is to be used for such windows, even if they
-	 * are otherwise distinct (eg, different names or framing clauses).
-	 *
-	 * There is room to be much smarter here, for example detecting whether
-	 * one window's sort keys are a prefix of another's (so that sorting for
-	 * the latter would do for the former), or putting windows first that
-	 * match a sort order available for the underlying query.  For the moment
-	 * we are content with meeting the spec.
-	 */
-	result = NIL;
-	while (actives != NIL)
-	{
-		WindowClause *wc = linitial_node(WindowClause, actives);
-		ListCell   *prev;
-		ListCell   *next;
-
-		/* Move wc from actives to result */
-		actives = list_delete_first(actives);
-		result = lappend(result, wc);
-
-		/* Now move any matching windows from actives to result */
-		prev = NULL;
-		for (lc = list_head(actives); lc; lc = next)
-		{
-			WindowClause *wc2 = lfirst_node(WindowClause, lc);
+	 * Sort windows by their partitioning/ordering clauses, so that the windows
+	 * that need the same sorting are adjacent in the list. This is required by
+	 * the SQL standard, which says that only one sort is to be used for such
+	 * windows, even if they are otherwise distinct (eg, different names or
+	 * framing clauses). Additionally, if the entire list of clauses of one
+	 * window is a prefix of another, put first the window with stronger sorting
+	 * requirements. This way we will first sort for stronger window, and won't
+	 * have to sort again for the weaker one.
+	 */
+	qsort(actives, nActive, sizeof(WindowClauseSortNode), common_prefix_cmp);
 
-			next = lnext(lc);
-			/* framing options are NOT to be compared here! */
-			if (equal(wc->partitionClause, wc2->partitionClause) &&
-				equal(wc->orderClause, wc2->orderClause))
-			{
-				actives = list_delete_cell(actives, lc, prev);
-				result = lappend(result, wc2);
-			}
-			else
-				prev = lc;
-		}
-	}
+	for (; nActive > 0; nActive--)
+		result = lcons(actives[nActive - 1].wc, result);
+
+	pfree(actives);
 
 	return result;
 }
 
 /*
+ * common_prefix_cmp
+ *	  QSort comparison function for WindowClauseSortNodes
+ *
+ * Sort the windows by the required sorting clauses. First, compare the sort
+ * clauses themselves. Second, if one window's clauses are a prefix of another
+ * one's clauses, put the window with more sort clauses first.
+ */
+static int
+common_prefix_cmp(const void *a, const void *b)
+{
+	WindowClauseSortNode *wcsa = (WindowClauseSortNode *) a;
+	WindowClauseSortNode *wcsb = (WindowClauseSortNode *) b;
+	ListCell *item_a;
+	ListCell *item_b;
+
+	forboth(item_a, wcsa->uniqueOrder, item_b, wcsb->uniqueOrder)
+	{
+		SortGroupClause *sca = lfirst(item_a);
+		SortGroupClause *scb = lfirst(item_b);
+
+		if (sca->tleSortGroupRef > scb->tleSortGroupRef)
+			return -1;
+		else if (sca->tleSortGroupRef < scb->tleSortGroupRef)
+			return 1;
+		else if (sca->sortop > scb->sortop)
+			return -1;
+		else if (sca->sortop < scb->sortop)
+			return 1;
+		else if (sca->nulls_first && !scb->nulls_first)
+			return -1;
+		else if (!sca->nulls_first && scb->nulls_first)
+			return 1;
+	}
+
+	if (list_length(wcsa->uniqueOrder) > list_length(wcsb->uniqueOrder))
+		return -1;
+	else if (list_length(wcsa->uniqueOrder) < list_length(wcsb->uniqueOrder))
+		return 1;
+
+	return 0;
+}
+
+/*
  * make_window_input_target
  *	  Generate appropriate PathTarget for initial input to WindowAgg nodes.
  *
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 562006a..662d348 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -504,9 +504,9 @@ SELECT sum(salary),
 FROM empsalary GROUP BY depname;
   sum  | row_number |  sum  
 -------+------------+-------
- 14600 |          3 | 14600
-  7400 |          2 | 22000
  25100 |          1 | 47100
+  7400 |          2 | 22000
+ 14600 |          3 | 14600
 (3 rows)
 
 -- identical windows with different names
@@ -2994,9 +2994,9 @@ SELECT sum(salary), row_number() OVER (ORDER BY depname), sum(
 FROM empsalary GROUP BY depname;
   sum  | row_number | filtered_sum |  depname  
 -------+------------+--------------+-----------
- 14600 |          3 |              | sales
-  7400 |          2 |         3500 | personnel
  25100 |          1 |        22600 | develop
+  7400 |          2 |         3500 | personnel
+ 14600 |          3 |              | sales
 (3 rows)
 
 -- Test pushdown of quals into a subquery containing window functions
@@ -3008,13 +3008,13 @@ SELECT * FROM
           min(salary) OVER (PARTITION BY depname || 'A', depname) depminsalary
    FROM empsalary) emp
 WHERE depname = 'sales';
-                             QUERY PLAN                              
----------------------------------------------------------------------
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
  Subquery Scan on emp
    ->  WindowAgg
-         ->  Sort
-               Sort Key: (((empsalary.depname)::text || 'A'::text))
-               ->  WindowAgg
+         ->  WindowAgg
+               ->  Sort
+                     Sort Key: (((empsalary.depname)::text || 'A'::text))
                      ->  Seq Scan on empsalary
                            Filter: ((depname)::text = 'sales'::text)
 (7 rows)
@@ -3027,19 +3027,53 @@ SELECT * FROM
           min(salary) OVER (PARTITION BY depname) depminsalary
    FROM empsalary) emp
 WHERE depname = 'sales';
-                        QUERY PLAN                         
------------------------------------------------------------
+                      QUERY PLAN                       
+-------------------------------------------------------
  Subquery Scan on emp
    Filter: ((emp.depname)::text = 'sales'::text)
    ->  WindowAgg
          ->  Sort
-               Sort Key: empsalary.depname
+               Sort Key: empsalary.enroll_date
                ->  WindowAgg
                      ->  Sort
-                           Sort Key: empsalary.enroll_date
+                           Sort Key: empsalary.depname
                            ->  Seq Scan on empsalary
 (9 rows)
 
+-- Test Sort node collapsing
+EXPLAIN (COSTS OFF)
+SELECT * FROM
+  (SELECT depname,
+          sum(salary) OVER (PARTITION BY depname order by empno) depsalary,
+          min(salary) OVER (PARTITION BY depname, empno order by enroll_date) depminsalary
+   FROM empsalary) emp
+WHERE depname = 'sales';
+                              QUERY PLAN                              
+----------------------------------------------------------------------
+ Subquery Scan on emp
+   ->  WindowAgg
+         ->  WindowAgg
+               ->  Sort
+                     Sort Key: empsalary.empno, empsalary.enroll_date
+                     ->  Seq Scan on empsalary
+                           Filter: ((depname)::text = 'sales'::text)
+(7 rows)
+
+-- Test Sort node reordering
+EXPLAIN (COSTS OFF)
+SELECT
+  lead(1) OVER (PARTITION BY depname ORDER BY salary, enroll_date),
+  lag(1) OVER (PARTITION BY depname ORDER BY salary,enroll_date,empno)
+FROM empsalary;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ WindowAgg
+   ->  WindowAgg
+         ->  Sort
+               Sort Key: depname, salary, enroll_date, empno
+               ->  Seq Scan on empsalary
+(5 rows)
+
 -- cleanup
 DROP TABLE empsalary;
 -- test user-defined window function with named args and default args
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index e2943a3..fc6d4cc 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -892,6 +892,22 @@ SELECT * FROM
    FROM empsalary) emp
 WHERE depname = 'sales';
 
+-- Test Sort node collapsing
+EXPLAIN (COSTS OFF)
+SELECT * FROM
+  (SELECT depname,
+          sum(salary) OVER (PARTITION BY depname order by empno) depsalary,
+          min(salary) OVER (PARTITION BY depname, empno order by enroll_date) depminsalary
+   FROM empsalary) emp
+WHERE depname = 'sales';
+
+-- Test Sort node reordering
+EXPLAIN (COSTS OFF)
+SELECT
+  lead(1) OVER (PARTITION BY depname ORDER BY salary, enroll_date),
+  lag(1) OVER (PARTITION BY depname ORDER BY salary,enroll_date,empno)
+FROM empsalary;
+
 -- cleanup
 DROP TABLE empsalary;
 
-- 
2.7.4

#10Daniel Gustafsson
daniel@yesql.se
In reply to: Alexander Kuzmenkov (#9)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

On 27 Jul 2018, at 21:12, Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru> wrote:

Thanks for the update.

Thank you for reviewing and hacking!

On 07/25/2018 01:37 AM, Daniel Gustafsson wrote:

Hmm, this is missing the eqop fields of SortGroupClause. I haven't
tested yet but does the similar degradation happen if two
SortGroupCaluses have different eqop and the same other values?

I wasn’t able to construct a case showing this, but I also think you’re right.
Do you have an idea of a query that can trigger a regression? The attached
patch adds a stab at this, but I’m not sure if it’s the right approach.

To trigger that, in your test example you could order by empno::int8 for one window, and by empno::int2 for another. But don't I think you have to compare eqop here, because if eqop differs, sortop will differ too. I removed the comparison from the patch.

Right, that makes sense.

I also clarified (I hope) the comments, and did the optimization I mentioned earlier: using array instead of list for active clauses. Please see the attached v6.

Thanks, looks good.

Otherwise I think the patch is good.

Cool, thanks for reviewing!

cheers ./daniel

#11Alexander Kuzmenkov
a.kuzmenkov@postgrespro.ru
In reply to: Daniel Gustafsson (#10)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

The last version looked OK, so I'm marking this patch as ready for
committer in the commitfest app.

--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#12Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Alexander Kuzmenkov (#9)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

On Sat, Jul 28, 2018 at 4:12 AM, Alexander Kuzmenkov
<a.kuzmenkov@postgrespro.ru> wrote:

Daniel,

Thanks for the update.

On 07/25/2018 01:37 AM, Daniel Gustafsson wrote:

Hmm, this is missing the eqop fields of SortGroupClause. I haven't
tested yet but does the similar degradation happen if two
SortGroupCaluses have different eqop and the same other values?

I wasn’t able to construct a case showing this, but I also think you’re
right.
Do you have an idea of a query that can trigger a regression? The
attached
patch adds a stab at this, but I’m not sure if it’s the right approach.

To trigger that, in your test example you could order by empno::int8 for one
window, and by empno::int2 for another. But don't I think you have to
compare eqop here, because if eqop differs, sortop will differ too. I
removed the comparison from the patch. I also clarified (I hope) the
comments, and did the optimization I mentioned earlier: using array instead
of list for active clauses. Please see the attached v6. Otherwise I think
the patch is good.

Thank you! That makes sense and the patch looks good to me. FWIW maybe
it's good idea to add the comment describing why we didn't that.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#13Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Masahiko Sawada (#12)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

So I'm looking to commit this, and here's my comments so far:

WindowClauseSortNode - I don't like this name, because it's not actually
a Node of any kind. How about WindowSortData?

list_concat_unique(list_copy(x),y) is exactly list_union(x,y), which
looks a bit nicer to me.

re. this:

for (; nActive > 0; nActive--)
result = lcons(actives[nActive - 1].wc, result);

Now that we're allowed to use C99, I think it looks better like this:

for (int i = 0; i < nActive; i++)
result = lappend(result, actives[i].wc);

(Building lists in forward order by using a reversed construction and
iterating backwards seems like an unnecessary double-negative.)

I can add a comment about not needing to compare eqop (which is derived
directly from sortop, so it can't differ unless sortop also does -
provided sortop is actually present; if window partitions could be
hashed, this would be a problem, but that doesn't strike me as very
likely to happen).

Any comments? (no need to post further patches unless there's some major
change needed)

--
Andrew (irc:RhodiumToad)

#14Daniel Gustafsson
daniel@yesql.se
In reply to: Andrew Gierth (#13)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

On 12 Sep 2018, at 22:15, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:

WindowClauseSortNode - I don't like this name, because it's not actually
a Node of any kind. How about WindowSortData?

That’s a good point. I probably would’ve named it WindowClauseSortData since
it acts on WindowClauses, but that might just be overly verbose.

Any comments? (no need to post further patches unless there's some major
change needed)

I have no objections to the comments made in this review, only the above
nitpick.

Thanks for picking this up!

cheers ./daniel

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Gierth (#13)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

So I'm looking to commit this, and here's my comments so far:

I took a quick look over this. I agree with your nitpicks, and have
a couple more:

* Please run it through pgindent. That will, at a minimum, remove some
gratuitous whitespace changes in this patch. I think it'll also expose
some places where you need to change the code layout to prevent pgindent
from making it look ugly. Notably, this:

actives[nActive].uniqueOrder = list_concat_unique(
list_copy(wc->partitionClause), wc->orderClause);

is not per project style for function call layout. Given the other
comment about using list_union, I'd probably lay it out like this:

actives[nActive].uniqueOrder = list_union(wc->partitionClause,
wc->orderClause);

* The initial comment in select_active_windows,

/* First, make a list of the active windows */

is now seriously inadequate as a description of what the subsequent
loop does; it needs to be expanded. I'd also say that it's not
building a list anymore, but an array. Further, there needs to be
an explanation of why what it's doing is correct at all ---
list_union doesn't make many promises about the order of the resulting
list (nor did the phraseology with list_concat_unique), but what we're
doing below certainly requires that order to have something to do with
the window semantics.

* I'm almost thinking that changing to list_union is a bad idea, because
that obscures the fact that we're relying on the relative order of
elements in partitionClause and orderClause to not change; any future
reimplementation of list_union would utterly break this code. I'm also a
bit suspicious as to whether the code is even correct; does it *really*
match what will happen later when we create sort plan nodes? (Maybe it's
fine; I haven't looked.)

* The original comments also made explicit that we were not considering
framing options, and I'm not too happy that that disappeared.

* It'd be better if common_prefix_cmp didn't cast away const.

regards, tom lane

#16Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Tom Lane (#15)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

Tom> * I'm almost thinking that changing to list_union is a bad idea,

A fair point. Though it looks like list_union is used in only about 3
distinct places, and two of those are list_union(NIL, blah) to simply
remove dups from a single list. The third place is the cartesian-product
expansion of grouping sets, which uses list_union_int to remove
duplicates - changing the order there will give slightly user-surprising
but not actually incorrect results.

Presumably list_concat_unique should be considered to guarantee that it
preserves the relative order of the two lists and of the non-duplicate
items in the second list?

--
Andrew (irc:RhodiumToad)

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Gierth (#16)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> * I'm almost thinking that changing to list_union is a bad idea,

A fair point. Though it looks like list_union is used in only about 3
distinct places, and two of those are list_union(NIL, blah) to simply
remove dups from a single list. The third place is the cartesian-product
expansion of grouping sets, which uses list_union_int to remove
duplicates - changing the order there will give slightly user-surprising
but not actually incorrect results.

Presumably list_concat_unique should be considered to guarantee that it
preserves the relative order of the two lists and of the non-duplicate
items in the second list?

I'm thinking that whichever coding we use, the patch should include
comment additions in list.c documenting that some callers have assumptions
thus-and-so about list order preservation. Then at least anybody who
got the idea to try to improve performance of those functions would be on
notice about the risks.

I see that list_union is currently documented like this:

* Generate the union of two lists. This is calculated by copying
* list1 via list_copy(), then adding to it all the members of list2
* that aren't already in list1.

so as long as it stays like that, it's not unreasonable to use it in
this patch. I just want the potential landmine to be obvious at that
end.

regards, tom lane

#18Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Tom Lane (#15)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

Tom> I'm also a bit suspicious as to whether the code is even correct;
Tom> does it *really* match what will happen later when we create sort
Tom> plan nodes? (Maybe it's fine; I haven't looked.)

As things stand before the patch, the code that actually generates the
path of sort+window nodes requires only this assumption: that
order-equivalent windows (as defined by the spec) must end up together
in the list, or otherwise separated only by entries that don't add a new
sort node.

(aside: I itch to rewrite the comment that says "the spec requires that
there be only one sort" - number of sorts is an implementation detail
about which the spec is silent, what it _actually_ requires is that peer
rows must be presented in the same order in all order-equivalent
windows, which we choose to implement by ensuring there is only one sort
for such windows, rather than, for example, adding extra sort keys to
provide stability.)

The path-generation code simply concatenates the partition and order
lists and creates pathkeys. The pathkeys creation removes redundant
entries. So if we're guaranteed that two entries considered equal by the
patch code are also considered equal by the pathkey mechanism, which I
believe is the case, then the logic is still correct (enough to satisfy
the spec and produce correct query results).

There are optimizations that can be done once we have the pathkeys that
can't be anticipated by select_active_windows because that function is
run before we set up equivalence classes. This might lead path creation
to produce fewer sorts than anticipated, but not more sorts.

So I'm satisfied, as far as I can tell, that the logic is both correct
and an improvement over what we currently have.

(Perhaps worth noting for future work is that this code and the grouping
sets code have a common issue: currently we allow only one sort order to
be requested as query_pathkeys, but this means that both window paths
and grouping sets paths have to make an essentially arbitrary choice of
query_pathkeys, rather than having a set of possible "useful" orderings
and taking whichever can be produced most cheaply.)

--
Andrew (irc:RhodiumToad)

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Gierth (#18)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

(aside: I itch to rewrite the comment that says "the spec requires that
there be only one sort" - number of sorts is an implementation detail
about which the spec is silent, what it _actually_ requires is that peer
rows must be presented in the same order in all order-equivalent
windows, which we choose to implement by ensuring there is only one sort
for such windows, rather than, for example, adding extra sort keys to
provide stability.)

Sure, rewrite away.

(Perhaps worth noting for future work is that this code and the grouping
sets code have a common issue: currently we allow only one sort order to
be requested as query_pathkeys, but this means that both window paths
and grouping sets paths have to make an essentially arbitrary choice of
query_pathkeys, rather than having a set of possible "useful" orderings
and taking whichever can be produced most cheaply.)

Yeah, I've had a bee in my bonnet for awhile about replacing
query_pathkeys with a list of potentially-desirable result orderings.
So far there hasn't been a truly compelling reason to do it, but if
somebody felt like generalizing the window function ordering stuff
in that direction, it'd be a nice project.

regards, tom lane

#20Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Tom Lane (#19)
1 attachment(s)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

Here's what I have queued up to push.

My changes are:

- added to the header comment of list_concat_unique that callers have
ordering expectations. Didn't touch list_union, since I ended up
sticking with list_concat_unique for this patch.

- WindowClauseSortNode renamed WindowClauseSortData

- added and rewrote some comments

- tidied up some casting in common_prefix_cmp

- pgindent and some layout tweaks

--
Andrew (irc:RhodiumToad)

Attachments:

0001-Order-active-window-clauses-for-greater-reuse-of-Sor.patchtext/x-patchDownload
From 9c89883ffe2153cc9d047f71a2b0e611f2c60452 Mon Sep 17 00:00:00 2001
From: Andrew Gierth <rhodiumtoad@postgresql.org>
Date: Thu, 13 Sep 2018 18:12:37 +0100
Subject: [PATCH] Order active window clauses for greater reuse of Sort nodes.

By sorting the active window list lexicographically by the sort clause
list but putting longer clauses before shorter prefixes, we generate
more chances to elide Sort nodes when building the path.

Author: Daniel Gustafsson (with some editorialization by me)
Reviewed-by: Alexander Kuzmenkov, Masahiko Sawada, Tom Lane
Discussion: https://postgr.es/m/124A7F69-84CD-435B-BA0E-2695BE21E5C2%40yesql.se
---
 src/backend/nodes/list.c             |   7 +-
 src/backend/optimizer/plan/planner.c | 154 +++++++++++++++++++++++++----------
 src/test/regress/expected/window.out |  60 +++++++++++---
 src/test/regress/sql/window.sql      |  16 ++++
 4 files changed, 177 insertions(+), 60 deletions(-)

diff --git a/src/backend/nodes/list.c b/src/backend/nodes/list.c
index f3e1800708..55fd4c359b 100644
--- a/src/backend/nodes/list.c
+++ b/src/backend/nodes/list.c
@@ -1011,8 +1011,11 @@ list_append_unique_oid(List *list, Oid datum)
  * via equal().
  *
  * This is almost the same functionality as list_union(), but list1 is
- * modified in-place rather than being copied.  Note also that list2's cells
- * are not inserted in list1, so the analogy to list_concat() isn't perfect.
+ * modified in-place rather than being copied. However, callers of this
+ * function may have strict ordering expectations -- i.e. that the relative
+ * order of those list2 elements that are not duplicates is preserved. Note
+ * also that list2's cells are not inserted in list1, so the analogy to
+ * list_concat() isn't perfect.
  */
 List *
 list_concat_unique(List *list1, List *list2)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 96bf0601a8..94b85721fa 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -110,6 +110,17 @@ typedef struct
 	int		   *tleref_to_colnum_map;
 } grouping_sets_data;
 
+/*
+ * Temporary structure for use during WindowClause reordering in order to be
+ * be able to sort WindowClauses on partitioning/ordering prefix.
+ */
+typedef struct
+{
+	WindowClause *wc;
+	List	   *uniqueOrder;	/* A List of unique ordering/partitioning
+								 * clauses per Window */
+} WindowClauseSortData;
+
 /* Local functions */
 static Node *preprocess_expression(PlannerInfo *root, Node *expr, int kind);
 static void preprocess_qual_conditions(PlannerInfo *root, Node *jtnode);
@@ -237,6 +248,7 @@ static void create_partitionwise_grouping_paths(PlannerInfo *root,
 static bool group_by_has_partkey(RelOptInfo *input_rel,
 					 List *targetList,
 					 List *groupClause);
+static int	common_prefix_cmp(const void *a, const void *b);
 
 
 /*****************************************************************************
@@ -5260,68 +5272,120 @@ postprocess_setop_tlist(List *new_tlist, List *orig_tlist)
 static List *
 select_active_windows(PlannerInfo *root, WindowFuncLists *wflists)
 {
-	List	   *result;
-	List	   *actives;
+	List	   *windowClause = root->parse->windowClause;
+	List	   *result = NIL;
 	ListCell   *lc;
+	int			nActive = 0;
+	WindowClauseSortData *actives = palloc(sizeof(WindowClauseSortData)
+										   * list_length(windowClause));
 
-	/* First, make a list of the active windows */
-	actives = NIL;
-	foreach(lc, root->parse->windowClause)
+	/* First, construct an array of the active windows */
+	foreach(lc, windowClause)
 	{
 		WindowClause *wc = lfirst_node(WindowClause, lc);
 
 		/* It's only active if wflists shows some related WindowFuncs */
 		Assert(wc->winref <= wflists->maxWinRef);
-		if (wflists->windowFuncs[wc->winref] != NIL)
-			actives = lappend(actives, wc);
+		if (wflists->windowFuncs[wc->winref] == NIL)
+			continue;
+
+		actives[nActive].wc = wc;	/* original clause */
+
+		/*
+		 * For sorting, we want the list of partition keys followed by the
+		 * list of sort keys. But pathkeys construction will remove duplicates
+		 * between the two, so we can as well (even though we can't detect all
+		 * of the duplicates, since some may come from ECs - that might mean
+		 * we miss optimization chances here). We must, however, ensure that
+		 * the order of entries is preserved with respect to the ones we do
+		 * keep.
+		 *
+		 * partitionClause and orderClause had their own duplicates removed in
+		 * parse analysis, so we're only concerned here with removing
+		 * orderClause entries that also appear in partitionClause.
+		 */
+		actives[nActive].uniqueOrder =
+			list_concat_unique(list_copy(wc->partitionClause),
+							   wc->orderClause);
+		nActive++;
 	}
 
 	/*
-	 * Now, ensure that windows with identical partitioning/ordering clauses
-	 * are adjacent in the list.  This is required by the SQL standard, which
-	 * says that only one sort is to be used for such windows, even if they
-	 * are otherwise distinct (eg, different names or framing clauses).
+	 * Sort active windows by their partitioning/ordering clauses, ignoring
+	 * any framing clauses, so that the windows that need the same sorting are
+	 * adjacent in the list. When we come to generate paths, this will avoid
+	 * inserting additional Sort nodes.
 	 *
-	 * There is room to be much smarter here, for example detecting whether
-	 * one window's sort keys are a prefix of another's (so that sorting for
-	 * the latter would do for the former), or putting windows first that
-	 * match a sort order available for the underlying query.  For the moment
-	 * we are content with meeting the spec.
-	 */
-	result = NIL;
-	while (actives != NIL)
-	{
-		WindowClause *wc = linitial_node(WindowClause, actives);
-		ListCell   *prev;
-		ListCell   *next;
-
-		/* Move wc from actives to result */
-		actives = list_delete_first(actives);
-		result = lappend(result, wc);
-
-		/* Now move any matching windows from actives to result */
-		prev = NULL;
-		for (lc = list_head(actives); lc; lc = next)
-		{
-			WindowClause *wc2 = lfirst_node(WindowClause, lc);
+	 * This is how we implement a specific requirement from the SQL standard,
+	 * which says that when two or more windows are order-equivalent (i.e.
+	 * have matching partition and order clauses, even if their names or
+	 * framing clauses differ), then all peer rows must be presented in the
+	 * same order in all of them. If we allowed multiple sort nodes for such
+	 * cases, we'd risk having the peer rows end up in different orders in
+	 * equivalent windows due to sort instability. (See General Rule 4 of
+	 * <window clause> in SQL2008 - SQL2016.)
+	 *
+	 * Additionally, if the entire list of clauses of one window is a prefix
+	 * of another, put first the window with stronger sorting requirements.
+	 * This way we will first sort for stronger window, and won't have to sort
+	 * again for the weaker one.
+	 */
+	qsort(actives, nActive, sizeof(WindowClauseSortData), common_prefix_cmp);
 
-			next = lnext(lc);
-			/* framing options are NOT to be compared here! */
-			if (equal(wc->partitionClause, wc2->partitionClause) &&
-				equal(wc->orderClause, wc2->orderClause))
-			{
-				actives = list_delete_cell(actives, lc, prev);
-				result = lappend(result, wc2);
-			}
-			else
-				prev = lc;
-		}
-	}
+	/* build ordered list of the original WindowClause nodes */
+	for (int i = 0; i < nActive; i++)
+		result = lappend(result, actives[i].wc);
+
+	pfree(actives);
 
 	return result;
 }
 
 /*
+ * common_prefix_cmp
+ *	  QSort comparison function for WindowClauseSortData
+ *
+ * Sort the windows by the required sorting clauses. First, compare the sort
+ * clauses themselves. Second, if one window's clauses are a prefix of another
+ * one's clauses, put the window with more sort clauses first.
+ */
+static int
+common_prefix_cmp(const void *a, const void *b)
+{
+	const WindowClauseSortData *wcsa = a;
+	const WindowClauseSortData *wcsb = b;
+	ListCell   *item_a;
+	ListCell   *item_b;
+
+	forboth(item_a, wcsa->uniqueOrder, item_b, wcsb->uniqueOrder)
+	{
+		SortGroupClause *sca = lfirst_node(SortGroupClause, item_a);
+		SortGroupClause *scb = lfirst_node(SortGroupClause, item_b);
+
+		if (sca->tleSortGroupRef > scb->tleSortGroupRef)
+			return -1;
+		else if (sca->tleSortGroupRef < scb->tleSortGroupRef)
+			return 1;
+		else if (sca->sortop > scb->sortop)
+			return -1;
+		else if (sca->sortop < scb->sortop)
+			return 1;
+		else if (sca->nulls_first && !scb->nulls_first)
+			return -1;
+		else if (!sca->nulls_first && scb->nulls_first)
+			return 1;
+		/* no need to compare eqop, since it is fully determined by sortop */
+	}
+
+	if (list_length(wcsa->uniqueOrder) > list_length(wcsb->uniqueOrder))
+		return -1;
+	else if (list_length(wcsa->uniqueOrder) < list_length(wcsb->uniqueOrder))
+		return 1;
+
+	return 0;
+}
+
+/*
  * make_window_input_target
  *	  Generate appropriate PathTarget for initial input to WindowAgg nodes.
  *
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 562006a2b8..662d348653 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -504,9 +504,9 @@ SELECT sum(salary),
 FROM empsalary GROUP BY depname;
   sum  | row_number |  sum  
 -------+------------+-------
- 14600 |          3 | 14600
-  7400 |          2 | 22000
  25100 |          1 | 47100
+  7400 |          2 | 22000
+ 14600 |          3 | 14600
 (3 rows)
 
 -- identical windows with different names
@@ -2994,9 +2994,9 @@ SELECT sum(salary), row_number() OVER (ORDER BY depname), sum(
 FROM empsalary GROUP BY depname;
   sum  | row_number | filtered_sum |  depname  
 -------+------------+--------------+-----------
- 14600 |          3 |              | sales
-  7400 |          2 |         3500 | personnel
  25100 |          1 |        22600 | develop
+  7400 |          2 |         3500 | personnel
+ 14600 |          3 |              | sales
 (3 rows)
 
 -- Test pushdown of quals into a subquery containing window functions
@@ -3008,13 +3008,13 @@ SELECT * FROM
           min(salary) OVER (PARTITION BY depname || 'A', depname) depminsalary
    FROM empsalary) emp
 WHERE depname = 'sales';
-                             QUERY PLAN                              
----------------------------------------------------------------------
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
  Subquery Scan on emp
    ->  WindowAgg
-         ->  Sort
-               Sort Key: (((empsalary.depname)::text || 'A'::text))
-               ->  WindowAgg
+         ->  WindowAgg
+               ->  Sort
+                     Sort Key: (((empsalary.depname)::text || 'A'::text))
                      ->  Seq Scan on empsalary
                            Filter: ((depname)::text = 'sales'::text)
 (7 rows)
@@ -3027,19 +3027,53 @@ SELECT * FROM
           min(salary) OVER (PARTITION BY depname) depminsalary
    FROM empsalary) emp
 WHERE depname = 'sales';
-                        QUERY PLAN                         
------------------------------------------------------------
+                      QUERY PLAN                       
+-------------------------------------------------------
  Subquery Scan on emp
    Filter: ((emp.depname)::text = 'sales'::text)
    ->  WindowAgg
          ->  Sort
-               Sort Key: empsalary.depname
+               Sort Key: empsalary.enroll_date
                ->  WindowAgg
                      ->  Sort
-                           Sort Key: empsalary.enroll_date
+                           Sort Key: empsalary.depname
                            ->  Seq Scan on empsalary
 (9 rows)
 
+-- Test Sort node collapsing
+EXPLAIN (COSTS OFF)
+SELECT * FROM
+  (SELECT depname,
+          sum(salary) OVER (PARTITION BY depname order by empno) depsalary,
+          min(salary) OVER (PARTITION BY depname, empno order by enroll_date) depminsalary
+   FROM empsalary) emp
+WHERE depname = 'sales';
+                              QUERY PLAN                              
+----------------------------------------------------------------------
+ Subquery Scan on emp
+   ->  WindowAgg
+         ->  WindowAgg
+               ->  Sort
+                     Sort Key: empsalary.empno, empsalary.enroll_date
+                     ->  Seq Scan on empsalary
+                           Filter: ((depname)::text = 'sales'::text)
+(7 rows)
+
+-- Test Sort node reordering
+EXPLAIN (COSTS OFF)
+SELECT
+  lead(1) OVER (PARTITION BY depname ORDER BY salary, enroll_date),
+  lag(1) OVER (PARTITION BY depname ORDER BY salary,enroll_date,empno)
+FROM empsalary;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ WindowAgg
+   ->  WindowAgg
+         ->  Sort
+               Sort Key: depname, salary, enroll_date, empno
+               ->  Seq Scan on empsalary
+(5 rows)
+
 -- cleanup
 DROP TABLE empsalary;
 -- test user-defined window function with named args and default args
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index e2943a38f1..fc6d4cc903 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -892,6 +892,22 @@ SELECT * FROM
    FROM empsalary) emp
 WHERE depname = 'sales';
 
+-- Test Sort node collapsing
+EXPLAIN (COSTS OFF)
+SELECT * FROM
+  (SELECT depname,
+          sum(salary) OVER (PARTITION BY depname order by empno) depsalary,
+          min(salary) OVER (PARTITION BY depname, empno order by enroll_date) depminsalary
+   FROM empsalary) emp
+WHERE depname = 'sales';
+
+-- Test Sort node reordering
+EXPLAIN (COSTS OFF)
+SELECT
+  lead(1) OVER (PARTITION BY depname ORDER BY salary, enroll_date),
+  lag(1) OVER (PARTITION BY depname ORDER BY salary,enroll_date,empno)
+FROM empsalary;
+
 -- cleanup
 DROP TABLE empsalary;
 
-- 
2.11.1

#21Daniel Gustafsson
daniel@yesql.se
In reply to: Andrew Gierth (#20)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

On 13 Sep 2018, at 19:50, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:

Here's what I have queued up to push.

LGTM, thanks!

+	 * framing clauses differ), then all peer rows must be presented in the
+	 * same order in all of them. If we allowed multiple sort nodes for such

Should probably be capitalized as "Sort nodes” to match the rest of the comment.

cheers ./daniel

#22Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Daniel Gustafsson (#21)
Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

Here's what I have queued up to push.

Daniel> LGTM, thanks!

Committed.

Many thanks for the contribution, and thanks to the reviewers for their
work.

--
Andrew (irc:RhodiumToad)