Fix up grouping sets reorder

Started by Richard Guoover 6 years ago4 messages
#1Richard Guo
riguo@pivotal.io
1 attachment(s)

Hi all,

During the reorder of grouping sets into correct prefix order, if only
one aggregation pass is needed, we follow the order of the ORDER BY
clause to the extent possible, to minimize the chance that we add
unnecessary sorts. This is implemented in preprocess_grouping_sets -->
reorder_grouping_sets.

However, current codes fail to do that. For instance:

# set enable_hashagg to off;
SET

# explain verbose select * from t group by grouping sets((a,b,c),(c)) order
by c,b,a;
QUERY PLAN
-------------------------------------------------------------------------------
Sort (cost=184.47..185.48 rows=404 width=12)
Output: a, b, c
Sort Key: t.c, t.b, t.a
-> GroupAggregate (cost=142.54..166.98 rows=404 width=12)
Output: a, b, c
Group Key: t.c, t.a, t.b
Group Key: t.c
-> Sort (cost=142.54..147.64 rows=2040 width=12)
Output: a, b, c
Sort Key: t.c, t.a, t.b
-> Seq Scan on public.t (cost=0.00..30.40 rows=2040
width=12)
Output: a, b, c
(12 rows)

This sort node in the above plan can be avoided if we reorder the
grouping sets more properly.

Attached is a patch for the fixup. With the patch, the above plan would
become:

# explain verbose select * from t group by grouping sets((a,b,c),(c)) order
by c,b,a;
QUERY PLAN
-------------------------------------------------------------------------
GroupAggregate (cost=142.54..166.98 rows=404 width=12)
Output: a, b, c
Group Key: t.c, t.b, t.a
Group Key: t.c
-> Sort (cost=142.54..147.64 rows=2040 width=12)
Output: a, b, c
Sort Key: t.c, t.b, t.a
-> Seq Scan on public.t (cost=0.00..30.40 rows=2040 width=12)
Output: a, b, c
(9 rows)

The fix happens in reorder_grouping_sets and is very simple. In each
iteration to reorder one grouping set, if the next item in sortclause
matches one element in new_elems, we add that item to the grouing set
list and meanwhile remove it from the new_elems list. When all the
elements in new_elems have been removed, we can know we are done with
current grouping set. We should break out to continue with next grouping
set.

Any thoughts?

Thanks
Richard

Attachments:

v1-0001-Fix-up-grouping-sets-reorder.patchapplication/octet-stream; name=v1-0001-Fix-up-grouping-sets-reorder.patchDownload
From 3e2b849b2f32dc3e03b5fa15d259a3000ccb6f92 Mon Sep 17 00:00:00 2001
From: Richard Guo <riguo@pivotal.io>
Date: Mon, 17 Jun 2019 08:34:47 +0000
Subject: [PATCH] Fix up grouping sets reorder.

---
 src/backend/optimizer/plan/planner.c       |  4 +++
 src/test/regress/expected/groupingsets.out | 41 ++++++++++++++++++++++++++++++
 src/test/regress/sql/groupingsets.sql      |  6 +++++
 3 files changed, 51 insertions(+)

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index cb897cc..36d78fe 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -3551,6 +3551,10 @@ reorder_grouping_sets(List *groupingsets, List *sortclause)
 				{
 					previous = lappend_int(previous, ref);
 					new_elems = list_delete_int(new_elems, ref);
+
+					/* finished with this grouping set */
+					if (list_length(new_elems) == 0)
+						break;
 				}
 				else
 				{
diff --git a/src/test/regress/expected/groupingsets.out b/src/test/regress/expected/groupingsets.out
index 381ebce..4e3040e 100644
--- a/src/test/regress/expected/groupingsets.out
+++ b/src/test/regress/expected/groupingsets.out
@@ -929,6 +929,47 @@ select sum(ten) from onek group by rollup(four::text), two order by 1;
  2500
 (6 rows)
 
+-- Grouping sets reorder
+explain (costs off, verbose)
+select * from gstest1 group by grouping sets((a,b,v),(v)) order by v,b,a;
+                                    QUERY PLAN                                    
+----------------------------------------------------------------------------------
+ GroupAggregate
+   Output: "*VALUES*".column1, "*VALUES*".column2, "*VALUES*".column3
+   Group Key: "*VALUES*".column3, "*VALUES*".column2, "*VALUES*".column1
+   Group Key: "*VALUES*".column3
+   ->  Sort
+         Output: "*VALUES*".column1, "*VALUES*".column2, "*VALUES*".column3
+         Sort Key: "*VALUES*".column3, "*VALUES*".column2, "*VALUES*".column1
+         ->  Values Scan on "*VALUES*"
+               Output: "*VALUES*".column1, "*VALUES*".column2, "*VALUES*".column3
+(9 rows)
+
+select * from gstest1 group by grouping sets((a,b,v),(v)) order by v,b,a;
+ a | b | v  
+---+---+----
+ 1 | 1 | 10
+   |   | 10
+ 1 | 1 | 11
+   |   | 11
+ 1 | 2 | 12
+   |   | 12
+ 1 | 2 | 13
+   |   | 13
+ 1 | 3 | 14
+   |   | 14
+ 2 | 3 | 15
+   |   | 15
+ 3 | 3 | 16
+   |   | 16
+ 3 | 4 | 17
+   |   | 17
+ 4 | 1 | 18
+   |   | 18
+ 4 | 1 | 19
+   |   | 19
+(20 rows)
+
 -- hashing support
 set enable_hashagg = true;
 -- failure cases
diff --git a/src/test/regress/sql/groupingsets.sql b/src/test/regress/sql/groupingsets.sql
index 5d64859..88ff042 100644
--- a/src/test/regress/sql/groupingsets.sql
+++ b/src/test/regress/sql/groupingsets.sql
@@ -266,6 +266,12 @@ select array(select row(v.a,s1.*) from (select two,four, count(*) from onek grou
 select sum(ten) from onek group by two, rollup(four::text) order by 1;
 select sum(ten) from onek group by rollup(four::text), two order by 1;
 
+-- Grouping sets reorder
+explain (costs off, verbose)
+select * from gstest1 group by grouping sets((a,b,v),(v)) order by v,b,a;
+
+select * from gstest1 group by grouping sets((a,b,v),(v)) order by v,b,a;
+
 -- hashing support
 
 set enable_hashagg = true;
-- 
2.7.4

#2Andres Freund
andres@anarazel.de
In reply to: Richard Guo (#1)
Re: Fix up grouping sets reorder

Hi,

On 2019-06-17 17:23:11 +0800, Richard Guo wrote:

During the reorder of grouping sets into correct prefix order, if only
one aggregation pass is needed, we follow the order of the ORDER BY
clause to the extent possible, to minimize the chance that we add
unnecessary sorts. This is implemented in preprocess_grouping_sets -->
reorder_grouping_sets.

Thanks for finding!

Andrew, could you take a look at that?

- Andres

#3Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Andres Freund (#2)
Re: Fix up grouping sets reorder

"Andres" == Andres Freund <andres@anarazel.de> writes:

During the reorder of grouping sets into correct prefix order, if
only one aggregation pass is needed, we follow the order of the
ORDER BY clause to the extent possible, to minimize the chance that
we add unnecessary sorts. This is implemented in
preprocess_grouping_sets --> reorder_grouping_sets.

Andres> Thanks for finding!

Andres> Andrew, could you take a look at that?

Yes.

--
Andrew (irc:RhodiumToad)

#4Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Richard Guo (#1)
Re: Fix up grouping sets reorder

"Richard" == Richard Guo <riguo@pivotal.io> writes:

Richard> Hi all,

Richard> During the reorder of grouping sets into correct prefix order,
Richard> if only one aggregation pass is needed, we follow the order of
Richard> the ORDER BY clause to the extent possible, to minimize the
Richard> chance that we add unnecessary sorts. This is implemented in
Richard> preprocess_grouping_sets --> reorder_grouping_sets.

Richard> However, current codes fail to do that.

You're correct, thanks for the report.

Your fix works, but I prefer to refactor the conditional logic slightly
instead, removing the outer if{}. So I didn't use your exact patch in
the fix I just committed.

--
Andrew (irc:RhodiumToad)