GSets: Fix bug involving GROUPING and HAVING together

Started by Jeevan Chalkeover 10 years ago8 messages
#1Jeevan Chalke
jeevan.chalke@enterprisedb.com
1 attachment(s)

Hi,

I have observed some fishy behavior related to GROUPING in HAVING clause
and when we have only one element in GROUPING SETS.

Basically, when we have only one element in GROUING SETS, we are assuming
it as a simple GROUP BY with one column. Due to which we are ending up with
this error.

If we have ROLLUP/CUBE or GROUPING SETS with multiple elements, then we are
not getting this error.

Here are some examples:

postgres=# select ten, grouping(ten) from onek
postgres-# group by grouping sets(ten) having grouping(ten) >= 0
postgres-# order by 2,1;
ERROR: parent of GROUPING is not Agg node
postgres=# select ten, grouping(ten) from onek
postgres-# group by grouping sets(ten, four) having grouping(ten) > 0
postgres-# order by 2,1;
ten | grouping
-----+----------
(0 rows)fix_bug_related_to_grouping_v1.patch

postgres=# select ten, grouping(ten) from onek
postgres-# group by rollup(ten) having grouping(ten) > 0
postgres-# order by 2,1;
ten | grouping
-----+----------
| 1
(1 row)

postgres=# select ten, grouping(ten) from onek
postgres-# group by cube(ten) having grouping(ten) > 0
postgres-# order by 2,1;
ten | grouping
-----+----------
| 1
(1 row)

I had a look over relevant code and found that contain_agg_clause_walker()
is not considering GroupingFunc as an aggregate node, due to which it is
failing to consider it in a having clause in subquery_planner().

Fix this by adding GroupingFunc node in this walker. We do it correctly in
contain_aggs_of_level_walker() in which we have handling for GroupingFunc
there.

Attached patch to fix this.

The side effect is that, if we have plain group by clause, then too we can
use GROUPING in HAVING clause. But I guess it is fine.

Let me know if I missed anything to consider here.

Thanks

--
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Attachments:

fix_bug_related_to_grouping_v1.patchtext/x-diff; charset=US-ASCII; name=fix_bug_related_to_grouping_v1.patchDownload
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index d40083d..cdb6955 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -390,7 +390,7 @@ make_ands_implicit(Expr *clause)
 
 /*
  * contain_agg_clause
- *	  Recursively search for Aggref nodes within a clause.
+ *	  Recursively search for Aggref/GroupingFunc nodes within a clause.
  *
  *	  Returns true if any aggregate found.
  *
@@ -417,6 +417,11 @@ contain_agg_clause_walker(Node *node, void *context)
 		Assert(((Aggref *) node)->agglevelsup == 0);
 		return true;			/* abort the tree traversal and return true */
 	}
+	if (IsA(node, GroupingFunc))
+	{
+		Assert(((GroupingFunc *) node)->agglevelsup == 0);
+		return true;			/* abort the tree traversal and return true */
+	}
 	Assert(!IsA(node, SubLink));
 	return expression_tree_walker(node, contain_agg_clause_walker, context);
 }
diff --git a/src/test/regress/expected/groupingsets.out b/src/test/regress/expected/groupingsets.out
index 842c2ae..adb39b3 100644
--- a/src/test/regress/expected/groupingsets.out
+++ b/src/test/regress/expected/groupingsets.out
@@ -486,6 +486,68 @@ having exists (select 1 from onek b where sum(distinct a.four) = b.four);
    9 |   3
 (25 rows)
 
+-- HAVING with GROUPING queries
+select ten, grouping(ten) from onek
+group by grouping sets(ten) having grouping(ten) >= 0
+order by 2,1;
+ ten | grouping 
+-----+----------
+   0 |        0
+   1 |        0
+   2 |        0
+   3 |        0
+   4 |        0
+   5 |        0
+   6 |        0
+   7 |        0
+   8 |        0
+   9 |        0
+(10 rows)
+
+select ten, grouping(ten) from onek
+group by grouping sets(ten, four) having grouping(ten) > 0
+order by 2,1;
+ ten | grouping 
+-----+----------
+     |        1
+     |        1
+     |        1
+     |        1
+(4 rows)
+
+select ten, grouping(ten) from onek
+group by rollup(ten) having grouping(ten) > 0
+order by 2,1;
+ ten | grouping 
+-----+----------
+     |        1
+(1 row)
+
+select ten, grouping(ten) from onek
+group by cube(ten) having grouping(ten) > 0
+order by 2,1;
+ ten | grouping 
+-----+----------
+     |        1
+(1 row)
+
+select ten, grouping(ten) from onek
+group by (ten) having grouping(ten) >= 0
+order by 2,1;
+ ten | grouping 
+-----+----------
+   0 |        0
+   1 |        0
+   2 |        0
+   3 |        0
+   4 |        0
+   5 |        0
+   6 |        0
+   7 |        0
+   8 |        0
+   9 |        0
+(10 rows)
+
 -- FILTER queries
 select ten, sum(distinct four) filter (where four::text ~ '123') from onek a
 group by rollup(ten);
diff --git a/src/test/regress/sql/groupingsets.sql b/src/test/regress/sql/groupingsets.sql
index 0bffb85..0883afd 100644
--- a/src/test/regress/sql/groupingsets.sql
+++ b/src/test/regress/sql/groupingsets.sql
@@ -154,6 +154,23 @@ select ten, sum(distinct four) from onek a
 group by grouping sets((ten,four),(ten))
 having exists (select 1 from onek b where sum(distinct a.four) = b.four);
 
+-- HAVING with GROUPING queries
+select ten, grouping(ten) from onek
+group by grouping sets(ten) having grouping(ten) >= 0
+order by 2,1;
+select ten, grouping(ten) from onek
+group by grouping sets(ten, four) having grouping(ten) > 0
+order by 2,1;
+select ten, grouping(ten) from onek
+group by rollup(ten) having grouping(ten) > 0
+order by 2,1;
+select ten, grouping(ten) from onek
+group by cube(ten) having grouping(ten) > 0
+order by 2,1;
+select ten, grouping(ten) from onek
+group by (ten) having grouping(ten) >= 0
+order by 2,1;
+
 -- FILTER queries
 select ten, sum(distinct four) filter (where four::text ~ '123') from onek a
 group by rollup(ten);
#2Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Jeevan Chalke (#1)
Re: [HACKERS] GSets: Fix bug involving GROUPING and HAVING together

"Jeevan" == Jeevan Chalke <jeevan.chalke@enterprisedb.com> writes:

Jeevan> Basically, when we have only one element in GROUING SETS, we
Jeevan> are assuming it as a simple GROUP BY with one column. Due to
Jeevan> which we are ending up with this error.

Jeevan> If we have ROLLUP/CUBE or GROUPING SETS with multiple elements,
Jeevan> then we are not getting this error.

There's two issues here. One you correctly identified, which is that
contain_agg_clause() should be true for GroupingFuncs too.

The other is that in subquery_planner, the optimization of converting
HAVING clauses to WHERE clauses is suppressed if parse->groupingSets
isn't empty. (It is empty if there's either no group by clause at all,
or if there's exactly one grouping set, which must not be empty, however
derived.) This is costing us some optimizations, especially in the case
of an explicit GROUP BY () clause; I'll look into this.

In the meantime your patch looks OK (and necessary) to me.

Jeevan> The side effect is that, if we have plain group by clause, then
Jeevan> too we can use GROUPING in HAVING clause. But I guess it is
Jeevan> fine.

GROUPING is, per spec, explicitly allowed anywhere you could have an
aggregate call, whether the group by clause is plain or not.

--
Andrew (irc:RhodiumToad)

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#3Jeevan Chalke
jeevan.chalke@enterprisedb.com
In reply to: Andrew Gierth (#2)
Re: GSets: Fix bug involving GROUPING and HAVING together

On Tue, Jul 14, 2015 at 4:23 PM, Andrew Gierth <andrew@tao11.riddles.org.uk>
wrote:

"Jeevan" == Jeevan Chalke <jeevan.chalke@enterprisedb.com> writes:

Jeevan> Basically, when we have only one element in GROUING SETS, we
Jeevan> are assuming it as a simple GROUP BY with one column. Due to
Jeevan> which we are ending up with this error.

Jeevan> If we have ROLLUP/CUBE or GROUPING SETS with multiple elements,
Jeevan> then we are not getting this error.

There's two issues here. One you correctly identified, which is that
contain_agg_clause() should be true for GroupingFuncs too.

The other is that in subquery_planner, the optimization of converting
HAVING clauses to WHERE clauses is suppressed if parse->groupingSets
isn't empty. (It is empty if there's either no group by clause at all,
or if there's exactly one grouping set, which must not be empty, however
derived.) This is costing us some optimizations, especially in the case
of an explicit GROUP BY () clause;

I have observed that. But I thought that it is indeed intentional.
As we don't want to choose code path for GSets when we have
only one column which is converted to plain group by and
thus setting parse->groupingSets to FALSE.

I'll look into this.

OK. Thanks

In the meantime your patch looks OK (and necessary) to me.

Jeevan> The side effect is that, if we have plain group by clause, then
Jeevan> too we can use GROUPING in HAVING clause. But I guess it is
Jeevan> fine.

GROUPING is, per spec, explicitly allowed anywhere you could have an
aggregate call, whether the group by clause is plain or not.

--
Andrew (irc:RhodiumToad)

--
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

#4Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Andrew Gierth (#2)
1 attachment(s)
Re: [HACKERS] GSets: Fix bug involving GROUPING and HAVING together

"Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

Andrew> The other is that in subquery_planner, the optimization of
Andrew> converting HAVING clauses to WHERE clauses is suppressed if
Andrew> parse->groupingSets isn't empty. (It is empty if there's either
Andrew> no group by clause at all, or if there's exactly one grouping
Andrew> set, which must not be empty, however derived.) This is costing
Andrew> us some optimizations, especially in the case of an explicit
Andrew> GROUP BY () clause; I'll look into this.

I'm inclined to go with the simplest approach here, which copies the
quals if there are grouping sets. The only way we could safely move a
qual without copying is if we can show that none of the grouping sets is
empty, that is (), and at this point the grouping set list has not been
expanded so it's not trivial to determine that.

Patch attached.

--
Andrew (irc:RhodiumToad)

Attachments:

gsets_having_pushdown.patchtext/x-patchDownload
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index a6ce96e..2866176 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -574,13 +574,12 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
 
 		if (contain_agg_clause(havingclause) ||
 			contain_volatile_functions(havingclause) ||
-			contain_subplans(havingclause) ||
-			parse->groupingSets)
+			contain_subplans(havingclause))
 		{
 			/* keep it in HAVING */
 			newHaving = lappend(newHaving, havingclause);
 		}
-		else if (parse->groupClause)
+		else if (parse->groupClause && !parse->groupingSets)
 		{
 			/* move it to WHERE */
 			parse->jointree->quals = (Node *)
#5Jeevan Chalke
jeevan.chalke@enterprisedb.com
In reply to: Andrew Gierth (#4)
Re: [HACKERS] GSets: Fix bug involving GROUPING and HAVING together

Hi,

This will fail too.

Note that, when we have only one element in GROUPING SETS,
we add that in group by list and set parse->groupingSets to NULL.

And hence it will have same issue.

However tests added in my patch failing too.

Thanks
--
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

#6Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Jeevan Chalke (#5)
Re: [HACKERS] GSets: Fix bug involving GROUPING and HAVING together

"Jeevan" == Jeevan Chalke <jeevan.chalke@enterprisedb.com> writes:

Jeevan> Hi,
Jeevan> This will fail too.

This is in addition to your patch, not instead of it.

--
Andrew (irc:RhodiumToad)

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#7Andres Freund
andres@anarazel.de
In reply to: Jeevan Chalke (#1)
Re: [HACKERS] GSets: Fix bug involving GROUPING and HAVING together

On 2015-07-14 14:51:09 +0530, Jeevan Chalke wrote:

Fix this by adding GroupingFunc node in this walker. We do it correctly in
contain_aggs_of_level_walker() in which we have handling for GroupingFunc
there.

Attached patch to fix this.

Pushed, thanks for fix!

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#8Andres Freund
andres@anarazel.de
In reply to: Andrew Gierth (#4)
Re: GSets: Fix bug involving GROUPING and HAVING together

On 2015-07-24 11:34:22 +0100, Andrew Gierth wrote:

"Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

Andrew> The other is that in subquery_planner, the optimization of
Andrew> converting HAVING clauses to WHERE clauses is suppressed if
Andrew> parse->groupingSets isn't empty. (It is empty if there's either
Andrew> no group by clause at all, or if there's exactly one grouping
Andrew> set, which must not be empty, however derived.) This is costing
Andrew> us some optimizations, especially in the case of an explicit
Andrew> GROUP BY () clause; I'll look into this.

I'm inclined to go with the simplest approach here, which copies the
quals if there are grouping sets. The only way we could safely move a
qual without copying is if we can show that none of the grouping sets is
empty, that is (), and at this point the grouping set list has not been
expanded so it's not trivial to determine that.

I pushed this to both master and 9.5. While this isn't strictly a
bugfix, it seemed better to keep the branches the same at this point.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers