Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

Started by 邱宇航4 months ago29 messages
#1邱宇航
iamqyh@gmail.com

I've noticed that two GROUP BY ROLLUP queries behave differently in v17
compared to master and REL_18_STABLE. The issue can be reproduced by
following SQL:

``` SQL
CREATE TABLE t(id int);

INSERT INTO t SELECT generate_series(1, 3);

-- Query 1
SELECT DISTINCT 'XXX'
FROM t
GROUP BY ROLLUP (id, 1);

-- Query 2
SELECT 'XXX'
FROM t
GROUP BY ROLLUP(id)
HAVING NOT (NULL IS NULL);
```

After some git bisect work, I traced the root cause:
- The first issue was introduced by commit f5050f79 (Mark expressions
nullable by grouping sets).
- The second issue stems from commit 67a54b9e (Allow pushdown of HAVING
clauses with grouping sets).

Best regards,
Yuhang Qiu

#2David Rowley
dgrowleyml@gmail.com
In reply to: 邱宇航 (#1)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

On Tue, 23 Sept 2025 at 15:49, 邱宇航 <iamqyh@gmail.com> wrote:

I've noticed that two GROUP BY ROLLUP queries behave differently in v17
compared to master and REL_18_STABLE. The issue can be reproduced by
following SQL:

After some git bisect work, I traced the root cause:
- The first issue was introduced by commit f5050f79 (Mark expressions
nullable by grouping sets).
- The second issue stems from commit 67a54b9e (Allow pushdown of HAVING
clauses with grouping sets).

If you check the release notes and the commit message for f5050f795
you'll see that it does mention that wrong results could be returned.

What wasn't mentioned was that this wasn't fixed in prior versions.
The reason being is that the fix required changing the query tree
representation, which we can't change in the back branches due to
incompatibility with stored rules in existing databases. So, a change
in query results for certain queries here is expected.

David

#3Bruce Momjian
bruce@momjian.us
In reply to: David Rowley (#2)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

On Tue, Sep 23, 2025 at 07:38:14PM +1200, David Rowley wrote:

On Tue, 23 Sept 2025 at 15:49, 邱宇航 <iamqyh@gmail.com> wrote:

I've noticed that two GROUP BY ROLLUP queries behave differently in v17
compared to master and REL_18_STABLE. The issue can be reproduced by
following SQL:

After some git bisect work, I traced the root cause:
- The first issue was introduced by commit f5050f79 (Mark expressions
nullable by grouping sets).
- The second issue stems from commit 67a54b9e (Allow pushdown of HAVING
clauses with grouping sets).

If you check the release notes and the commit message for f5050f795
you'll see that it does mention that wrong results could be returned.

What wasn't mentioned was that this wasn't fixed in prior versions.
The reason being is that the fix required changing the query tree
representation, which we can't change in the back branches due to
incompatibility with stored rules in existing databases. So, a change
in query results for certain queries here is expected.

Uh, by design, items mentioned in the major release notes have _not_
been fixed in previous minor versions. Not sure if we can make that
clearer to users. I did write a blog about this:

https://momjian.us/main/blogs/pgblog/2022.html#June_13_2022

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Do not let urgent matters crowd out time for investment in the future.

#4邱宇航
iamqyh@gmail.com
In reply to: David Rowley (#2)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

2025年9月23日 15:38,David Rowley <dgrowleyml@gmail.com> 写道:

If you check the release notes and the commit message for f5050f795
you'll see that it does mention that wrong results could be returned.

What wasn't mentioned was that this wasn't fixed in prior versions.
The reason being is that the fix required changing the query tree
representation, which we can't change in the back branches due to
incompatibility with stored rules in existing databases. So, a change
in query results for certain queries here is expected.

Yeah, I got it. Thanks David and Bruce.

And what about the query 2. This is caused by another commit, and
it's not mentioned in the commit message or the mailing discussion.

Best regards,
Yuhang Qiu

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: 邱宇航 (#4)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

=?utf-8?B?6YKx5a6H6Iiq?= <iamqyh@gmail.com> writes:

And what about the query 2. This is caused by another commit, and
it's not mentioned in the commit message or the mailing discussion.

That one indeed seems quite broken. EXPLAIN confirms that it's
pushing the HAVING below the aggregation, which is simply wrong
because it fails to filter the all-null row(s) that the aggregation
node will create out of thin air.

Is there anything we can salvage from 67a54b9e, or should
we just revert it?

regards, tom lane

#6David Rowley
dgrowleyml@gmail.com
In reply to: Tom Lane (#5)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

On Wed, 24 Sept 2025 at 02:39, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Is there anything we can salvage from 67a54b9e, or should
we just revert it?

It doesn't seem great that we need to reconsider the safety of that
optimisation post-release. It's not as if 67a54b9e added several cases
to test for and got one of them wrong. It's a case of the 1 case it
did add wasn't fully thought through.

As for fixing it; testing for a Const-false havingClause can't be done
as that only works for this case which const-folding happens to figure
out during planning. You could equally have something Var-less like:

create or replace function one() returns int as $$ begin return 1;
end; $$ stable language plpgsql;
SELECT 'XXX',count(*) FROM t GROUP BY ROLLUP(id) HAVING NOT one()=1;

which isn't known at plan-time. For me, I can't see how to make this
safe and I suspect due to your above question that you're in a similar
situation. Reverting and reconsidering for v19 seems like the safest
option.

Let's see what Richard thinks.

David

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#6)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

David Rowley <dgrowleyml@gmail.com> writes:

As for fixing it; testing for a Const-false havingClause can't be done
as that only works for this case which const-folding happens to figure
out during planning. You could equally have something Var-less like:

Yeah, I don't think the havingClause being constant-false is the
key point here. I've not looked closely at 67a54b9e, but I suspect
that anything Var-free is potentially an issue. Or it might even
fail for something that has Vars but is non-strict.

The core of the problem though is that the aggregation node will
issue an all-nulls row that did not come from its input, and we
have to be sure that the HAVING clause is properly evaluated
for that row.

regards, tom lane

#8Richard Guo
guofenglinux@gmail.com
In reply to: Tom Lane (#7)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

On Wed, Sep 24, 2025 at 10:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah, I don't think the havingClause being constant-false is the
key point here. I've not looked closely at 67a54b9e, but I suspect
that anything Var-free is potentially an issue. Or it might even
fail for something that has Vars but is non-strict.

The core of the problem though is that the aggregation node will
issue an all-nulls row that did not come from its input, and we
have to be sure that the HAVING clause is properly evaluated
for that row.

I think the issue is that an empty grouping set can produce a row out
of nowhere, and pushing down HAVING clauses in this case may cause us
to fail to filter out that row.

Currently, non-variable-free HAVING clauses aren't pushed down when
there is an empty grouping set, because the empty grouping set would
nullify the vars referenced in those clauses, and we have logic in
place to prevent their pushdown.

However, we (I) overlooked variable-free HAVING clauses. If there are
only empty grouping sets, this is not a problem since a copy of the
HAVING clauses is retained in HAVING. The issue arises when there are
both nonempty and empty grouping sets.

In summary, the problem occurs when both of the following conditions
are met: 1) there are both nonempty and empty grouping sets, 2) there
are variable-free HAVING clauses.

I think the following change fixes this problem.

foreach(l, (List *) parse->havingQual)
{
Node *havingclause = (Node *) lfirst(l);
+ Relids having_relids;

        if (contain_agg_clause(havingclause) ||
            contain_volatile_functions(havingclause) ||
            contain_subplans(havingclause) ||
            (parse->groupClause && parse->groupingSets &&
-            bms_is_member(root->group_rtindex, pull_varnos(root,
havingclause))))
+            ((having_relids = pull_varnos(root, havingclause)) == NULL ||
+            bms_is_member(root->group_rtindex, having_relids))))
        {
            /* keep it in HAVING */
            newHaving = lappend(newHaving, havingclause);

This change essentially prevents variable-free HAVING clauses from
being pushed down when there are any nonempty grouping sets. Of
course, this could be done more precisely -- for example, we could
restrict the prevention only to cases where empty grouping sets are
also present, but I'm not sure it's worth the effort.

- Richard

#9Richard Guo
guofenglinux@gmail.com
In reply to: Richard Guo (#8)
1 attachment(s)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

On Wed, Sep 24, 2025 at 12:10 PM Richard Guo <guofenglinux@gmail.com> wrote:

In summary, the problem occurs when both of the following conditions
are met: 1) there are both nonempty and empty grouping sets, 2) there
are variable-free HAVING clauses.

I think the following change fixes this problem.

foreach(l, (List *) parse->havingQual)
{
Node *havingclause = (Node *) lfirst(l);
+ Relids having_relids;

if (contain_agg_clause(havingclause) ||
contain_volatile_functions(havingclause) ||
contain_subplans(havingclause) ||
(parse->groupClause && parse->groupingSets &&
-            bms_is_member(root->group_rtindex, pull_varnos(root,
havingclause))))
+            ((having_relids = pull_varnos(root, havingclause)) == NULL ||
+            bms_is_member(root->group_rtindex, having_relids))))
{
/* keep it in HAVING */
newHaving = lappend(newHaving, havingclause);

This change essentially prevents variable-free HAVING clauses from
being pushed down when there are any nonempty grouping sets. Of
course, this could be done more precisely -- for example, we could
restrict the prevention only to cases where empty grouping sets are
also present, but I'm not sure it's worth the effort.

Here is the patch.

- Richard

Attachments:

v1-0001-Fix-pushdown-of-degenerate-HAVING-clauses.patchapplication/octet-stream; name=v1-0001-Fix-pushdown-of-degenerate-HAVING-clauses.patchDownload
From 6dcaed11d7521f22ec924b342c5668a2a7fb855a Mon Sep 17 00:00:00 2001
From: Richard Guo <guofenglinux@gmail.com>
Date: Wed, 24 Sep 2025 16:49:24 +0900
Subject: [PATCH v1] Fix pushdown of degenerate HAVING clauses

Commit 67a54b9e8 taught the planner to push down HAVING clauses even
when nonempty grouping sets are present, as long as the clause does
not reference any columns that are nullable by the grouping sets.
However, there was an oversight: if any empty grouping sets are also
present, the aggregation node can produce a row that did not come from
the input, and pushing down a HAVING clause in this case may cause us
to fail to filter out that row.

Currently, non-degenerate HAVING clauses are not pushed down when
empty grouping sets are present, since the grouping set would nullify
the vars they reference.  However, degenerate (variable-free) HAVING
clauses are not subject to this restriction and may be incorrectly
pushed down.

To fix, the patch adds a check that prevents degenerate HAVING clauses
from being pushed down when any nonempty grouping sets are present.
This could be done more precisely, for example by restricting the
prevention only to cases where empty grouping sets are also present,
but the added complexity does not seem justified.

Back-patch to v18 where this issue was introduced.
---
 src/backend/optimizer/plan/planner.c       | 13 +++++---
 src/test/regress/expected/groupingsets.out | 39 +++++++++++++++++++++-
 src/test/regress/sql/groupingsets.sql      | 12 ++++++-
 3 files changed, 57 insertions(+), 7 deletions(-)

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 41bd8353430..8d331aa34bd 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1097,10 +1097,11 @@ subquery_planner(PlannerGlobal *glob, Query *parse, PlannerInfo *parent_root,
 	 * cannot do so if the HAVING clause contains aggregates (obviously) or
 	 * volatile functions (since a HAVING clause is supposed to be executed
 	 * only once per group).  We also can't do this if there are any nonempty
-	 * grouping sets and the clause references any columns that are nullable
-	 * by the grouping sets; moving such a clause into WHERE would potentially
-	 * change the results.  (If there are only empty grouping sets, then the
-	 * HAVING clause must be degenerate as discussed below.)
+	 * grouping sets and the clause either references any columns that are
+	 * nullable by the grouping sets or is degenerate; moving such a clause
+	 * into WHERE would potentially change the results.  (If there are only
+	 * empty grouping sets, then the HAVING clause must be degenerate as
+	 * discussed below.)
 	 *
 	 * Also, it may be that the clause is so expensive to execute that we're
 	 * better off doing it only once per group, despite the loss of
@@ -1137,12 +1138,14 @@ subquery_planner(PlannerGlobal *glob, Query *parse, PlannerInfo *parent_root,
 	foreach(l, (List *) parse->havingQual)
 	{
 		Node	   *havingclause = (Node *) lfirst(l);
+		Relids		having_relids;
 
 		if (contain_agg_clause(havingclause) ||
 			contain_volatile_functions(havingclause) ||
 			contain_subplans(havingclause) ||
 			(parse->groupClause && parse->groupingSets &&
-			 bms_is_member(root->group_rtindex, pull_varnos(root, havingclause))))
+			 ((having_relids = pull_varnos(root, havingclause)) == NULL ||
+			  bms_is_member(root->group_rtindex, having_relids))))
 		{
 			/* keep it in HAVING */
 			newHaving = lappend(newHaving, havingclause);
diff --git a/src/test/regress/expected/groupingsets.out b/src/test/regress/expected/groupingsets.out
index 210bbe307a7..7a8533228d6 100644
--- a/src/test/regress/expected/groupingsets.out
+++ b/src/test/regress/expected/groupingsets.out
@@ -890,7 +890,8 @@ explain (costs off)
                        ->  Seq Scan on gstest2
 (10 rows)
 
--- test pushdown of HAVING clause that does not reference any columns that are nullable by grouping sets
+-- test pushdown of non-degenerate HAVING clause that does not reference any
+-- columns that are nullable by grouping sets
 explain (costs off)
 select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a > 1 and b > 1;
            QUERY PLAN            
@@ -911,6 +912,42 @@ select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a
  2 | 2 |     1
 (1 row)
 
+-- test pushdown of degenerate HAVING clause
+explain (costs off)
+select count(*) from gstest2 group by grouping sets (()) having false;
+            QUERY PLAN             
+-----------------------------------
+ Aggregate
+   Group Key: ()
+   Filter: false
+   ->  Result
+         Replaces: Scan on gstest2
+         One-Time Filter: false
+(6 rows)
+
+select count(*) from gstest2 group by grouping sets (()) having false;
+ count 
+-------
+(0 rows)
+
+explain (costs off)
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+           QUERY PLAN            
+---------------------------------
+ GroupAggregate
+   Group Key: a
+   Group Key: ()
+   Filter: false
+   ->  Sort
+         Sort Key: a
+         ->  Seq Scan on gstest2
+(7 rows)
+
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+ a | count 
+---+-------
+(0 rows)
+
 -- HAVING with GROUPING queries
 select ten, grouping(ten) from onek
 group by grouping sets(ten) having grouping(ten) >= 0
diff --git a/src/test/regress/sql/groupingsets.sql b/src/test/regress/sql/groupingsets.sql
index 38d3cdd0fd8..dbacc2ffdce 100644
--- a/src/test/regress/sql/groupingsets.sql
+++ b/src/test/regress/sql/groupingsets.sql
@@ -290,11 +290,21 @@ explain (costs off)
   select v.c, (select count(*) from gstest2 group by () having v.c)
     from (values (false),(true)) v(c) order by v.c;
 
--- test pushdown of HAVING clause that does not reference any columns that are nullable by grouping sets
+-- test pushdown of non-degenerate HAVING clause that does not reference any
+-- columns that are nullable by grouping sets
 explain (costs off)
 select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a > 1 and b > 1;
 select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a > 1 and b > 1;
 
+-- test pushdown of degenerate HAVING clause
+explain (costs off)
+select count(*) from gstest2 group by grouping sets (()) having false;
+select count(*) from gstest2 group by grouping sets (()) having false;
+
+explain (costs off)
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+
 -- HAVING with GROUPING queries
 select ten, grouping(ten) from onek
 group by grouping sets(ten) having grouping(ten) >= 0
-- 
2.39.5 (Apple Git-154)

#10wenhui qiu
qiuwenhuifx@gmail.com
In reply to: Richard Guo (#9)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

Hi Richard

I think the following change fixes this problem.

foreach(l, (List *) parse->havingQual)
{
Node *havingclause = (Node *) lfirst(l);
+ Relids having_relids;

if (contain_agg_clause(havingclause) ||
contain_volatile_functions(havingclause) ||
contain_subplans(havingclause) ||
(parse->groupClause && parse->groupingSets &&
-            bms_is_member(root->group_rtindex, pull_varnos(root,
havingclause))))
+            ((having_relids = pull_varnos(root, havingclause)) == NULL ||
+            bms_is_member(root->group_rtindex, having_relids))))
{
/* keep it in HAVING */
newHaving = lappend(newHaving, havingclause);

This change essentially prevents variable-free HAVING clauses from
being pushed down when there are any nonempty grouping sets. Of
course, this could be done more precisely -- for example, we could
restrict the prevention only to cases where empty grouping sets are
also present, but I'm not sure it's worth the effort.

I think it makes sense. However, it is now too close to the release date of
v18.Awaiting Tom's feedback?

Thanks

On Wed, Sep 24, 2025 at 4:18 PM Richard Guo <guofenglinux@gmail.com> wrote:

Show quoted text

On Wed, Sep 24, 2025 at 12:10 PM Richard Guo <guofenglinux@gmail.com>
wrote:

In summary, the problem occurs when both of the following conditions
are met: 1) there are both nonempty and empty grouping sets, 2) there
are variable-free HAVING clauses.

I think the following change fixes this problem.

foreach(l, (List *) parse->havingQual)
{
Node *havingclause = (Node *) lfirst(l);
+ Relids having_relids;

if (contain_agg_clause(havingclause) ||
contain_volatile_functions(havingclause) ||
contain_subplans(havingclause) ||
(parse->groupClause && parse->groupingSets &&
-            bms_is_member(root->group_rtindex, pull_varnos(root,
havingclause))))
+            ((having_relids = pull_varnos(root, havingclause)) == NULL

||

+ bms_is_member(root->group_rtindex, having_relids))))
{
/* keep it in HAVING */
newHaving = lappend(newHaving, havingclause);

This change essentially prevents variable-free HAVING clauses from
being pushed down when there are any nonempty grouping sets. Of
course, this could be done more precisely -- for example, we could
restrict the prevention only to cases where empty grouping sets are
also present, but I'm not sure it's worth the effort.

Here is the patch.

- Richard

#11Richard Guo
guofenglinux@gmail.com
In reply to: Richard Guo (#9)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

On Wed, Sep 24, 2025 at 5:18 PM Richard Guo <guofenglinux@gmail.com> wrote:

Here is the patch.

I plan to push this patch soon, unless there are any objections.

- Richard

#12David Rowley
dgrowleyml@gmail.com
In reply to: Richard Guo (#11)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

On Thu, 25 Sept 2025 at 13:01, Richard Guo <guofenglinux@gmail.com> wrote:

I plan to push this patch soon, unless there are any objections.

What's your confidence levels on the logic now being correct? 100%?
90%? Hopeful?

I've not personally had the time to process the patch and figure out
that this is now safe. It sounds like you have, but (with respect) I
expect you also thought that before the initial commit too. I realise
that you now have more of the picture, but how do we know the picture
is now complete? I think we all know this stuff is hard and we also
know that even the most seasoned of planner hackers don't always get
it right first time.

What I'm now wondering is the risk to reward ratio of fixing this in
18.1. If it happens that it's still not right for 18.1, then we need
to wait until 18.2, which is currently due Feb-26. I don't quite have
the same confidence levels as you do, but maybe I would if I took the
time to more carefully think about this. For now, my thoughts are
that it's safer to revert and try again for v19. Probably it would
make more sense to try harder for an 18.1 fix if this optimisation was
a more critical and people had been waiting on it and there was no
other way to obtain the benefits of it. But that's not quite the case
here as, in theory, someone could rewrite their query if it's safe for
their use case and end up with the same performance benefits.

Just so it's clear, I'm not objecting to fixing for 18.1. I just want
to ensure your judgment is for the project and not for
self-preservation. I think most people will respect the decision if
you decide that you need more time to consider this and opt to revert
in v18 and fix only in master. Based on my current understanding of
the optimisation, I'd certainly be more at ease with that.

On the other hand, if you're 100% confident, I'll step aside.

David

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#12)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

David Rowley <dgrowleyml@gmail.com> writes:

On Thu, 25 Sept 2025 at 13:01, Richard Guo <guofenglinux@gmail.com> wrote:

I plan to push this patch soon, unless there are any objections.

What's your confidence levels on the logic now being correct? 100%?
90%? Hopeful?

FWIW, my confidence in it is rather low. I've not had time to think
this through carefully, but it seems to me that the test ought to
involve whether there is an empty grouping set, yet the proposed
patch does no such thing --- or at least, if it manages to achieve
that effect, it's not obvious how.

18.1 will not be coming out till November, so I feel no need to
rush to judgment on what to do here.

regards, tom lane

#14Richard Guo
guofenglinux@gmail.com
In reply to: David Rowley (#12)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

On Thu, Sep 25, 2025 at 11:05 AM David Rowley <dgrowleyml@gmail.com> wrote:

What's your confidence levels on the logic now being correct? 100%?
90%? Hopeful?

I've not personally had the time to process the patch and figure out
that this is now safe. It sounds like you have, but (with respect) I
expect you also thought that before the initial commit too. I realise
that you now have more of the picture, but how do we know the picture
is now complete? I think we all know this stuff is hard and we also
know that even the most seasoned of planner hackers don't always get
it right first time.

What I'm now wondering is the risk to reward ratio of fixing this in
18.1. If it happens that it's still not right for 18.1, then we need
to wait until 18.2, which is currently due Feb-26. I don't quite have
the same confidence levels as you do, but maybe I would if I took the
time to more carefully think about this. For now, my thoughts are
that it's safer to revert and try again for v19. Probably it would
make more sense to try harder for an 18.1 fix if this optimisation was
a more critical and people had been waiting on it and there was no
other way to obtain the benefits of it. But that's not quite the case
here as, in theory, someone could rewrite their query if it's safe for
their use case and end up with the same performance benefits.

Just so it's clear, I'm not objecting to fixing for 18.1. I just want
to ensure your judgment is for the project and not for
self-preservation. I think most people will respect the decision if
you decide that you need more time to consider this and opt to revert
in v18 and fix only in master. Based on my current understanding of
the optimisation, I'd certainly be more at ease with that.

On the other hand, if you're 100% confident, I'll step aside.

Thanks for the input; that's a reasonable concern. Although I've
reviewed my analysis again and didn't find any new issues, I don't
think I would claim to be 100% confident that the current logic is
bug-free. Realistically, I doubt anyone would make such a claim.

I'm completely open to reverting this and revisiting it for v19.
However, I'd really appreciate any reviews that point out specific
issues in the current logic. I've shared my analysis in my initial
email as well as in the commit message of the proposed patch. If
anyone can highlight parts that don't make sense, we can discuss them
and update the patch accordingly. Without that kind of feedback, I'm
concerned that we may just end up repeating the same code in v19.

- Richard

#15Richard Guo
guofenglinux@gmail.com
In reply to: Tom Lane (#13)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

On Thu, Sep 25, 2025 at 12:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

FWIW, my confidence in it is rather low. I've not had time to think
this through carefully, but it seems to me that the test ought to
involve whether there is an empty grouping set, yet the proposed
patch does no such thing --- or at least, if it manages to achieve
that effect, it's not obvious how.

I explained the test for empty grouping sets in my initial email as
well as in the commit message of the proposed patch. Apparently, I
failed to make myself clear, so here's another attempt.

To be clear, we are only discussing HAVING clauses that do not contain
aggregates, since clauses with aggregates wouldn't be pushed down
anyway.

First, let's consider the presence of empty grouping sets. There are
two cases: either the query contains only empty grouping sets, or it
contains both non-empty and empty grouping sets.

In the former case (only empty grouping sets), the HAVING clauses must
be degenerate (variable-free). These are placed in the WHERE clause
and also retained in HAVING. This follows the same logic as prior to
commit 67a54b9e8.

In the latter case (both non-empty and empty grouping sets), for
non-degenerate HAVING clauses, the empty grouping sets would nullify
the vars referenced in the clauses, so they would not be pushed
down. This is ensured by the check bms_is_member(root->group_rtindex,
having_relids). For degenerate HAVING clauses, they are also
not pushed down, guaranteed by the new check added in the proposed
patch: having_relids == NULL.

Next, suppose there are no empty grouping sets in the query. For
non-degenerate HAVING clauses, only those clauses that do not
reference columns nullable by the grouping sets are pushed down, which
is guaranteed by the same check: bms_is_member(root->group_rtindex,
having_relids). For degenerate HAVING clauses, the check
having_relids == NULL prevents pushing them down, though it could be
argued that in this case, they could be pushed down. This is what I
meant by (quoting from the commit message):

"
This could be done more precisely, for example by restricting the
prevention only to cases where empty grouping sets are also present,
but the added complexity does not seem justified.
"

In summary, under the current logic, HAVING clauses in queries with
grouping sets are pushed down only in the following two cases:

1) There are only empty grouping sets.

In this case, the HAVING clauses are pushed down, but they are also
retained in HAVING.

2) There are non-empty grouping sets and the HAVING clauses are
non-degenerate and they do not reference any columns nullable by the
grouping sets.

I believe both of the above cases are safe for pushing down HAVING
clauses. It could be argued that there may be more cases where we can
push down HAVING clauses with grouping sets, but for now, I'd prefer
to focusing on ensuring the current logic is safe.

Does this analysis make sense to you? Am I missing anything?

18.1 will not be coming out till November, so I feel no need to
rush to judgment on what to do here.

Thanks. I'll wait for any feedback on the patch itself before
deciding how to proceed. I'd appreciate it if a review could be done
before November.

- Richard

#16Richard Guo
guofenglinux@gmail.com
In reply to: Richard Guo (#15)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

On Thu, Sep 25, 2025 at 5:27 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Thu, Sep 25, 2025 at 12:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

18.1 will not be coming out till November, so I feel no need to
rush to judgment on what to do here.

Thanks. I'll wait for any feedback on the patch itself before
deciding how to proceed. I'd appreciate it if a review could be done
before November.

Having heard nothing but crickets and not wanting to leave this until
the 11th hour before November, I'll plan to push the v1 patch next
week, unless there are any objections.

- Richard

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Richard Guo (#16)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

Richard Guo <guofenglinux@gmail.com> writes:

Having heard nothing but crickets and not wanting to leave this until
the 11th hour before November, I'll plan to push the v1 patch next
week, unless there are any objections.

I started to look at this again, and now I'm thinking that there is
indeed an issue related to "Query 1". Recall that the test setup is

CREATE TABLE t(id int);
INSERT INTO t SELECT generate_series(1, 3);

If we do

regression=# SELECT id, 'XXX' FROM t GROUP BY ROLLUP (id, 1);

we get this, which seems correct:

id | ?column?
----+----------
| XXX
3 | XXX
2 | XXX
1 | XXX
3 | XXX
2 | XXX
1 | XXX
(7 rows)

But leave out the "id" output, and look what happens:

regression=# SELECT 'XXX' FROM t GROUP BY ROLLUP (id, 1);
?column?
----------

XXX
XXX
XXX

(7 rows)

How can that be correct?? Simplifying further to

regression=# SELECT 'XXX' FROM t GROUP BY ROLLUP (id);
?column?
----------
XXX
XXX
XXX
XXX
(4 rows)

restores sanity. I've not dug into the code, but these two examples
make it look like we think that 'XXX' is dependent on '1', which
surely it is not, most especially since it shouldn't vary depending
on whether "id" is included as an output column.

This behavior is the same as in v17, but that doesn't make it not
broken.

regards, tom lane

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#17)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

I wrote:

I started to look at this again, and now I'm thinking that there is
indeed an issue related to "Query 1".

Oh, scratch that, I see my mistake: I was thinking of "1" as a
constant, but actually we must be interpreting it per SQL92 rules
as a reference to output column 1. So that's why adding or dropping
the "id" output column changes the results of the grouping.

-ENOCAFFEINE ... sorry for the noise. I'll go back to studying
the HAVING issue.

regards, tom lane

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Richard Guo (#16)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

Richard Guo <guofenglinux@gmail.com> writes:

Having heard nothing but crickets and not wanting to leave this until
the 11th hour before November, I'll plan to push the v1 patch next
week, unless there are any objections.

OK, I finally made some time to think this through, and here's
my thoughts:

The definition of HAVING is that it excludes grouped rows that don't
satisfy the HAVING condition. The idea behind the push-down-HAVING
optimization is that if we exclude all source rows that don't satisfy
the condition, then no groups that don't satisfy it will be formed, so
we don't have to re-check the HAVING condition after forming groups.

Grouping sets break this concept, which is why the pre-67a54b9e
planner code just punted on the optimization if there were grouping
sets. There are two reasons why they cause trouble:

1. If a variable doesn't appear in every grouping set, then it will
sometimes go to NULL during grouping, meaning that the HAVING clause
needs to test a different value than what is available at the scan
level. We did not have a good way to deal with that pre-v18, but now
we can check to see whether the clause contains any variables marked
as nullable by group_rtindex; if not, it should be safe to push down.

2. If there is an empty grouping set, it will give rise to a grouped
row whether there is any input or not. In this case we *must* keep
the HAVING condition at the top level, else it fails to filter that
row. (We might still choose to copy it to WHERE, however.) Note
that whether the HAVING condition is degenerate (variable-free) or
not does not enter into this directly; although if it has any Vars
they are certainly all nullable by group_rtindex, meaning that
checking for #1 accidentally keeps us out of trouble unless the
condition is degenerate.

67a54b9e accounted for #1 but not #2, which is why we have a bug.

The proposed patch tries to close the hole by checking whether
the condition is degenerate, but that feels subtly wrong to me:
what we ought to check is whether there is any empty grouping set.
As proposed, I think we miss optimization opportunities for
degenerate HAVING because we will not try the trick of copying
it to WHERE.

Also, I suspect that this change in 67a54b9e was wrong in detail:

            newHaving = lappend(newHaving, havingclause);
        }
-       else if (parse->groupClause && !parse->groupingSets)
+       else if (parse->groupClause)
        {
            Node       *whereclause;

because it allows us to apply "move to WHERE" rather than "copy to
WHERE" in cases with grouping sets, and that can't always be right.
I think that if that change hadn't been made, we would not have
had a bug. So the fact that the proposed fix doesn't touch this
bit does not seem right.

regards, tom lane

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#19)
1 attachment(s)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

I wrote:

The proposed patch tries to close the hole by checking whether
the condition is degenerate, but that feels subtly wrong to me:
what we ought to check is whether there is any empty grouping set.
As proposed, I think we miss optimization opportunities for
degenerate HAVING because we will not try the trick of copying
it to WHERE.

Concretely, I think we could do the attached. This has the same
test query as in v1, but the generated plan is better because it
realizes it can copy the constant-false HAVING clause into WHERE,
resulting in a dummy scan of the table.

I'm not sure if planner.c is the best place to put
has_empty_grouping_set(). I couldn't find any existing code doing the
same thing, but maybe someday we'd want the functionality elsewhere.

regards, tom lane

Attachments:

v2-0001-Fix-pushdown-of-degenerate-HAVING-clauses.patchtext/x-diff; charset=us-ascii; name=v2-0001-Fix-pushdown-of-degenerate-HAVING-clauses.patchDownload
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 342d782c74b..2e2e7acf195 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -147,6 +147,7 @@ static double preprocess_limit(PlannerInfo *root,
 							   double tuple_fraction,
 							   int64 *offset_est, int64 *count_est);
 static List *preprocess_groupclause(PlannerInfo *root, List *force);
+static bool has_empty_grouping_set(List *groupingSets);
 static List *extract_rollup_sets(List *groupingSets);
 static List *reorder_grouping_sets(List *groupingSets, List *sortclause);
 static void standard_qp_callback(PlannerInfo *root, void *extra);
@@ -1131,11 +1132,12 @@ subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
 	 * In some cases we may want to transfer a HAVING clause into WHERE. We
 	 * cannot do so if the HAVING clause contains aggregates (obviously) or
 	 * volatile functions (since a HAVING clause is supposed to be executed
-	 * only once per group).  We also can't do this if there are any nonempty
-	 * grouping sets and the clause references any columns that are nullable
-	 * by the grouping sets; moving such a clause into WHERE would potentially
-	 * change the results.  (If there are only empty grouping sets, then the
-	 * HAVING clause must be degenerate as discussed below.)
+	 * only once per group).  We also can't do this if there are any grouping
+	 * sets and the clause references any columns that are nullable by the
+	 * grouping sets; the nulled values of those columns are not available
+	 * before the grouping step.  (The test on groupClause might seem wrong,
+	 * but it's okay: it's just an optimization to avoid running pull_varnos
+	 * when there cannot be any Vars in the HAVING clause.)
 	 *
 	 * Also, it may be that the clause is so expensive to execute that we're
 	 * better off doing it only once per group, despite the loss of
@@ -1145,19 +1147,19 @@ subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
 	 * clause into WHERE, in hopes of eliminating tuples before aggregation
 	 * instead of after.
 	 *
-	 * If the query has explicit grouping then we can simply move such a
+	 * If the query has no empty grouping set then we can simply move such a
 	 * clause into WHERE; any group that fails the clause will not be in the
 	 * output because none of its tuples will reach the grouping or
-	 * aggregation stage.  Otherwise we must have a degenerate (variable-free)
-	 * HAVING clause, which we put in WHERE so that query_planner() can use it
-	 * in a gating Result node, but also keep in HAVING to ensure that we
-	 * don't emit a bogus aggregated row. (This could be done better, but it
-	 * seems not worth optimizing.)
+	 * aggregation stage.  Otherwise we have to keep the clause in HAVING to
+	 * ensure that we don't emit a bogus aggregated row.  But then the HAVING
+	 * clause must be degenerate (variable-free), so we can copy it into WHERE
+	 * so that query_planner() can use it in a gating Result node. (This could
+	 * be done better, but it seems not worth optimizing.)
 	 *
 	 * Note that a HAVING clause may contain expressions that are not fully
 	 * preprocessed.  This can happen if these expressions are part of
 	 * grouping items.  In such cases, they are replaced with GROUP Vars in
-	 * the parser and then replaced back after we've done with expression
+	 * the parser and then replaced back after we're done with expression
 	 * preprocessing on havingQual.  This is not an issue if the clause
 	 * remains in HAVING, because these expressions will be matched to lower
 	 * target items in setrefs.c.  However, if the clause is moved or copied
@@ -1182,7 +1184,8 @@ subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
 			/* keep it in HAVING */
 			newHaving = lappend(newHaving, havingclause);
 		}
-		else if (parse->groupClause)
+		else if (parse->groupClause &&
+				 !has_empty_grouping_set(parse->groupingSets))
 		{
 			Node	   *whereclause;
 
@@ -2910,6 +2913,39 @@ preprocess_groupclause(PlannerInfo *root, List *force)
 	return new_groupclause;
 }
 
+/*
+ * Check for empty grouping sets within a list of GroupingSets.
+ */
+static bool
+has_empty_grouping_set(List *groupingSets)
+{
+	ListCell   *lc;
+
+	foreach(lc, groupingSets)
+	{
+		GroupingSet *gset = castNode(GroupingSet, lfirst(lc));
+
+		switch (gset->kind)
+		{
+			case GROUPING_SET_EMPTY:
+				return true;
+			case GROUPING_SET_SIMPLE:
+				/* keep looking */
+				break;
+			case GROUPING_SET_ROLLUP:
+			case GROUPING_SET_CUBE:
+				/* these necessarily include an empty set */
+				return true;
+			case GROUPING_SET_SETS:
+				/* recurse */
+				if (has_empty_grouping_set(gset->content))
+					return true;
+				break;
+		}
+	}
+	return false;
+}
+
 /*
  * Extract lists of grouping sets that can be implemented using a single
  * rollup-type aggregate pass each. Returns a list of lists of grouping sets.
diff --git a/src/test/regress/expected/groupingsets.out b/src/test/regress/expected/groupingsets.out
index 991121545c5..a480b4749a8 100644
--- a/src/test/regress/expected/groupingsets.out
+++ b/src/test/regress/expected/groupingsets.out
@@ -890,7 +890,8 @@ explain (costs off)
                        ->  Seq Scan on gstest2
 (10 rows)
 
--- test pushdown of HAVING clause that does not reference any columns that are nullable by grouping sets
+-- test pushdown of non-degenerate HAVING clause that does not reference any
+-- columns that are nullable by grouping sets
 explain (costs off)
 select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a > 1 and b > 1;
            QUERY PLAN            
@@ -911,6 +912,44 @@ select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a
  2 | 2 |     1
 (1 row)
 
+-- test pushdown of degenerate HAVING clause
+explain (costs off)
+select count(*) from gstest2 group by grouping sets (()) having false;
+            QUERY PLAN             
+-----------------------------------
+ Aggregate
+   Group Key: ()
+   Filter: false
+   ->  Result
+         Replaces: Scan on gstest2
+         One-Time Filter: false
+(6 rows)
+
+select count(*) from gstest2 group by grouping sets (()) having false;
+ count 
+-------
+(0 rows)
+
+explain (costs off)
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+               QUERY PLAN                
+-----------------------------------------
+ GroupAggregate
+   Group Key: a
+   Group Key: ()
+   Filter: false
+   ->  Sort
+         Sort Key: a
+         ->  Result
+               Replaces: Scan on gstest2
+               One-Time Filter: false
+(9 rows)
+
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+ a | count 
+---+-------
+(0 rows)
+
 -- HAVING with GROUPING queries
 select ten, grouping(ten) from onek
 group by grouping sets(ten) having grouping(ten) >= 0
diff --git a/src/test/regress/sql/groupingsets.sql b/src/test/regress/sql/groupingsets.sql
index 38d3cdd0fd8..dbacc2ffdce 100644
--- a/src/test/regress/sql/groupingsets.sql
+++ b/src/test/regress/sql/groupingsets.sql
@@ -290,11 +290,21 @@ explain (costs off)
   select v.c, (select count(*) from gstest2 group by () having v.c)
     from (values (false),(true)) v(c) order by v.c;
 
--- test pushdown of HAVING clause that does not reference any columns that are nullable by grouping sets
+-- test pushdown of non-degenerate HAVING clause that does not reference any
+-- columns that are nullable by grouping sets
 explain (costs off)
 select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a > 1 and b > 1;
 select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a > 1 and b > 1;
 
+-- test pushdown of degenerate HAVING clause
+explain (costs off)
+select count(*) from gstest2 group by grouping sets (()) having false;
+select count(*) from gstest2 group by grouping sets (()) having false;
+
+explain (costs off)
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+
 -- HAVING with GROUPING queries
 select ten, grouping(ten) from onek
 group by grouping sets(ten) having grouping(ten) >= 0
#21Richard Guo
guofenglinux@gmail.com
In reply to: Tom Lane (#20)
1 attachment(s)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

On Sat, Oct 18, 2025 at 6:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

The proposed patch tries to close the hole by checking whether
the condition is degenerate, but that feels subtly wrong to me:
what we ought to check is whether there is any empty grouping set.
As proposed, I think we miss optimization opportunities for
degenerate HAVING because we will not try the trick of copying
it to WHERE.

Concretely, I think we could do the attached. This has the same
test query as in v1, but the generated plan is better because it
realizes it can copy the constant-false HAVING clause into WHERE,
resulting in a dummy scan of the table.

Thanks for the patch. Yeah, v2 generates better plans for queries
with degenerate HAVING clauses, as it can move or copy such clauses
into WHERE.

The idea of 67a54b9e8 is to leverage the presence of group_rtindex in
the HAVING clause to determine whether it references any columns that
are not present in all grouping sets. If it doesn't, then it's safe
to push the clause down. This logic was also intended to cope with
empty grouping sets, since an empty grouping set would nullify any
Vars referenced in the clause. However, the loose end is that
degenerate clauses are not subject to this check, as they lack any
Vars and therefore cannot be marked with group_rtindex.

v1 patch tries to close this gap by detecting degenerate HAVING
clauses and preventing them from being pushed down. While this may
cause us to miss the optimization for degenerate clauses, I felt it
was acceptable, because degenerate HAVING clauses don't seem to be
common in practice, and detecting the presence of empty grouping sets
doesn't seem a trivial task.

I took a look at has_empty_grouping_set() in v2 patch, but I'm afraid
it's not thorough enough. I don't think it's correct to return true
once we find an EMPTY, ROLLUP, or CUBE GroupingSet. I think we need
to compute the Cartesian product across different GroupingSets, just
like what expand_grouping_sets() does. With the v2 patch, I can see
some bogus plans; for example:

create table t (a int, b int);

explain (costs off) select count(*) from t group by rollup(a), b having b > 1;
QUERY PLAN
-------------------------
HashAggregate
Hash Key: b, a
Hash Key: b
Filter: (b > 1)
-> Seq Scan on t
Filter: (b > 1)
(6 rows)

has_empty_grouping_set() thinks there is an empty grouping set in this
query, so a copy of the HAVING clause is retained in HAVING, which
does not seem correct.

Rather than extending has_empty_grouping_set() to compute the
Cartesian product across different GroupingSets, I wonder if we could
move the call to expand_grouping_sets() from
preprocess_grouping_sets() to just before the push-down-HAVING
optimization. This way, we could leverage the expanded flat list of
grouping sets to accurately determine whether any empty grouping sets
exist.

I did a quick search and didn't find any code between the
push-down-HAVING optimization and preprocess_grouping_sets() that
relies on parse->groupingSets being a list of GroupingSet's, so this
should be safe.

Hence, attached v3 patch.

- Richard

Attachments:

v3-0001-Fix-pushdown-of-degenerate-HAVING-clauses.patchapplication/octet-stream; name=v3-0001-Fix-pushdown-of-degenerate-HAVING-clauses.patchDownload
From 3572acdca465d68fa78e694c054c6cb0426260d2 Mon Sep 17 00:00:00 2001
From: Richard Guo <guofenglinux@gmail.com>
Date: Mon, 20 Oct 2025 22:06:56 +0900
Subject: [PATCH v3] Fix pushdown of degenerate HAVING clauses

---
 src/backend/optimizer/plan/planner.c       | 41 ++++++++++++++--------
 src/test/regress/expected/groupingsets.out | 41 +++++++++++++++++++++-
 src/test/regress/sql/groupingsets.sql      | 12 ++++++-
 3 files changed, 77 insertions(+), 17 deletions(-)

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 342d782c74b..c33c8b152dc 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1127,15 +1127,26 @@ subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
 	if (parse->hasTargetSRFs)
 		parse->hasTargetSRFs = expression_returns_set((Node *) parse->targetList);
 
+	/*
+	 * Expand the groupingSets tree of this Query to a flat list of grouping
+	 * sets.
+	 */
+	if (parse->groupingSets)
+	{
+		parse->groupingSets =
+			expand_grouping_sets(parse->groupingSets, parse->groupDistinct, -1);
+	}
+
 	/*
 	 * In some cases we may want to transfer a HAVING clause into WHERE. We
 	 * cannot do so if the HAVING clause contains aggregates (obviously) or
 	 * volatile functions (since a HAVING clause is supposed to be executed
-	 * only once per group).  We also can't do this if there are any nonempty
-	 * grouping sets and the clause references any columns that are nullable
-	 * by the grouping sets; moving such a clause into WHERE would potentially
-	 * change the results.  (If there are only empty grouping sets, then the
-	 * HAVING clause must be degenerate as discussed below.)
+	 * only once per group).  We also can't do this if there are any grouping
+	 * sets and the clause references any columns that are nullable by the
+	 * grouping sets; the nulled values of those columns are not available
+	 * before the grouping step.  (The test on groupClause might seem wrong,
+	 * but it's okay: it's just an optimization to avoid running pull_varnos
+	 * when there cannot be any Vars in the HAVING clause.)
 	 *
 	 * Also, it may be that the clause is so expensive to execute that we're
 	 * better off doing it only once per group, despite the loss of
@@ -1145,19 +1156,19 @@ subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
 	 * clause into WHERE, in hopes of eliminating tuples before aggregation
 	 * instead of after.
 	 *
-	 * If the query has explicit grouping then we can simply move such a
+	 * If the query has no empty grouping set then we can simply move such a
 	 * clause into WHERE; any group that fails the clause will not be in the
 	 * output because none of its tuples will reach the grouping or
-	 * aggregation stage.  Otherwise we must have a degenerate (variable-free)
-	 * HAVING clause, which we put in WHERE so that query_planner() can use it
-	 * in a gating Result node, but also keep in HAVING to ensure that we
-	 * don't emit a bogus aggregated row. (This could be done better, but it
-	 * seems not worth optimizing.)
+	 * aggregation stage.  Otherwise we have to keep the clause in HAVING to
+	 * ensure that we don't emit a bogus aggregated row.  But then the HAVING
+	 * clause must be degenerate (variable-free), so we can copy it into WHERE
+	 * so that query_planner() can use it in a gating Result node. (This could
+	 * be done better, but it seems not worth optimizing.)
 	 *
 	 * Note that a HAVING clause may contain expressions that are not fully
 	 * preprocessed.  This can happen if these expressions are part of
 	 * grouping items.  In such cases, they are replaced with GROUP Vars in
-	 * the parser and then replaced back after we've done with expression
+	 * the parser and then replaced back after we're done with expression
 	 * preprocessing on havingQual.  This is not an issue if the clause
 	 * remains in HAVING, because these expressions will be matched to lower
 	 * target items in setrefs.c.  However, if the clause is moved or copied
@@ -1182,7 +1193,9 @@ subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
 			/* keep it in HAVING */
 			newHaving = lappend(newHaving, havingclause);
 		}
-		else if (parse->groupClause)
+		else if (parse->groupClause &&
+				 (parse->groupingSets == NIL ||
+				  linitial(parse->groupingSets) != NIL))
 		{
 			Node	   *whereclause;
 
@@ -2195,8 +2208,6 @@ preprocess_grouping_sets(PlannerInfo *root)
 	ListCell   *lc_set;
 	grouping_sets_data *gd = palloc0(sizeof(grouping_sets_data));
 
-	parse->groupingSets = expand_grouping_sets(parse->groupingSets, parse->groupDistinct, -1);
-
 	gd->any_hashable = false;
 	gd->unhashable_refs = NULL;
 	gd->unsortable_refs = NULL;
diff --git a/src/test/regress/expected/groupingsets.out b/src/test/regress/expected/groupingsets.out
index 991121545c5..a480b4749a8 100644
--- a/src/test/regress/expected/groupingsets.out
+++ b/src/test/regress/expected/groupingsets.out
@@ -890,7 +890,8 @@ explain (costs off)
                        ->  Seq Scan on gstest2
 (10 rows)
 
--- test pushdown of HAVING clause that does not reference any columns that are nullable by grouping sets
+-- test pushdown of non-degenerate HAVING clause that does not reference any
+-- columns that are nullable by grouping sets
 explain (costs off)
 select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a > 1 and b > 1;
            QUERY PLAN            
@@ -911,6 +912,44 @@ select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a
  2 | 2 |     1
 (1 row)
 
+-- test pushdown of degenerate HAVING clause
+explain (costs off)
+select count(*) from gstest2 group by grouping sets (()) having false;
+            QUERY PLAN             
+-----------------------------------
+ Aggregate
+   Group Key: ()
+   Filter: false
+   ->  Result
+         Replaces: Scan on gstest2
+         One-Time Filter: false
+(6 rows)
+
+select count(*) from gstest2 group by grouping sets (()) having false;
+ count 
+-------
+(0 rows)
+
+explain (costs off)
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+               QUERY PLAN                
+-----------------------------------------
+ GroupAggregate
+   Group Key: a
+   Group Key: ()
+   Filter: false
+   ->  Sort
+         Sort Key: a
+         ->  Result
+               Replaces: Scan on gstest2
+               One-Time Filter: false
+(9 rows)
+
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+ a | count 
+---+-------
+(0 rows)
+
 -- HAVING with GROUPING queries
 select ten, grouping(ten) from onek
 group by grouping sets(ten) having grouping(ten) >= 0
diff --git a/src/test/regress/sql/groupingsets.sql b/src/test/regress/sql/groupingsets.sql
index 38d3cdd0fd8..dbacc2ffdce 100644
--- a/src/test/regress/sql/groupingsets.sql
+++ b/src/test/regress/sql/groupingsets.sql
@@ -290,11 +290,21 @@ explain (costs off)
   select v.c, (select count(*) from gstest2 group by () having v.c)
     from (values (false),(true)) v(c) order by v.c;
 
--- test pushdown of HAVING clause that does not reference any columns that are nullable by grouping sets
+-- test pushdown of non-degenerate HAVING clause that does not reference any
+-- columns that are nullable by grouping sets
 explain (costs off)
 select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a > 1 and b > 1;
 select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a > 1 and b > 1;
 
+-- test pushdown of degenerate HAVING clause
+explain (costs off)
+select count(*) from gstest2 group by grouping sets (()) having false;
+select count(*) from gstest2 group by grouping sets (()) having false;
+
+explain (costs off)
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+
 -- HAVING with GROUPING queries
 select ten, grouping(ten) from onek
 group by grouping sets(ten) having grouping(ten) >= 0
-- 
2.39.5 (Apple Git-154)

#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Richard Guo (#21)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

Richard Guo <guofenglinux@gmail.com> writes:

On Sat, Oct 18, 2025 at 6:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

The proposed patch tries to close the hole by checking whether
the condition is degenerate, but that feels subtly wrong to me:
what we ought to check is whether there is any empty grouping set.

v1 patch tries to close this gap by detecting degenerate HAVING
clauses and preventing them from being pushed down. While this may
cause us to miss the optimization for degenerate clauses, I felt it
was acceptable, because degenerate HAVING clauses don't seem to be
common in practice, and detecting the presence of empty grouping sets
doesn't seem a trivial task.

I actually agree with you that it's not that critical whether we fully
optimize degenerate HAVING clauses in every case. (Although it might
be bad if we miss any cases that previous versions handled.) What
bothered me about v1 was that it seemed to be operating from a
conceptually wrong model, which perhaps could lead to bugs in future.

I took a look at has_empty_grouping_set() in v2 patch, but I'm afraid
it's not thorough enough. I don't think it's correct to return true
once we find an EMPTY, ROLLUP, or CUBE GroupingSet. I think we need
to compute the Cartesian product across different GroupingSets, just
like what expand_grouping_sets() does.

Oh, good point. We could probably make it handle this with a little
more complication, or decide that it's okay if we fail to optimize
some cases --- but if we're going to do expand_grouping_sets() anyway,
we might as well rely on that.

Rather than extending has_empty_grouping_set() to compute the
Cartesian product across different GroupingSets, I wonder if we could
move the call to expand_grouping_sets() from
preprocess_grouping_sets() to just before the push-down-HAVING
optimization. This way, we could leverage the expanded flat list of
grouping sets to accurately determine whether any empty grouping sets
exist.

I like the concept here, but not so much the details. Pulling
expand_grouping_sets out of preprocess_grouping_sets feels weird.
I guess it's all right given that preprocess_grouping_sets doesn't
manipulate the parse tree otherwise, but you didn't update the comment
for preprocess_grouping_sets. Also it might be a good idea to have a
test case demonstrating why v2 wasn't good enough.

I did a quick search and didn't find any code between the
push-down-HAVING optimization and preprocess_grouping_sets() that
relies on parse->groupingSets being a list of GroupingSet's, so this
should be safe.

I think it's actually quite awful that we replace parse->groupingSets
with data of a totally different representation. It would be better
to store the result of expand_grouping_sets into some new PlannerInfo
field, say root->preprocessed_groupingSets. But we can't make a
change like that in v18 now, so this is just something for future
cleanup.

regards, tom lane

#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#22)
1 attachment(s)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

I wrote:

I like the concept here, but not so much the details. Pulling
expand_grouping_sets out of preprocess_grouping_sets feels weird.
I guess it's all right given that preprocess_grouping_sets doesn't
manipulate the parse tree otherwise, but you didn't update the comment
for preprocess_grouping_sets. Also it might be a good idea to have a
test case demonstrating why v2 wasn't good enough.

Here's a v4 with some concrete suggestions for comment changes,
plus the extra test case. (I did some tiny cosmetic fooling with
preprocess_grouping_sets too, because the order of its initial
operations seemed quite weird to me.)

regards, tom lane

Attachments:

v4-0001-Fix-pushdown-of-degenerate-HAVING-clauses.patchtext/x-diff; charset=us-ascii; name=v4-0001-Fix-pushdown-of-degenerate-HAVING-clauses.patchDownload
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 342d782c74b..c4fd646b999 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1127,15 +1127,28 @@ subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
 	if (parse->hasTargetSRFs)
 		parse->hasTargetSRFs = expression_returns_set((Node *) parse->targetList);
 
+	/*
+	 * If we have grouping sets, expand the groupingSets tree of this query to
+	 * a flat list of grouping sets.  We need to do this before optimizing
+	 * HAVING, since we can't easily tell if there's an empty grouping set
+	 * until we have this representation.
+	 */
+	if (parse->groupingSets)
+	{
+		parse->groupingSets =
+			expand_grouping_sets(parse->groupingSets, parse->groupDistinct, -1);
+	}
+
 	/*
 	 * In some cases we may want to transfer a HAVING clause into WHERE. We
 	 * cannot do so if the HAVING clause contains aggregates (obviously) or
 	 * volatile functions (since a HAVING clause is supposed to be executed
-	 * only once per group).  We also can't do this if there are any nonempty
-	 * grouping sets and the clause references any columns that are nullable
-	 * by the grouping sets; moving such a clause into WHERE would potentially
-	 * change the results.  (If there are only empty grouping sets, then the
-	 * HAVING clause must be degenerate as discussed below.)
+	 * only once per group).  We also can't do this if there are any grouping
+	 * sets and the clause references any columns that are nullable by the
+	 * grouping sets; the nulled values of those columns are not available
+	 * before the grouping step.  (The test on groupClause might seem wrong,
+	 * but it's okay: it's just an optimization to avoid running pull_varnos
+	 * when there cannot be any Vars in the HAVING clause.)
 	 *
 	 * Also, it may be that the clause is so expensive to execute that we're
 	 * better off doing it only once per group, despite the loss of
@@ -1145,19 +1158,19 @@ subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
 	 * clause into WHERE, in hopes of eliminating tuples before aggregation
 	 * instead of after.
 	 *
-	 * If the query has explicit grouping then we can simply move such a
+	 * If the query has no empty grouping set then we can simply move such a
 	 * clause into WHERE; any group that fails the clause will not be in the
 	 * output because none of its tuples will reach the grouping or
-	 * aggregation stage.  Otherwise we must have a degenerate (variable-free)
-	 * HAVING clause, which we put in WHERE so that query_planner() can use it
-	 * in a gating Result node, but also keep in HAVING to ensure that we
-	 * don't emit a bogus aggregated row. (This could be done better, but it
-	 * seems not worth optimizing.)
+	 * aggregation stage.  Otherwise we have to keep the clause in HAVING to
+	 * ensure that we don't emit a bogus aggregated row.  But then the HAVING
+	 * clause must be degenerate (variable-free), so we can copy it into WHERE
+	 * so that query_planner() can use it in a gating Result node. (This could
+	 * be done better, but it seems not worth optimizing.)
 	 *
 	 * Note that a HAVING clause may contain expressions that are not fully
 	 * preprocessed.  This can happen if these expressions are part of
 	 * grouping items.  In such cases, they are replaced with GROUP Vars in
-	 * the parser and then replaced back after we've done with expression
+	 * the parser and then replaced back after we're done with expression
 	 * preprocessing on havingQual.  This is not an issue if the clause
 	 * remains in HAVING, because these expressions will be matched to lower
 	 * target items in setrefs.c.  However, if the clause is moved or copied
@@ -1182,8 +1195,11 @@ subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
 			/* keep it in HAVING */
 			newHaving = lappend(newHaving, havingclause);
 		}
-		else if (parse->groupClause)
+		else if (parse->groupClause &&
+				 (parse->groupingSets == NIL ||
+				  (List *) linitial(parse->groupingSets) != NIL))
 		{
+			/* There is GROUP BY, but no empty grouping set */
 			Node	   *whereclause;
 
 			/* Preprocess the HAVING clause fully */
@@ -1196,6 +1212,7 @@ subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
 		}
 		else
 		{
+			/* There is an empty grouping set (perhaps implicitly) */
 			Node	   *whereclause;
 
 			/* Preprocess the HAVING clause fully */
@@ -2181,10 +2198,13 @@ grouping_planner(PlannerInfo *root, double tuple_fraction,
 }
 
 /*
- * Do preprocessing for groupingSets clause and related data.  This handles the
- * preliminary steps of expanding the grouping sets, organizing them into lists
- * of rollups, and preparing annotations which will later be filled in with
- * size estimates.
+ * Do preprocessing for groupingSets clause and related data.
+ *
+ * We expect that parse->groupingSets has already been expanded into a flat
+ * list of grouping sets (that is, just integer Lists of ressortgroupref
+ * numbers) by expand_grouping_sets().  This function handles the preliminary
+ * steps of organizing the grouping sets into lists of rollups, and preparing
+ * annotations which will later be filled in with size estimates.
  */
 static grouping_sets_data *
 preprocess_grouping_sets(PlannerInfo *root)
@@ -2195,19 +2215,18 @@ preprocess_grouping_sets(PlannerInfo *root)
 	ListCell   *lc_set;
 	grouping_sets_data *gd = palloc0(sizeof(grouping_sets_data));
 
-	parse->groupingSets = expand_grouping_sets(parse->groupingSets, parse->groupDistinct, -1);
-
-	gd->any_hashable = false;
-	gd->unhashable_refs = NULL;
-	gd->unsortable_refs = NULL;
-	gd->unsortable_sets = NIL;
-
 	/*
 	 * We don't currently make any attempt to optimize the groupClause when
 	 * there are grouping sets, so just duplicate it in processed_groupClause.
 	 */
 	root->processed_groupClause = parse->groupClause;
 
+	/* Detect unhashable and unsortable grouping expressions */
+	gd->any_hashable = false;
+	gd->unhashable_refs = NULL;
+	gd->unsortable_refs = NULL;
+	gd->unsortable_sets = NIL;
+
 	if (parse->groupClause)
 	{
 		ListCell   *lc;
diff --git a/src/test/regress/expected/groupingsets.out b/src/test/regress/expected/groupingsets.out
index 991121545c5..8e354759da6 100644
--- a/src/test/regress/expected/groupingsets.out
+++ b/src/test/regress/expected/groupingsets.out
@@ -890,7 +890,8 @@ explain (costs off)
                        ->  Seq Scan on gstest2
 (10 rows)
 
--- test pushdown of HAVING clause that does not reference any columns that are nullable by grouping sets
+-- test pushdown of non-degenerate HAVING clause that does not reference any
+-- columns that are nullable by grouping sets
 explain (costs off)
 select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a > 1 and b > 1;
            QUERY PLAN            
@@ -911,6 +912,65 @@ select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a
  2 | 2 |     1
 (1 row)
 
+explain (costs off)
+select a, b, count(*) from gstest2 group by rollup(a), b having b > 1;
+           QUERY PLAN            
+---------------------------------
+ GroupAggregate
+   Group Key: b, a
+   Group Key: b
+   ->  Sort
+         Sort Key: b, a
+         ->  Seq Scan on gstest2
+               Filter: (b > 1)
+(7 rows)
+
+select a, b, count(*) from gstest2 group by rollup(a), b having b > 1;
+ a | b | count 
+---+---+-------
+ 1 | 2 |     1
+ 2 | 2 |     1
+   | 2 |     2
+(3 rows)
+
+-- test pushdown of degenerate HAVING clause
+explain (costs off)
+select count(*) from gstest2 group by grouping sets (()) having false;
+            QUERY PLAN             
+-----------------------------------
+ Aggregate
+   Group Key: ()
+   Filter: false
+   ->  Result
+         Replaces: Scan on gstest2
+         One-Time Filter: false
+(6 rows)
+
+select count(*) from gstest2 group by grouping sets (()) having false;
+ count 
+-------
+(0 rows)
+
+explain (costs off)
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+               QUERY PLAN                
+-----------------------------------------
+ GroupAggregate
+   Group Key: a
+   Group Key: ()
+   Filter: false
+   ->  Sort
+         Sort Key: a
+         ->  Result
+               Replaces: Scan on gstest2
+               One-Time Filter: false
+(9 rows)
+
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+ a | count 
+---+-------
+(0 rows)
+
 -- HAVING with GROUPING queries
 select ten, grouping(ten) from onek
 group by grouping sets(ten) having grouping(ten) >= 0
diff --git a/src/test/regress/sql/groupingsets.sql b/src/test/regress/sql/groupingsets.sql
index 38d3cdd0fd8..74e0bddebf1 100644
--- a/src/test/regress/sql/groupingsets.sql
+++ b/src/test/regress/sql/groupingsets.sql
@@ -290,11 +290,25 @@ explain (costs off)
   select v.c, (select count(*) from gstest2 group by () having v.c)
     from (values (false),(true)) v(c) order by v.c;
 
--- test pushdown of HAVING clause that does not reference any columns that are nullable by grouping sets
+-- test pushdown of non-degenerate HAVING clause that does not reference any
+-- columns that are nullable by grouping sets
 explain (costs off)
 select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a > 1 and b > 1;
 select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a > 1 and b > 1;
 
+explain (costs off)
+select a, b, count(*) from gstest2 group by rollup(a), b having b > 1;
+select a, b, count(*) from gstest2 group by rollup(a), b having b > 1;
+
+-- test pushdown of degenerate HAVING clause
+explain (costs off)
+select count(*) from gstest2 group by grouping sets (()) having false;
+select count(*) from gstest2 group by grouping sets (()) having false;
+
+explain (costs off)
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+
 -- HAVING with GROUPING queries
 select ten, grouping(ten) from onek
 group by grouping sets(ten) having grouping(ten) >= 0
#24Richard Guo
guofenglinux@gmail.com
In reply to: Tom Lane (#23)
1 attachment(s)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

On Tue, Oct 21, 2025 at 1:27 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

I like the concept here, but not so much the details. Pulling
expand_grouping_sets out of preprocess_grouping_sets feels weird.
I guess it's all right given that preprocess_grouping_sets doesn't
manipulate the parse tree otherwise, but you didn't update the comment
for preprocess_grouping_sets. Also it might be a good idea to have a
test case demonstrating why v2 wasn't good enough.

Here's a v4 with some concrete suggestions for comment changes,
plus the extra test case. (I did some tiny cosmetic fooling with
preprocess_grouping_sets too, because the order of its initial
operations seemed quite weird to me.)

Yeah, I noticed the out-of-date comment for preprocess_grouping_sets
shortly after sending out the v3 patch, but I was too tired last night
to update and resend it. Thanks for the suggestions in v4 -- all the
changes look good to me.

Regarding the tests, I think we could add another test query to cover
the case with no empty grouping sets and degenerate HAVING clauses.
This way, all cases for the HAVING pushdown optimization with grouping
sets should be covered.

I've added such a test in v5, along with a commit message. Nothing
else has changed. I'll push this patch soon, barring any objections.

- Richard

Attachments:

v5-0001-Fix-pushdown-of-degenerate-HAVING-clauses.patchapplication/octet-stream; name=v5-0001-Fix-pushdown-of-degenerate-HAVING-clauses.patchDownload
From 353b812fce47e4fd19252daba070455a3e533e5d Mon Sep 17 00:00:00 2001
From: Richard Guo <guofenglinux@gmail.com>
Date: Tue, 21 Oct 2025 11:29:22 +0900
Subject: [PATCH v5] Fix pushdown of degenerate HAVING clauses

67a54b9e8 taught the planner to push down HAVING clauses even when
grouping sets are present, as long as the clause does not reference
any columns that are nullable by the grouping sets.  However, there
was an oversight: if any empty grouping sets are present, the
aggregation node can produce a row that did not come from the input,
and pushing down a HAVING clause in this case may cause us to fail to
filter out that row.

Currently, non-degenerate HAVING clauses are not pushed down when
empty grouping sets are present, since the empty grouping sets would
nullify the vars they reference.  However, degenerate (variable-free)
HAVING clauses are not subject to this restriction and may be
incorrectly pushed down.

To fix, explicitly check for the presence of empty grouping sets and
retain degenerate clauses in HAVING when they are present.  This
ensures that we don't emit a bogus aggregated row.  A copy of each
such clause is also put in WHERE so that query_planner() can use it in
a gating Result node.

To facilitate this check, this patch expands the groupingSets tree of
the query to a flat list of grouping sets before applying the HAVING
pushdown optimization.  This does not add any additional planning
overhead, since we need to do this expansion anyway.

In passing, make a small tweak to preprocess_grouping_sets() by
reordering its initial operations a bit.

Backpatch to v18, where this issue was introduced.

Reported-by: Yuhang Qiu <iamqyh@gmail.com>
Author: Richard Guo <guofenglinux@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/0879D9C9-7FE2-4A20-9593-B23F7A0B5290@gmail.com
Backpatch-through: 18
---
 src/backend/optimizer/plan/planner.c       | 67 +++++++++++-------
 src/test/regress/expected/groupingsets.out | 82 +++++++++++++++++++++-
 src/test/regress/sql/groupingsets.sql      | 20 +++++-
 3 files changed, 143 insertions(+), 26 deletions(-)

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 342d782c74b..c4fd646b999 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1127,15 +1127,28 @@ subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
 	if (parse->hasTargetSRFs)
 		parse->hasTargetSRFs = expression_returns_set((Node *) parse->targetList);
 
+	/*
+	 * If we have grouping sets, expand the groupingSets tree of this query to
+	 * a flat list of grouping sets.  We need to do this before optimizing
+	 * HAVING, since we can't easily tell if there's an empty grouping set
+	 * until we have this representation.
+	 */
+	if (parse->groupingSets)
+	{
+		parse->groupingSets =
+			expand_grouping_sets(parse->groupingSets, parse->groupDistinct, -1);
+	}
+
 	/*
 	 * In some cases we may want to transfer a HAVING clause into WHERE. We
 	 * cannot do so if the HAVING clause contains aggregates (obviously) or
 	 * volatile functions (since a HAVING clause is supposed to be executed
-	 * only once per group).  We also can't do this if there are any nonempty
-	 * grouping sets and the clause references any columns that are nullable
-	 * by the grouping sets; moving such a clause into WHERE would potentially
-	 * change the results.  (If there are only empty grouping sets, then the
-	 * HAVING clause must be degenerate as discussed below.)
+	 * only once per group).  We also can't do this if there are any grouping
+	 * sets and the clause references any columns that are nullable by the
+	 * grouping sets; the nulled values of those columns are not available
+	 * before the grouping step.  (The test on groupClause might seem wrong,
+	 * but it's okay: it's just an optimization to avoid running pull_varnos
+	 * when there cannot be any Vars in the HAVING clause.)
 	 *
 	 * Also, it may be that the clause is so expensive to execute that we're
 	 * better off doing it only once per group, despite the loss of
@@ -1145,19 +1158,19 @@ subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
 	 * clause into WHERE, in hopes of eliminating tuples before aggregation
 	 * instead of after.
 	 *
-	 * If the query has explicit grouping then we can simply move such a
+	 * If the query has no empty grouping set then we can simply move such a
 	 * clause into WHERE; any group that fails the clause will not be in the
 	 * output because none of its tuples will reach the grouping or
-	 * aggregation stage.  Otherwise we must have a degenerate (variable-free)
-	 * HAVING clause, which we put in WHERE so that query_planner() can use it
-	 * in a gating Result node, but also keep in HAVING to ensure that we
-	 * don't emit a bogus aggregated row. (This could be done better, but it
-	 * seems not worth optimizing.)
+	 * aggregation stage.  Otherwise we have to keep the clause in HAVING to
+	 * ensure that we don't emit a bogus aggregated row.  But then the HAVING
+	 * clause must be degenerate (variable-free), so we can copy it into WHERE
+	 * so that query_planner() can use it in a gating Result node. (This could
+	 * be done better, but it seems not worth optimizing.)
 	 *
 	 * Note that a HAVING clause may contain expressions that are not fully
 	 * preprocessed.  This can happen if these expressions are part of
 	 * grouping items.  In such cases, they are replaced with GROUP Vars in
-	 * the parser and then replaced back after we've done with expression
+	 * the parser and then replaced back after we're done with expression
 	 * preprocessing on havingQual.  This is not an issue if the clause
 	 * remains in HAVING, because these expressions will be matched to lower
 	 * target items in setrefs.c.  However, if the clause is moved or copied
@@ -1182,8 +1195,11 @@ subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
 			/* keep it in HAVING */
 			newHaving = lappend(newHaving, havingclause);
 		}
-		else if (parse->groupClause)
+		else if (parse->groupClause &&
+				 (parse->groupingSets == NIL ||
+				  (List *) linitial(parse->groupingSets) != NIL))
 		{
+			/* There is GROUP BY, but no empty grouping set */
 			Node	   *whereclause;
 
 			/* Preprocess the HAVING clause fully */
@@ -1196,6 +1212,7 @@ subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
 		}
 		else
 		{
+			/* There is an empty grouping set (perhaps implicitly) */
 			Node	   *whereclause;
 
 			/* Preprocess the HAVING clause fully */
@@ -2181,10 +2198,13 @@ grouping_planner(PlannerInfo *root, double tuple_fraction,
 }
 
 /*
- * Do preprocessing for groupingSets clause and related data.  This handles the
- * preliminary steps of expanding the grouping sets, organizing them into lists
- * of rollups, and preparing annotations which will later be filled in with
- * size estimates.
+ * Do preprocessing for groupingSets clause and related data.
+ *
+ * We expect that parse->groupingSets has already been expanded into a flat
+ * list of grouping sets (that is, just integer Lists of ressortgroupref
+ * numbers) by expand_grouping_sets().  This function handles the preliminary
+ * steps of organizing the grouping sets into lists of rollups, and preparing
+ * annotations which will later be filled in with size estimates.
  */
 static grouping_sets_data *
 preprocess_grouping_sets(PlannerInfo *root)
@@ -2195,19 +2215,18 @@ preprocess_grouping_sets(PlannerInfo *root)
 	ListCell   *lc_set;
 	grouping_sets_data *gd = palloc0(sizeof(grouping_sets_data));
 
-	parse->groupingSets = expand_grouping_sets(parse->groupingSets, parse->groupDistinct, -1);
-
-	gd->any_hashable = false;
-	gd->unhashable_refs = NULL;
-	gd->unsortable_refs = NULL;
-	gd->unsortable_sets = NIL;
-
 	/*
 	 * We don't currently make any attempt to optimize the groupClause when
 	 * there are grouping sets, so just duplicate it in processed_groupClause.
 	 */
 	root->processed_groupClause = parse->groupClause;
 
+	/* Detect unhashable and unsortable grouping expressions */
+	gd->any_hashable = false;
+	gd->unhashable_refs = NULL;
+	gd->unsortable_refs = NULL;
+	gd->unsortable_sets = NIL;
+
 	if (parse->groupClause)
 	{
 		ListCell   *lc;
diff --git a/src/test/regress/expected/groupingsets.out b/src/test/regress/expected/groupingsets.out
index 991121545c5..398cf6965e0 100644
--- a/src/test/regress/expected/groupingsets.out
+++ b/src/test/regress/expected/groupingsets.out
@@ -890,7 +890,8 @@ explain (costs off)
                        ->  Seq Scan on gstest2
 (10 rows)
 
--- test pushdown of HAVING clause that does not reference any columns that are nullable by grouping sets
+-- test pushdown of non-degenerate HAVING clause that does not reference any
+-- columns that are nullable by grouping sets
 explain (costs off)
 select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a > 1 and b > 1;
            QUERY PLAN            
@@ -911,6 +912,85 @@ select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a
  2 | 2 |     1
 (1 row)
 
+explain (costs off)
+select a, b, count(*) from gstest2 group by rollup(a), b having b > 1;
+           QUERY PLAN            
+---------------------------------
+ GroupAggregate
+   Group Key: b, a
+   Group Key: b
+   ->  Sort
+         Sort Key: b, a
+         ->  Seq Scan on gstest2
+               Filter: (b > 1)
+(7 rows)
+
+select a, b, count(*) from gstest2 group by rollup(a), b having b > 1;
+ a | b | count 
+---+---+-------
+ 1 | 2 |     1
+ 2 | 2 |     1
+   | 2 |     2
+(3 rows)
+
+-- test pushdown of degenerate HAVING clause
+explain (costs off)
+select count(*) from gstest2 group by grouping sets (()) having false;
+            QUERY PLAN             
+-----------------------------------
+ Aggregate
+   Group Key: ()
+   Filter: false
+   ->  Result
+         Replaces: Scan on gstest2
+         One-Time Filter: false
+(6 rows)
+
+select count(*) from gstest2 group by grouping sets (()) having false;
+ count 
+-------
+(0 rows)
+
+explain (costs off)
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+               QUERY PLAN                
+-----------------------------------------
+ GroupAggregate
+   Group Key: a
+   Group Key: ()
+   Filter: false
+   ->  Sort
+         Sort Key: a
+         ->  Result
+               Replaces: Scan on gstest2
+               One-Time Filter: false
+(9 rows)
+
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+ a | count 
+---+-------
+(0 rows)
+
+explain (costs off)
+select a, b, count(*) from gstest2 group by grouping sets ((a), (b)) having false;
+               QUERY PLAN                
+-----------------------------------------
+ GroupAggregate
+   Group Key: a
+   Sort Key: b
+     Group Key: b
+   ->  Sort
+         Sort Key: a
+         ->  Result
+               Replaces: Scan on gstest2
+               One-Time Filter: false
+(9 rows)
+
+select a, b, count(*) from gstest2 group by grouping sets ((a), (b)) having false;
+ a | b | count 
+---+---+-------
+(0 rows)
+
 -- HAVING with GROUPING queries
 select ten, grouping(ten) from onek
 group by grouping sets(ten) having grouping(ten) >= 0
diff --git a/src/test/regress/sql/groupingsets.sql b/src/test/regress/sql/groupingsets.sql
index 38d3cdd0fd8..6d875475fae 100644
--- a/src/test/regress/sql/groupingsets.sql
+++ b/src/test/regress/sql/groupingsets.sql
@@ -290,11 +290,29 @@ explain (costs off)
   select v.c, (select count(*) from gstest2 group by () having v.c)
     from (values (false),(true)) v(c) order by v.c;
 
--- test pushdown of HAVING clause that does not reference any columns that are nullable by grouping sets
+-- test pushdown of non-degenerate HAVING clause that does not reference any
+-- columns that are nullable by grouping sets
 explain (costs off)
 select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a > 1 and b > 1;
 select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a > 1 and b > 1;
 
+explain (costs off)
+select a, b, count(*) from gstest2 group by rollup(a), b having b > 1;
+select a, b, count(*) from gstest2 group by rollup(a), b having b > 1;
+
+-- test pushdown of degenerate HAVING clause
+explain (costs off)
+select count(*) from gstest2 group by grouping sets (()) having false;
+select count(*) from gstest2 group by grouping sets (()) having false;
+
+explain (costs off)
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
+
+explain (costs off)
+select a, b, count(*) from gstest2 group by grouping sets ((a), (b)) having false;
+select a, b, count(*) from gstest2 group by grouping sets ((a), (b)) having false;
+
 -- HAVING with GROUPING queries
 select ten, grouping(ten) from onek
 group by grouping sets(ten) having grouping(ten) >= 0
-- 
2.39.5 (Apple Git-154)

#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Richard Guo (#24)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

Richard Guo <guofenglinux@gmail.com> writes:

Regarding the tests, I think we could add another test query to cover
the case with no empty grouping sets and degenerate HAVING clauses.
This way, all cases for the HAVING pushdown optimization with grouping
sets should be covered.

I've added such a test in v5, along with a commit message. Nothing
else has changed. I'll push this patch soon, barring any objections.

v5 LGTM.

regards, tom lane

#26Richard Guo
guofenglinux@gmail.com
In reply to: Tom Lane (#25)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

On Tue, Oct 21, 2025 at 12:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Richard Guo <guofenglinux@gmail.com> writes:

Regarding the tests, I think we could add another test query to cover
the case with no empty grouping sets and degenerate HAVING clauses.
This way, all cases for the HAVING pushdown optimization with grouping
sets should be covered.

I've added such a test in v5, along with a commit message. Nothing
else has changed. I'll push this patch soon, barring any objections.

v5 LGTM.

Cool! I've pushed and back-patched v5. Thanks for working on this
patch.

邱宇航, thanks for the report -- it's a good one. This issue should
be fixed in 18.1, which is scheduled for release in November (the
13th, I believe).

- Richard

#27Richard Guo
guofenglinux@gmail.com
In reply to: Richard Guo (#26)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

On Tue, Oct 21, 2025 at 12:58 PM Richard Guo <guofenglinux@gmail.com> wrote:

Cool! I've pushed and back-patched v5. Thanks for working on this
patch.

Oops, I made a mistake in the test case for v18. Fixing it now…

- Richard

#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Richard Guo (#27)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

Richard Guo <guofenglinux@gmail.com> writes:

Oops, I made a mistake in the test case for v18. Fixing it now…

Bleah. I pretty much don't ever commit things into back branches
without running regression tests there as well as in master.
Postgres is a moving target.

regards, tom lane

#29Richard Guo
guofenglinux@gmail.com
In reply to: Tom Lane (#28)
Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

On Tue, Oct 21, 2025 at 1:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Richard Guo <guofenglinux@gmail.com> writes:

Oops, I made a mistake in the test case for v18. Fixing it now…

Bleah. I pretty much don't ever commit things into back branches
without running regression tests there as well as in master.
Postgres is a moving target.

Indeed. I should have always stayed alert to the possibility that
test outputs might differ across branches, even for simple queries.
But somehow, it slipped through this time.

I just pushed a fix to v18 and am now staring at the buildfarm
nervously. Fingers crossed this time.

- Richard