Allow pushdown of HAVING clauses with grouping sets

Started by Richard Guoalmost 2 years ago2 messageshackers

guofenglinux@gmail.com

almost 2 years ago

In some cases, we may want to transfer a HAVING clause into WHERE in
hopes of eliminating tuples before aggregation instead of after.

Previously, we couldn't do this if there were any nonempty grouping
sets, because we didn't have a way to tell if the HAVING clause
referenced any columns that were nullable by the grouping sets, and
moving such a clause into WHERE could potentially change the results.

Now, with expressions marked nullable by grouping sets with the RT
index of the RTE_GROUP RTE, it is much easier to identify those
clauses that reference any nullable-by-grouping-sets columns: we just
need to check if the RT index of the RTE_GROUP RTE is present in the
clause. For other HAVING clauses, they can be safely pushed down.

I'm not sure how common it is in practice to have a HAVING clause
where all referenced columns are present in all the grouping sets.
But it seems to me that this optimization doesn't cost too much. Not
implementing it seems like leaving money on the table.

Any thoughts?

Thanks
Richard

Richard Guo

guofenglinux@gmail.com

almost 2 years ago

In reply to: Richard Guo (#1)

Re: Allow pushdown of HAVING clauses with grouping sets

On Wed, Sep 11, 2024 at 11:43 AM Richard Guo <guofenglinux@gmail.com> wrote:

In some cases, we may want to transfer a HAVING clause into WHERE in
hopes of eliminating tuples before aggregation instead of after.

Previously, we couldn't do this if there were any nonempty grouping
sets, because we didn't have a way to tell if the HAVING clause
referenced any columns that were nullable by the grouping sets, and
moving such a clause into WHERE could potentially change the results.

Now, with expressions marked nullable by grouping sets with the RT
index of the RTE_GROUP RTE, it is much easier to identify those
clauses that reference any nullable-by-grouping-sets columns: we just
need to check if the RT index of the RTE_GROUP RTE is present in the
clause. For other HAVING clauses, they can be safely pushed down.

I'm not sure how common it is in practice to have a HAVING clause
where all referenced columns are present in all the grouping sets.
But it seems to me that this optimization doesn't cost too much. Not
implementing it seems like leaving money on the table.

I've gone ahead and pushed this.

Thanks
Richard

Allow pushdown of HAVING clauses with grouping sets

Attachments: