Doc: Minor update for enable_partitionwise_aggregate

Started by Andy Atkinsonover 2 years ago5 messages
#1Andy Atkinson
andyatkinson@gmail.com
1 attachment(s)

Hello. While reading the docs for the enable_partitionwise_aggregate
parameter on the Query Planning page, I thought the description had a small
mistake that could be improved.

The current wording is: "which allows grouping or aggregation on a
partitioned tables performed separately "

Page: https://www.postgresql.org/docs/current/runtime-config-query.html

I think possible better alternatives could be:

- (Option 1) a "partitioned table's partitions" (the possessive form of
"it's"). The "enable_partition_pruning" parameter uses "the partitioned
table's partitions" in this form. I think this option is good, but I had a
slight preference for option 2.
- (Option 2) Or to just cut out the first part and say "to be performed
separately for each partition", which seemed simpler. So the sentence
reads: "which allows grouping or aggregation to be performed separately for
each partition"
- (Option 3) dropping the "a" so it says "which allows grouping or
aggregation on partitioned tables performed separately". I don't think this
is as good though because the aggregation happens on the partitions, so it
feels slightly off to me to say the "partitioned tables" instead of the
partitions.

I tested toggling this parameter on and off with a test partitioned table,
and looked at the query execution plan, and saw how the aggregation
happened on the partitions first when it was enabled.

This is my first ever submission to pgsql-hackers. :) I used this guide
from Lætitia to prepare the patch file for Option 2 above, which is
attached. I am having a problem with the "make STYLE=website html" step, so
I hadn't seen the preview (still fixing this up).
https://mydbanotebook.org/post/patching-doc/

Let me know what you think!

Thanks.

Attachments:

query-planning-enable-partitionwise-aggregate-improved-wording_v1.patchapplication/x-patch; name=query-planning-enable-partitionwise-aggregate-improved-wording_v1.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 924309af26..167248a5da 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5271,9 +5271,8 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       <listitem>
        <para>
         Enables or disables the query planner's use of partitionwise grouping
-        or aggregation, which allows grouping or aggregation on a partitioned
-        tables performed separately for each partition.  If the <literal>GROUP
-        BY</literal> clause does not include the partition keys, only partial
+        or aggregation, which allows grouping or aggregation to be performed separately for each 
+        partition.  If the <literal>GROUP BY</literal> clause does not include the partition keys, only partial
         aggregation can be performed on a per-partition basis, and
         finalization must be performed later.  Because partitionwise grouping
         or aggregation can use significantly more CPU time and memory during
#2Ashutosh Bapat
ashutosh.bapat.oss@gmail.com
In reply to: Andy Atkinson (#1)
Re: Doc: Minor update for enable_partitionwise_aggregate

On Sun, Oct 1, 2023 at 7:38 AM Andy Atkinson <andyatkinson@gmail.com> wrote:

Hello. While reading the docs for the enable_partitionwise_aggregate parameter on the Query Planning page, I thought the description had a small mistake that could be improved.

The current wording is: "which allows grouping or aggregation on a partitioned tables performed separately "

Page: https://www.postgresql.org/docs/current/runtime-config-query.html

I think possible better alternatives could be:

(Option 1) a "partitioned table's partitions" (the possessive form of "it's"). The "enable_partition_pruning" parameter uses "the partitioned table's partitions" in this form. I think this option is good, but I had a slight preference for option 2.
(Option 2) Or to just cut out the first part and say "to be performed separately for each partition", which seemed simpler. So the sentence reads: "which allows grouping or aggregation to be performed separately for each partition"

I would leave "on a partitioned table". Notice that I have removed "s"
from tables.

(Option 3) dropping the "a" so it says "which allows grouping or aggregation on partitioned tables performed separately". I don't think this is as good though because the aggregation happens on the partitions, so it feels slightly off to me to say the "partitioned tables" instead of the partitions.

It's technically incorrect as well. Aggregation is performed on a
single relation always - a join or subquery or simple relation. A join
may have multiple tables in it but the aggregation is performed on its
result and not individual tables and hence not on partitions of
individual tables.

--
Best Wishes,
Ashutosh Bapat

#3Andrew Atkinson
andyatkinson@gmail.com
In reply to: Ashutosh Bapat (#2)
1 attachment(s)
Re: Doc: Minor update for enable_partitionwise_aggregate

Thank you for the feedback and clarifications Ashutosh. How about this?

"which allows grouping or aggregation on partitioned tables to be performed
separately for each partition."

Attached a v2 patch file with this change.

Here is a gist w/ a partitioned table and 2 partitions, comparing execution
plans after enabling the parameter, and reading the plan nodes up from the
bottom. With enable_partitionwise_aggregate = on, I can see the Partial
HashAggregate/"Group Key" on each of the two partitions (of the partitioned
table) where I don't see that when the setting is off (by default).
https://gist.github.com/andyatkinson/7af81fb8a5b9e677af6049e29ab2cb73

For the terms partitioned table vs. partitions, I used how they're
described here:
https://www.postgresql.org/docs/current/ddl-partitioning.html
- partitioned table (virtual table)
- partitions (of a partitioned table)

Thanks!
Andrew

On Tue, Oct 3, 2023 at 4:33 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
wrote:

Show quoted text

On Sun, Oct 1, 2023 at 7:38 AM Andy Atkinson <andyatkinson@gmail.com>
wrote:

Hello. While reading the docs for the enable_partitionwise_aggregate

parameter on the Query Planning page, I thought the description had a small
mistake that could be improved.

The current wording is: "which allows grouping or aggregation on a

partitioned tables performed separately "

Page: https://www.postgresql.org/docs/current/runtime-config-query.html

I think possible better alternatives could be:

(Option 1) a "partitioned table's partitions" (the possessive form of

"it's"). The "enable_partition_pruning" parameter uses "the partitioned
table's partitions" in this form. I think this option is good, but I had a
slight preference for option 2.

(Option 2) Or to just cut out the first part and say "to be performed

separately for each partition", which seemed simpler. So the sentence
reads: "which allows grouping or aggregation to be performed separately for
each partition"

I would leave "on a partitioned table". Notice that I have removed "s"
from tables.

(Option 3) dropping the "a" so it says "which allows grouping or

aggregation on partitioned tables performed separately". I don't think this
is as good though because the aggregation happens on the partitions, so it
feels slightly off to me to say the "partitioned tables" instead of the
partitions.

It's technically incorrect as well. Aggregation is performed on a
single relation always - a join or subquery or simple relation. A join
may have multiple tables in it but the aggregation is performed on its
result and not individual tables and hence not on partitions of
individual tables.

--
Best Wishes,
Ashutosh Bapat

Attachments:

query-planning-enable-partitionwise-aggregate-improved-wording_v2.patchapplication/octet-stream; name=query-planning-enable-partitionwise-aggregate-improved-wording_v2.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 924309af26..24acc4c5b5 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5271,8 +5271,8 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       <listitem>
        <para>
         Enables or disables the query planner's use of partitionwise grouping
-        or aggregation, which allows grouping or aggregation on a partitioned
-        tables performed separately for each partition.  If the <literal>GROUP
+        or aggregation, which allows grouping or aggregation on partitioned
+        tables to be performed separately for each partition.  If the <literal>GROUP
         BY</literal> clause does not include the partition keys, only partial
         aggregation can be performed on a per-partition basis, and
         finalization must be performed later.  Because partitionwise grouping
#4David Rowley
dgrowleyml@gmail.com
In reply to: Andrew Atkinson (#3)
Re: Doc: Minor update for enable_partitionwise_aggregate

On Wed, 11 Oct 2023 at 16:26, Andrew Atkinson <andyatkinson@gmail.com> wrote:

Thank you for the feedback and clarifications Ashutosh. How about this?

"which allows grouping or aggregation on partitioned tables to be performed separately for each partition."

This looks good to me. I can take care of this.

David

#5David Rowley
dgrowleyml@gmail.com
In reply to: David Rowley (#4)
Re: Doc: Minor update for enable_partitionwise_aggregate

On Wed, 11 Oct 2023 at 19:38, David Rowley <dgrowleyml@gmail.com> wrote:

On Wed, 11 Oct 2023 at 16:26, Andrew Atkinson <andyatkinson@gmail.com> wrote:

"which allows grouping or aggregation on partitioned tables to be performed separately for each partition."

This looks good to me. I can take care of this.

Pushed and backpatched to v11.

David