Window Functions with identical PARTITION BY and ORDER BY clauses evaluated separately
Relevant documentation:
https://www.postgresql.org/docs/9.4/queries-table-expressions.html#QUERIES-WINDOW
"When multiple window functions are used, all the window functions having
syntactically equivalent PARTITION BY and ORDER BY clauses in their window
definitions are guaranteed to be evaluated in a single pass over the data."
PostgreSQL version:
"PostgreSQL 17.4 on x86_64-windows, compiled by msvc-19.42.34436, 64-bit"
Machine information:
Windows server 2016
kernel version 10.0.14393.7783
12.00 GiB memory
4 cores
Reproduction (my_table_contents.csv attached to email as zip file):
- CREATE TABLE my_table (champid SMALLINT, champmastery INT);
- COPY my_table FROM 'path\to\my_table_contents.csv' WITH (FORMAT CSV);
- CREATE INDEX my_idx ON my_table (champid, champmastery);
- SELECT
SUM(CAST(champmastery AS BIGINT)) OVER (
PARTITION BY champid
ORDER BY champmastery ASC
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
) AS sumx,
COUNT(1) OVER (
PARTITION BY champid
ORDER BY champmastery ASC
RANGE BETWEEN 1000 PRECEDING AND 1000 FOLLOWING
) AS sampledensity
FROM my_table;
I apologize for the email spacing. It may cause issues with copy paste.
Expected result: Given both window functions in the above SELECT query have
identical PARTITION BY and ORDER BY clauses, the execution plan should have
a single "Window Aggregation" operation.
Actual result: The execution plan generated for the above query has two
"Window Aggregation" operations
[image: image.png]
Those are different windows. See:
https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS
Because the (optional) frame_clause is part of the window_definition, it
does seem like a minor documentation bug as we ought to mention that the
frame (if it exists) needs to be equivalent too. Here's a better link to
where we state that:
https://www.postgresql.org/docs/current/queries-table-expressions.html#QUERIES-WINDOW
Here's a simplified example:
greg=# explain select count(*) over (partition by oid rows between 1000
preceding and 1000 following),
count(*) over (partition by oid rows between 1000 preceding and 1000
following) from pg_class;
QUERY PLAN
-------------------------------------------------------------------------------------------------
WindowAgg (cost=0.28..60.87 rows=791 width=20)
-> Index Only Scan using pg_class_oid_index on pg_class
(cost=0.28..49.00 rows=791 width=4)
(2 rows)
greg=# explain select count(*) over (partition by oid rows between 1000
preceding and 1000 following),
count(*) over (partition by oid rows between 1000 preceding and 9999
following) from pg_class;
QUERY PLAN
-------------------------------------------------------------------------------------------------------
WindowAgg (cost=0.28..72.73 rows=791 width=20)
-> WindowAgg (cost=0.28..60.87 rows=791 width=12)
-> Index Only Scan using pg_class_oid_index on pg_class
(cost=0.28..49.00 rows=791 width=4)
Cheers,
Greg
--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support
Was it really not intentional that the docs explicitly name PARTITION BY
and ORDER BY rather than the entire window_definition? If I understand
correctly, only those two clauses control which records are hit and in what
order.
Sorry for the impudence, but I was rather excited for the potential
performance gain when I saw the doc excerpt. I appreciate the time taken to
respond to my query.
Thank you,
Christopher Inokuchi
On Fri, Mar 7, 2025 at 6:24 AM Greg Sabino Mullane <htamfids@gmail.com>
wrote:
Show quoted text
Those are different windows. See:
https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS
Because the (optional) frame_clause is part of the window_definition, it
does seem like a minor documentation bug as we ought to mention that the
frame (if it exists) needs to be equivalent too. Here's a better link to
where we state that:https://www.postgresql.org/docs/current/queries-table-expressions.html#QUERIES-WINDOW
Here's a simplified example:
greg=# explain select count(*) over (partition by oid rows between 1000
preceding and 1000 following),
count(*) over (partition by oid rows between 1000 preceding and 1000
following) from pg_class;
QUERY PLAN-------------------------------------------------------------------------------------------------
WindowAgg (cost=0.28..60.87 rows=791 width=20)
-> Index Only Scan using pg_class_oid_index on pg_class
(cost=0.28..49.00 rows=791 width=4)
(2 rows)greg=# explain select count(*) over (partition by oid rows between 1000
preceding and 1000 following),
count(*) over (partition by oid rows between 1000 preceding and 9999
following) from pg_class;
QUERY PLAN-------------------------------------------------------------------------------------------------------
WindowAgg (cost=0.28..72.73 rows=791 width=20)
-> WindowAgg (cost=0.28..60.87 rows=791 width=12)
-> Index Only Scan using pg_class_oid_index on pg_class
(cost=0.28..49.00 rows=791 width=4)Cheers,
Greg--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support
Christopher Inokuchi <cinokuchi@gmail.com> writes:
Was it really not intentional that the docs explicitly name PARTITION BY
and ORDER BY rather than the entire window_definition? If I understand
correctly, only those two clauses control which records are hit and in what
order.
Yeah, it's intentional, and in fact required by the SQL standard.
However, you're misinterpreting what the guarantee is. The spec
requirement is that window functions sharing PARTITION BY and
ORDER BY all be evaluated on the same concrete ordering of the
data, ie there can't be any re-sorting between them. And that's
what we implement. We do use a separate WindowAgg node for
each distinguishable window specification, but you'll notice
there is not a Sort step between them unless the query involves
entirely-incompatible PARTITION/ORDER BY specs.
Perhaps the wording in section 7.2.5 could be improved; I agree
that "evaluated in one pass" is capable of being read in more
than one way, and it's not clear that it's referring to sorts.
Do you have any suggestions for clearer wording?
It's possible that we could restructure things so that window
functions having distinct frame clauses were nonetheless done
in one WindowAgg node. But it would complicate the code and
it's far from obvious to me that it'd buy much in speed.
Optimizing the sort steps is where most of the potential win lies.
regards, tom lane
I see. Thank you for the response, I appreciate the postgresql team's
patience in humoring my request twice.
Do you have any suggestions for clearer wording?
I think specifically the word "pass" in "evaluated in one pass" suggests
iteration. Given the intention of the passage is to indicate the number of
times the data is sorted, then I believe it should refer to "sorting" or
"ordering" explicitly. Perhaps "without additional interleaving sort
operations" (though the parenthetical at the end of the paragraph would
become repetitive so maybe not).
Sincerely,
Christopher Inokuchi
On Fri, Mar 7, 2025 at 4:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
Christopher Inokuchi <cinokuchi@gmail.com> writes:
Was it really not intentional that the docs explicitly name PARTITION BY
and ORDER BY rather than the entire window_definition? If I understand
correctly, only those two clauses control which records are hit and inwhat
order.
Yeah, it's intentional, and in fact required by the SQL standard.
However, you're misinterpreting what the guarantee is. The spec
requirement is that window functions sharing PARTITION BY and
ORDER BY all be evaluated on the same concrete ordering of the
data, ie there can't be any re-sorting between them. And that's
what we implement. We do use a separate WindowAgg node for
each distinguishable window specification, but you'll notice
there is not a Sort step between them unless the query involves
entirely-incompatible PARTITION/ORDER BY specs.Perhaps the wording in section 7.2.5 could be improved; I agree
that "evaluated in one pass" is capable of being read in more
than one way, and it's not clear that it's referring to sorts.
Do you have any suggestions for clearer wording?It's possible that we could restructure things so that window
functions having distinct frame clauses were nonetheless done
in one WindowAgg node. But it would complicate the code and
it's far from obvious to me that it'd buy much in speed.
Optimizing the sort steps is where most of the potential win lies.regards, tom lane
On Fri, Mar 7, 2025 at 5:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Christopher Inokuchi <cinokuchi@gmail.com> writes:
Was it really not intentional that the docs explicitly name PARTITION BY
and ORDER BY rather than the entire window_definition? If I understand
correctly, only those two clauses control which records are hit and inwhat
order.
Yeah, it's intentional, and in fact required by the SQL standard.
However, you're misinterpreting what the guarantee is. The spec
requirement is that window functions sharing PARTITION BY and
ORDER BY all be evaluated on the same concrete ordering of the
data, ie there can't be any re-sorting between them. And that's
what we implement. We do use a separate WindowAgg node for
each distinguishable window specification, but you'll notice
there is not a Sort step between them unless the query involves
entirely-incompatible PARTITION/ORDER BY specs.Perhaps the wording in section 7.2.5 could be improved; I agree
that "evaluated in one pass" is capable of being read in more
than one way, and it's not clear that it's referring to sorts.
Do you have any suggestions for clearer wording?
We seem to do quite a few things that we don't tell the user about. The
attached patch describes those things and adds an example demonstrating
their effects via an explain; which is the only way you can construct an
example for this material.
Considered a draft pending feedback to either throw it out in favor of a
one-word/one-line fix or support for going into this amount of detail.
David J.
Attachments:
v0-draft-0001-doc-more-thoroughly-explain-window-function-processing.patchtext/x-patch; charset=US-ASCII; name=v0-draft-0001-doc-more-thoroughly-explain-window-function-processing.patchDownload+88-11
"David G. Johnston" <david.g.johnston@gmail.com> writes:
On Fri, Mar 7, 2025 at 5:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Perhaps the wording in section 7.2.5 could be improved; I agree
that "evaluated in one pass" is capable of being read in more
than one way, and it's not clear that it's referring to sorts.
Do you have any suggestions for clearer wording?
We seem to do quite a few things that we don't tell the user about. The
attached patch describes those things and adds an example demonstrating
their effects via an explain; which is the only way you can construct an
example for this material.
Considered a draft pending feedback to either throw it out in favor of a
one-word/one-line fix or support for going into this amount of detail.
Meh. This is detail that wasn't asked for and doesn't belong in this
section anyway. (If we did want to write something like this, chapter
14 Performance Tips would be a more plausible home I think. That'd
at least remove the problem of forward-referencing EXPLAIN output.)
For a shorter fix, maybe replace the first two sentences
When multiple window functions are used, all the window functions
having syntactically equivalent PARTITION BY and ORDER BY clauses
in their window definitions are guaranteed to be evaluated in a
single pass over the data. Therefore they will see the same sort
ordering, even if the ORDER BY does not uniquely determine an
ordering.
with
When multiple window functions are used, all the window functions
having syntactically equivalent PARTITION BY and ORDER BY clauses
in their window definitions are guaranteed to see the same
ordering of the input rows, even if the ORDER BY does not uniquely
determine the ordering.
regards, tom lane
On Sat, Mar 8, 2025 at 12:50 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
On Fri, Mar 7, 2025 at 5:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Perhaps the wording in section 7.2.5 could be improved; I agree
that "evaluated in one pass" is capable of being read in more
than one way, and it's not clear that it's referring to sorts.
Do you have any suggestions for clearer wording?We seem to do quite a few things that we don't tell the user about. The
attached patch describes those things and adds an example demonstrating
their effects via an explain; which is the only way you can construct an
example for this material.
Considered a draft pending feedback to either throw it out in favor of a
one-word/one-line fix or support for going into this amount of detail.Meh. This is detail that wasn't asked for and doesn't belong in this
section anyway. (If we did want to write something like this, chapter
14 Performance Tips would be a more plausible home I think. That'd
at least remove the problem of forward-referencing EXPLAIN output.)
Thank you for the input. I figured that would be the response and am fine
with it. I took a look at Chapter 14 and agree it could fit there but that
just brought up more thoughts. I'll start a separate thread for all that.
For a shorter fix
When multiple window functions are used, all the window functions
having syntactically equivalent PARTITION BY and ORDER BY clauses
in their window definitions are guaranteed to see the same
ordering of the input rows, even if the ORDER BY does not uniquely
determine the ordering.
WFM, the key point is removing the problematic wording and I do find this
reads better in the end.
However, I'm having trouble understanding the purpose of the word
"syntactically" here. Or even using the "clauses" at all. Why not:
"When multiple window functions are used, all the window functions having
the same partitioning and ordering expressions in their window definitions
are guaranteed to see the same ordering of the input rows, even if
the ordering is not deterministic."
David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes:
On Sat, Mar 8, 2025 at 12:50 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
For a shorter fix
When multiple window functions are used, all the window functions
having syntactically equivalent PARTITION BY and ORDER BY clauses
in their window definitions are guaranteed to see the same
ordering of the input rows, even if the ORDER BY does not uniquely
determine the ordering.
WFM, the key point is removing the problematic wording and I do find this
reads better in the end.
However, I'm having trouble understanding the purpose of the word
"syntactically" here. Or even using the "clauses" at all. Why not:
"When multiple window functions are used, all the window functions having
the same partitioning and ordering expressions in their window definitions
are guaranteed to see the same ordering of the input rows, even if
the ordering is not deterministic."
Sure, we can lose "syntactically" --- it's probably not even strictly
correct anyway, given that what we really look for is equal() parsed
expression trees. By the same token though, I don't love "same"
here, because what is "same"? In particular, in your phrasing
it's not clear whether "PARTITION BY foo" and "ORDER BY foo"
are considered equivalent for this purpose. So let's go with
my wording less the "syntactically".
regards, tom lane
On Sun, Mar 9, 2025 at 10:27 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
On Sat, Mar 8, 2025 at 12:50 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
For a shorter fix
When multiple window functions are used, all the window functions
having syntactically equivalent PARTITION BY and ORDER BY clauses
in their window definitions are guaranteed to see the same
ordering of the input rows, even if the ORDER BY does not uniquely
determine the ordering.
So let's go with
my wording less the "syntactically".
+1
David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes:
On Sun, Mar 9, 2025 at 10:27 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
So let's go with my wording less the "syntactically".
+1
Done that way.
regards, tom lane