Window Function "Run Conditions"
It seems like a few too many years of an SQL standard without any
standardised way to LIMIT the number of records in a result set caused
various applications to adopt some strange ways to get this behaviour.
Over here in the PostgreSQL world, we just type LIMIT n; at the end of
our queries. I believe Oracle people did a few tricks with a special
column named "rownum". Another set of people needed SQL that would
work over multiple DBMSes and used something like:
SELECT * FROM (SELECT ... row_number() over (order by ...) rn) a WHERE rn <= 10;
I believe it's fairly common to do paging this way on commerce sites.
The problem with PostgreSQL here is that neither the planner nor
executor knows that once we get to row_number 11 that we may as well
stop. The number will never go back down in this partition.
I'd like to make this better for PostgreSQL 15. I've attached a WIP
patch to do so.
How this works is that I've added prosupport functions for each of
row_number(), rank() and dense_rank(). When doing qual pushdown, if
we happen to hit a windowing function, instead of rejecting the
pushdown, we see if there's a prosupport function and if there is, ask
it if this qual can be used to allow us to stop emitting tuples from
the Window node by making use of this qual. I've called these "run
conditions". Basically, keep running while this remains true. Stop
when it's not.
We can't always use the qual directly. For example, if someone does.
SELECT * FROM (SELECT ... row_number() over (order by ...) rn) a WHERE rn = 10;
then if we use the rn = 10 qual, we'd think we could stop right away.
Instead, I've made the prosupport function handle this by generating a
rn <= 10 qual so that we can stop once we get to 11. In this case we
cannot completely pushdown the qual. It needs to remain in place to
filter out rn values 1-9.
Row_number(), rank() and dense_rank() are all monotonically increasing
functions. But we're not limited to just those. COUNT(*) works too
providing the frame bounds guarantee that the function is either
monotonically increasing or decreasing.
COUNT(*) OVER (ORDER BY .. ROWS BETWEEN CURRENT ROW AND UNBOUNDED
FOLLOWING) is monotonically decreasing, whereas the standard bound
options would make it monotonically increasing.
The same could be done for MIN() and MAX(). I just don't think that's
worth doing. It seems unlikely that would get enough use.
Anyway. I'd like to work on this more during the PG15 cycle. I
believe the attached patch makes this work ok. There are just a few
things to iron out.
1) Unsure of the API to the prosupport function. I wonder if the
prosupport function should just be able to say if the function is
either monotonically increasing or decreasing or neither then have
core code build a qual. That would make the job of building new
functions easier, but massively reduce the flexibility of the feature.
I'm just not sure it needs to do more in the future.
2) Unsure if what I've got to make EXPLAIN show the run condition is
the right way to do it. Because I don't want nodeWindow.c to have to
re-evaluate the window function to determine of the run condition is
no longer met, I've coded the qual to reference the varno in the
window node's targetlist. That qual is no good for EXPLAIN so had to
include another set of quals that include the WindowFunc reference. I
saw that Index Only Scans have a similar means to make EXPLAIN work,
so I just followed that.
David
Attachments:
v1-0001-Allow-some-window-functions-to-finish-execution-e.patchapplication/octet-stream; name=v1-0001-Allow-some-window-functions-to-finish-execution-e.patchDownload+1090-20
čt 1. 7. 2021 v 11:11 odesílatel David Rowley <dgrowleyml@gmail.com> napsal:
It seems like a few too many years of an SQL standard without any
standardised way to LIMIT the number of records in a result set caused
various applications to adopt some strange ways to get this behaviour.
Over here in the PostgreSQL world, we just type LIMIT n; at the end of
our queries. I believe Oracle people did a few tricks with a special
column named "rownum". Another set of people needed SQL that would
work over multiple DBMSes and used something like:SELECT * FROM (SELECT ... row_number() over (order by ...) rn) a WHERE rn
<= 10;I believe it's fairly common to do paging this way on commerce sites.
The problem with PostgreSQL here is that neither the planner nor
executor knows that once we get to row_number 11 that we may as well
stop. The number will never go back down in this partition.I'd like to make this better for PostgreSQL 15. I've attached a WIP
patch to do so.How this works is that I've added prosupport functions for each of
row_number(), rank() and dense_rank(). When doing qual pushdown, if
we happen to hit a windowing function, instead of rejecting the
pushdown, we see if there's a prosupport function and if there is, ask
it if this qual can be used to allow us to stop emitting tuples from
the Window node by making use of this qual. I've called these "run
conditions". Basically, keep running while this remains true. Stop
when it's not.We can't always use the qual directly. For example, if someone does.
SELECT * FROM (SELECT ... row_number() over (order by ...) rn) a WHERE rn
= 10;then if we use the rn = 10 qual, we'd think we could stop right away.
Instead, I've made the prosupport function handle this by generating a
rn <= 10 qual so that we can stop once we get to 11. In this case we
cannot completely pushdown the qual. It needs to remain in place to
filter out rn values 1-9.Row_number(), rank() and dense_rank() are all monotonically increasing
functions. But we're not limited to just those. COUNT(*) works too
providing the frame bounds guarantee that the function is either
monotonically increasing or decreasing.COUNT(*) OVER (ORDER BY .. ROWS BETWEEN CURRENT ROW AND UNBOUNDED
FOLLOWING) is monotonically decreasing, whereas the standard bound
options would make it monotonically increasing.The same could be done for MIN() and MAX(). I just don't think that's
worth doing. It seems unlikely that would get enough use.Anyway. I'd like to work on this more during the PG15 cycle. I
believe the attached patch makes this work ok. There are just a few
things to iron out.1) Unsure of the API to the prosupport function. I wonder if the
prosupport function should just be able to say if the function is
either monotonically increasing or decreasing or neither then have
core code build a qual. That would make the job of building new
functions easier, but massively reduce the flexibility of the feature.
I'm just not sure it needs to do more in the future.2) Unsure if what I've got to make EXPLAIN show the run condition is
the right way to do it. Because I don't want nodeWindow.c to have to
re-evaluate the window function to determine of the run condition is
no longer met, I've coded the qual to reference the varno in the
window node's targetlist. That qual is no good for EXPLAIN so had to
include another set of quals that include the WindowFunc reference. I
saw that Index Only Scans have a similar means to make EXPLAIN work,
so I just followed that.
+1
this can be very nice feature
Pavel
Show quoted text
David
On Thu, 1 Jul 2021 at 21:11, David Rowley <dgrowleyml@gmail.com> wrote:
1) Unsure of the API to the prosupport function. I wonder if the
prosupport function should just be able to say if the function is
either monotonically increasing or decreasing or neither then have
core code build a qual. That would make the job of building new
functions easier, but massively reduce the flexibility of the feature.
I'm just not sure it needs to do more in the future.
I looked at this patch again today and ended up changing the API that
I'd done for the prosupport functions. These just now set a new
"monotonic" field in the (also newly renamed)
SupportRequestWFuncMonotonic struct. This can be set to one of the
values from the newly added MonotonicFunction enum, namely:
MONOTONICFUNC_NONE, MONOTONICFUNC_INCREASING, MONOTONICFUNC_DECREASING
or MONOTONICFUNC_BOTH.
I also added handling for a few more cases that are perhaps rare but
could be done with just a few lines of code. For example; COUNT(*)
OVER() is MONOTONICFUNC_BOTH as it can neither increase nor decrease
for a given window partition. I think technically all of the standard
set of aggregate functions could have a prosupport function to handle
that case. Min() and Max() could go a little further, but I'm not sure
if adding handling for that would be worth it, and if someone does
think that it is worth it, then I'd rather do that as a separate
patch.
I put the MonotonicFunction enum in plannodes.h. There's nothing
specific about window functions or support functions. It could, for
example, be reused again for something else such as monotonic
set-returning functions.
One thing which I'm still not sure about is where
find_window_run_conditions() should be located. Currently, it's in
allpaths.c but that does not really feel like the right place to me.
We do have planagg.c in src/backend/optimizer/plan, maybe we need
planwindow.c?
David
Attachments:
v2_teach_planner_and_executor_about_monotonic_wfuncs.patchapplication/octet-stream; name=v2_teach_planner_and_executor_about_monotonic_wfuncs.patchDownload+1073-18
On Mon, Aug 16, 2021 at 3:28 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Thu, 1 Jul 2021 at 21:11, David Rowley <dgrowleyml@gmail.com> wrote:
1) Unsure of the API to the prosupport function. I wonder if the
prosupport function should just be able to say if the function is
either monotonically increasing or decreasing or neither then have
core code build a qual. That would make the job of building new
functions easier, but massively reduce the flexibility of the feature.
I'm just not sure it needs to do more in the future.I looked at this patch again today and ended up changing the API that
I'd done for the prosupport functions. These just now set a new
"monotonic" field in the (also newly renamed)
SupportRequestWFuncMonotonic struct. This can be set to one of the
values from the newly added MonotonicFunction enum, namely:
MONOTONICFUNC_NONE, MONOTONICFUNC_INCREASING, MONOTONICFUNC_DECREASING
or MONOTONICFUNC_BOTH.I also added handling for a few more cases that are perhaps rare but
could be done with just a few lines of code. For example; COUNT(*)
OVER() is MONOTONICFUNC_BOTH as it can neither increase nor decrease
for a given window partition. I think technically all of the standard
set of aggregate functions could have a prosupport function to handle
that case. Min() and Max() could go a little further, but I'm not sure
if adding handling for that would be worth it, and if someone does
think that it is worth it, then I'd rather do that as a separate
patch.I put the MonotonicFunction enum in plannodes.h. There's nothing
specific about window functions or support functions. It could, for
example, be reused again for something else such as monotonic
set-returning functions.One thing which I'm still not sure about is where
find_window_run_conditions() should be located. Currently, it's in
allpaths.c but that does not really feel like the right place to me.
We do have planagg.c in src/backend/optimizer/plan, maybe we need
planwindow.c?David
Hi,
+ if ((res->monotonic & MONOTONICFUNC_INCREASING) ==
MONOTONICFUNC_INCREASING)
The above can be simplified as 'if (res->monotonic &
MONOTONICFUNC_INCREASING) '
Cheers
On Tue, 17 Aug 2021 at 03:51, Zhihong Yu <zyu@yugabyte.com> wrote:
+ if ((res->monotonic & MONOTONICFUNC_INCREASING) == MONOTONICFUNC_INCREASING)
The above can be simplified as 'if (res->monotonic & MONOTONICFUNC_INCREASING) '
True. I've attached an updated patch.
David
Attachments:
v3_teach_planner_and_executor_about_monotonic_wfuncs.patchapplication/octet-stream; name=v3_teach_planner_and_executor_about_monotonic_wfuncs.patchDownload+1073-18
Hi David:
Thanks for the patch.
On Wed, Aug 18, 2021 at 6:40 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Tue, 17 Aug 2021 at 03:51, Zhihong Yu <zyu@yugabyte.com> wrote:
+ if ((res->monotonic & MONOTONICFUNC_INCREASING) == MONOTONICFUNC_INCREASING)
The above can be simplified as 'if (res->monotonic & MONOTONICFUNC_INCREASING) '
True. I've attached an updated patch.
David
Looks like we need to narrow down the situation where we can apply
this optimization.
SELECT * FROM
(SELECT empno,
salary,
count(*) over (order by empno desc) as c ,
dense_rank() OVER (ORDER BY salary DESC) dr
FROM empsalary) emp
WHERE dr = 1;
In the current master, the result is:
empno | salary | c | dr
-------+--------+---+----
8 | 6000 | 4 | 1
(1 row)
In the patched version, the result is:
empno | salary | c | dr
-------+--------+---+----
8 | 6000 | 1 | 1
(1 row)
--
Best Regards
Andy Fan (https://www.aliyun.com/)
On Thu, 19 Aug 2021 at 00:20, Andy Fan <zhihui.fan1213@gmail.com> wrote:
In the current master, the result is:
empno | salary | c | dr
-------+--------+---+----
8 | 6000 | 4 | 1
In the patched version, the result is:
empno | salary | c | dr
-------+--------+---+----
8 | 6000 | 1 | 1
Thanks for taking it for a spin.
That's a bit unfortunate. I don't immediately see how to fix it other
than to restrict the optimisation to only apply when there's a single
WindowClause. It might be possible to relax it further and only apply
if it's the final window clause to be evaluated, but in those cases,
the savings are likely to be much less anyway as some previous
WindowAgg will have exhausted all rows from its subplan. Likely
restricting it to only working if there's 1 WindowClause would be fine
as for the people using row_number() for a top-N type query, there's
most likely only going to be 1 WindowClause.
Anyway, I'll take a few more days to think about it before posting a fix.
David
On Thu, Aug 19, 2021 at 2:35 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Thu, 19 Aug 2021 at 00:20, Andy Fan <zhihui.fan1213@gmail.com> wrote:
In the current master, the result is:
empno | salary | c | dr
-------+--------+---+----
8 | 6000 | 4 | 1In the patched version, the result is:
empno | salary | c | dr
-------+--------+---+----
8 | 6000 | 1 | 1Thanks for taking it for a spin.
That's a bit unfortunate. I don't immediately see how to fix it other
than to restrict the optimisation to only apply when there's a single
WindowClause. It might be possible to relax it further and only apply
if it's the final window clause to be evaluated, but in those cases,
the savings are likely to be much less anyway as some previous
WindowAgg will have exhausted all rows from its subplan.
I am trying to hack the select_active_windows function to make
sure the WindowClause with Run Condition clause to be the last one
to evaluate (we also need to consider more than 1 window func has
run condition), at that time, the run condition clause is ready already.
However there are two troubles in this direction: a). This may conflict
with "the windows that need the same sorting are adjacent in the list."
b). "when two or more windows are order-equivalent then all peer rows
must be presented in the same order in all of them. .. (See General Rule 4 of
<window clause> in SQL2008 - SQL2016.)"
In summary, I am not sure if it is correct to change the execution Order
of WindowAgg freely.
Likely
restricting it to only working if there's 1 WindowClause would be fine
as for the people using row_number() for a top-N type query, there's
most likely only going to be 1 WindowClause.
This sounds practical. And I suggest the following small changes.
(just check the partitionClause before the prosupport)
@@ -2133,20 +2133,22 @@ find_window_run_conditions(Query *subquery,
RangeTblEntry *rte, Index rti,
*keep_original = true;
- prosupport = get_func_support(wfunc->winfnoid);
-
- /* Check if there's a support function for 'wfunc' */
- if (!OidIsValid(prosupport))
- return false;
-
/*
* Currently the WindowAgg node just stop when the run condition is no
* longer true. If there is a PARTITION BY clause then we cannot just
* stop as other partitions still need to be processed.
*/
+
+ /* Check this first since window function with a partition
clause is common*/
if (wclause->partitionClause != NIL)
return false;
+ prosupport = get_func_support(wfunc->winfnoid);
+
+ /* Check if there's a support function for 'wfunc' */
+ if (!OidIsValid(prosupport))
+ return false;
+
/* get the Expr from the other side of the OpExpr */
if (wfunc_left)
otherexpr = lsecond(opexpr->args);
--
Best Regards
Andy Fan (https://www.aliyun.com/)
This looks like an awesome addition.
I have one technical questions...
Is it possible to actually transform the row_number case into a LIMIT
clause or make the planner support for this case equivalent to it (in
which case we can replace the LIMIT clause planning to transform into
a window function)?
The reason I ask is because the Limit plan node is actually quite a
bit more optimized than the general window function plan node. It
calculates cost estimates based on the limit and can support Top-N
sort nodes.
But the bigger question is whether this patch is ready for a committer
to look at? Were you able to resolve Andy Fan's bug report? Did you
resolve the two questions in the original email?
On Tue, Mar 15, 2022 at 5:24 PM Greg Stark <stark@mit.edu> wrote:
This looks like an awesome addition.
I have one technical questions...
Is it possible to actually transform the row_number case into a LIMIT
clause or make the planner support for this case equivalent to it (in
which case we can replace the LIMIT clause planning to transform into
a window function)?The reason I ask is because the Limit plan node is actually quite a
bit more optimized than the general window function plan node. It
calculates cost estimates based on the limit and can support Top-N
sort nodes.But the bigger question is whether this patch is ready for a committer
to look at? Were you able to resolve Andy Fan's bug report? Did you
resolve the two questions in the original email?
+1 to all this
It seems like this effort would aid in implementing what some other
databases implement via the QUALIFY clause, which is to window functions
what HAVING is to aggregate functions.
example:
https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#qualify_clause
Hi,
On 2021-08-19 18:35:27 +1200, David Rowley wrote:
Anyway, I'll take a few more days to think about it before posting a fix.
The patch in the CF entry doesn't apply: http://cfbot.cputube.org/patch_37_3234.log
The quoted message was ~6 months ago. I think this CF entry should be marked
as returned-with-feedback?
- Andres
On Thu, 26 Aug 2021 at 14:54, Andy Fan <zhihui.fan1213@gmail.com> wrote:
On Thu, Aug 19, 2021 at 2:35 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Thu, 19 Aug 2021 at 00:20, Andy Fan <zhihui.fan1213@gmail.com> wrote:
In the current master, the result is:
empno | salary | c | dr
-------+--------+---+----
8 | 6000 | 4 | 1In the patched version, the result is:
empno | salary | c | dr
-------+--------+---+----
8 | 6000 | 1 | 1Thanks for taking it for a spin.
That's a bit unfortunate. I don't immediately see how to fix it other
than to restrict the optimisation to only apply when there's a single
WindowClause. It might be possible to relax it further and only apply
if it's the final window clause to be evaluated, but in those cases,
the savings are likely to be much less anyway as some previous
WindowAgg will have exhausted all rows from its subplan.I am trying to hack the select_active_windows function to make
sure the WindowClause with Run Condition clause to be the last one
to evaluate (we also need to consider more than 1 window func has
run condition), at that time, the run condition clause is ready already.However there are two troubles in this direction: a). This may conflict
with "the windows that need the same sorting are adjacent in the list."
b). "when two or more windows are order-equivalent then all peer rows
must be presented in the same order in all of them. .. (See General Rule 4 of
<window clause> in SQL2008 - SQL2016.)"In summary, I am not sure if it is correct to change the execution Order
of WindowAgg freely.
Thanks for looking at that.
My current thoughts are that it just feels a little too risky to
adjust the comparison function that sorts the window clauses to pay
attention to the run-condition.
We would need to ensure that there's just a single window function
with a run condition as it wouldn't be valid for there to be multiple.
It would be easy enough to ensure we only push quals into just 1
window clause, but that and meddling with the evaluation order has
trade-offs. To do that properly, we'd likely want to consider the
costs when deciding which window clause would benefit from having
quals pushed the most. Plus, if we start messing with the evaluation
order then we'd likely really want some sort of costing to check if
pushing a qual and adjusting the evaluation order is actually cheaper
than not pushing the qual and keeping the clauses in the order that
requires the minimum number of sorts. The planner is not really
geared up for costing things like that properly, that's why we just
assume the order with the least sorts is best. In reality that's often
not going to be true as an index may exist and we might want to
evaluate a clause first if we could get rid of a sort and index scan
instead. Sorting the window clauses based on their SortGroupClause is
just the best we can do for now at that stage in planning.
I think it's safer to just disable the optimisation when there are
multiple window clauses. Multiple matching clauses are merged
already, so it's perfectly valid to have multiple window functions,
it's just they must share the same window clause. I don't think
that's terrible as with the major use case that I have in mind for
this, the window function is only added to limit the number of rows.
In most cases I can imagine, there'd be no reason to have an
additional window function with different frame options.
I've attached an updated patch.
Attachments:
v2-0001-Teach-planner-and-executor-about-monotonic-window.patchtext/plain; charset=US-ASCII; name=v2-0001-Teach-planner-and-executor-about-monotonic-window.patchDownload+1161-19
On Wed, 16 Mar 2022 at 10:24, Greg Stark <stark@mit.edu> wrote:
This looks like an awesome addition.
Thanks
I have one technical questions...
Is it possible to actually transform the row_number case into a LIMIT
clause or make the planner support for this case equivalent to it (in
which case we can replace the LIMIT clause planning to transform into
a window function)?
Currently, I have only coded it to support monotonically increasing
and decreasing functions. Putting a <= <const> type condition on a
row_number() function with no PARTITION BY clause I think is logically
the same as a LIMIT clause, but that's not the case for rank() and
dense_rank(). There may be multiple peer rows with the same rank in
those cases. We'd have no way to know what the LIMIT should be set to.
I don't really want to just do this for row_number().
The reason I ask is because the Limit plan node is actually quite a
bit more optimized than the general window function plan node. It
calculates cost estimates based on the limit and can support Top-N
sort nodes.
This is true. There's perhaps no reason why an additional property
could not be added to allow the prosupport function to optionally set
*exactly* the maximum number of rows that could match the condition.
e.g. for select * from (select *,row_number() over (order by c) rn
from ..) w where rn <= 10; that could be set to 10, and if we used
rank() instead of row_number(), it could just be left unset.
I think this is probably worth thinking about at some future date. I
don't really want to make it part of this effort. I also don't think
I'm doing anything here that would need to be undone to make that
work.
David
On Thu, 17 Mar 2022 at 17:04, Corey Huinker <corey.huinker@gmail.com> wrote:
It seems like this effort would aid in implementing what some other databases implement via the QUALIFY clause, which is to window functions what HAVING is to aggregate functions.
example: https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#qualify_clause
Isn't that just syntactic sugar? You could get the same from adding a
subquery where a WHERE clause to filter rows evaluated after the
window clause.
David
On Tue, Mar 22, 2022 at 3:24 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Thu, 26 Aug 2021 at 14:54, Andy Fan <zhihui.fan1213@gmail.com> wrote:
On Thu, Aug 19, 2021 at 2:35 PM David Rowley <dgrowleyml@gmail.com>
wrote:
On Thu, 19 Aug 2021 at 00:20, Andy Fan <zhihui.fan1213@gmail.com>
wrote:
In the current master, the result is:
empno | salary | c | dr
-------+--------+---+----
8 | 6000 | 4 | 1In the patched version, the result is:
empno | salary | c | dr
-------+--------+---+----
8 | 6000 | 1 | 1Thanks for taking it for a spin.
That's a bit unfortunate. I don't immediately see how to fix it other
than to restrict the optimisation to only apply when there's a single
WindowClause. It might be possible to relax it further and only apply
if it's the final window clause to be evaluated, but in those cases,
the savings are likely to be much less anyway as some previous
WindowAgg will have exhausted all rows from its subplan.I am trying to hack the select_active_windows function to make
sure the WindowClause with Run Condition clause to be the last one
to evaluate (we also need to consider more than 1 window func has
run condition), at that time, the run condition clause is ready already.However there are two troubles in this direction: a). This may conflict
with "the windows that need the same sorting are adjacent in the list."
b). "when two or more windows are order-equivalent then all peer rows
must be presented in the same order in all of them. .. (See General Rule4 of
<window clause> in SQL2008 - SQL2016.)"
In summary, I am not sure if it is correct to change the execution Order
of WindowAgg freely.Thanks for looking at that.
My current thoughts are that it just feels a little too risky to
adjust the comparison function that sorts the window clauses to pay
attention to the run-condition.We would need to ensure that there's just a single window function
with a run condition as it wouldn't be valid for there to be multiple.
It would be easy enough to ensure we only push quals into just 1
window clause, but that and meddling with the evaluation order has
trade-offs. To do that properly, we'd likely want to consider the
costs when deciding which window clause would benefit from having
quals pushed the most. Plus, if we start messing with the evaluation
order then we'd likely really want some sort of costing to check if
pushing a qual and adjusting the evaluation order is actually cheaper
than not pushing the qual and keeping the clauses in the order that
requires the minimum number of sorts. The planner is not really
geared up for costing things like that properly, that's why we just
assume the order with the least sorts is best. In reality that's often
not going to be true as an index may exist and we might want to
evaluate a clause first if we could get rid of a sort and index scan
instead. Sorting the window clauses based on their SortGroupClause is
just the best we can do for now at that stage in planning.I think it's safer to just disable the optimisation when there are
multiple window clauses. Multiple matching clauses are merged
already, so it's perfectly valid to have multiple window functions,
it's just they must share the same window clause. I don't think
that's terrible as with the major use case that I have in mind for
this, the window function is only added to limit the number of rows.
In most cases I can imagine, there'd be no reason to have an
additional window function with different frame options.I've attached an updated patch.
Hi,
The following code seems to be common between if / else blocks (w.r.t.
wfunc_left) of find_window_run_conditions().
+ op = get_opfamily_member(opinfo->opfamily_id,
+ opinfo->oplefttype,
+ opinfo->oprighttype,
+ newstrategy);
+
+ newopexpr = (OpExpr *) make_opclause(op,
+ opexpr->opresulttype,
+ opexpr->opretset,
+ otherexpr,
+ (Expr *) wfunc,
+ opexpr->opcollid,
+ opexpr->inputcollid);
+ newopexpr->opfuncid = get_opcode(op);
+
+ *keep_original = true;
+ runopexpr = newopexpr;
It would be nice if this code can be shared.
+ WindowClause *wclause = (WindowClause *)
+ list_nth(subquery->windowClause,
+ wfunc->winref - 1);
The code would be more readable if list_nth() is indented.
+ /* Check the left side of the OpExpr */
It seems the code for checking left / right is the same. It would be better
to extract and reuse the code.
Cheers
On Wed, 23 Mar 2022 at 12:50, Zhihong Yu <zyu@yugabyte.com> wrote:
The following code seems to be common between if / else blocks (w.r.t. wfunc_left) of find_window_run_conditions().
It would be nice if this code can be shared.
I remember thinking about that and thinking that I didn't want to
overcomplicate the if conditions for the strategy tests. I'd thought
these would have become:
if ((wfunc_left && (strategy == BTLessStrategyNumber ||
strategy == BTLessEqualStrategyNumber)) ||
(!wfunc_left && (strategy == BTGreaterStrategyNumber ||
strategy == BTGreaterEqualStrategyNumber)))
which I didn't think was very readable. That caused me to keep it separate.
On reflection, we can just leave the strategy checks as they are, then
add the additional code for checking wfunc_left when checking the
res->monotonic, i.e:
if ((wfunc_left && (res->monotonic & MONOTONICFUNC_INCREASING)) ||
(!wfunc_left && (res->monotonic & MONOTONICFUNC_DECREASING)))
I think that's more readable than doubling up the strategy checks, so
I've done it that way in the attached.
+ WindowClause *wclause = (WindowClause *) + list_nth(subquery->windowClause, + wfunc->winref - 1);The code would be more readable if list_nth() is indented.
That's just the way pgindent put it.
+ /* Check the left side of the OpExpr */
It seems the code for checking left / right is the same. It would be better to extract and reuse the code.
I've moved some of that code into find_window_run_conditions() which
removes about 10 lines of code.
Updated patch attached. Thanks for looking.
David
Attachments:
v3-0001-Teach-planner-and-executor-about-monotonic-window.patchtext/plain; charset=US-ASCII; name=v3-0001-Teach-planner-and-executor-about-monotonic-window.patchDownload+1083-19
On Wed, 23 Mar 2022 at 11:24, David Rowley <dgrowleyml@gmail.com> wrote:
I think it's safer to just disable the optimisation when there are
multiple window clauses. Multiple matching clauses are merged
already, so it's perfectly valid to have multiple window functions,
it's just they must share the same window clause. I don't think
that's terrible as with the major use case that I have in mind for
this, the window function is only added to limit the number of rows.
In most cases I can imagine, there'd be no reason to have an
additional window function with different frame options.
I've not looked into the feasibility of it, but I had a thought that
we could just accumulate all the run-conditions in a new field in the
PlannerInfo then just tag them onto the top-level WindowAgg when
building the plan.
I'm just not sure it would be any more useful than what the v3 patch
is currently doing as intermediate WindowAggs would still need to
process all rows. I think it would only save the window function
evaluation of the top-level WindowAgg for rows that don't match the
run-condition. All the supported window functions are quite cheap, so
it's not a huge saving. I'd bet there would be example cases where it
would be measurable though.
David
On Tue, Mar 22, 2022 at 3:39 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Thu, 17 Mar 2022 at 17:04, Corey Huinker <corey.huinker@gmail.com>
wrote:It seems like this effort would aid in implementing what some other
databases implement via the QUALIFY clause, which is to window functions
what HAVING is to aggregate functions.example:
https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#qualify_clause
Isn't that just syntactic sugar? You could get the same from adding a
subquery where a WHERE clause to filter rows evaluated after the
window clause.
I'd like some of that syntactic sugar please. It goes nicely with my
HAVING syntactic coffee.
David J.
On Wed, 23 Mar 2022 at 16:30, David Rowley <dgrowleyml@gmail.com> wrote:
On Wed, 23 Mar 2022 at 11:24, David Rowley <dgrowleyml@gmail.com> wrote:
I think it's safer to just disable the optimisation when there are
multiple window clauses. Multiple matching clauses are merged
already, so it's perfectly valid to have multiple window functions,
it's just they must share the same window clause. I don't think
that's terrible as with the major use case that I have in mind for
this, the window function is only added to limit the number of rows.
In most cases I can imagine, there'd be no reason to have an
additional window function with different frame options.I've not looked into the feasibility of it, but I had a thought that
we could just accumulate all the run-conditions in a new field in the
PlannerInfo then just tag them onto the top-level WindowAgg when
building the plan.I'm just not sure it would be any more useful than what the v3 patch
is currently doing as intermediate WindowAggs would still need to
process all rows. I think it would only save the window function
evaluation of the top-level WindowAgg for rows that don't match the
run-condition. All the supported window functions are quite cheap, so
it's not a huge saving. I'd bet there would be example cases where it
would be measurable though.
Another way of doing this that seems better is to make it so only the
top-level WindowAgg will stop processing when the run condition
becomes false. Any intermediate WindowAggs must continue processing
tuples, but may skip evaluation of their WindowFuncs.
Doing things this way also allows us to handle cases where there is a
PARTITION BY clause, however, in this case, the top-level WindowAgg
must not stop processing and return NULL, instead, it can just act as
if it were an intermediate WindowAgg and just stop evaluating
WindowFuncs. The top-level WindowAgg must continue processing the
tuples so that the other partitions are also processed.
I made the v4 patch do things this way and tested the performance of
it vs current master. Test 1 and 2 have PARTITION BY clauses. There's
a small performance increase from not evaluating the row_number()
function once rn <= 2 is no longer true.
Test 3 shows the same speedup as the original patch where we just stop
processing any further tuples when the run condition is no longer true
and there is no PARTITION BY clause.
Setup:
create table xy (x int, y int);
insert into xy select x,y from generate_series(1,1000)x,
generate_Series(1,1000)y;
create index on xy(x,y);
vacuum analyze xy;
Test 1:
explain analyze select * from (select x,y,row_number() over (partition
by x order by y) rn from xy) as xy where rn <= 2;
Master:
Execution Time: 359.553 ms
Execution Time: 354.235 ms
Execution Time: 357.646 ms
v4 patch:
Execution Time: 346.641 ms
Execution Time: 337.131 ms
Execution Time: 336.531 ms
(5% faster)
Test 2:
explain analyze select * from (select x,y,row_number() over (partition
by x order by y) rn from xy) as xy where rn = 1;
Master:
Execution Time: 359.046 ms
Execution Time: 357.601 ms
Execution Time: 357.977 ms
v4 patch:
Execution Time: 336.540 ms
Execution Time: 337.024 ms
Execution Time: 342.706 ms
(5.7% faster)
Test 3:
explain analyze select * from (select x,y,row_number() over (order by
x,y) rn from xy) as xy where rn <= 2;
Master:
Execution Time: 362.322 ms
Execution Time: 348.812 ms
Execution Time: 349.471 ms
v4 patch:
Execution Time: 0.060 ms
Execution Time: 0.037 ms
Execution Time: 0.037 ms
(~8000x faster)
One thing which I'm not sure about with the patch is how I'm handling
the evaluation of the runcondition in nodeWindowAgg.c. Instead of
having ExecQual() evaluate an OpExpr such as "row_number() over (...)
<= 10", I'm replacing the WindowFunc with the Var in the targetlist
that corresponds to the given WindowFunc. This saves having to double
evaluate the WindowFunc. Instead, the value of the Var can be taken
directly from the slot. I don't know of anywhere else we do things
quite like that. The runcondition is slightly similar to HAVING
clauses, but HAVING clauses don't work this way. Maybe they would
have if slots had existed back then. Or maybe it's a bad idea to set a
precedent that the targetlist Vars must be evaluated already. Does
anyone have any thoughts on this part?
v4 patch attached.
David
Attachments:
v4-0001-Teach-planner-and-executor-about-monotonic-window.patchtext/plain; charset=US-ASCII; name=v4-0001-Teach-planner-and-executor-about-monotonic-window.patchDownload+1312-44
Hi,
On 2022-03-29 15:11:52 +1300, David Rowley wrote:
One thing which I'm not sure about with the patch is how I'm handling
the evaluation of the runcondition in nodeWindowAgg.c. Instead of
having ExecQual() evaluate an OpExpr such as "row_number() over (...)
<= 10", I'm replacing the WindowFunc with the Var in the targetlist
that corresponds to the given WindowFunc. This saves having to double
evaluate the WindowFunc. Instead, the value of the Var can be taken
directly from the slot. I don't know of anywhere else we do things
quite like that. The runcondition is slightly similar to HAVING
clauses, but HAVING clauses don't work this way.
Don't HAVING clauses actually work pretty similar? Yes, they don't have a Var,
but for expression evaluation purposes an Aggref is nearly the same as a plain
Var:
EEO_CASE(EEOP_INNER_VAR)
{
int attnum = op->d.var.attnum;
/*
* Since we already extracted all referenced columns from the
* tuple with a FETCHSOME step, we can just grab the value
* directly out of the slot's decomposed-data arrays. But let's
* have an Assert to check that that did happen.
*/
Assert(attnum >= 0 && attnum < innerslot->tts_nvalid);
*op->resvalue = innerslot->tts_values[attnum];
*op->resnull = innerslot->tts_isnull[attnum];
EEO_NEXT();
}
vs
EEO_CASE(EEOP_AGGREF)
{
/*
* Returns a Datum whose value is the precomputed aggregate value
* found in the given expression context.
*/
int aggno = op->d.aggref.aggno;
Assert(econtext->ecxt_aggvalues != NULL);
*op->resvalue = econtext->ecxt_aggvalues[aggno];
*op->resnull = econtext->ecxt_aggnulls[aggno];
EEO_NEXT();
}
specifically we don't re-evaluate expressions?
This is afaics slightly cheaper than referencing a variable in a slot.
Greetings,
Andres Freund