Possible bug: SQL function parameter in window frame definition
Hi all,
I noticed this strange behaviour whilst trying to write a function for Postgres 11.5 (PostgreSQL 11.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit) and reduced it to this minimal example. Using a function parameter in the window frame definition seems to be the cause of the error.
create or replace function f(group_size bigint) returns setof int[] as
$$
select array_agg(s) over w
from generate_series(1,10) s
window w as (order by s rows between current row and group_size following)
$$ language sql immutable;
Calling the function without a column list succeeds:
postgres=# select f(3);
f
------------
{1,2,3,4}
{2,3,4,5}
{3,4,5,6}
{4,5,6,7}
{5,6,7,8}
{6,7,8,9}
{7,8,9,10}
{8,9,10}
{9,10}
{10}
(10 rows)
Calling the function with select * fails:
postgres=# select * from f(3);
ERROR: 42704: no value found for parameter 1
LOCATION: ExecEvalParamExtern, execExprInterp.c:2296
Using a plpgsql function with a stringified query works, which is my current workaround:
create or replace function f1(group_size bigint) returns setof int[] as
$$
begin
return query execute format($q$
select array_agg(s) over w as t
from generate_series(1,10) s
window w as (order by s rows between current row and %1$s following)
$q$,group_size);
end;
$$ language plpgsql immutable;
This appears to be a bug to me. If confirmed that this is not some expected behaviour unknown to me I will report this.
Alastair
"Alastair" == Alastair McKinley <a.mckinley@analyticsengines.com> writes:
Alastair> Hi all,
Alastair> I noticed this strange behaviour whilst trying to write a
Alastair> function for Postgres 11.5 (PostgreSQL 11.5 on
Alastair> x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623
Alastair> (Red Hat 4.8.5-36), 64-bit) and reduced it to this minimal
Alastair> example. Using a function parameter in the window frame
Alastair> definition seems to be the cause of the error.
[...]
Alastair> This appears to be a bug to me.
Yes, it's a bug, related to function inlining (the select f(3); is not
inlined and therefore works, but the select * from f(3); is being
inlined, but the original Param is somehow making it into the final plan
rather than being substituted with its value). Looking into why.
--
Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
"Alastair" == Alastair McKinley <a.mckinley@analyticsengines.com> writes:
Alastair> This appears to be a bug to me.
Yes, it's a bug, related to function inlining (the select f(3); is not
inlined and therefore works, but the select * from f(3); is being
inlined, but the original Param is somehow making it into the final plan
rather than being substituted with its value). Looking into why.
It looks to me that the reason is that query_tree_mutator (likewise
query_tree_walker) fails to visit query->windowClause, which is a
bug of the first magnitude if we allow those to contain expressions.
Not sure how we've missed that up to now.
Looking at struct Query, it seems like that's not the only questionable
omission. We're also not descending into
Node *utilityStmt; /* non-null if commandType == CMD_UTILITY */
List *groupClause; /* a list of SortGroupClause's */
List *groupingSets; /* a list of GroupingSet's if present */
List *distinctClause; /* a list of SortGroupClause's */
List *sortClause; /* a list of SortGroupClause's */
List *rowMarks; /* a list of RowMarkClause's */
Now probably this is never called on utility statements, and maybe
there is never a reason for anyone to examine or mutate SortGroupClauses,
GroupingSets, or RowMarkClauses, but I'm not sure it's any business of
this module to assume that.
regards, tom lane
"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> It looks to me that the reason is that query_tree_mutator
Tom> (likewise query_tree_walker) fails to visit query->windowClause,
I noticed this too. I spent some time looking at what might break if
that was changed (found two places so far, see attached draft patch).
Tom> which is a bug of the first magnitude if we allow those to contain
Tom> expressions. Not sure how we've missed that up to now.
I suspect because the partition/order by expressions are actually in the
targetlist instead (with only SortGroupClause nodes in the
windowClause), so only window framing expressions are being missed.
Tom> Looking at struct Query, it seems like that's not the only
Tom> questionable omission. We're also not descending into
Tom> Node *utilityStmt; /* non-null if commandType == CMD_UTILITY */
I assume that utility statements are doing any necessary expression
processing themselves...
Tom> List *groupClause; /* a list of SortGroupClause's */
There's at least one place that walks this (and the distinct and sort
clauses) explicitly (find_expr_references_walker) but most places just
aren't interested in SortGroupClause nodes given that the actual
expressions are elsewhere.
Tom> List *groupingSets; /* a list of GroupingSet's if present */
Likewise, GroupingSet nodes are not any form of expression, they only
reference the groupClause entries.
Tom> List *distinctClause; /* a list of SortGroupClause's */
Tom> List *sortClause; /* a list of SortGroupClause's */
Same goes as for groupClause.
Tom> List *rowMarks; /* a list of RowMarkClause's */
Tom> Now probably this is never called on utility statements, and maybe
Tom> there is never a reason for anyone to examine or mutate
Tom> SortGroupClauses, GroupingSets, or RowMarkClauses, but I'm not
Tom> sure it's any business of this module to assume that.
I think the logic that query_tree_walker is specifically there to walk
places that might contain _expressions_ is reasonably valid. That said,
the fact that we do have one caller that finds it necessary to
explicitly walk some of the places that query_tree_walker omits suggests
that this decision may have been a mistake.
--
Andrew (irc:RhodiumToad)
Attachments:
wc.patchtext/x-patchDownload
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index dd0a7d8dac..2862c47314 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -2217,7 +2217,6 @@ find_expr_references_walker(Node *node,
/* query_tree_walker ignores ORDER BY etc, but we need those opers */
find_expr_references_walker((Node *) query->sortClause, context);
find_expr_references_walker((Node *) query->groupClause, context);
- find_expr_references_walker((Node *) query->windowClause, context);
find_expr_references_walker((Node *) query->distinctClause, context);
/* Examine substructure of query */
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 18bd5ac903..7f485ae29a 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -2292,6 +2292,8 @@ query_tree_walker(Query *query,
return true;
if (walker(query->havingQual, context))
return true;
+ if (walker(query->windowClause, context))
+ return true;
if (walker(query->limitOffset, context))
return true;
if (walker(query->limitCount, context))
@@ -3151,6 +3153,7 @@ query_tree_mutator(Query *query,
MUTATE(query->jointree, query->jointree, FromExpr *);
MUTATE(query->setOperations, query->setOperations, Node *);
MUTATE(query->havingQual, query->havingQual, Node *);
+ MUTATE(query->windowClause, query->windowClause, List *);
MUTATE(query->limitOffset, query->limitOffset, Node *);
MUTATE(query->limitCount, query->limitCount, Node *);
if (!(flags & QTW_IGNORE_CTE_SUBQUERIES))
diff --git a/src/backend/parser/parse_collate.c b/src/backend/parser/parse_collate.c
index 31a5f702a9..dabd904999 100644
--- a/src/backend/parser/parse_collate.c
+++ b/src/backend/parser/parse_collate.c
@@ -485,6 +485,7 @@ assign_collations_walker(Node *node, assign_collations_context *context)
case T_FromExpr:
case T_OnConflictExpr:
case T_SortGroupClause:
+ case T_WindowClause:
(void) expression_tree_walker(node,
assign_collations_walker,
(void *) &loccontext);
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> Now probably this is never called on utility statements, and maybe
Tom> there is never a reason for anyone to examine or mutate
Tom> SortGroupClauses, GroupingSets, or RowMarkClauses, but I'm not
Tom> sure it's any business of this module to assume that.
I think the logic that query_tree_walker is specifically there to walk
places that might contain _expressions_ is reasonably valid. That said,
the fact that we do have one caller that finds it necessary to
explicitly walk some of the places that query_tree_walker omits suggests
that this decision may have been a mistake.
I'm okay with assuming that these functions aren't used on utility
statements (but maybe we should add Assert(query->utilityStmt == NULL)?).
I'm a bit uncomfortable with skipping the other lists. Admittedly,
there's probably not huge value in examining SortGroupClauses in a
vacuum (that is, without knowing which list they appear in). The only
application I can think of offhand is extracting dependencies, which
is already covered by that one caller you mention.
However, we need to fix this in all active branches, and I definitely
agree with minimizing the amount of change to back branches.
The fact that the minimal change breaks (or exposes an oversight in)
assign_collations_walker makes it very plausible that it will also
break somebody's third-party code. If we push the API change further
we increase the risk of breaking stuff. That seems OK in HEAD but
not in back branches.
regards, tom lane
"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> However, we need to fix this in all active branches, and I
Tom> definitely agree with minimizing the amount of change to back
Tom> branches. The fact that the minimal change breaks (or exposes an
Tom> oversight in) assign_collations_walker makes it very plausible
Tom> that it will also break somebody's third-party code. If we push
Tom> the API change further we increase the risk of breaking stuff.
Tom> That seems OK in HEAD but not in back branches.
We could minimize the chance of breakage in a back-patched fix by having
query_tree_walker/mutator iterate the windowClause list itself and
invoke the walker only on offset expressions; is it worth it?
Walkers that follow the recommended code structure should be unaffected;
it only shows up in the collations walker because that treats
expressions as the "default" case and tries to explicitly handle all
non-expression nodes.
--
Andrew (irc:RhodiumToad)
"Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
Andrew> We could minimize the chance of breakage in a back-patched fix
Andrew> by having query_tree_walker/mutator iterate the windowClause
Andrew> list itself
Here is a draft patch along those lines; the intent of this one is that
no existing walker or mutator should need to change (the change to the
dependency code is basically cosmetic I believe, just avoids walking
some things twice).
Also added some tests.
--
Andrew (irc:RhodiumToad)
Attachments:
wc2.patchtext/x-patchDownload
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index dd0a7d8dac..03582781f6 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -2214,18 +2214,13 @@ find_expr_references_walker(Node *node,
context->addrs);
}
- /* query_tree_walker ignores ORDER BY etc, but we need those opers */
- find_expr_references_walker((Node *) query->sortClause, context);
- find_expr_references_walker((Node *) query->groupClause, context);
- find_expr_references_walker((Node *) query->windowClause, context);
- find_expr_references_walker((Node *) query->distinctClause, context);
-
/* Examine substructure of query */
context->rtables = lcons(query->rtable, context->rtables);
result = query_tree_walker(query,
find_expr_references_walker,
(void *) context,
- QTW_IGNORE_JOINALIASES);
+ QTW_IGNORE_JOINALIASES |
+ QTW_EXAMINE_SORTGROUP);
context->rtables = list_delete_first(context->rtables);
return result;
}
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 18bd5ac903..d063bee271 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -2296,6 +2296,33 @@ query_tree_walker(Query *query,
return true;
if (walker(query->limitCount, context))
return true;
+ if ((flags & QTW_EXAMINE_SORTGROUP))
+ {
+ if (walker((Node *) query->groupClause, context))
+ return true;
+ if (walker((Node *) query->windowClause, context))
+ return true;
+ if (walker((Node *) query->sortClause, context))
+ return true;
+ if (walker((Node *) query->distinctClause, context))
+ return true;
+ }
+ else
+ {
+ /*
+ * We need to walk the expressions in WindowClause nodes even if we're
+ * not interested in SortGroupClause nodes.
+ */
+ ListCell *lc;
+ foreach(lc, query->windowClause)
+ {
+ WindowClause *wc = lfirst_node(WindowClause, lc);
+ if (walker(wc->startOffset, context))
+ return true;
+ if (walker(wc->endOffset, context))
+ return true;
+ }
+ }
if (!(flags & QTW_IGNORE_CTE_SUBQUERIES))
{
if (walker((Node *) query->cteList, context))
@@ -3153,6 +3180,38 @@ query_tree_mutator(Query *query,
MUTATE(query->havingQual, query->havingQual, Node *);
MUTATE(query->limitOffset, query->limitOffset, Node *);
MUTATE(query->limitCount, query->limitCount, Node *);
+
+ if ((flags & QTW_EXAMINE_SORTGROUP))
+ {
+ MUTATE(query->groupClause, query->groupClause, List *);
+ MUTATE(query->windowClause, query->windowClause, List *);
+ MUTATE(query->sortClause, query->sortClause, List *);
+ MUTATE(query->distinctClause, query->distinctClause, List *);
+ }
+ else
+ {
+ /*
+ * We need to mutate the expressions in WindowClause nodes even if
+ * we're not interested in SortGroupClause nodes.
+ */
+ List *resultlist;
+ ListCell *temp;
+
+ resultlist = NIL;
+ foreach(temp, query->windowClause)
+ {
+ WindowClause *wc = lfirst_node(WindowClause, temp);
+ WindowClause *newnode;
+
+ FLATCOPY(newnode, wc, WindowClause);
+ MUTATE(newnode->startOffset, wc->startOffset, Node *);
+ MUTATE(newnode->endOffset, wc->endOffset, Node *);
+
+ resultlist = lappend(resultlist, (Node *) newnode);
+ }
+ query->windowClause = resultlist;
+ }
+
if (!(flags & QTW_IGNORE_CTE_SUBQUERIES))
MUTATE(query->cteList, query->cteList, List *);
else /* else copy CTE list as-is */
diff --git a/src/include/nodes/nodeFuncs.h b/src/include/nodes/nodeFuncs.h
index 0cb931c82c..4b5408fa9b 100644
--- a/src/include/nodes/nodeFuncs.h
+++ b/src/include/nodes/nodeFuncs.h
@@ -27,6 +27,7 @@
#define QTW_EXAMINE_RTES_AFTER 0x20 /* examine RTE nodes after their
* contents */
#define QTW_DONT_COPY_QUERY 0x40 /* do not copy top Query */
+#define QTW_EXAMINE_SORTGROUP 0x80 /* include SortGroupNode lists */
/* callback function for check_functions_in_node */
typedef bool (*check_function_callback) (Oid func_id, void *context);
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index edc93d5729..d5fd4045f9 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -3821,3 +3821,45 @@ SELECT i, b, bool_and(b) OVER w, bool_or(b) OVER w
5 | t | t | t
(5 rows)
+-- Tests for problems with failure to walk or mutate expressions
+-- within window frame clauses.
+-- test walker (fails with collation error if expressions are not walked)
+SELECT array_agg(i) OVER w
+ FROM generate_series(1,5) i
+WINDOW w AS (ORDER BY i ROWS BETWEEN (('foo' < 'foobar')::integer) PRECEDING AND CURRENT ROW);
+ array_agg
+-----------
+ {1}
+ {1,2}
+ {2,3}
+ {3,4}
+ {4,5}
+(5 rows)
+
+-- test mutator (fails when inlined if expressions are not mutated)
+CREATE FUNCTION pg_temp.f(group_size BIGINT) RETURNS SETOF integer[]
+AS $$
+ SELECT array_agg(s) OVER w
+ FROM generate_series(1,5) s
+ WINDOW w AS (ORDER BY s ROWS BETWEEN CURRENT ROW AND GROUP_SIZE FOLLOWING)
+$$ LANGUAGE SQL STABLE;
+EXPLAIN (costs off) SELECT * FROM pg_temp.f(2);
+ QUERY PLAN
+------------------------------------------------------
+ Subquery Scan on f
+ -> WindowAgg
+ -> Sort
+ Sort Key: s.s
+ -> Function Scan on generate_series s
+(5 rows)
+
+SELECT * FROM pg_temp.f(2);
+ f
+---------
+ {1,2,3}
+ {2,3,4}
+ {3,4,5}
+ {4,5}
+ {5}
+(5 rows)
+
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index fc6d4cc903..fe273aa31e 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -1257,3 +1257,22 @@ SELECT to_char(SUM(n::float8) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND 1 FO
SELECT i, b, bool_and(b) OVER w, bool_or(b) OVER w
FROM (VALUES (1,true), (2,true), (3,false), (4,false), (5,true)) v(i,b)
WINDOW w AS (ORDER BY i ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING);
+
+-- Tests for problems with failure to walk or mutate expressions
+-- within window frame clauses.
+
+-- test walker (fails with collation error if expressions are not walked)
+SELECT array_agg(i) OVER w
+ FROM generate_series(1,5) i
+WINDOW w AS (ORDER BY i ROWS BETWEEN (('foo' < 'foobar')::integer) PRECEDING AND CURRENT ROW);
+
+-- test mutator (fails when inlined if expressions are not mutated)
+CREATE FUNCTION pg_temp.f(group_size BIGINT) RETURNS SETOF integer[]
+AS $$
+ SELECT array_agg(s) OVER w
+ FROM generate_series(1,5) s
+ WINDOW w AS (ORDER BY s ROWS BETWEEN CURRENT ROW AND GROUP_SIZE FOLLOWING)
+$$ LANGUAGE SQL STABLE;
+
+EXPLAIN (costs off) SELECT * FROM pg_temp.f(2);
+SELECT * FROM pg_temp.f(2);
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
"Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
Andrew> We could minimize the chance of breakage in a back-patched fix
Andrew> by having query_tree_walker/mutator iterate the windowClause
Andrew> list itself
Here is a draft patch along those lines; the intent of this one is that
no existing walker or mutator should need to change (the change to the
dependency code is basically cosmetic I believe, just avoids walking
some things twice).
Hmm. I think this is a reasonable direction to go in, but
what about groupingSets and rowMarks?
Also, in HEAD I'd be inclined to add assertions about utilityStmt
being NULL.
regards, tom lane
"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Here is a draft patch along those lines; the intent of this one is
that no existing walker or mutator should need to change (the change
to the dependency code is basically cosmetic I believe, just avoids
walking some things twice).
Tom> Hmm. I think this is a reasonable direction to go in, but
Tom> what about groupingSets and rowMarks?
groupingSets ultimately contains nothing but numbers which are
meaningless without reference to the matching groupClause list. So
anything that cares about those is really going to have to process them
in its Query case in the walker function in order to get at both
clauses.
Similarly, rowMarks contains indexes into the rangetable (and no
recursive substructure at all), so it's likewise better processed at the
Query level.
Tom> Also, in HEAD I'd be inclined to add assertions about utilityStmt
Tom> being NULL.
Yup.
--
Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> Hmm. I think this is a reasonable direction to go in, but
Tom> what about groupingSets and rowMarks?
groupingSets ultimately contains nothing but numbers which are
meaningless without reference to the matching groupClause list. So
anything that cares about those is really going to have to process them
in its Query case in the walker function in order to get at both
clauses.
Ah. I was thinking there were SortGroupClauses under them, but that
was based on an overly hasty reading of the parsenodes.h comments.
No further complaints.
regards, tom lane
[moving to -hackers, removing OP and -general]
"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> Also, in HEAD I'd be inclined to add assertions about utilityStmt
Tom> being NULL.
Tried this. The assertion is hit:
#3 0x0000000000bb9144 in ExceptionalCondition (conditionName=0xd3c7a9 "query->utilityStmt == NULL",
errorType=0xc3da24 "FailedAssertion", fileName=0xd641f8 "nodeFuncs.c", lineNumber=2280) at assert.c:54
#4 0x000000000081268e in query_tree_walker (query=0x80bb34220, walker=0x98d150 <rangeTableEntry_used_walker>,
context=0x7fffffffc768, flags=0) at nodeFuncs.c:2280
#5 0x0000000000815a29 in query_or_expression_tree_walker (node=0x80bb34220, walker=0x98d150 <rangeTableEntry_used_walker>,
context=0x7fffffffc768, flags=0) at nodeFuncs.c:3344
#6 0x000000000098d13d in rangeTableEntry_used (node=0x80bb34220, rt_index=1, sublevels_up=0) at rewriteManip.c:900
#7 0x0000000000698ce6 in transformRuleStmt (stmt=0x80241bd20,
queryString=0x80241b120 "create rule r3 as on delete to rules_src do notify rules_src_deletion;", actions=0x7fffffffc968,
whereClause=0x7fffffffc960) at parse_utilcmd.c:2883
#8 0x00000000009819c5 in DefineRule (stmt=0x80241bd20,
queryString=0x80241b120 "create rule r3 as on delete to rules_src do notify rules_src_deletion;") at rewriteDefine.c:206
Any suggestions where best to fix this? transformRuleStmt could be
taught to skip a lot of the per-Query stuff it does in the event that
the Query is actually a NOTIFY, or a check for NOTIFY could be added
further down the stack, e.g. in rangeTableEntry_used. Any preferences?
--
Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> Also, in HEAD I'd be inclined to add assertions about utilityStmt
Tom> being NULL.
Tried this. The assertion is hit:
...
Any suggestions where best to fix this? transformRuleStmt could be
taught to skip a lot of the per-Query stuff it does in the event that
the Query is actually a NOTIFY, or a check for NOTIFY could be added
further down the stack, e.g. in rangeTableEntry_used. Any preferences?
Hm. transformRuleStmt already does special-case utility statements to
some extent, so my inclination would be to make it do more of that.
However, it looks like that might end up with rather spaghetti-ish
code, as that function is kind of messy already.
Or we could abandon the notion of adding the assertion. I don't
know how much work it's worth.
regards, tom lane
"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> Hm. transformRuleStmt already does special-case utility statements
Tom> to some extent, so my inclination would be to make it do more of
Tom> that. However, it looks like that might end up with rather
Tom> spaghetti-ish code, as that function is kind of messy already.
Tom> Or we could abandon the notion of adding the assertion. I don't
Tom> know how much work it's worth.
Fixing transformRuleStmt just pushes the issue along another step:
InsertRule wants to do recordDependencyOnExpr on the rule actions,
which just does find_expr_references_walker.
I'm going to leave the assertion out for now and put in a comment for
future reference.
--
Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
I'm going to leave the assertion out for now and put in a comment for
future reference.
WFM. At this point it's clear it would be a separate piece of work
not something to slide into the bug-fix patch, anyway.
regards, tom lane
"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
I'm going to leave the assertion out for now and put in a comment
for future reference.
Tom> WFM. At this point it's clear it would be a separate piece of work
Tom> not something to slide into the bug-fix patch, anyway.
OK. So here's the final patch.
(For the benefit of anyone in -hackers not following the original thread
in -general, the problem here is that expressions in window framing
clauses were not being walked or mutated by query_tree_walker /
query_tree_mutator. This has been wrong ever since 9.0, but somehow
nobody seems to have noticed until now.)
--
Andrew (irc:RhodiumToad)
Attachments:
wc3.patchtext/x-patchDownload
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index dd0a7d8dac..03582781f6 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -2214,18 +2214,13 @@ find_expr_references_walker(Node *node,
context->addrs);
}
- /* query_tree_walker ignores ORDER BY etc, but we need those opers */
- find_expr_references_walker((Node *) query->sortClause, context);
- find_expr_references_walker((Node *) query->groupClause, context);
- find_expr_references_walker((Node *) query->windowClause, context);
- find_expr_references_walker((Node *) query->distinctClause, context);
-
/* Examine substructure of query */
context->rtables = lcons(query->rtable, context->rtables);
result = query_tree_walker(query,
find_expr_references_walker,
(void *) context,
- QTW_IGNORE_JOINALIASES);
+ QTW_IGNORE_JOINALIASES |
+ QTW_EXAMINE_SORTGROUP);
context->rtables = list_delete_first(context->rtables);
return result;
}
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 18bd5ac903..95051629e2 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -2278,6 +2278,13 @@ query_tree_walker(Query *query,
{
Assert(query != NULL && IsA(query, Query));
+ /*
+ * We don't walk any utilityStmt here. However, we can't easily assert
+ * that it is absent, since there are at least two code paths by which
+ * action statements from CREATE RULE end up here, and NOTIFY is allowed
+ * in a rule action.
+ */
+
if (walker((Node *) query->targetList, context))
return true;
if (walker((Node *) query->withCheckOptions, context))
@@ -2296,6 +2303,49 @@ query_tree_walker(Query *query,
return true;
if (walker(query->limitCount, context))
return true;
+ /*
+ * Most callers aren't interested in SortGroupClause nodes since those
+ * don't contain actual expressions. However they do contain OIDs which
+ * may be needed by dependency walkers etc.
+ */
+ if ((flags & QTW_EXAMINE_SORTGROUP))
+ {
+ if (walker((Node *) query->groupClause, context))
+ return true;
+ if (walker((Node *) query->windowClause, context))
+ return true;
+ if (walker((Node *) query->sortClause, context))
+ return true;
+ if (walker((Node *) query->distinctClause, context))
+ return true;
+ }
+ else
+ {
+ /*
+ * But we need to walk the expressions under WindowClause nodes even
+ * if we're not interested in SortGroupClause nodes.
+ */
+ ListCell *lc;
+ foreach(lc, query->windowClause)
+ {
+ WindowClause *wc = lfirst_node(WindowClause, lc);
+ if (walker(wc->startOffset, context))
+ return true;
+ if (walker(wc->endOffset, context))
+ return true;
+ }
+ }
+ /*
+ * groupingSets and rowMarks are not walked:
+ *
+ * groupingSets contain only ressortgrouprefs (integers) which are
+ * meaningless without the corresponding groupClause or tlist.
+ * Accordingly, any walker that needs to care about them needs to handle
+ * them itself in its Query processing.
+ *
+ * rowMarks is not walked because it contains only rangetable indexes (and
+ * flags etc.) and therefore should be handled at Query level similarly.
+ */
if (!(flags & QTW_IGNORE_CTE_SUBQUERIES))
{
if (walker((Node *) query->cteList, context))
@@ -3153,6 +3203,56 @@ query_tree_mutator(Query *query,
MUTATE(query->havingQual, query->havingQual, Node *);
MUTATE(query->limitOffset, query->limitOffset, Node *);
MUTATE(query->limitCount, query->limitCount, Node *);
+
+ /*
+ * Most callers aren't interested in SortGroupClause nodes since those
+ * don't contain actual expressions. However they do contain OIDs, which
+ * may be of interest to some mutators.
+ */
+
+ if ((flags & QTW_EXAMINE_SORTGROUP))
+ {
+ MUTATE(query->groupClause, query->groupClause, List *);
+ MUTATE(query->windowClause, query->windowClause, List *);
+ MUTATE(query->sortClause, query->sortClause, List *);
+ MUTATE(query->distinctClause, query->distinctClause, List *);
+ }
+ else
+ {
+ /*
+ * But we need to mutate the expressions under WindowClause nodes even
+ * if we're not interested in SortGroupClause nodes.
+ */
+ List *resultlist;
+ ListCell *temp;
+
+ resultlist = NIL;
+ foreach(temp, query->windowClause)
+ {
+ WindowClause *wc = lfirst_node(WindowClause, temp);
+ WindowClause *newnode;
+
+ FLATCOPY(newnode, wc, WindowClause);
+ MUTATE(newnode->startOffset, wc->startOffset, Node *);
+ MUTATE(newnode->endOffset, wc->endOffset, Node *);
+
+ resultlist = lappend(resultlist, (Node *) newnode);
+ }
+ query->windowClause = resultlist;
+ }
+
+ /*
+ * groupingSets and rowMarks are not mutated:
+ *
+ * groupingSets contain only ressortgroup refs (integers) which are
+ * meaningless without the groupClause or tlist. Accordingly, any mutator
+ * that needs to care about them needs to handle them itself in its Query
+ * processing.
+ *
+ * rowMarks contains only rangetable indexes (and flags etc.) and
+ * therefore should be handled at Query level similarly.
+ */
+
if (!(flags & QTW_IGNORE_CTE_SUBQUERIES))
MUTATE(query->cteList, query->cteList, List *);
else /* else copy CTE list as-is */
diff --git a/src/include/nodes/nodeFuncs.h b/src/include/nodes/nodeFuncs.h
index 0cb931c82c..4b5408fa9b 100644
--- a/src/include/nodes/nodeFuncs.h
+++ b/src/include/nodes/nodeFuncs.h
@@ -27,6 +27,7 @@
#define QTW_EXAMINE_RTES_AFTER 0x20 /* examine RTE nodes after their
* contents */
#define QTW_DONT_COPY_QUERY 0x40 /* do not copy top Query */
+#define QTW_EXAMINE_SORTGROUP 0x80 /* include SortGroupNode lists */
/* callback function for check_functions_in_node */
typedef bool (*check_function_callback) (Oid func_id, void *context);
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index edc93d5729..d5fd4045f9 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -3821,3 +3821,45 @@ SELECT i, b, bool_and(b) OVER w, bool_or(b) OVER w
5 | t | t | t
(5 rows)
+-- Tests for problems with failure to walk or mutate expressions
+-- within window frame clauses.
+-- test walker (fails with collation error if expressions are not walked)
+SELECT array_agg(i) OVER w
+ FROM generate_series(1,5) i
+WINDOW w AS (ORDER BY i ROWS BETWEEN (('foo' < 'foobar')::integer) PRECEDING AND CURRENT ROW);
+ array_agg
+-----------
+ {1}
+ {1,2}
+ {2,3}
+ {3,4}
+ {4,5}
+(5 rows)
+
+-- test mutator (fails when inlined if expressions are not mutated)
+CREATE FUNCTION pg_temp.f(group_size BIGINT) RETURNS SETOF integer[]
+AS $$
+ SELECT array_agg(s) OVER w
+ FROM generate_series(1,5) s
+ WINDOW w AS (ORDER BY s ROWS BETWEEN CURRENT ROW AND GROUP_SIZE FOLLOWING)
+$$ LANGUAGE SQL STABLE;
+EXPLAIN (costs off) SELECT * FROM pg_temp.f(2);
+ QUERY PLAN
+------------------------------------------------------
+ Subquery Scan on f
+ -> WindowAgg
+ -> Sort
+ Sort Key: s.s
+ -> Function Scan on generate_series s
+(5 rows)
+
+SELECT * FROM pg_temp.f(2);
+ f
+---------
+ {1,2,3}
+ {2,3,4}
+ {3,4,5}
+ {4,5}
+ {5}
+(5 rows)
+
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index fc6d4cc903..fe273aa31e 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -1257,3 +1257,22 @@ SELECT to_char(SUM(n::float8) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND 1 FO
SELECT i, b, bool_and(b) OVER w, bool_or(b) OVER w
FROM (VALUES (1,true), (2,true), (3,false), (4,false), (5,true)) v(i,b)
WINDOW w AS (ORDER BY i ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING);
+
+-- Tests for problems with failure to walk or mutate expressions
+-- within window frame clauses.
+
+-- test walker (fails with collation error if expressions are not walked)
+SELECT array_agg(i) OVER w
+ FROM generate_series(1,5) i
+WINDOW w AS (ORDER BY i ROWS BETWEEN (('foo' < 'foobar')::integer) PRECEDING AND CURRENT ROW);
+
+-- test mutator (fails when inlined if expressions are not mutated)
+CREATE FUNCTION pg_temp.f(group_size BIGINT) RETURNS SETOF integer[]
+AS $$
+ SELECT array_agg(s) OVER w
+ FROM generate_series(1,5) s
+ WINDOW w AS (ORDER BY s ROWS BETWEEN CURRENT ROW AND GROUP_SIZE FOLLOWING)
+$$ LANGUAGE SQL STABLE;
+
+EXPLAIN (costs off) SELECT * FROM pg_temp.f(2);
+SELECT * FROM pg_temp.f(2);
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
OK. So here's the final patch.
(For the benefit of anyone in -hackers not following the original thread
in -general, the problem here is that expressions in window framing
clauses were not being walked or mutated by query_tree_walker /
query_tree_mutator. This has been wrong ever since 9.0, but somehow
nobody seems to have noticed until now.)
Two nitpicky suggestions:
* Please run it through pgindent. Otherwise v13+ are going to be randomly
different from older branches in this area, once we next pgindent HEAD.
* I think you missed s/walk/mutate/ in some of the comments you copied
into query_tree_mutator.
Looks good otherwise.
regards, tom lane
"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> * Please run it through pgindent. Otherwise v13+ are going to be
Tom> randomly different from older branches in this area, once we next
Tom> pgindent HEAD.
gotcha.
Tom> * I think you missed s/walk/mutate/ in some of the comments you
Tom> copied into query_tree_mutator.
... where? The only mention of "walk" near query_tree_mutator is in its
header comment, which I didn't touch.
--
Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> * I think you missed s/walk/mutate/ in some of the comments you
Tom> copied into query_tree_mutator.
... where? The only mention of "walk" near query_tree_mutator is in its
header comment, which I didn't touch.
Wup, sorry, I misparsed the patch. On second read there's no issue there.
regards, tom lane