Common Table Expressions (WITH RECURSIVE) patch
These are my initial comments about the Common Table Expressions (CTE)
patch, also known as WITH [RECURSIVE]. These comments are based on the
patch here:
http://archives.postgresql.org/pgsql-patches/2008-08/msg00021.php
This is a semantically complex feature, and the standard is fairly
complex as well. So I'm approaching this by making my own
interpretations from the standard first (I included my interpretations
and section references at the end of this email) and comparing to the
behavior of the patch.
The following examples may be inconsistent with the standard. Some
have already been mentioned, and I don't think they all need to be
fixed for 8.4, but I mention them here for completeness.
* Mutual Recursion:
with recursive
foo(i) as (values(1) union all select i+1 from bar where i < 10),
bar(i) as (values(1) union all select i+1 from foo where i < 10)
select * from foo;
ERROR: mutual recursive call is not supported
The standard allows mutual recursion.
* Single Evaluation:
with
foo(i) as (select random() as i)
select * from foo union all select * from foo;
i
-------------------
0.233165248762816
0.62126633618027
(2 rows)
The standard specifies that non-recursive WITH should be evaluated
once.
* RHS only:
with recursive
foo(i) as (select i+1 from foo where i < 10 union all values(1))
select * from foo;
ERROR: Left hand side of UNION ALL must be a non-recursive term in a
recursive query
The standard does not require that the recursive term be on the RHS.
* UNION ALL only:
with recursive
foo(i) as (values(1) union select i+1 from foo where i < 10)
select * from foo;
ERROR: non-recursive term and recursive term must be combined with
UNION ALL
The standard seems to allow UNION ALL, UNION, INTERSECT, and EXCEPT
(when the recursive term is not on the RHS of the EXCEPT).
* Binary recursion and subselect strangeness:
with recursive foo(i) as
(values (1)
union all
select * from
(select i+1 from foo where i < 10
union all
select i+1 from foo where i < X) t)
select * from foo;
Produces 10 rows of output regardless of what "X" is. This should be
fixed for 8.4.
Also, this is non-linear recursion, which the standard seems to
disallow.
* Multiple recursive references:
with recursive foo(i) as
(values (1)
union all
select i+1 from foo where i < 10
union all
select i+1 from foo where i < 20)
select * from foo;
ERROR: Left hand side of UNION ALL must be a non-recursive term in a
recursive query
If we're going to allow non-linear recursion (which the standard
does not), this seems like it should be a valid case.
* Strange result with except:
with recursive foo(i) as
(values (1)
union all
select * from
(select i+1 from foo where i < 10
except
select i+1 from foo where i < 5) t)
select * from foo;
ERROR: table "foo" has 0 columns available but 1 columns specified
This query works if you replace "except" with "union". This should be
fixed for 8.4.
* Aggregates allowed:
with recursive foo(i) as
(values(1)
union all
select max(i)+1 from foo where i < 10)
select * from foo;
Aggregates should be blocked according to the standard.
Also, causes an infinite loop. This should be fixed for 8.4.
* DISTINCT should supress duplicates:
with recursive foo(i) as
(select distinct * from (values(1),(2)) t
union all
select distinct i+1 from foo where i < 10)
select * from foo;
This outputs a lot of duplicates, but they should be supressed
according to the standard. This query is essentially the same as
supporting UNION for recursive queries, so we should either fix both for
8.4 or block both for consistency.
* outer joins on a recursive reference should be blocked:
with recursive foo(i) as
(values(1)
union all
select i+1 from foo left join (values(1)) t on (i=column1))
select * from foo;
Causes an infinite loop, but the standard says using an outer join
in this situation should be prohibited. This should be fixed for 8.4.
* ORDER BY, LIMIT, and OFFSET are rejected for recursive queries. The
standard does not seem to say that these should be rejected.
The following are my interpretations of relevant parts of the SQL
standard (200n), and the associated sections. These are only my
interpretation, so let me know if you interpret the standard
differently.
Non-linear recursion forbidden:
7.13: Syntax Rules: 2.g.ii
My interpretation of 2.g.ii.2 is that WQN[k] and WQN[l] may be the
same <query name>.
7.13: Syntax Rules: 2.g.iv
EXCEPT can't be used for recursive queries if a recursive reference
appears on the RHS:
7.13: Syntax Rules: 2.g.iii.1
INTERSECT ALL/EXCEPT ALL can't be used for recursive queries:
7.13: Syntax Rules: 2.g.iii.5
recursive references must appear in FROM clause:
7.13: Syntax Rules: 2.g.iii.3
My interpretation of this rule is that it does not allow a
recursive reference in a subquery in the targetlist or a subquery
in the where clause.
stratum defined:
7.13: Syntax Rules: 2.f
7.13: Syntax Rules: 2.g.i.1
recursive query must have anchor for every stratum:
7.13: Syntax Rules: 2.g.i.3
outer joins not allowed to join recursive references:
7.13: Syntax Rules: 2.g.iii.6
7.13: Syntax Rules: 2.g.iii.7
Aggregates/HAVING disallowed:
7.13: Syntax Rules: 2.g.iii.4.B
Mutual recursion defined:
7.13: Syntax Rules: 2.g.i.1
Evaluate each WITH entry once, even if it's referenced multiple times:
7.13: General Rules: 1
7.13: General Rules: 2.b
See also:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php
Evaluation order with mutual recursion:
7.13: General Rules: 2.a
7.13: General Rules: 2.b
Evaluation semantics:
7.13: General Rules: 2.c
DISTINCT:
7.13: General Rules: 2.c.iv
7.13: General Rules: 2.c.ix.3.A
I will provide comments about the code and documentation soon. This is a
very useful feature.
Regards,
Jeff Davis
"Jeff" == Jeff Davis <pgsql@j-davis.com> writes:
Jeff> * Mutual Recursion:
This limitation isn't at all uncommon in other implementations; DB2
docs for example say:
"If more than one common table expression is defined in the same
statement, cyclic references between the common table expressions are
not permitted. A cyclic reference occurs when two common table
expressions dt1 and dt2 are created such that dt1 refers to dt2 and
dt2 refers to dt1."
MSSQL's documentation is less clear, but it also appears not to allow
mutual recursion (not allowing forward references to CTEs).
"A CTE can reference itself and previously defined CTEs in the same
WITH clause. Forward referencing is not allowed."
http://msdn.microsoft.com/en-us/library/ms175972.aspx
Jeff> * RHS only:
Jeff> with recursive
Jeff> foo(i) as (select i+1 from foo where i < 10 union all values(1))
Jeff> select * from foo;
Jeff> ERROR: Left hand side of UNION ALL must be a non-recursive term in a
Jeff> recursive query
Jeff> The standard does not require that the recursive term be on
Jeff> the RHS.
Again, the standard may not, but existing implementations do:
MSSQL:
"The recursive CTE definition must contain at least two CTE query
definitions, an anchor member and a recursive member. Multiple anchor
members and recursive members can be defined; however, all anchor
member query definitions must be put before the first recursive member
definition. All CTE query definitions are anchor members unless they
reference the CTE itself. "
DB2:
"The following restrictions apply to a recursive common-table-expression:
* A list of column-names must be specified following the
table-identifier.
* The UNION ALL set operator must be specified.
* The first fullselect of the first union (the initialization
fullselect) must not include a reference to the
common-table-expression itself in any FROM clause.
"
Jeff> * UNION ALL only:
Jeff> with recursive
Jeff> foo(i) as (values(1) union select i+1 from foo where i < 10)
Jeff> select * from foo;
Jeff> ERROR: non-recursive term and recursive term must be combined with
Jeff> UNION ALL
Jeff> The standard seems to allow UNION ALL, UNION, INTERSECT, and
Jeff> EXCEPT (when the recursive term is not on the RHS of the
Jeff> EXCEPT).
Again, existing implementations disagree. See above for DB2, and for
MSSQL:
"Anchor members must be combined by one of these set operators: UNION
ALL, UNION, INTERSECT, or EXCEPT. UNION ALL is the only set operator
allowed between the last anchor member and first recursive member, and
when combining multiple recursive members."
Jeff> * Binary recursion and subselect strangeness:
Jeff> with recursive foo(i) as
Jeff> (values (1)
Jeff> union all
Jeff> select * from
Jeff> (select i+1 from foo where i < 10
Jeff> union all
Jeff> select i+1 from foo where i < X) t)
Jeff> select * from foo;
Jeff> Produces 10 rows of output regardless of what "X" is. This
Jeff> should be fixed for 8.4. Also, this is non-linear recursion,
Jeff> which the standard seems to disallow.
That looks like it should be disallowed somehow.
Jeff> * Multiple recursive references:
[snip]
Jeff> If we're going to allow non-linear recursion (which the
Jeff> standard does not), this seems like it should be a valid
Jeff> case.
We probably shouldn't allow it (as above).
[snip * Strange result with except: which looks like a bug]
Jeff> * Aggregates allowed: which
Jeff> with recursive foo(i) as
Jeff> (values(1)
Jeff> union all
Jeff> select max(i)+1 from foo where i < 10)
Jeff> select * from foo;
Jeff> Aggregates should be blocked according to the standard.
Jeff> Also, causes an infinite loop. This should be fixed for 8.4.
Does the standard require anywhere that non-conforming statements must
be diagnosed? (seems impractical, since it would forbid extensions)
Jeff> * DISTINCT should supress duplicates:
Jeff> with recursive foo(i) as
Jeff> (select distinct * from (values(1),(2)) t
Jeff> union all
Jeff> select distinct i+1 from foo where i < 10)
Jeff> select * from foo;
Jeff> This outputs a lot of duplicates, but they should be
Jeff> supressed according to the standard. This query is essentially
Jeff> the same as supporting UNION for recursive queries, so we
Jeff> should either fix both for 8.4 or block both for consistency.
Yeah, though the standard's use of DISTINCT in this way is something
of a violation of the POLA.
Jeff> * outer joins on a recursive reference should be blocked:
Jeff> with recursive foo(i) as
Jeff> (values(1)
Jeff> union all
Jeff> select i+1 from foo left join (values(1)) t on (i=column1))
Jeff> select * from foo;
Jeff> Causes an infinite loop, but the standard says using an outer
Jeff> join in this situation should be prohibited. This should be
Jeff> fixed for 8.4.
No. This has already been discussed; it's neither possible nor desirable
to diagnose all cases which can result in infinite loops, and there are
important types of queries which would be unnecessarily forbidden.
Besides, you've misread the spec here: it prohibits the recursive
reference ONLY on the nullable side of the join. You cite:
Jeff> outer joins not allowed to join recursive references:
Jeff> 7.13: Syntax Rules: 2.g.iii.6
Jeff> 7.13: Syntax Rules: 2.g.iii.7
6) WQEi shall not contain a <qualified join> QJ in which:
A) QJ immediately contains a <join type> that specifies FULL and a
<table reference> or <table factor> that contains a <query name>
referencing WQNj.
[no recursive FULL JOIN at all]
B) QJ immediately contains a <join type> that specifies LEFT and a
<table factor> following the <join type> that contains a <query
name> referencing WQNj.
[note "following the <join type>", so WQNj LEFT JOIN foo is allowed,
but foo LEFT JOIN WQNj is not]
C) QJ immediately contains a <join type> that specifies RIGHT and a
<table reference> preceding the <join type> that contains a
<query name> referencing WQNj.
[note "preceding the <join type>", so foo RIGHT JOIN WQNj is allowed,
but WQNj RIGHT JOIN foo is not]
para (7) is identical but for natural rather than qualified joins.
Jeff> * ORDER BY, LIMIT, and OFFSET are rejected for recursive
Jeff> queries. The standard does not seem to say that these should be
Jeff> rejected.
Note that supporting those in subqueries (including CTEs) is a separate
optional feature of the standard.
--
Andrew (irc:RhodiumToad)
On Mon, 2008-09-08 at 18:08 +0100, Andrew Gierth wrote:
Jeff> * Mutual Recursion:
This limitation isn't at all uncommon in other implementations; DB2
docs for example say:
As with some other things in my list, this doesn't need to be supported
in 8.4. I just wanted to lay out my interpretation of the standard, and
places that we might (currently) fall short of it.
The fact that other DBMSs don't support mutual recursion is a good
indication that it's not important immediately.
Jeff> The standard does not require that the recursive term be on
Jeff> the RHS.Again, the standard may not, but existing implementations do:
Again, I don't think we need this for 8.4.
However, I think it's probably more important than mutual recursion.
Jeff> * UNION ALL only:
Jeff> with recursive
Jeff> foo(i) as (values(1) union select i+1 from foo where i < 10)
Jeff> select * from foo;
Jeff> ERROR: non-recursive term and recursive term must be combined with
Jeff> UNION ALLJeff> The standard seems to allow UNION ALL, UNION, INTERSECT, and
Jeff> EXCEPT (when the recursive term is not on the RHS of the
Jeff> EXCEPT).Again, existing implementations disagree. See above for DB2, and for
MSSQL:
And again, I agree that it's not important for 8.4.
At some point we need to determine what the goalposts are though. Are we
copying existing implementations, or are we implementing the standard?
Jeff> Produces 10 rows of output regardless of what "X" is. This
Jeff> should be fixed for 8.4. Also, this is non-linear recursion,
Jeff> which the standard seems to disallow.That looks like it should be disallowed somehow.
Agreed. I think it should just throw an error, probably.
[snip * Strange result with except: which looks like a bug]
Jeff> * Aggregates allowed: which
Jeff> with recursive foo(i) as
Jeff> (values(1)
Jeff> union all
Jeff> select max(i)+1 from foo where i < 10)
Jeff> select * from foo;Jeff> Aggregates should be blocked according to the standard.
Jeff> Also, causes an infinite loop. This should be fixed for 8.4.Does the standard require anywhere that non-conforming statements must
be diagnosed? (seems impractical, since it would forbid extensions)
2.g.iii.4.B explicitly says aggregates should be rejected, unless I have
misinterpreted.
Yeah, though the standard's use of DISTINCT in this way is something
of a violation of the POLA.
I agree that's kind of a funny requirement. But that's pretty typical of
the SQL standard. If DB2 or SQL Server follow the standard here, we
should, too. If not, it's open for discussion.
No. This has already been discussed; it's neither possible nor desirable
to diagnose all cases which can result in infinite loops, and there are
important types of queries which would be unnecessarily forbidden.
I didn't say we should forbid all infinite loops. But we should forbid
ones that the standard tells us to forbid.
Besides, you've misread the spec here: it prohibits the recursive
reference ONLY on the nullable side of the join. You cite:
Thank you for the correction. It does properly reject the outer joins
that the standard says should be rejected.
Jeff> * ORDER BY, LIMIT, and OFFSET are rejected for recursive
Jeff> queries. The standard does not seem to say that these should be
Jeff> rejected.Note that supporting those in subqueries (including CTEs) is a separate
optional feature of the standard.
I don't feel strongly about this either way, but I prefer that we are
consistent when possible. We do support these things in a subquery, so
shouldn't we support them in all subqueries?
Regards,
Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes:
* Mutual Recursion:
with recursive
foo(i) as (values(1) union all select i+1 from bar where i < 10),
bar(i) as (values(1) union all select i+1 from foo where i < 10)
select * from foo;
ERROR: mutual recursive call is not supportedThe standard allows mutual recursion.
This seems to be a point of confusion. I originally read the standard and
concluded that mutual recursion was an optional feature. Itagaki-san showed me
a copy of the spec where it seemed there was a clear blanket prohibition on
mutually recursive queries and in fact anything but simple linearly expandable
queries. I wonder if there are different versions of the spec floating around
on this point.
Take a second look at your spec and read on to where it defines "linear" and
"expandable". If it doesn't define those terms then it's definitely different
from what I read. If it does, read on to see what it does with them. The main
reason to define them appeared to be to use them to say that supporting mutual
recursion is not required.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!
On Mon, 2008-09-08 at 21:13 +0100, Gregory Stark wrote:
Jeff Davis <pgsql@j-davis.com> writes:
* Mutual Recursion:
with recursive
foo(i) as (values(1) union all select i+1 from bar where i < 10),
bar(i) as (values(1) union all select i+1 from foo where i < 10)
select * from foo;
ERROR: mutual recursive call is not supportedThe standard allows mutual recursion.
This seems to be a point of confusion. I originally read the standard and
concluded that mutual recursion was an optional feature. Itagaki-san showed me
a copy of the spec where it seemed there was a clear blanket prohibition on
mutually recursive queries and in fact anything but simple linearly expandable
queries. I wonder if there are different versions of the spec floating around
on this point.Take a second look at your spec and read on to where it defines "linear" and
"expandable". If it doesn't define those terms then it's definitely different
from what I read. If it does, read on to see what it does with them. The main
reason to define them appeared to be to use them to say that supporting mutual
recursion is not required.
I think we're reading the same version of the spec. I'm reading 200n.
My interpretation (Syntax Rule 2.h) is that expandable is only used to
determine whether it can contain a <search or cycle>.
That being said, I don't think it should be a requirement for 8.4. The
CTE patch is an important feature, and we shouldn't hold it up over
something comparatively obscure like mutual recursion. As Andrew Gierth
cited, other systems don't support it anyway. It should be kept in mind
for future work though.
Regards,
Jeff Davis
"Jeff" == Jeff Davis <jdavis@truviso.com> writes:
Jeff> Aggregates should be blocked according to the standard.
Jeff> Also, causes an infinite loop. This should be fixed for 8.4.
Does the standard require anywhere that non-conforming statements
must be diagnosed? (seems impractical, since it would forbid
extensions)
Jeff> 2.g.iii.4.B explicitly says aggregates should be rejected,
Jeff> unless I have misinterpreted.
Yes, you've misinterpreted. When the spec says that a query "shall
not" do such-and-such, it means that a conforming client isn't allowed
to do that, it does NOT mean that the server is required to produce an
error when it sees it.
Chapter and verse on this is given in the Framework doc, at 6.3.3.2:
In the Syntax Rules, the term "shall" defines conditions that are
required to be true of syntactically conforming SQL language. When such
conditions depend on the contents of one or more schemas, they are
required to be true just before the actions specified by the General
Rules are performed. The treatment of language that does not conform to
the SQL Formats and Syntax Rules is implementation-dependent. If any
condition required by Syntax Rules is not satisfied when the evaluation
of Access or General Rules is attempted and the implementation is
neither processing non-conforming SQL language nor processing
conforming SQL language in a non-conforming manner, then an exception
condition is raised: "syntax error or access rule violation".
Including an aggregate violates a "shall" in a syntax rule, therefore the
query is non-conforming, therefore the server can either process it in an
implementation-dependent manner or reject it with an exception.
Yeah, though the standard's use of DISTINCT in this way is something
of a violation of the POLA.
Jeff> I agree that's kind of a funny requirement. But that's pretty
Jeff> typical of the SQL standard. If DB2 or SQL Server follow the
Jeff> standard here, we should, too. If not, it's open for discussion.
MSSQL does not:
"The following items are not allowed in the CTE_query_definition of a
recursive member:
* SELECT DISTINCT
* GROUP BY
* HAVING
* Scalar aggregation
* TOP
* LEFT, RIGHT, OUTER JOIN (INNER JOIN is allowed)
* Subqueries
* A hint applied to a recursive reference to a CTE inside a
CTE_query_definition.
"
For DB2 the docs do not appear to specify either way; they don't seem to
forbid the use of SELECT DISTINCT inside a recursive CTE, but neither do
they seem to mention any unusual effect of including it.
Jeff> * ORDER BY, LIMIT, and OFFSET are rejected for recursive
Jeff> queries. The standard does not seem to say that these should be
Jeff> rejected.
Note that supporting those in subqueries (including CTEs) is a
separate optional feature of the standard.
Jeff> I don't feel strongly about this either way, but I prefer that we
Jeff> are consistent when possible. We do support these things in a
Jeff> subquery, so shouldn't we support them in all subqueries?
Ideally we should. DB2 appears to (other than OFFSET which it doesn't
seem to have at all). But it's not at all clear that it would be either
useful or easy to do so.
--
Andrew.
On Mon, 2008-09-08 at 22:53 +0100, Andrew Gierth wrote:
Yes, you've misinterpreted. When the spec says that a query "shall
not" do such-and-such, it means that a conforming client isn't allowed
to do that, it does NOT mean that the server is required to produce an
error when it sees it.
Interesting. Thanks for the clear reference.
So, we either need to reject it or define some implementation-dependent
behavior, right?
The patch currently rejects HAVING, which means that it's a little
difficult to use an aggregate effectively. I lean toward rejecting
aggregates if we reject HAVING, for consistency. If we allow it, we
should allow HAVING as well.
Jeff> I agree that's kind of a funny requirement. But that's pretty
Jeff> typical of the SQL standard. If DB2 or SQL Server follow the
Jeff> standard here, we should, too. If not, it's open for discussion.MSSQL does not:
Thanks again for a reference.
If MSSQL rejects SELECT DISTINCT for a recursive <query expression>,
then I think we should reject it. Right now the patch allows it but
provides a result that is inconsistent with the standard.
If we reject SELECT DISTINCT for recursive queries now, we can always
meet the standard later, or decide that the standard behavior is too
likely to cause confusion and just continue to block it.
Ideally we should. DB2 appears to (other than OFFSET which it doesn't
seem to have at all). But it's not at all clear that it would be either
useful or easy to do so.
Ok. If it's easy to support it should probably be done.
As an aside, you seem to be an expert with the standard, have you had a
chance to look at the question I asked earlier?:
http://archives.postgresql.org/pgsql-hackers/2008-09/msg00487.php
Regards,
Jeff Davis
Thanks for the review.
[I dropped y-asaba@sraoss.co.jp from the Cc list since he has left our
company and the email address is being deleted.]
I'm going to look into issues which are seem to be bug (of course if
you know what to fix, patches are always welcome:-).
These are my initial comments about the Common Table Expressions (CTE)
patch, also known as WITH [RECURSIVE]. These comments are based on the
patch here:http://archives.postgresql.org/pgsql-patches/2008-08/msg00021.php
This is a semantically complex feature, and the standard is fairly
complex as well. So I'm approaching this by making my own
interpretations from the standard first (I included my interpretations
and section references at the end of this email) and comparing to the
behavior of the patch.The following examples may be inconsistent with the standard. Some
have already been mentioned, and I don't think they all need to be
fixed for 8.4, but I mention them here for completeness.* Mutual Recursion:
with recursive
foo(i) as (values(1) union all select i+1 from bar where i < 10),
bar(i) as (values(1) union all select i+1 from foo where i < 10)
select * from foo;
ERROR: mutual recursive call is not supportedThe standard allows mutual recursion.
The discussion seems to agree that let it leave for post 8.4.
* Single Evaluation:
with
foo(i) as (select random() as i)
select * from foo union all select * from foo;
i
-------------------
0.233165248762816
0.62126633618027
(2 rows)The standard specifies that non-recursive WITH should be evaluated
once.
What shall we do? I don't think there's a easy way to fix this. Maybe
we should not allow WITH clause without RECURISVE?
* RHS only:
with recursive
foo(i) as (select i+1 from foo where i < 10 union all values(1))
select * from foo;
ERROR: Left hand side of UNION ALL must be a non-recursive term in a
recursive queryThe standard does not require that the recursive term be on the RHS.
The discussion seems to agree that let it leave for post 8.4.
* UNION ALL only:
with recursive
foo(i) as (values(1) union select i+1 from foo where i < 10)
select * from foo;
ERROR: non-recursive term and recursive term must be combined with
UNION ALLThe standard seems to allow UNION ALL, UNION, INTERSECT, and EXCEPT
(when the recursive term is not on the RHS of the EXCEPT).
The discussion seems to agree that let it leave for post 8.4.
* Binary recursion and subselect strangeness:
with recursive foo(i) as
(values (1)
union all
select * from
(select i+1 from foo where i < 10
union all
select i+1 from foo where i < X) t)
select * from foo;Produces 10 rows of output regardless of what "X" is. This should be
fixed for 8.4.
Also, this is non-linear recursion, which the standard seems to
disallow.
I will try to fix this. However detecting the query being not a
non-linear one is not so easy.
* Multiple recursive references:
with recursive foo(i) as
(values (1)
union all
select i+1 from foo where i < 10
union all
select i+1 from foo where i < 20)
select * from foo;
ERROR: Left hand side of UNION ALL must be a non-recursive term in a
recursive queryIf we're going to allow non-linear recursion (which the standard
does not), this seems like it should be a valid case.
I will try to disallow this.
* Strange result with except:
with recursive foo(i) as
(values (1)
union all
select * from
(select i+1 from foo where i < 10
except
select i+1 from foo where i < 5) t)
select * from foo;
ERROR: table "foo" has 0 columns available but 1 columns specifiedThis query works if you replace "except" with "union". This should be
fixed for 8.4.
I will try to fix this.
* Aggregates allowed:
with recursive foo(i) as
(values(1)
union all
select max(i)+1 from foo where i < 10)
select * from foo;Aggregates should be blocked according to the standard.
Also, causes an infinite loop. This should be fixed for 8.4.
I will try to fix this.
* DISTINCT should supress duplicates:
with recursive foo(i) as
(select distinct * from (values(1),(2)) t
union all
select distinct i+1 from foo where i < 10)
select * from foo;This outputs a lot of duplicates, but they should be supressed
according to the standard. This query is essentially the same as
supporting UNION for recursive queries, so we should either fix both for
8.4 or block both for consistency.
I'm not sure if it's possible to fix this. Will look into.
* outer joins on a recursive reference should be blocked:
with recursive foo(i) as
(values(1)
union all
select i+1 from foo left join (values(1)) t on (i=column1))
select * from foo;Causes an infinite loop, but the standard says using an outer join
in this situation should be prohibited. This should be fixed for 8.4.
Not an issue, I think.
* ORDER BY, LIMIT, and OFFSET are rejected for recursive queries. The
standard does not seem to say that these should be rejected.The following are my interpretations of relevant parts of the SQL
standard (200n), and the associated sections. These are only my
interpretation, so let me know if you interpret the standard
differently.Non-linear recursion forbidden:
7.13: Syntax Rules: 2.g.ii
My interpretation of 2.g.ii.2 is that WQN[k] and WQN[l] may be the
same <query name>.
7.13: Syntax Rules: 2.g.ivEXCEPT can't be used for recursive queries if a recursive reference
appears on the RHS:
7.13: Syntax Rules: 2.g.iii.1INTERSECT ALL/EXCEPT ALL can't be used for recursive queries:
7.13: Syntax Rules: 2.g.iii.5recursive references must appear in FROM clause:
7.13: Syntax Rules: 2.g.iii.3
My interpretation of this rule is that it does not allow a
recursive reference in a subquery in the targetlist or a subquery
in the where clause.stratum defined:
7.13: Syntax Rules: 2.f
7.13: Syntax Rules: 2.g.i.1recursive query must have anchor for every stratum:
7.13: Syntax Rules: 2.g.i.3outer joins not allowed to join recursive references:
7.13: Syntax Rules: 2.g.iii.6
7.13: Syntax Rules: 2.g.iii.7Aggregates/HAVING disallowed:
7.13: Syntax Rules: 2.g.iii.4.BMutual recursion defined:
7.13: Syntax Rules: 2.g.i.1Evaluate each WITH entry once, even if it's referenced multiple times:
7.13: General Rules: 1
7.13: General Rules: 2.b
See also:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.phpEvaluation order with mutual recursion:
7.13: General Rules: 2.a
7.13: General Rules: 2.bEvaluation semantics:
7.13: General Rules: 2.cDISTINCT:
7.13: General Rules: 2.c.iv
7.13: General Rules: 2.c.ix.3.AI will provide comments about the code and documentation soon. This is a
very useful feature.
Thanks. Enclosed is the latest patch to adopt CVS HEAD.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
Attachments:
Thanks for the review.
[I dropped y-asaba@sraoss.co.jp from the Cc list since he has left our
company and the email address is being deleted.]
I'm going to look into issues which are seem to be bug (of course if
you know what to fix, patches are always welcome:-).
These are my initial comments about the Common Table Expressions (CTE)
patch, also known as WITH [RECURSIVE]. These comments are based on the
patch here:http://archives.postgresql.org/pgsql-patches/2008-08/msg00021.php
This is a semantically complex feature, and the standard is fairly
complex as well. So I'm approaching this by making my own
interpretations from the standard first (I included my interpretations
and section references at the end of this email) and comparing to the
behavior of the patch.The following examples may be inconsistent with the standard. Some
have already been mentioned, and I don't think they all need to be
fixed for 8.4, but I mention them here for completeness.* Mutual Recursion:
with recursive
foo(i) as (values(1) union all select i+1 from bar where i < 10),
bar(i) as (values(1) union all select i+1 from foo where i < 10)
select * from foo;
ERROR: mutual recursive call is not supportedThe standard allows mutual recursion.
The discussion seems to agree that let it leave for post 8.4.
* Single Evaluation:
with
foo(i) as (select random() as i)
select * from foo union all select * from foo;
i
-------------------
0.233165248762816
0.62126633618027
(2 rows)The standard specifies that non-recursive WITH should be evaluated
once.
What shall we do? I don't think there's an easy way to fix this as Tom
suggested. Maybe we should not allow WITH clause without RECURISVE for
8.4?
* RHS only:
with recursive
foo(i) as (select i+1 from foo where i < 10 union all values(1))
select * from foo;
ERROR: Left hand side of UNION ALL must be a non-recursive term in a
recursive queryThe standard does not require that the recursive term be on the RHS.
The discussion seems to agree that let it leave for post 8.4.
* UNION ALL only:
with recursive
foo(i) as (values(1) union select i+1 from foo where i < 10)
select * from foo;
ERROR: non-recursive term and recursive term must be combined with
UNION ALLThe standard seems to allow UNION ALL, UNION, INTERSECT, and EXCEPT
(when the recursive term is not on the RHS of the EXCEPT).
The discussion seems to agree that let it leave for post 8.4.
* Binary recursion and subselect strangeness:
with recursive foo(i) as
(values (1)
union all
select * from
(select i+1 from foo where i < 10
union all
select i+1 from foo where i < X) t)
select * from foo;Produces 10 rows of output regardless of what "X" is. This should be
fixed for 8.4.
Also, this is non-linear recursion, which the standard seems to
disallow.
I will try to fix this. However detecting the query being not a
non-linear one is not so easy.
* Multiple recursive references:
with recursive foo(i) as
(values (1)
union all
select i+1 from foo where i < 10
union all
select i+1 from foo where i < 20)
select * from foo;
ERROR: Left hand side of UNION ALL must be a non-recursive term in a
recursive queryIf we're going to allow non-linear recursion (which the standard
does not), this seems like it should be a valid case.
I will try to disallow this.
* Strange result with except:
with recursive foo(i) as
(values (1)
union all
select * from
(select i+1 from foo where i < 10
except
select i+1 from foo where i < 5) t)
select * from foo;
ERROR: table "foo" has 0 columns available but 1 columns specifiedThis query works if you replace "except" with "union". This should be
fixed for 8.4.
I will try to fix this.
* Aggregates allowed:
with recursive foo(i) as
(values(1)
union all
select max(i)+1 from foo where i < 10)
select * from foo;Aggregates should be blocked according to the standard.
Also, causes an infinite loop. This should be fixed for 8.4.
I will try to fix this.
* DISTINCT should supress duplicates:
with recursive foo(i) as
(select distinct * from (values(1),(2)) t
union all
select distinct i+1 from foo where i < 10)
select * from foo;This outputs a lot of duplicates, but they should be supressed
according to the standard. This query is essentially the same as
supporting UNION for recursive queries, so we should either fix both for
8.4 or block both for consistency.
I'm not sure if it's possible to fix this. Will look into.
* outer joins on a recursive reference should be blocked:
with recursive foo(i) as
(values(1)
union all
select i+1 from foo left join (values(1)) t on (i=column1))
select * from foo;Causes an infinite loop, but the standard says using an outer join
in this situation should be prohibited. This should be fixed for 8.4.
Not an issue, I think.
* ORDER BY, LIMIT, and OFFSET are rejected for recursive queries. The
standard does not seem to say that these should be rejected.The following are my interpretations of relevant parts of the SQL
standard (200n), and the associated sections. These are only my
interpretation, so let me know if you interpret the standard
differently.Non-linear recursion forbidden:
7.13: Syntax Rules: 2.g.ii
My interpretation of 2.g.ii.2 is that WQN[k] and WQN[l] may be the
same <query name>.
7.13: Syntax Rules: 2.g.ivEXCEPT can't be used for recursive queries if a recursive reference
appears on the RHS:
7.13: Syntax Rules: 2.g.iii.1INTERSECT ALL/EXCEPT ALL can't be used for recursive queries:
7.13: Syntax Rules: 2.g.iii.5recursive references must appear in FROM clause:
7.13: Syntax Rules: 2.g.iii.3
My interpretation of this rule is that it does not allow a
recursive reference in a subquery in the targetlist or a subquery
in the where clause.stratum defined:
7.13: Syntax Rules: 2.f
7.13: Syntax Rules: 2.g.i.1recursive query must have anchor for every stratum:
7.13: Syntax Rules: 2.g.i.3outer joins not allowed to join recursive references:
7.13: Syntax Rules: 2.g.iii.6
7.13: Syntax Rules: 2.g.iii.7Aggregates/HAVING disallowed:
7.13: Syntax Rules: 2.g.iii.4.BMutual recursion defined:
7.13: Syntax Rules: 2.g.i.1Evaluate each WITH entry once, even if it's referenced multiple times:
7.13: General Rules: 1
7.13: General Rules: 2.b
See also:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.phpEvaluation order with mutual recursion:
7.13: General Rules: 2.a
7.13: General Rules: 2.bEvaluation semantics:
7.13: General Rules: 2.cDISTINCT:
7.13: General Rules: 2.c.iv
7.13: General Rules: 2.c.ix.3.AI will provide comments about the code and documentation soon. This is a
very useful feature.
Thanks. Enclosed is the latest patch to adopt CVS HEAD.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
Attachments:
* Single Evaluation:
with
foo(i) as (select random() as i)
select * from foo union all select * from foo;
i
-------------------
0.233165248762816
0.62126633618027
(2 rows)The standard specifies that non-recursive WITH should be evaluated
once.What shall we do? I don't think there's a easy way to fix this. Maybe
we should not allow WITH clause without RECURISVE?
ISTM that kind of misses the point. Even if it's WITH RECURSIVE
rather than simply WITH, one wouldn't expect multiple evaluations of
any non-recursive portion of the query.
...Robert
On Tue, 2008-09-09 at 13:45 +0900, Tatsuo Ishii wrote:
Thanks for the review.
The standard specifies that non-recursive WITH should be evaluated
once.What shall we do? I don't think there's a easy way to fix this. Maybe
we should not allow WITH clause without RECURISVE?
My interpretation of 7.13: General Rules: 2.b is that it should be
single evaluation, even if RECURSIVE is present.
The previous discussion was here:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php
The important arguments in the thread seemed to be:
1. People will generally expect single evaluation, so might be
disappointed if they can't use this feature for that purpose.
2. It's a spec violation in the case of volatile functions.
3. "I think this is a "must fix" because of the point about volatile
functions --- changing it later will result in user-visible semantics
changes, so we have to get it right the first time."
I don't entirely agree with #3. It is user-visible, but only in the
sense that someone is depending on undocumented multiple-evaluation
behavior.
Tom Lane said that multiple evaluation is grounds for rejection:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01318.php
Is there hope of correcting this before November?
I will try to fix this. However detecting the query being not a
non-linear one is not so easy.
If we don't allow mutual recursion, the only kind of non-linear
recursion that might exist would be multiple references to the same
recursive query name in a recursive query, is that correct?
* DISTINCT should supress duplicates:
with recursive foo(i) as
(select distinct * from (values(1),(2)) t
union all
select distinct i+1 from foo where i < 10)
select * from foo;This outputs a lot of duplicates, but they should be supressed
according to the standard. This query is essentially the same as
supporting UNION for recursive queries, so we should either fix both for
8.4 or block both for consistency.I'm not sure if it's possible to fix this. Will look into.
Can't we just reject queries with top-level DISTINCT, similar to how
UNION is rejected?
* outer joins on a recursive reference should be blocked:
with recursive foo(i) as
(values(1)
union all
select i+1 from foo left join (values(1)) t on (i=column1))
select * from foo;Causes an infinite loop, but the standard says using an outer join
in this situation should be prohibited. This should be fixed for 8.4.Not an issue, I think.
Agreed, Andrew Gierth corrected me here.
Regards,
Jeff Davis
3. "I think this is a "must fix" because of the point about volatile
functions --- changing it later will result in user-visible semantics
changes, so we have to get it right the first time."I don't entirely agree with #3. It is user-visible, but only in the
sense that someone is depending on undocumented multiple-evaluation
behavior.
What makes you think it's going to be undocumented? Single versus
multiple evaluation is a keep aspect of this feature and certainly
needs to be documented one way or the other. I can't understand why
we would introduce a standard syntax with non-standard behavior, but
if we do, it certainly had better be mentioned in the documentation.
I think that the most likely result of a CTE implementation that
doesn't guarantee single evaluation is that people simply won't use
it. But anyone who does will expect that their queries will return
the same results in release N and release N+1, for all values of N.
The only way that an incompatible change of this type won't break
people's applications is if they're not using the feature in the first
place, in which case there is no point in committing it anyway.
I wonder if the whole approach to this patch is backward. Instead of
worrying about how to implement WITH RECURSIVE, maybe it would be
better to implement a really solid, spec-compliant version of WITH,
and add the RECURSIVE functionality in a later patch/release.
...Robert
On Tue, 2008-09-09 at 13:45 +0900, Tatsuo Ishii wrote:
Thanks for the review.
The standard specifies that non-recursive WITH should be evaluated
once.What shall we do? I don't think there's a easy way to fix this. Maybe
we should not allow WITH clause without RECURISVE?My interpretation of 7.13: General Rules: 2.b is that it should be
single evaluation, even if RECURSIVE is present.The previous discussion was here:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php
The important arguments in the thread seemed to be:
1. People will generally expect single evaluation, so might be
disappointed if they can't use this feature for that purpose.2. It's a spec violation in the case of volatile functions.
3. "I think this is a "must fix" because of the point about volatile
functions --- changing it later will result in user-visible semantics
changes, so we have to get it right the first time."I don't entirely agree with #3. It is user-visible, but only in the
sense that someone is depending on undocumented multiple-evaluation
behavior.Tom Lane said that multiple evaluation is grounds for rejection:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01318.phpIs there hope of correcting this before November?
According to Tom, to implement "single evaluation" we need to make big
infrastructure enhancement which is likely slip the schedule for 8.4
release which Tom does not want.
So as long as Tom and other people think that is a "must fix", there
seems no hope probably.
Anyway I will continue to work on existing patches...
--
Tatsuo Ishii
SRA OSS, Inc. Japan
Show quoted text
I will try to fix this. However detecting the query being not a
non-linear one is not so easy.If we don't allow mutual recursion, the only kind of non-linear
recursion that might exist would be multiple references to the same
recursive query name in a recursive query, is that correct?* DISTINCT should supress duplicates:
with recursive foo(i) as
(select distinct * from (values(1),(2)) t
union all
select distinct i+1 from foo where i < 10)
select * from foo;This outputs a lot of duplicates, but they should be supressed
according to the standard. This query is essentially the same as
supporting UNION for recursive queries, so we should either fix both for
8.4 or block both for consistency.I'm not sure if it's possible to fix this. Will look into.
Can't we just reject queries with top-level DISTINCT, similar to how
UNION is rejected?* outer joins on a recursive reference should be blocked:
with recursive foo(i) as
(values(1)
union all
select i+1 from foo left join (values(1)) t on (i=column1))
select * from foo;Causes an infinite loop, but the standard says using an outer join
in this situation should be prohibited. This should be fixed for 8.4.Not an issue, I think.
Agreed, Andrew Gierth corrected me here.
Regards,
Jeff Davis--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello
2008/9/9 Tatsuo Ishii <ishii@postgresql.org>:
On Tue, 2008-09-09 at 13:45 +0900, Tatsuo Ishii wrote:
Thanks for the review.
The standard specifies that non-recursive WITH should be evaluated
once.What shall we do? I don't think there's a easy way to fix this. Maybe
we should not allow WITH clause without RECURISVE?My interpretation of 7.13: General Rules: 2.b is that it should be
single evaluation, even if RECURSIVE is present.The previous discussion was here:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php
The important arguments in the thread seemed to be:
1. People will generally expect single evaluation, so might be
disappointed if they can't use this feature for that purpose.2. It's a spec violation in the case of volatile functions.
3. "I think this is a "must fix" because of the point about volatile
functions --- changing it later will result in user-visible semantics
changes, so we have to get it right the first time."I don't entirely agree with #3. It is user-visible, but only in the
sense that someone is depending on undocumented multiple-evaluation
behavior.Tom Lane said that multiple evaluation is grounds for rejection:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01318.phpIs there hope of correcting this before November?
According to Tom, to implement "single evaluation" we need to make big
infrastructure enhancement which is likely slip the schedule for 8.4
release which Tom does not want.
why? why don't use a materialisation?
So as long as Tom and other people think that is a "must fix", there
seems no hope probably.Anyway I will continue to work on existing patches...
--
I would to see your patch in core early. I am working on grouping sets
and I cannot finish my patch before your patch will be commited.
Regards
Pavel Stehule
Show quoted text
Tatsuo Ishii
SRA OSS, Inc. JapanI will try to fix this. However detecting the query being not a
non-linear one is not so easy.If we don't allow mutual recursion, the only kind of non-linear
recursion that might exist would be multiple references to the same
recursive query name in a recursive query, is that correct?* DISTINCT should supress duplicates:
with recursive foo(i) as
(select distinct * from (values(1),(2)) t
union all
select distinct i+1 from foo where i < 10)
select * from foo;This outputs a lot of duplicates, but they should be supressed
according to the standard. This query is essentially the same as
supporting UNION for recursive queries, so we should either fix both for
8.4 or block both for consistency.I'm not sure if it's possible to fix this. Will look into.
Can't we just reject queries with top-level DISTINCT, similar to how
UNION is rejected?* outer joins on a recursive reference should be blocked:
with recursive foo(i) as
(values(1)
union all
select i+1 from foo left join (values(1)) t on (i=column1))
select * from foo;Causes an infinite loop, but the standard says using an outer join
in this situation should be prohibited. This should be fixed for 8.4.Not an issue, I think.
Agreed, Andrew Gierth corrected me here.
Regards,
Jeff Davis--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello
2008/9/9 Tatsuo Ishii <ishii@postgresql.org>:
On Tue, 2008-09-09 at 13:45 +0900, Tatsuo Ishii wrote:
Thanks for the review.
The standard specifies that non-recursive WITH should be evaluated
once.What shall we do? I don't think there's a easy way to fix this. Maybe
we should not allow WITH clause without RECURISVE?My interpretation of 7.13: General Rules: 2.b is that it should be
single evaluation, even if RECURSIVE is present.The previous discussion was here:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php
The important arguments in the thread seemed to be:
1. People will generally expect single evaluation, so might be
disappointed if they can't use this feature for that purpose.2. It's a spec violation in the case of volatile functions.
3. "I think this is a "must fix" because of the point about volatile
functions --- changing it later will result in user-visible semantics
changes, so we have to get it right the first time."I don't entirely agree with #3. It is user-visible, but only in the
sense that someone is depending on undocumented multiple-evaluation
behavior.Tom Lane said that multiple evaluation is grounds for rejection:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01318.phpIs there hope of correcting this before November?
According to Tom, to implement "single evaluation" we need to make big
infrastructure enhancement which is likely slip the schedule for 8.4
release which Tom does not want.why? why don't use a materialisation?
See:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php
Show quoted text
So as long as Tom and other people think that is a "must fix", there
seems no hope probably.Anyway I will continue to work on existing patches...
--I would to see your patch in core early. I am working on grouping sets
and I cannot finish my patch before your patch will be commited.Regards
Pavel StehuleTatsuo Ishii
SRA OSS, Inc. JapanI will try to fix this. However detecting the query being not a
non-linear one is not so easy.If we don't allow mutual recursion, the only kind of non-linear
recursion that might exist would be multiple references to the same
recursive query name in a recursive query, is that correct?* DISTINCT should supress duplicates:
with recursive foo(i) as
(select distinct * from (values(1),(2)) t
union all
select distinct i+1 from foo where i < 10)
select * from foo;This outputs a lot of duplicates, but they should be supressed
according to the standard. This query is essentially the same as
supporting UNION for recursive queries, so we should either fix both for
8.4 or block both for consistency.I'm not sure if it's possible to fix this. Will look into.
Can't we just reject queries with top-level DISTINCT, similar to how
UNION is rejected?* outer joins on a recursive reference should be blocked:
with recursive foo(i) as
(values(1)
union all
select i+1 from foo left join (values(1)) t on (i=column1))
select * from foo;Causes an infinite loop, but the standard says using an outer join
in this situation should be prohibited. This should be fixed for 8.4.Not an issue, I think.
Agreed, Andrew Gierth corrected me here.
Regards,
Jeff Davis--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello
2008/9/9 Tatsuo Ishii <ishii@postgresql.org>:
On Tue, 2008-09-09 at 13:45 +0900, Tatsuo Ishii wrote:
Thanks for the review.
The standard specifies that non-recursive WITH should be evaluated
once.What shall we do? I don't think there's a easy way to fix this. Maybe
we should not allow WITH clause without RECURISVE?My interpretation of 7.13: General Rules: 2.b is that it should be
single evaluation, even if RECURSIVE is present.The previous discussion was here:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php
The important arguments in the thread seemed to be:
1. People will generally expect single evaluation, so might be
disappointed if they can't use this feature for that purpose.2. It's a spec violation in the case of volatile functions.
3. "I think this is a "must fix" because of the point about volatile
functions --- changing it later will result in user-visible semantics
changes, so we have to get it right the first time."I don't entirely agree with #3. It is user-visible, but only in the
sense that someone is depending on undocumented multiple-evaluation
behavior.Tom Lane said that multiple evaluation is grounds for rejection:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01318.phpIs there hope of correcting this before November?
According to Tom, to implement "single evaluation" we need to make big
infrastructure enhancement which is likely slip the schedule for 8.4
release which Tom does not want.why? why don't use a materialisation?
See:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php
Show quoted text
So as long as Tom and other people think that is a "must fix", there
seems no hope probably.Anyway I will continue to work on existing patches...
--I would to see your patch in core early. I am working on grouping sets
and I cannot finish my patch before your patch will be commited.Regards
Pavel StehuleTatsuo Ishii
SRA OSS, Inc. JapanI will try to fix this. However detecting the query being not a
non-linear one is not so easy.If we don't allow mutual recursion, the only kind of non-linear
recursion that might exist would be multiple references to the same
recursive query name in a recursive query, is that correct?* DISTINCT should supress duplicates:
with recursive foo(i) as
(select distinct * from (values(1),(2)) t
union all
select distinct i+1 from foo where i < 10)
select * from foo;This outputs a lot of duplicates, but they should be supressed
according to the standard. This query is essentially the same as
supporting UNION for recursive queries, so we should either fix both for
8.4 or block both for consistency.I'm not sure if it's possible to fix this. Will look into.
Can't we just reject queries with top-level DISTINCT, similar to how
UNION is rejected?* outer joins on a recursive reference should be blocked:
with recursive foo(i) as
(values(1)
union all
select i+1 from foo left join (values(1)) t on (i=column1))
select * from foo;Causes an infinite loop, but the standard says using an outer join
in this situation should be prohibited. This should be fixed for 8.4.Not an issue, I think.
Agreed, Andrew Gierth corrected me here.
Regards,
Jeff Davis--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
* Aggregates allowed:
with recursive foo(i) as
(values(1)
union all
select max(i)+1 from foo where i < 10)
select * from foo;Aggregates should be blocked according to the standard.
Also, causes an infinite loop. This should be fixed for 8.4.I will try to fix this.
We already reject:
select max(i) from foo where i < 10)
But max(i)+1 seems to slip the check. I looked into this I found the
patch tried to detect the case before analyzing(see
parser/parse_cte.c) which is not a right thing I think.
I think we could detect the case by adding more checking in
parseCheckAggregates():
/*
* Check if there's aggregate function in a recursive term.
*/
foreach(l, qry->rtable)
{
RangeTblEntry *rte = (RangeTblEntry *) lfirst(l);
if (qry->hasAggs && rte->rtekind == RTE_RECURSIVE &&
rte->self_reference)
{
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("aggregate functions in a recursive term not allowed")));
}
}
What do you think?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
On Tue, 2008-09-09 at 09:47 -0400, Robert Haas wrote:
3. "I think this is a "must fix" because of the point about volatile
functions --- changing it later will result in user-visible semantics
changes, so we have to get it right the first time."I don't entirely agree with #3. It is user-visible, but only in the
sense that someone is depending on undocumented multiple-evaluation
behavior.What makes you think it's going to be undocumented? Single versus
multiple evaluation is a keep aspect of this feature and certainly
needs to be documented one way or the other. I can't understand why
we would introduce a standard syntax with non-standard behavior, but
if we do, it certainly had better be mentioned in the documentation.
I meant that -- hypothetically if we did accept the feature as-is -- the
number of evaluations would be documented to be undefined, not N. That
would avoid the backwards-compatibility problem.
This one point is probably not worth discussing now, because argument
#1 and #2 stand on their own.
Regards,
Jeff Davis
2008/9/9 Tatsuo Ishii <ishii@sraoss.co.jp>:
Hello
2008/9/9 Tatsuo Ishii <ishii@postgresql.org>:
On Tue, 2008-09-09 at 13:45 +0900, Tatsuo Ishii wrote:
Thanks for the review.
The standard specifies that non-recursive WITH should be evaluated
once.What shall we do? I don't think there's a easy way to fix this. Maybe
we should not allow WITH clause without RECURISVE?My interpretation of 7.13: General Rules: 2.b is that it should be
single evaluation, even if RECURSIVE is present.The previous discussion was here:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php
I am blind, I didn't find any reason, why materialisation isn't useable.
Regards
Pavel
Show quoted text
The important arguments in the thread seemed to be:
1. People will generally expect single evaluation, so might be
disappointed if they can't use this feature for that purpose.2. It's a spec violation in the case of volatile functions.
3. "I think this is a "must fix" because of the point about volatile
functions --- changing it later will result in user-visible semantics
changes, so we have to get it right the first time."I don't entirely agree with #3. It is user-visible, but only in the
sense that someone is depending on undocumented multiple-evaluation
behavior.Tom Lane said that multiple evaluation is grounds for rejection:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01318.phpIs there hope of correcting this before November?
According to Tom, to implement "single evaluation" we need to make big
infrastructure enhancement which is likely slip the schedule for 8.4
release which Tom does not want.why? why don't use a materialisation?
See:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.phpSo as long as Tom and other people think that is a "must fix", there
seems no hope probably.Anyway I will continue to work on existing patches...
--I would to see your patch in core early. I am working on grouping sets
and I cannot finish my patch before your patch will be commited.Regards
Pavel StehuleTatsuo Ishii
SRA OSS, Inc. JapanI will try to fix this. However detecting the query being not a
non-linear one is not so easy.If we don't allow mutual recursion, the only kind of non-linear
recursion that might exist would be multiple references to the same
recursive query name in a recursive query, is that correct?* DISTINCT should supress duplicates:
with recursive foo(i) as
(select distinct * from (values(1),(2)) t
union all
select distinct i+1 from foo where i < 10)
select * from foo;This outputs a lot of duplicates, but they should be supressed
according to the standard. This query is essentially the same as
supporting UNION for recursive queries, so we should either fix both for
8.4 or block both for consistency.I'm not sure if it's possible to fix this. Will look into.
Can't we just reject queries with top-level DISTINCT, similar to how
UNION is rejected?* outer joins on a recursive reference should be blocked:
with recursive foo(i) as
(values(1)
union all
select i+1 from foo left join (values(1)) t on (i=column1))
select * from foo;Causes an infinite loop, but the standard says using an outer join
in this situation should be prohibited. This should be fixed for 8.4.Not an issue, I think.
Agreed, Andrew Gierth corrected me here.
Regards,
Jeff Davis--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
My interpretation of 7.13: General Rules: 2.b is that it should be
single evaluation, even if RECURSIVE is present.The previous discussion was here:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php
I am blind, I didn't find any reason, why materialisation isn't useable.
I believe it's because of these two (closely related) problems:
# The basic
# implementation clearly ought to be to dump the result of the subquery
# into a tuplestore and then have the upper level read out from that.
# However, we don't have any infrastructure for having multiple
# upper-level RTEs reference the same tuplestore. (Perhaps the InitPlan
# infrastructure could be enhanced in that direction, but it's not ready
# for non-scalar outputs today.) Also, I think we'd have to teach
# tuplestore how to support multiple readout cursors. For example,
# consider
# WITH foo AS (SELECT ...) SELECT ... FROM foo a, foo b WHERE ...
# If the planner chooses to do the join as a nested loop then each
# Scan node needs to keep track of its own place in the tuplestore,
# concurrently with the other node having a different place.
...Robert