WITH RECUSIVE patches 0723
Hi,
Here is the lastest WITH RECURSIVE patches against 2007/07/17 CVS (CVS
HEAD won't compile for me).This version includes regression tests and is almost ready for commit
IMO.
I pulled fresh CVS HEAD and it seems the problem is gone. Here is the
lastest WITH RECURSIVE patches against CVS HEAD.
Reviewers, please let me know if you find problems with the
patches. If none, I would like to commit this weekend.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
Attachments:
recursive_query.patch.gzapplication/octet-streamDownload
Tatsuo Ishii <ishii@postgresql.org> writes:
Reviewers, please let me know if you find problems with the
patches. If none, I would like to commit this weekend.
Has this patch actually been reviewed yet? The only reports I've
seen are from testing; nothing from anyone actually reading the
code. I know I've not looked at it yet.
regards, tom lane
Tatsuo Ishii <ishii@postgresql.org> writes:
Reviewers, please let me know if you find problems with the
patches. If none, I would like to commit this weekend.Has this patch actually been reviewed yet? The only reports I've
seen are from testing; nothing from anyone actually reading the
code. I know I've not looked at it yet.
The reviewer registered at the Wiki is David Fetter and I believe he
is reading the patches. Michael Makes has contributed the ecpg
part. So apparently he is knowing the ecpg part at least.
I know the patch is huge. Reviewers, please let me know if you have
any question about the code. I would like to do anything for helping
the review.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
On Wed, Jul 23, 2008 at 10:59:20AM -0400, Tom Lane wrote:
Tatsuo Ishii <ishii@postgresql.org> writes:
Reviewers, please let me know if you find problems with the
patches. If none, I would like to commit this weekend.Has this patch actually been reviewed yet? The only reports I've
seen are from testing; nothing from anyone actually reading the
code. I know I've not looked at it yet.
I've read the code, for what that's worth, which isn't much. I just
tried out this patch on a fresh checkout of CVS TIP and found:
EXPLAIN WITH RECURSIVE t(i) AS (VALUES(1::int) UNION ALL SELECT i+1 FROM t WHERE i < 5) SELECT * FROM t AS t1 JOIN t AS t2 USING(i);
QUERY PLAN
-------------------------------------------------------------------------------------
Hash Join (cost=0.08..0.16 rows=2 width=4)
Hash Cond: (t1.i = t2.i)
-> Recursion on t1 (cost=0.00..0.06 rows=2 width=4)
-> Append (cost=0.00..0.04 rows=2 width=4)
-> Values Scan on "*VALUES*" (cost=0.00..0.01 rows=1 width=4)
-> Recursive Scan on t (cost=0.00..0.00 rows=1 width=4)
Filter: (i < 5)
-> Hash (cost=0.06..0.06 rows=2 width=4)
-> Recursion on t2 (cost=0.00..0.06 rows=2 width=4)
-> Append (cost=0.00..0.04 rows=2 width=4)
-> Values Scan on "*VALUES*" (cost=0.00..0.01 rows=1 width=4)
-> Recursive Scan on t (cost=0.00..0.00 rows=1 width=4)
Filter: (i < 5)
(13 rows)
When I try to execute the query without the EXPLAIN, having attached a debugger
to the back-end, I get.
(gdb) continue
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0x08192dcd in ExecQual (qual=0xa183824, econtext=0xa183230, resultForNull=0 '\0') at execQual.c:4513
4513 expr_value = ExecEvalExpr(clause, econtext, &isNull, NULL);
(gdb) i s
#0 0x08192dcd in ExecQual (qual=0xa183824, econtext=0xa183230, resultForNull=0 '\0') at execQual.c:4513
#1 0x08199b23 in ExecScan (node=0xa1831a4, accessMtd=0x81a6bb0 <RecursivescanNext>) at execScan.c:131
#2 0x081a6ba9 in ExecRecursiveScan (node=0xa1831a4) at nodeRecursivescan.c:48
#3 0x08192320 in ExecProcNode (node=0xa1831a4) at execProcnode.c:380
#4 0x081a6923 in RecursionNext (node=0xa17fe18) at nodeRecursion.c:68
#5 0x08199a83 in ExecScan (node=0xa17fe18, accessMtd=0x81a6840 <RecursionNext>) at execScan.c:68
#6 0x081a6839 in ExecRecursion (node=0xa17fe18) at nodeRecursion.c:116
#7 0x081923e0 in ExecProcNode (node=0xa17fe18) at execProcnode.c:339
#8 0x081a1c13 in MultiExecHash (node=0xa17fcc4) at nodeHash.c:94
#9 0x081a28b9 in ExecHashJoin (node=0xa17b654) at nodeHashjoin.c:159
#10 0x081922d8 in ExecProcNode (node=0xa17b654) at execProcnode.c:395
#11 0x081901db in standard_ExecutorRun (queryDesc=0xa15c334, direction=ForwardScanDirection, count=0) at execMain.c:1271
#12 0x08240dc4 in PortalRunSelect (portal=0xa15631c, forward=1 '\001', count=0, dest=0xa1733d8) at pquery.c:937
#13 0x082420e6 in PortalRun (portal=0xa15631c, count=2147483647, isTopLevel=1 '\001', dest=0xa1733d8, altdest=0xa1733d8,
completionTag=0xbfcacaea "") at pquery.c:793
#14 0x0823d0a7 in exec_simple_query (
query_string=0xa12fc9c "WITH RECURSIVE t(i) AS (VALUES(1::int) UNION ALL SELECT i+1 FROM t WHERE i < 5) SELECT * FROM t AS t1 JOIN t AS t2 USING(i);") at postgres.c:977
#15 0x0823e84c in PostgresMain (argc=4, argv=0xa0d0dac, username=0xa0d0d7c "shackle") at postgres.c:3559
#16 0x0820957f in ServerLoop () at postmaster.c:3238
#17 0x0820a4e0 in PostmasterMain (argc=3, argv=0xa0ced50) at postmaster.c:1023
#18 0x081b69d6 in main (argc=3, argv=0xa0ced50) at main.c:188
What other information could help track down this problem?
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
On Wed, Jul 23, 2008 at 10:59:20AM -0400, Tom Lane wrote:
Tatsuo Ishii <ishii@postgresql.org> writes:
Reviewers, please let me know if you find problems with the
patches. If none, I would like to commit this weekend.Has this patch actually been reviewed yet? The only reports I've
seen are from testing; nothing from anyone actually reading the
code. I know I've not looked at it yet.I've read the code, for what that's worth, which isn't much. I just
tried out this patch on a fresh checkout of CVS TIP and found:EXPLAIN WITH RECURSIVE t(i) AS (VALUES(1::int) UNION ALL SELECT i+1 FROM t WHERE i < 5) SELECT * FROM t AS t1 JOIN t AS t2 USING(i);
QUERY PLAN
-------------------------------------------------------------------------------------
Hash Join (cost=0.08..0.16 rows=2 width=4)
Hash Cond: (t1.i = t2.i)
-> Recursion on t1 (cost=0.00..0.06 rows=2 width=4)
-> Append (cost=0.00..0.04 rows=2 width=4)
-> Values Scan on "*VALUES*" (cost=0.00..0.01 rows=1 width=4)
-> Recursive Scan on t (cost=0.00..0.00 rows=1 width=4)
Filter: (i < 5)
-> Hash (cost=0.06..0.06 rows=2 width=4)
-> Recursion on t2 (cost=0.00..0.06 rows=2 width=4)
-> Append (cost=0.00..0.04 rows=2 width=4)
-> Values Scan on "*VALUES*" (cost=0.00..0.01 rows=1 width=4)
-> Recursive Scan on t (cost=0.00..0.00 rows=1 width=4)
Filter: (i < 5)
(13 rows)When I try to execute the query without the EXPLAIN, having attached a debugger
to the back-end, I get.
Thanks for the report. We will look into this.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
Show quoted text
(gdb) continue
Continuing.Program received signal SIGSEGV, Segmentation fault.
0x08192dcd in ExecQual (qual=0xa183824, econtext=0xa183230, resultForNull=0 '\0') at execQual.c:4513
4513 expr_value = ExecEvalExpr(clause, econtext, &isNull, NULL);
(gdb) i s
#0 0x08192dcd in ExecQual (qual=0xa183824, econtext=0xa183230, resultForNull=0 '\0') at execQual.c:4513
#1 0x08199b23 in ExecScan (node=0xa1831a4, accessMtd=0x81a6bb0 <RecursivescanNext>) at execScan.c:131
#2 0x081a6ba9 in ExecRecursiveScan (node=0xa1831a4) at nodeRecursivescan.c:48
#3 0x08192320 in ExecProcNode (node=0xa1831a4) at execProcnode.c:380
#4 0x081a6923 in RecursionNext (node=0xa17fe18) at nodeRecursion.c:68
#5 0x08199a83 in ExecScan (node=0xa17fe18, accessMtd=0x81a6840 <RecursionNext>) at execScan.c:68
#6 0x081a6839 in ExecRecursion (node=0xa17fe18) at nodeRecursion.c:116
#7 0x081923e0 in ExecProcNode (node=0xa17fe18) at execProcnode.c:339
#8 0x081a1c13 in MultiExecHash (node=0xa17fcc4) at nodeHash.c:94
#9 0x081a28b9 in ExecHashJoin (node=0xa17b654) at nodeHashjoin.c:159
#10 0x081922d8 in ExecProcNode (node=0xa17b654) at execProcnode.c:395
#11 0x081901db in standard_ExecutorRun (queryDesc=0xa15c334, direction=ForwardScanDirection, count=0) at execMain.c:1271
#12 0x08240dc4 in PortalRunSelect (portal=0xa15631c, forward=1 '\001', count=0, dest=0xa1733d8) at pquery.c:937
#13 0x082420e6 in PortalRun (portal=0xa15631c, count=2147483647, isTopLevel=1 '\001', dest=0xa1733d8, altdest=0xa1733d8,
completionTag=0xbfcacaea "") at pquery.c:793
#14 0x0823d0a7 in exec_simple_query (
query_string=0xa12fc9c "WITH RECURSIVE t(i) AS (VALUES(1::int) UNION ALL SELECT i+1 FROM t WHERE i < 5) SELECT * FROM t AS t1 JOIN t AS t2 USING(i);") at postgres.c:977
#15 0x0823e84c in PostgresMain (argc=4, argv=0xa0d0dac, username=0xa0d0d7c "shackle") at postgres.c:3559
#16 0x0820957f in ServerLoop () at postmaster.c:3238
#17 0x0820a4e0 in PostmasterMain (argc=3, argv=0xa0ced50) at postmaster.c:1023
#18 0x081b69d6 in main (argc=3, argv=0xa0ced50) at main.c:188What other information could help track down this problem?
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.comRemember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
On Wed, Jul 23, 2008 at 10:59:20AM -0400, Tom Lane wrote:
Tatsuo Ishii <ishii@postgresql.org> writes:
Reviewers, please let me know if you find problems with the
patches. If none, I would like to commit this weekend.Has this patch actually been reviewed yet? The only reports I've
seen are from testing; nothing from anyone actually reading the
code. I know I've not looked at it yet.I've read the code, for what that's worth, which isn't much. I just
tried out this patch on a fresh checkout of CVS TIP and found:EXPLAIN WITH RECURSIVE t(i) AS (VALUES(1::int) UNION ALL SELECT i+1 FROM t WHERE i < 5) SELECT * FROM t AS t1 JOIN t AS t2 USING(i);
QUERY PLAN
-------------------------------------------------------------------------------------
Hash Join (cost=0.08..0.16 rows=2 width=4)
Hash Cond: (t1.i = t2.i)
-> Recursion on t1 (cost=0.00..0.06 rows=2 width=4)
-> Append (cost=0.00..0.04 rows=2 width=4)
-> Values Scan on "*VALUES*" (cost=0.00..0.01 rows=1 width=4)
-> Recursive Scan on t (cost=0.00..0.00 rows=1 width=4)
Filter: (i < 5)
-> Hash (cost=0.06..0.06 rows=2 width=4)
-> Recursion on t2 (cost=0.00..0.06 rows=2 width=4)
-> Append (cost=0.00..0.04 rows=2 width=4)
-> Values Scan on "*VALUES*" (cost=0.00..0.01 rows=1 width=4)
-> Recursive Scan on t (cost=0.00..0.00 rows=1 width=4)
Filter: (i < 5)
(13 rows)When I try to execute the query without the EXPLAIN, having attached a debugger
to the back-end, I get.(gdb) continue
Continuing.Program received signal SIGSEGV, Segmentation fault.
Thanks for the report. Here is the new patches from Yoshiyuki. It
appeared that addRangeTableEntryForRecursive() needs to do deep copy
for the subquery and ref in the RangeTblEntry to avoid double free
bug (remember that your example is a self join case).
Also I added your query to the regression test case with minor
modifications.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
Attachments:
Tatsuo Ishii <ishii@postgresql.org> writes:
Reviewers, please let me know if you find problems with the
patches. If none, I would like to commit this weekend.
Given that everyone who has tested this has found a different way to
crash it, and that the frequency of crash reports shows no signs of
slowing down, I have to think that committing it is premature.
I tried to look through the patch just now and failed to make any
sense of it, because of the complete absence of documentation.
Two unexplained examples added to the SELECT reference page don't
do it for me. I want to see an explanation of exactly what behaviors
are intended to be provided (and, in view of the long TODO list that
was posted awhile back, what isn't provided). And there needs to be
more than zero internal documentation. A README file, or perhaps
a very long file header comment, needs to be provided to explain
what's supposed to happen, when, and where when processing a recursive
query. (For comparison look at the README.HOT file that was created
to explain the HOT patch --- something at about that level of detail
would help this patch a lot. Or consider adding a section to
chapter 43 in the SGML docs.)
We really can't accept a patch that is so poorly documented as to
be unreviewable. Unreviewable also means it'll be unmaintainable
going forward.
regards, tom lane
On Thu, Jul 24, 2008 at 01:55:37PM +0900, Tatsuo Ishii wrote:
Program received signal SIGSEGV, Segmentation fault.
Thanks for the report. Here is the new patches from Yoshiyuki.
Thanks for the patch :)
Now, I get a different problem, this time with the following code
intended to materialize paths on the fly and summarize down to a
certain depth in a tree:
CREATE TABLE tree(
id INTEGER PRIMARY KEY,
parent_id INTEGER REFERENCES tree(id)
);
INSERT INTO tree
VALUES (1, NULL), (2, 1), (3,1), (4,2), (5,2), (6,2), (7,3), (8,3),
(9,4), (10,4), (11,7), (12,7), (13,7), (14, 9), (15,11), (16,11);
WITH RECURSIVE t(id, path) AS (
VALUES(1,ARRAY[NULL::integer])
UNION ALL
SELECT tree.id, t.path || tree.id
FROM tree JOIN t ON (tree.parent_id = t.id)
)
SELECT
t1.id, count(t2.*)
FROM
t t1
JOIN
t t2
ON (
t1.path[1:2] = t2.path[1:2]
AND
array_upper(t1.path,1) = 2
AND
array_upper(t2.path,1) > 2
)
GROUP BY t1.id;
ERROR: unrecognized node type: 203
Please apply the attached patch to help out with tab
completion in psql.
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachments:
wr_tab_complete.patchtext/plain; charset=us-asciiDownload+13-4
Now, I get a different problem, this time with the following code
intended to materialize paths on the fly and summarize down to a
certain depth in a tree:CREATE TABLE tree(
id INTEGER PRIMARY KEY,
parent_id INTEGER REFERENCES tree(id)
);INSERT INTO tree
VALUES (1, NULL), (2, 1), (3,1), (4,2), (5,2), (6,2), (7,3), (8,3),
(9,4), (10,4), (11,7), (12,7), (13,7), (14, 9), (15,11), (16,11);WITH RECURSIVE t(id, path) AS (
VALUES(1,ARRAY[NULL::integer])
UNION ALL
SELECT tree.id, t.path || tree.id
FROM tree JOIN t ON (tree.parent_id = t.id)
)
SELECT
t1.id, count(t2.*)
FROM
t t1
JOIN
t t2
ON (
t1.path[1:2] = t2.path[1:2]
AND
array_upper(t1.path,1) = 2
AND
array_upper(t2.path,1) > 2
)
GROUP BY t1.id;
ERROR: unrecognized node type: 203
Thanks for the report. We will look into this.
Please apply the attached patch to help out with tab
completion in psql.
Ok, it will appear in the next patches.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
Thanks for the patch :)
Now, I get a different problem, this time with the following code
intended to materialize paths on the fly and summarize down to a
certain depth in a tree:CREATE TABLE tree(
id INTEGER PRIMARY KEY,
parent_id INTEGER REFERENCES tree(id)
);INSERT INTO tree
VALUES (1, NULL), (2, 1), (3,1), (4,2), (5,2), (6,2), (7,3), (8,3),
(9,4), (10,4), (11,7), (12,7), (13,7), (14, 9), (15,11), (16,11);WITH RECURSIVE t(id, path) AS (
VALUES(1,ARRAY[NULL::integer])
UNION ALL
SELECT tree.id, t.path || tree.id
FROM tree JOIN t ON (tree.parent_id = t.id)
)
SELECT
t1.id, count(t2.*)
FROM
t t1
JOIN
t t2
ON (
t1.path[1:2] = t2.path[1:2]
AND
array_upper(t1.path,1) = 2
AND
array_upper(t2.path,1) > 2
)
GROUP BY t1.id;
ERROR: unrecognized node type: 203
Thanks for the report. Here is the new patches from Yoshiyuki against
CVS HEAD. Also I have added your test case to the regression test.
Please apply the attached patch to help out with tab
completion in psql.
Thanks. Your patches has been included.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
Attachments:
recursive_query.patch.gzapplication/octet-streamDownload+1-0
At David's request I've been looking through this patch.
Regarding documentation: if it would help, I can write some; I have
already made a start on writing down what is going on internally in
order to understand it myself.
I've found three more bugs so far:
1)
create view v2(id) as values (1);
with recursive t(id) as (select id from v2
union all select id+1 from t where id < 5)
select * from t;
ERROR: could not open relation 1663/16384/24588: No such file or directory
Here it seems that rewriting is simply not being applied to CTEs where
a recursive clause is present; the reference to "v2" remains in the
query up until execution time, at which point it errors out (in
ExecInitSeqScan called from InitPlan).
2)
with recursive t(id) as (values (1)
union all select id+1 from t where id < 5
union all values (2))
select * from t;
ERROR: table "t" has 0 columns available but 1 columns specified
This seems to be caused by incorrect assumptions in checkWellFormedCte
and checkCteSelectStmt (which should have been rejecting the query).
The query tree as seen by checkWellFormedCte here is (values(1) union
all select ...) union all (values (2)), and when the left subtree is
passed to checkCteSelectStmt, it believes it to be non-recursive due
to the lack of any From clause. The unexpected error is produced
later.
3)
with recursive t(id)
as (values (1)
union all select t.id+1
from t left join (values (1)) as s(x) on (false)
where t.id < 5)
select * from t;
id
----
1
2
(2 rows)
This behaviour is clearly intentional, since the entire mechanism of
estate->es_disallow_tuplestore exists for no other reason, but it
seems to me to be clearly wrong. What is the justification for it?
--
Andrew (irc:RhodiumToad)
At David's request I've been looking through this patch.
Regarding documentation: if it would help, I can write some; I have
already made a start on writing down what is going on internally in
order to understand it myself.
Thanks. There was some docs written in Japanese by Yoshiyuki. Recently
he updagted it. I will translate into English and post here.
I've found three more bugs so far:
1)
create view v2(id) as values (1);
with recursive t(id) as (select id from v2
union all select id+1 from t where id < 5)
select * from t;
ERROR: could not open relation 1663/16384/24588: No such file or directoryHere it seems that rewriting is simply not being applied to CTEs where
a recursive clause is present; the reference to "v2" remains in the
query up until execution time, at which point it errors out (in
ExecInitSeqScan called from InitPlan).
Yes, we need to make the rewrite system to understand CTEs. Probably
fireRIRrules() needs to have lines something like:
if (rte->rtekind == RTE_RECURSIVE)
{
rte->non_recursive_query = fireRIRrules(rte->non_recursive_query, activeRIRs);
continue;
}
But I still see the error message. Will look into more.
For below, I will ask Yoshiyuki.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
Show quoted text
2)
with recursive t(id) as (values (1)
union all select id+1 from t where id < 5
union all values (2))
select * from t;
ERROR: table "t" has 0 columns available but 1 columns specifiedThis seems to be caused by incorrect assumptions in checkWellFormedCte
and checkCteSelectStmt (which should have been rejecting the query).
The query tree as seen by checkWellFormedCte here is (values(1) union
all select ...) union all (values (2)), and when the left subtree is
passed to checkCteSelectStmt, it believes it to be non-recursive due
to the lack of any From clause. The unexpected error is produced
later.3)
with recursive t(id)
as (values (1)
union all select t.id+1
from t left join (values (1)) as s(x) on (false)
where t.id < 5)
select * from t;
id
----
1
2
(2 rows)This behaviour is clearly intentional, since the entire mechanism of
estate->es_disallow_tuplestore exists for no other reason, but it
seems to me to be clearly wrong. What is the justification for it?--
Andrew (irc:RhodiumToad)--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello
I played with CTE and I have to say, it's great feature - great work.
One questions - can I enforce materialisation of query?
It would be usefull for some analytical queries like:
with tmp as (select a, sum(b) as b from test) select * from tmp union
all select 'all', sum(b) from tmp;
regards
Pavel Stehule
2008/7/27 Tatsuo Ishii <ishii@postgresql.org>:
Show quoted text
At David's request I've been looking through this patch.
Regarding documentation: if it would help, I can write some; I have
already made a start on writing down what is going on internally in
order to understand it myself.Thanks. There was some docs written in Japanese by Yoshiyuki. Recently
he updagted it. I will translate into English and post here.I've found three more bugs so far:
1)
create view v2(id) as values (1);
with recursive t(id) as (select id from v2
union all select id+1 from t where id < 5)
select * from t;
ERROR: could not open relation 1663/16384/24588: No such file or directoryHere it seems that rewriting is simply not being applied to CTEs where
a recursive clause is present; the reference to "v2" remains in the
query up until execution time, at which point it errors out (in
ExecInitSeqScan called from InitPlan).Yes, we need to make the rewrite system to understand CTEs. Probably
fireRIRrules() needs to have lines something like:if (rte->rtekind == RTE_RECURSIVE)
{
rte->non_recursive_query = fireRIRrules(rte->non_recursive_query, activeRIRs);
continue;
}But I still see the error message. Will look into more.
For below, I will ask Yoshiyuki.
--
Tatsuo Ishii
SRA OSS, Inc. Japan2)
with recursive t(id) as (values (1)
union all select id+1 from t where id < 5
union all values (2))
select * from t;
ERROR: table "t" has 0 columns available but 1 columns specifiedThis seems to be caused by incorrect assumptions in checkWellFormedCte
and checkCteSelectStmt (which should have been rejecting the query).
The query tree as seen by checkWellFormedCte here is (values(1) union
all select ...) union all (values (2)), and when the left subtree is
passed to checkCteSelectStmt, it believes it to be non-recursive due
to the lack of any From clause. The unexpected error is produced
later.3)
with recursive t(id)
as (values (1)
union all select t.id+1
from t left join (values (1)) as s(x) on (false)
where t.id < 5)
select * from t;
id
----
1
2
(2 rows)This behaviour is clearly intentional, since the entire mechanism of
estate->es_disallow_tuplestore exists for no other reason, but it
seems to me to be clearly wrong. What is the justification for it?--
Andrew (irc:RhodiumToad)--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
At David's request I've been looking through this patch.
Regarding documentation: if it would help, I can write some; I have
already made a start on writing down what is going on internally in
order to understand it myself.I've found three more bugs so far:
1)
create view v2(id) as values (1);
with recursive t(id) as (select id from v2
union all select id+1 from t where id < 5)
select * from t;
ERROR: could not open relation 1663/16384/24588: No such file or directoryHere it seems that rewriting is simply not being applied to CTEs where
a recursive clause is present; the reference to "v2" remains in the
query up until execution time, at which point it errors out (in
ExecInitSeqScan called from InitPlan).2)
with recursive t(id) as (values (1)
union all select id+1 from t where id < 5
union all values (2))
select * from t;
ERROR: table "t" has 0 columns available but 1 columns specifiedThis seems to be caused by incorrect assumptions in checkWellFormedCte
and checkCteSelectStmt (which should have been rejecting the query).
The query tree as seen by checkWellFormedCte here is (values(1) union
all select ...) union all (values (2)), and when the left subtree is
passed to checkCteSelectStmt, it believes it to be non-recursive due
to the lack of any From clause. The unexpected error is produced
later.
Included patches from Yoshiyuki should fix 1) and 2). I also add your
SQLs to the regression test. Thanks.
3)
with recursive t(id)
as (values (1)
union all select t.id+1
from t left join (values (1)) as s(x) on (false)
where t.id < 5)
select * from t;
id
----
1
2
(2 rows)This behaviour is clearly intentional, since the entire mechanism of
estate->es_disallow_tuplestore exists for no other reason, but it
seems to me to be clearly wrong. What is the justification for it?
Yes, this is due to prevent infinit recursion caused by following
case for example.
CREATE TABLE test (a TEXT, b TEXT);
INSERT INTO test VALUES ('aaa', 'bbb');
INSERT INTO test VALUES ('bbb', 'ccc');
INSERT INTO test VALUES ('ddd', 'eee');
INSERT INTO test VALUES ('ccc', 'qqq');
WITH RECURSIVE x AS (
SELECT * FROM test WHERE a = 'aaa'
UNION ALL
SELECT test.* FROM x LEFT JOIN test on test.a = x.b
) SELECT * FROM x;
Now we think that we were wrong. This type of query should run into
infinit recursion and it's user's responsibility that he does not make
such a query.
Another idea would be prohibiting *any* outer joins in the recursive
term (DB2 style), but this may be overkill.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
Attachments:
Now we think that we were wrong. This type of query should run into
infinit recursion and it's user's responsibility that he does not make
such a query.Another idea would be prohibiting *any* outer joins in the recursive
term (DB2 style), but this may be overkill.
Even if you were to do that, I'm pretty sure that you'd find that it
is insufficient to prevent infinite recursion. I suspect it's not
difficult to show that SQL with WITH RECURSIVE is Turing-complete, and
therefore detecting infinite recursion is equivalent to the Halting
problem.
http://en.wikipedia.org/wiki/Halting_problem
Even if it isn't, someone can always call a function written in any of
the procedural languages, which is definitely sufficient to make it
Turing-complete. Then you don't even need WITH RECURSIVE:
rhaas=# create or replace function keep_going() returns setof integer as $$
begin
loop
return next 1;
end loop;
end
$$ language plpgsql;
CREATE FUNCTION
rhaas=# select * from keep_going();
<...>
The way to attack this is by putting in some kind of logic that cuts
the query off when the result-set gets too large or consumes too much
memory or CPU time or something along those lines. Actually detecting
or preventing infinite loops is impossible, and not the real goal
anyway, since a query that generates 10^100 rows is for all practical
purposes just as bad.
...Robert
"Tatsuo" == Tatsuo Ishii <ishii@postgresql.org> writes:
This behaviour is clearly intentional, since the entire mechanism of
estate-> es_disallow_tuplestore exists for no other reason, but it
seems to me to be clearly wrong. What is the justification for it?
Tatsuo> Yes, this is due to prevent infinit recursion caused by
Tatsuo> following case for example.
[...]
Tatsuo> WITH RECURSIVE x AS (
Tatsuo> SELECT * FROM test WHERE a = 'aaa'
Tatsuo> UNION ALL
Tatsuo> SELECT test.* FROM x LEFT JOIN test on test.a = x.b
Tatsuo> ) SELECT * FROM x;
Tatsuo> Now we think that we were wrong. This type of query should
Tatsuo> run into infinit recursion and it's user's responsibility
Tatsuo> that he does not make such a query.
I agree.
Tatsuo> Another idea would be prohibiting *any* outer joins in the
Tatsuo> recursive term (DB2 style), but this may be overkill.
There are legitimate cases for wanting to do a left join in the
recursion - for example, to use the content of another table to
prune the tree where matching records exist (consider the standard
bill-of-materials example with the addition of another table listing
components already in stock).
--
Andrew (irc:RhodiumToad)
I spent some time reading the SQL spec over the weekend, and I believe
I've identified a fairly serious problem in the WITH patch. SQL99
7.12 <query expression> General Rule 1 is
1) If a non-recursive <with clause> is specified, then:
a) For every <with list element> WLE, let WQN be the <query
name> immediately contained in WLE. Let WQE be the <query
expression> immediately contained in WLE. Let WLT be the
table resulting from evaluation of WQE, with each column name
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
replaced by the corresponding element of the <with column
list>, if any, immediately contained in WLE.
b) Every <table reference> contained in <query expression> that
specifies WQN identifies WLT.
I think what this is saying is that the subquery defined by a WITH
clause is to be evaluated only once, even if it is referenced in
multiple places in the upper query. This is sensible because if there
is no such rule, then there really is no semantic difference between
non-recursive WITH and ordinary subqueries; and the SQL committee is not
known for inventing duplicate syntax. It is a useful property for users
because (1) it lets them prevent duplicate evaluations of an expensive
subquery, and (2) it lets them prevent multiple evaluations of volatile
functions in a subquery. (Right now we tell people to use OFFSET 0 as
an optimization fence, but that's an unportable hack, and it doesn't
cover all cases anyway.) Another thing in the back of my head is that
having these semantics could enable using INSERT ... RETURNING etc
as WITH subexpressions, whereas we can't really allow them as arbitrary
subqueries because of the lack of guarantees about one-time execution.
That's something for later, though.
I think this is a "must fix" because of the point about volatile
functions --- changing it later will result in user-visible semantics
changes, so we have to get it right the first time.
This isn't going to be a particularly simple fix :-(. The basic
implementation clearly ought to be to dump the result of the subquery
into a tuplestore and then have the upper level read out from that.
However, we don't have any infrastructure for having multiple
upper-level RTEs reference the same tuplestore. (Perhaps the InitPlan
infrastructure could be enhanced in that direction, but it's not ready
for non-scalar outputs today.) Also, I think we'd have to teach
tuplestore how to support multiple readout cursors. For example,
consider
WITH foo AS (SELECT ...) SELECT ... FROM foo a, foo b WHERE ...
If the planner chooses to do the join as a nested loop then each
Scan node needs to keep track of its own place in the tuplestore,
concurrently with the other node having a different place.
regards, tom lane
"Tatsuo" == Tatsuo Ishii <ishii@postgresql.org> writes:
Tatsuo> Included patches from Yoshiyuki should fix 1) and 2). I also
Tatsuo> add your SQLs to the regression test. Thanks.
I think it needs this change in addition; without it, incorrect
results are returned when you reference a recursive view from within
the recursive query, due to the RecursionScan nodes becoming linked to
the wrong tuplestores.
The testcase for this would be something like
create view v0 as
with recursive t1(id) as (values (1)
union all select id+1 from t1 where id < 5)
select * from t1;
with recursive t(id) as (select * from v0
union all select id+1 from t where t.id < 8)
select count(*) from t;
-- expected output is 30, not 5
*** src/backend/executor/nodeRecursion.c~ Mon Jul 28 14:41:38 2008
--- src/backend/executor/nodeRecursion.c Mon Jul 28 15:22:55 2008
***************
*** 124,129 ****
--- 124,130 ----
ExecInitRecursion(Recursion *node, EState *estate, int eflags)
{
RecursionState *recursionstate;
+ Tuplestorestate **save_tuplestorestate;
/* check for unsupported flags */
Assert(!(eflags & EXEC_FLAG_MARK));
***************
*** 149,154 ****
--- 150,156 ----
/*
* Save the reference for the working table to share
*/
+ save_tuplestorestate = estate->es_tuplestorestate;
estate->es_tuplestorestate = &recursionstate->working_table;
/*
***************
*** 196,201 ****
--- 198,205 ----
ExecAssignResultTypeFromTL(&recursionstate->ss.ps);
ExecAssignScanProjectionInfo(&recursionstate->ss);
+ estate->es_tuplestorestate = save_tuplestorestate;
+
return recursionstate;
}
--
Andrew (irc:RhodiumToad)
"Tom Lane" <tgl@sss.pgh.pa.us> writes:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:
I think what this is saying is that the subquery defined by a WITH
clause is to be evaluated only once, even if it is referenced in
multiple places in the upper query. This is sensible because if there
is no such rule, then there really is no semantic difference between
non-recursive WITH and ordinary subqueries; and the SQL committee is not
known for inventing duplicate syntax.
Well I guess that answers the question of whether to apply the partial patch
before full recursive queries come along. In any case since that work seems to
be making a lot of progress (no thanks to me:( ) I don't think it's a problem
to wait.
This isn't going to be a particularly simple fix :-(. The basic
implementation clearly ought to be to dump the result of the subquery
into a tuplestore and then have the upper level read out from that.
However, we don't have any infrastructure for having multiple
upper-level RTEs reference the same tuplestore.
Uhm, indeed, isn't that the whole point of the work needed to make recursive
queries work? Once we have that we'll just use those executor nodes even for
non-recursive queries.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's RemoteDBA services!
Gregory Stark <stark@enterprisedb.com> writes:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:
This isn't going to be a particularly simple fix :-(. The basic
implementation clearly ought to be to dump the result of the subquery
into a tuplestore and then have the upper level read out from that.
However, we don't have any infrastructure for having multiple
upper-level RTEs reference the same tuplestore.
Uhm, indeed, isn't that the whole point of the work needed to make recursive
queries work?
No, I don't think so. The present patch (so far as I've gathered what
it can do, in the absence of any documentation) knows how to feed back
tuples collected at an upper level to a scan node at a lower level,
which is needed to implement recursion. However, if you do
WITH RECURSIVE foo AS (...) SELECT ... FROM foo a, foo b WHERE
then you will still get two independent evaluations of duplicate
querytrees. Which is at least a performance bug, and if there are
any volatile functions inside foo I claim it's a spec violation too.
regards, tom lane