WITH RECURSIVE patch V0.1
WITH RECURSIVE patch V0.1
Here are patches to implement WITH RECURSIVE clause. There are some
limitiations and TODO items(see the "Current limitations" section
below). Comments are welcome.
1. Credit
These patches were developed by Yoshiyuki Asaba (y-asab@sraoss.co.jp)
with some discussions with Tatsuo Ishii (ishii@sraoss.co.jp).
- prior patches:
http://archives.postgresql.org/pgsql-patches/2008-03/msg00327.php
The patches include some extentions to the patches above.
2. Design proposales
See: http://archives.postgresql.org/pgsql-hackers/2008-02/msg00642.php
3. Implementation details
- Parsing recursive queries
We convert WITH clause to following structure in analyzer.
/*
* WithClause -
* reprensentation of WITH clause
*/
typedef struct WithClause
{
NodeTag type;
List *subquery; /* query list */
bool recursive; /* true = WITH RECURSIVE */
} WithClause;
If recursive is false, WITH clause is converted to a subquery.
Next we call transformWithClause() to check if the query defined in
WITH RECURSIVE clause is a liner recursive query.
transformWithClause() does followings:
1) Detect query which is not allowed in SQL standard
2) Create a name dependency graphs
transoform* functions checks if target tables actually exist and throw
an error if tables do not exist.
For example in following query:
WITH RECURSIVE x AS (SELECT * from y),
y AS (SELECT * FROM pg_class)
SELECT * FROM x;
x's definition referes to y and y is defined after x's definition. If
we evaluate x and y in the order of above, an error will occur while
evaluating x since y is not defined yet.
To avoid the problem we create new function:
void checkWellFormedCte(ParseState *pstate, WithClause *withClause);
which performs topological sort on graphs.
Non recursive term will be the start of the graph, checks type
checking one by one.
names appear in WITH RECURSIVE clause form a name dependency
graph. The nodes in the graph is represented by CteNode type:
typedef struct CteNode;
struct CteNode
{
Alias *alias;
CteList *dependency; /* if NULL the node does not call other node */
bool self_reference;
bool non_recursive;
} CteNode;
If there is loop in the dependency graph, a matual recursion exists
and an error occurs (error code 0A00).
* Note that this part is not implemented in the V1.0 patches yet.
Tables defined in WITH RECURSIVE clause is identified as RTE_RECURSIVE.
A name space is added to p_recursive_namespace in ParseState
structure. The added item's structure is following:
typedef struct RangeRecursive
{
NodeTag type;
Node *subquery; /* whole subquery */
Alias *alias; /* table alias & optional column aliases */
List *coldeflist; /* list of ColumnDef nodes to describe result
* of function returning RECORD */
Node *non_recursive_term; /* anaylze result of non recursive term */
bool recursive; /* reserved */
} RangeRecursive;
non_recursive_term keeps the result of analyze on non recursive term.
Using this, the type of recursive query is inferenced.
- Generating a plan
New plan node "Recursion" and "Recursive scan" is added.
Recursion scan is a plan which returns WT. Recursion is a plan which
execute recursive query. The subplan of Recursion is always "Append".
Below is an example EXPLAIN output.
EXPLAIN WITH RECURSIVE subdepartment AS
(
-- non recursive term
SELECT * FROM department WHERE name = 'A'
UNION ALL
-- recursive term
SELECT d.* FROM department AS d, subdepartment
WHERE d.parent_department = subdepartment.id
)
SELECT * FROM subdepartment;
QUERY PLAN
-------------------------------------------------------------------------------------------
Recursion on subdepartment (cost=0.00..50.76 rows=12 width=40)
-> Append (cost=0.00..50.64 rows=12 width=40)
-> Seq Scan on department (cost=0.00..24.50 rows=6 width=40) <-- non recusrive term
Filter: (name = 'A'::text)
-> Hash Join (cost=0.01..26.02 rows=6 width=40) <-- recursive term
Hash Cond: (d.parent_department = subdepartment.id)
-> Seq Scan on department d (cost=0.00..21.60 rows=1160 width=40)
-> Hash (cost=0.00..0.00 rows=1 width=4)
-> Recursive Scan on subdepartment (cost=0.00..0.00 rows=1 width=4)
- Executor
new files nodeRecursion.c and nodeRecursivescan.c are added.
The structure which represents the Recursion plan is as follows:
typedef struct Recursion
{
Scan scan;
Plan *subplan;
List *subrtable; /* temporary workspace for planner */
} Recursion;
typedef struct RecursionState
{
ScanState ss; /* its first field is NodeTag */
PlanState *subplan;
bool recursive_empty; /* if recursive term does exist, true */
Tuplestorestate *intermediate_tuplestorestate; /* intermediate table */
bool init_done; /* initialization flag */
} RecursiveState;
Following function are defined.
TupleTableSlot *
ExecRecursion(RecursionState *)
{
...
if (non recursive term)
{
slot = ExecProcNode(non recurisve term)
tuplestore_puttuple(WT, slot);
return slot;
}
/* calculate recursive term */
slot = ExecProcNode(recursive term)
if (!TupIsNull(slot)
{
slot = ExecProcNode(Nested Loop);
if (slot is empty)
swap(intermediate_tuplestorestate, WT)
else
tuplestore_puttuple(intermediate_tuplestorestate, slot)
}
...
}
- Executing RecursiveScan
Returns tuples store in a work table. Work table is stored in Estate structure.
typedef struct EState
{
...
Tuplestorestate *es_tuplestorestate; /* Stuff used for recursive query */
...
} EState;
RecursiveScan and RecursiveScanState structures.
typedef struct RecursiveScan
{
Plan *plan;
Plan *subplan;
} RecursiveScan;
typedef struct RecursiveScanState
{
ScanState ss;
TupleDesc tupdesc;
bool done_init;
} RecursiveScanState;
RecursiveScanState initialized RecursionState the initialize the type
information. Since Recursive plan is an upper plan of RecursiveScan,
initialization should be delayed. For this purpose we use done_init
flag.
Scan functions:
static TupleTableSlot *
RecursivescanNext(RecursiveScanState *node)
{
...
if (tuplestore_gettupleslot(node->ss.ps->state->es_tuplestorestate, true, slot))
return slot;
...
}
TupleTableSlot *
ExecRecursiveScan(RecursiveScanState *)
{
/*
* Delay initializing type info because type inference did not do
* in ExecInitRecursiveScan.
*/
if (!node->init_done)
{
ExecAssignScanType(&node->ss,
node->ss.ps.state->es_rscan_tupledesc);
/*
* Initialize result tuple type and projection info.
*/
ExecAssignResultTypeFromTL(&node->ss.ps);
ExecAssignScanProjectionInfo(&node->ss);
node->init_done = true;
}
return ExecScan(&node->ss, (ExecScanAccessMtd) RecursivescanNext);
}
4. New source codes:
- src/backend/parser/parse_cte.c
parse CTE (not completed)
- src/backend/executor/nodeRecursion.c
- src/include/nodeRecursion.h
Execute Recursion plan
- src/backend/executor/nodeRecursivescan.c
- src/include/nodeRecursivescan.h
Execute Recursive scan plan
5. Limitations and TODO
1) Only the last SELECT of UNION ALL can include self recursion name.
2) Cost of Recursion and Recursivescan plan are always 0.
3) Some recursion type checking is not complete.
4) Outer join for recursive name and tables does not work.
5) Need regression test cases.
6) Documentation needed.
6. Executing WITH RECURSIVE cluase examples
DROP TABLE IF EXISTS department;
DROP TABLE
CREATE TABLE department (
id INT PRIMARY KEY, -- department ID
parent_department INT REFERENCES department, -- upper department ID
name TEXT -- department name
);
psql:recursive-e.sql:6: NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "department_pkey" for table "department"
CREATE TABLE
INSERT INTO department VALUES (0, NULL, 'ROOT');
INSERT 0 1
INSERT INTO department VALUES (1, 0, 'A');
INSERT 0 1
INSERT INTO department VALUES (2, 1, 'B');
INSERT 0 1
INSERT INTO department VALUES (3, 2, 'C');
INSERT 0 1
INSERT INTO department VALUES (4, 2, 'D');
INSERT 0 1
INSERT INTO department VALUES (5, 0, 'E');
INSERT 0 1
INSERT INTO department VALUES (6, 4, 'F');
INSERT 0 1
INSERT INTO department VALUES (7, 5, 'G');
INSERT 0 1
-- department structure represented here is as follows:
--
-- ROOT-+->A-+->B-+->C
-- | |
-- | +->D-+->F
-- +->E-+->G
EXPLAIN WITH RECURSIVE subdepartment AS
(
-- non recursive term
SELECT * FROM department WHERE name = 'A'
UNION ALL
-- recursive term
SELECT d.* FROM department AS d, subdepartment
WHERE d.parent_department = subdepartment.id
)
SELECT * FROM subdepartment;
QUERY PLAN
-------------------------------------------------------------------------------------------
Recursion on subdepartment (cost=0.00..50.76 rows=12 width=40)
-> Append (cost=0.00..50.64 rows=12 width=40)
-> Seq Scan on department (cost=0.00..24.50 rows=6 width=40)
Filter: (name = 'A'::text)
-> Hash Join (cost=0.01..26.02 rows=6 width=40)
Hash Cond: (d.parent_department = subdepartment.id)
-> Seq Scan on department d (cost=0.00..21.60 rows=1160 width=40)
-> Hash (cost=0.00..0.00 rows=1 width=4)
-> Recursive Scan on subdepartment (cost=0.00..0.00 rows=1 width=4)
(9 rows)
-- extract all departments under 'A'. Result should be A, B, C, D and F
WITH RECURSIVE subdepartment AS
(
-- non recursive term
SELECT * FROM department WHERE name = 'A'
UNION ALL
-- recursive term
SELECT d.* FROM department AS d, subdepartment
WHERE d.parent_department = subdepartment.id
)
SELECT * FROM subdepartment ORDER BY name;
id | parent_department | name
----+-------------------+------
1 | 0 | A
2 | 1 | B
3 | 2 | C
4 | 2 | D
6 | 4 | F
(5 rows)
-- extract all departments under 'E'. Result should be E and G.
WITH RECURSIVE subdepartment AS
(
-- non recursive term
SELECT * FROM department WHERE name = 'E'
UNION ALL
-- recursive term
SELECT d.* FROM department AS d, subdepartment
WHERE d.parent_department = subdepartment.id
)
SELECT * FROM subdepartment ORDER BY name;
id | parent_department | name
----+-------------------+------
5 | 0 | E
7 | 5 | G
(2 rows)
--
Tatsuo Ishii
SRA OSS, Inc. Japan
Attachments:
On Sun, May 18, 2008 at 08:51:29PM +0900, Tatsuo Ishii wrote:
WITH RECURSIVE patch V0.1
Here are patches to implement WITH RECURSIVE clause. There are some
limitiations and TODO items(see the "Current limitations" section
below). Comments are welcome.1. Credit
These patches were developed by Yoshiyuki Asaba (y-asab@sraoss.co.jp)
with some discussions with Tatsuo Ishii (ishii@sraoss.co.jp).
This is really great! Kudos to all who made this happen :)
I tried a bunch of different queries, and so far, only these two
haven't worked. Any ideas what I'm doing wrong here?
WITH RECURSIVE t(n) AS (
SELECT 1
UNION ALL
SELECT n+1
FROM t
WHERE n < 100
)
SELECT * FROM t;
ERROR: cannot extract attribute from empty tuple slot
WITH RECURSIVE t(n) AS (
VALUES (1)
UNION ALL
SELECT n+1
FROM t
WHERE n < 100
)
SELECT * FROM t;
ERROR: cannot extract attribute from empty tuple slot
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter írta:
On Sun, May 18, 2008 at 08:51:29PM +0900, Tatsuo Ishii wrote:
WITH RECURSIVE patch V0.1
Here are patches to implement WITH RECURSIVE clause. There are some
limitiations and TODO items(see the "Current limitations" section
below). Comments are welcome.1. Credit
These patches were developed by Yoshiyuki Asaba (y-asab@sraoss.co.jp)
with some discussions with Tatsuo Ishii (ishii@sraoss.co.jp).This is really great! Kudos to all who made this happen :)
I tried a bunch of different queries, and so far, only these two
haven't worked. Any ideas what I'm doing wrong here?WITH RECURSIVE t(n) AS (
SELECT 1
UNION ALL
SELECT n+1
FROM t
WHERE n < 100
)
SELECT * FROM t;
ERROR: cannot extract attribute from empty tuple slotWITH RECURSIVE t(n) AS (
VALUES (1)
UNION ALL
SELECT n+1
FROM t
WHERE n < 100
)
SELECT * FROM t;
ERROR: cannot extract attribute from empty tuple slotCheers,
David.
Here's a test case attached shamelessly stolen from
http://www.adp-gmbh.ch/ora/sql/connect_by.html
This query (without naming toplevel columns) works:
# with recursive x as (select * from test_connect_by where parent is
null union all select base.* from test_connect_by as base, x where
base.parent = x.child) select * from x;
parent | child
--------+-------
| 38
| 26
| 18
18 | 11
18 | 7
26 | 13
26 | 1
26 | 12
38 | 15
38 | 17
38 | 6
17 | 9
17 | 8
15 | 10
15 | 5
5 | 2
5 | 3
(17 rows)
It even works when I add my "level" column:
# with recursive x(level, parent, child) as (select 1::bigint, * from
test_connect_by where parent is null union all select x.level + 1,
base.* from test_connect_by as base, x where base.parent = x.child)
select * from x;
level | parent | child
-------+--------+-------
1 | | 38
1 | | 26
1 | | 18
2 | 18 | 11
2 | 18 | 7
2 | 26 | 13
2 | 26 | 1
2 | 26 | 12
2 | 38 | 15
2 | 38 | 17
2 | 38 | 6
3 | 17 | 9
3 | 17 | 8
3 | 15 | 10
3 | 15 | 5
4 | 5 | 2
4 | 5 | 3
(17 rows)
But I have a little problem with the output.
If it's not obvious, here is the query tweaked a little below.
# with recursive x(level, parent, child) as (select 1::integer, * from
test_connect_by where parent is null union all select x.level + 1,
base.* from test_connect_by as base, x where base.parent = x.child)
select lpad(' ', 4*level - 1) || child from x;
?column?
------------------
38
26
18
11
7
13
1
12
15
17
6
9
8
10
5
2
3
(17 rows)
Can we get the rows in tree order, please? I.e. something like this:
?column?
------------------
38
15
10
5
2
3
17
9
8
6
26
13
1
12
18
11
7
(17 rows)
After all, I didn't specify any ORDER BY clauses in the base, recursive
or the final queries.
Also, it seems there are no infinite recursion detection:
# with recursive x(level, parent, child) as (
select 1::integer, * from test_connect_by where parent is null
union all
select x.level + 1, base.* from test_connect_by as base, x where
base.child = x.child
) select * from x;
... it waits and waits and waits ...
Also, there's another rough edge:
# with recursive x as (select * from test_connect_by where parent is
null) select * from x;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
Best regards,
Zoltán Böszörményi
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/
Attachments:
testconn.sqltext/x-sql; charset=iso-8859-1; name=testconn.sqlDownload
On Sun, May 18, 2008 at 5:22 PM, Zoltan Boszormenyi <zb@cybertec.at> wrote:
Can we get the rows in tree order, please? I.e. something like this:
Is ordering by tree order defined in the standard when no explicit
order is given? If not, it probably returns them in the order they
are pulled up, which might be the fastest way.
merlin
Merlin Moncure wrote:
On Sun, May 18, 2008 at 5:22 PM, Zoltan Boszormenyi <zb@cybertec.at> wrote:
Can we get the rows in tree order, please? I.e. something like this:
Is ordering by tree order defined in the standard when no explicit
order is given? If not, it probably returns them in the order they
are pulled up, which might be the fastest way
+1 for the fastest way, which I expect to often be "find all level 1
matches", "find all level 2 matches", ... If ORDER BY is important, it
should be specified (although it may be difficult or impossible to
properly represent ORDER BY for a tree? not sure?) I think most uses of
recursive require extra client side code to deal with anyways, so only
relative order is important (order within a particular branch).
There are things I'd like to use this for right now. Currently I use
plpgsql procedures to implement my own recursion. :-)
Cheers,
mark
--
Mark Mielke <mark@mielke.cc>
This is indeed really cool. I'm sorry I haven't gotten to doing what I
promised in this area but I'm glad it's happening anyways.
"Zoltan Boszormenyi" <zb@cybertec.at> writes:
Can we get the rows in tree order, please?
...
After all, I didn't specify any ORDER BY clauses in the base, recursive or the
final queries.
The standard has a clause to specify depth-first order. However doing a
depth-first traversal would necessitate quite a different looking plan and
it's far less obvious (to me anyways) how to do it.
Also, it seems there are no infinite recursion detection:
# with recursive x(level, parent, child) as (
select 1::integer, * from test_connect_by where parent is null
union all
select x.level + 1, base.* from test_connect_by as base, x where base.child
= x.child
) select * from x;
... it waits and waits and waits ...
Well, psql might wait and wait but it's actually receiving rows. A cleverer
client should be able to deal with infinite streams of records.
I think DB2 does produce a warning if there is no clause it can determine will
bound the results. But that's not actually reliable. It's quite possible to
have clauses which will limit the output but not in a way the database can
determine. Consider for example a tree-traversal for a binary tree stored in a
recursive table reference. The DBA might know that the data contains no loops
but the database doesn't.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!
On Mon, May 19, 2008 at 12:21:20AM -0400, Gregory Stark wrote:
"Zoltan Boszormenyi" <zb@cybertec.at> writes:
Also, it seems there are no infinite recursion detection:
# with recursive x(level, parent, child) as (
select 1::integer, * from test_connect_by where parent is null
union all
select x.level + 1, base.* from test_connect_by as base, x where base.child
= x.child
) select * from x;
... it waits and waits and waits ...Well, psql might wait and wait but it's actually receiving rows. A
cleverer client should be able to deal with infinite streams of
records.
That would be a very good thing for libpq (and its descendants) to
have :)
I think DB2 does produce a warning if there is no clause it can
determine will bound the results. But that's not actually reliable.
I'd think not, as it's (in some sense) a Halting Problem.
It's quite possible to have clauses which will limit the output but
not in a way the database can determine. Consider for example a
tree-traversal for a binary tree stored in a recursive table
reference. The DBA might know that the data contains no loops but
the database doesn't.
I seem to recall Oracle's implementation can do this traversal on
write operations, but maybe that's just their marketing.
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
"David Fetter" <david@fetter.org> writes:
On Mon, May 19, 2008 at 12:21:20AM -0400, Gregory Stark wrote:
"Zoltan Boszormenyi" <zb@cybertec.at> writes:
Also, it seems there are no infinite recursion detection:
# with recursive x(level, parent, child) as (
select 1::integer, * from test_connect_by where parent is null
union all
select x.level + 1, base.* from test_connect_by as base, x where base.child
= x.child
) select * from x;
... it waits and waits and waits ...Well, psql might wait and wait but it's actually receiving rows. A
cleverer client should be able to deal with infinite streams of
records.That would be a very good thing for libpq (and its descendants) to
have :)
I think you can do it in libpq.
In psql you can use \set FETCH_COUNT or somrthing like this (I can't look it
up just now) to use a cursor to do this too.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!
Gregory Stark írta:
This is indeed really cool. I'm sorry I haven't gotten to doing what I
promised in this area but I'm glad it's happening anyways."Zoltan Boszormenyi" <zb@cybertec.at> writes:
Can we get the rows in tree order, please?
...
After all, I didn't specify any ORDER BY clauses in the base, recursive or the
final queries.The standard has a clause to specify depth-first order. However doing a
depth-first traversal would necessitate quite a different looking plan and
it's far less obvious (to me anyways) how to do it.
That would be even cooler to have it implemented as well.
Also, it seems there are no infinite recursion detection:
# with recursive x(level, parent, child) as (
select 1::integer, * from test_connect_by where parent is null
union all
select x.level + 1, base.* from test_connect_by as base, x where base.child
= x.child
) select * from x;
... it waits and waits and waits ...Well, psql might wait and wait but it's actually receiving rows. A cleverer
client should be able to deal with infinite streams of records.
I think it's the other way around. The server should not emit infinite
number of records.
I think DB2 does produce a warning if there is no clause it can determine will
bound the results. But that's not actually reliable. It's quite possible to
have clauses which will limit the output but not in a way the database can
determine. Consider for example a tree-traversal for a binary tree stored in a
recursive table reference. The DBA might know that the data contains no loops
but the database doesn't.
Well, a maintenance resjunk could be used like the branch column in
tablefunc::connectby().
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/
On Mon, May 19, 2008 at 08:19:17AM +0200, Zoltan Boszormenyi wrote:
The standard has a clause to specify depth-first order. However doing a
depth-first traversal would necessitate quite a different looking plan and
it's far less obvious (to me anyways) how to do it.That would be even cooler to have it implemented as well.
From an implementation point of view, the only difference between
breadth-first and depth-first is that your tuplestore needs to be LIFO
instead of FIFO. However, just looking at the plan I don't know whether
it could support that kind of usage. At the very least I don't think
the standard tuplestore code can handle it.
Well, psql might wait and wait but it's actually receiving rows. A cleverer
client should be able to deal with infinite streams of records.I think it's the other way around. The server should not emit infinite
number of records.
The server won't, the universe will end first. This is a nice example
of the halting problem:
http://en.wikipedia.org/wiki/Halting_problem
Which was proved unsolvable a long time ago.
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
Please line up in a tree and maintain the heap invariant while
boarding. Thank you for flying nlogn airlines.
Hi,
From: Zoltan Boszormenyi <zb@cybertec.at>
Subject: Re: [PATCHES] WITH RECURSIVE patch V0.1
Date: Mon, 19 May 2008 08:19:17 +0200
Also, it seems there are no infinite recursion detection:
# with recursive x(level, parent, child) as (
select 1::integer, * from test_connect_by where parent is null
union all
select x.level + 1, base.* from test_connect_by as base, x where base.child
= x.child
) select * from x;
... it waits and waits and waits ...Well, psql might wait and wait but it's actually receiving rows. A cleverer
client should be able to deal with infinite streams of records.I think it's the other way around. The server should not emit infinite
number of records.
How about adding new GUC parameter "max_recursive_call"?
Regards,
--
Yoshiyuki Asaba
y-asaba@sraoss.co.jp
Yoshiyuki Asaba írta:
Hi,
From: Zoltan Boszormenyi <zb@cybertec.at>
Subject: Re: [PATCHES] WITH RECURSIVE patch V0.1
Date: Mon, 19 May 2008 08:19:17 +0200Also, it seems there are no infinite recursion detection:
# with recursive x(level, parent, child) as (
select 1::integer, * from test_connect_by where parent is null
union all
select x.level + 1, base.* from test_connect_by as base, x where base.child
= x.child
) select * from x;
... it waits and waits and waits ...Well, psql might wait and wait but it's actually receiving rows. A cleverer
client should be able to deal with infinite streams of records.I think it's the other way around. The server should not emit infinite
number of records.How about adding new GUC parameter "max_recursive_call"?
Yes, why not?
MSSQL has a similar MAXRECURSION hint for WITH RECURSIVE queries
according to their docs.
http://msdn.microsoft.com/en-us/library/ms186243.aspx
Regards,
--
Yoshiyuki Asaba
y-asaba@sraoss.co.jp
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/
Martijn van Oosterhout írta:
On Mon, May 19, 2008 at 08:19:17AM +0200, Zoltan Boszormenyi wrote:
The standard has a clause to specify depth-first order. However doing a
depth-first traversal would necessitate quite a different looking plan and
it's far less obvious (to me anyways) how to do it.That would be even cooler to have it implemented as well.
From an implementation point of view, the only difference between
breadth-first and depth-first is that your tuplestore needs to be LIFO
instead of FIFO. However, just looking at the plan I don't know whether
it could support that kind of usage. At the very least I don't think
the standard tuplestore code can handle it.Well, psql might wait and wait but it's actually receiving rows. A cleverer
client should be able to deal with infinite streams of records.I think it's the other way around. The server should not emit infinite
number of records.The server won't, the universe will end first.
The universe is alive and well, thank you. :-)
But the server won't emit infinite number of records, you are right.
Given the implementation uses a tuplestore and not producing the
tupleslots on the fly, it will go OOM first not the psql client,
I watched them in 'top'. It just takes a bit of time.
This is a nice example
of the halting problem:http://en.wikipedia.org/wiki/Halting_problem
Which was proved unsolvable a long time ago.
Hmpf, yes, I forgot too much about Turing-machines since university. :-(
Have a nice day,
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/
Martijn van Oosterhout írta:
On Mon, May 19, 2008 at 08:19:17AM +0200, Zoltan Boszormenyi wrote:
The standard has a clause to specify depth-first order. However doing a
depth-first traversal would necessitate quite a different looking plan and
it's far less obvious (to me anyways) how to do it.That would be even cooler to have it implemented as well.
From an implementation point of view, the only difference between
breadth-first and depth-first is that your tuplestore needs to be LIFO
instead of FIFO.
Are you sure? I think a LIFO tuplestore would simply return reversed
breadth-first order. Depth-first means for every new record descend into
another recursion first then continue with the next record on the right.
However, just looking at the plan I don't know whether
it could support that kind of usage. At the very least I don't think
the standard tuplestore code can handle it.
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/
On Mon, May 19, 2008 at 11:56:17AM +0200, Zoltan Boszormenyi wrote:
From an implementation point of view, the only difference between
breadth-first and depth-first is that your tuplestore needs to be LIFO
instead of FIFO.Are you sure? I think a LIFO tuplestore would simply return reversed
breadth-first order. Depth-first means for every new record descend into
another recursion first then continue with the next record on the right.
Say your tree looks like:
Root->A, D
A->B,C
D->E,F
LIFO pushes A and D. It then pops A and pushes B and C. B and C have no
children and are returned. Then D is popped and E and F pushed. So the
returned order is: A,B,C,D,E,F. You could also do B,C,A,E,F,D if you
wanted.
FIFO pushes A and D. It then pops A and puts B and C at *the end*. It
then pops D and pushes E and F at the end. So you get the order
A,D,B,C,E,F
Hope this helps,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
Please line up in a tree and maintain the heap invariant while
boarding. Thank you for flying nlogn airlines.
Martijn van Oosterhout írta:
On Mon, May 19, 2008 at 11:56:17AM +0200, Zoltan Boszormenyi wrote:
From an implementation point of view, the only difference between
breadth-first and depth-first is that your tuplestore needs to be LIFO
instead of FIFO.Are you sure? I think a LIFO tuplestore would simply return reversed
breadth-first order. Depth-first means for every new record descend into
another recursion first then continue with the next record on the right.Say your tree looks like:
Root->A, D
A->B,C
D->E,FLIFO pushes A and D. It then pops A and pushes B and C. B and C have no
children and are returned. Then D is popped and E and F pushed. So the
returned order is: A,B,C,D,E,F. You could also do B,C,A,E,F,D if you
wanted.FIFO pushes A and D. It then pops A and puts B and C at *the end*. It
then pops D and pushes E and F at the end. So you get the order
A,D,B,C,E,FHope this helps,
Thanks, I didn't consider popping elements off while processing.
However, if the toplevel query returns tuples in A, D order, you need
a positioned insert into the tuplestore, because the LIFO would pop D first.
Say, a "treestore" would work this way:
1. setup: treestore is empty, storage_position := 0
2. treestore_puttupleslot() adds slot at current position,
storage_position++
3. treestore_gettupleslot() removes slot from the beginning,
storage_position := 0
This works easily in memory lists but it's not obvious for me how it may
work
with disk backed temporary storage inside PG.
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/
On Mon, May 19, 2008 at 05:57:17PM +0900, Yoshiyuki Asaba wrote:
Hi,
I think it's the other way around. The server should not emit
infinite number of records.How about adding new GUC parameter "max_recursive_call"?
Couldn't we just have it pay attention to the existing
max_stack_depth?
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
On Sun, 2008-05-18 at 22:17 -0700, David Fetter wrote:
On Mon, May 19, 2008 at 12:21:20AM -0400, Gregory Stark wrote:
"Zoltan Boszormenyi" <zb@cybertec.at> writes:
Also, it seems there are no infinite recursion detection:
# with recursive x(level, parent, child) as (
select 1::integer, * from test_connect_by where parent is null
union all
select x.level + 1, base.* from test_connect_by as base, x where base.child
= x.child
) select * from x;
... it waits and waits and waits ...Well, psql might wait and wait but it's actually receiving rows. A
cleverer client should be able to deal with infinite streams of
records.That would be a very good thing for libpq (and its descendants) to
have :)I think DB2 does produce a warning if there is no clause it can
determine will bound the results. But that's not actually reliable.I'd think not, as it's (in some sense) a Halting Problem.
It's quite possible to have clauses which will limit the output but
not in a way the database can determine. Consider for example a
tree-traversal for a binary tree stored in a recursive table
reference. The DBA might know that the data contains no loops but
the database doesn't.I seem to recall Oracle's implementation can do this traversal on
write operations, but maybe that's just their marketing.
It may be possible to solve at least some of it by doing something
similar to hash version of DISTINCT by having an hashtable of tuples
already returned and not descending branches where you have already
been.
Show quoted text
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.comRemember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
On Sun, 2008-05-18 at 22:17 -0700, David Fetter wrote:
On Mon, May 19, 2008 at 12:21:20AM -0400, Gregory Stark wrote:
"Zoltan Boszormenyi" <zb@cybertec.at> writes:
Also, it seems there are no infinite recursion detection:
# with recursive x(level, parent, child) as (
select 1::integer, * from test_connect_by where parent is null
union all
select x.level + 1, base.* from test_connect_by as base, x where base.child
= x.child
) select * from x;
... it waits and waits and waits ...Well, psql might wait and wait but it's actually receiving rows. A
cleverer client should be able to deal with infinite streams of
records.That would be a very good thing for libpq (and its descendants) to
have :)I think DB2 does produce a warning if there is no clause it can
determine will bound the results. But that's not actually reliable.I'd think not, as it's (in some sense) a Halting Problem.
It's quite possible to have clauses which will limit the output but
not in a way the database can determine. Consider for example a
tree-traversal for a binary tree stored in a recursive table
reference. The DBA might know that the data contains no loops but
the database doesn't.I seem to recall Oracle's implementation can do this traversal on
write operations, but maybe that's just their marketing.
It may be possible to solve at least some of it by doing something
similar to hash version of DISTINCT by having an hashtable of tuples
already returned and not descending branches where you have already
been.
Show quoted text
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.comRemember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
"Martijn van Oosterhout" <kleptog@svana.org> writes:
From an implementation point of view, the only difference between
breadth-first and depth-first is that your tuplestore needs to be LIFO
instead of FIFO.
I think it's not so simple. How do you reconcile that concept with the join
plans like merge join or hash join which expect you to be able to be able to
process the records in a specific order?
It sounds like you might have to keep around a stack of started executor nodes
or something but hopefully we can avoid anything like that because, well, ick.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!