[PROPOSAL] Max recursion depth in WITH Queries (Common Table Expressions)

Started by Valery Popovabout 10 years ago7 messages
#1Valery Popov
v.popov@postgrespro.ru
1 attachment(s)

Hi, Hackers

Recursive queries are typically used to deal with hierarchical or
tree-structured data.
In some conditions when data contain relationships with cycles recursive query will loop
unlimited and significantly slows the client's session.
To prevent "infinite" loop I suggest the max_recursion_depth parameter,
which defines the maximum recursion level during the execution of recursive
query.
When max_recursion_depth > 0 and the recursion level of query exceeds
specified value then the execution of query interrupts with error message.
In the MS SQL Server there is MAXRECURSION hint for the same purpose.

Thanks!
--

Regards,
Valery Popov
Postgres Professional http://www.postgrespro.com
The Russian Postgres Company

Attachments:

recurs_depth_v0.patchtext/plain; charset=UTF-8; name=recurs_depth_v0.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 1da7dfb..33a6009 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -6048,6 +6048,23 @@ SET XML OPTION { DOCUMENT | CONTENT };
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-max-recursion-depth" xreflabel="max_recursion_depth">
+      <term><varname>max_recursion_depth</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>max_recursion_depth</> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Sets the maximum recursion depth in  <literal>WITH Queries</> (Common Table Expressions).
+        The default value is 0 and it means no limit for recursion depth (infinite loop is possible). 
+        When <literal>max_recursion_depth</> > 0 and the recursion level of query exceeds specified value 
+   then execution of query interrupts with error message.
+         See <xref linkend="queries-with"> for more information.
+       </para>
+      </listitem>
+     </varlistentry>
+
      </variablelist>
     </sect2>
      <sect2 id="runtime-config-client-format">
diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
index ab49bd7..80a63c8 100644
--- a/doc/src/sgml/queries.sgml
+++ b/doc/src/sgml/queries.sgml
@@ -2216,6 +2216,46 @@ SELECT n FROM t LIMIT 100;
    In each case it effectively provides temporary table(s) that can
    be referred to in the main command.
   </para>
+  
+  <tip>
+   <para>
+    Also it is possible to limit the number of returning rows by setting parameter
+    <literal>max_recursion_depth</> in the <literal>postgresql.conf</> file. 
+    See <xref linkend="guc-max-recursion-depth"> for more information.
+   </para>
+  </tip>
+  
+  <para>
+  Another way to set <literal>max_recursion_depth</> is <literal>SET max_recursion_depth</> command in <literal>psql</>.
+  
+<programlisting>
+postgres=# show max_recursion_depth;
+LOG:  statement: show max_recursion_depth;
+ max_recursion_depth 
+---------------------
+ 0
+(1 row)
+
+postgres=# set max_recursion_depth = 5;
+LOG:  statement: set max_recursion_depth = 5;
+SET
+postgres=# show max_recursion_depth;
+LOG:  statement: show max_recursion_depth;
+ max_recursion_depth 
+---------------------
+ 5
+(1 row)
+</programlisting>  
+   When <literal>max_recursion_depth</> > 0 and the recursion level of query exceeds specified value 
+   then execution of query interrupts with error message like this:  
+
+<programlisting>
+   ERROR:  The statement terminated. The maximum recursion depth 5 has been exhausted before statement completion.
+</programlisting>
+   
+     
+   </para>
+  
  </sect2>
 
  <sect2 id="queries-with-modifying">
diff --git a/src/backend/executor/nodeRecursiveunion.c b/src/backend/executor/nodeRecursiveunion.c
index 8df1639..c616250 100644
--- a/src/backend/executor/nodeRecursiveunion.c
+++ b/src/backend/executor/nodeRecursiveunion.c
@@ -110,6 +110,13 @@ ExecRecursiveUnion(RecursiveUnionState *node)
 	/* 2. Execute recursive term */
 	for (;;)
 	{
+		if ((with_recursive_limit > 0) && (node->ps.recursion_cnt >= with_recursive_limit))
+		{
+			ereport(ERROR,
+				(errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED),
+				errmsg("The statement terminated. The maximum recursion depth %d has been exhausted before statement completion.", with_recursive_limit)));
+			break;
+		}
 		slot = ExecProcNode(innerPlan);
 		if (TupIsNull(slot))
 		{
@@ -132,6 +139,9 @@ ExecRecursiveUnion(RecursiveUnionState *node)
 			innerPlan->chgParam = bms_add_member(innerPlan->chgParam,
 												 plan->wtParam);
 
+			/* go to the next recursion level */
+			node->ps.recursion_cnt++;
+
 			/* and continue fetching from recursive term */
 			continue;
 		}
@@ -261,6 +271,11 @@ ExecInitRecursiveUnion(RecursiveUnion *node, EState *estate, int eflags)
 		build_hash_table(rustate);
 	}
 
+	/*
+	 * Init recursion depth counter.
+	 */
+	rustate->ps.recursion_cnt=0;
+
 	return rustate;
 }
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 230c5cc..c39e8ca 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -112,6 +112,9 @@ extern char *temp_tablespaces;
 extern bool ignore_checksum_failure;
 extern bool synchronize_seqscans;
 
+/* Parameters for controlling recursive depth */
+int	with_recursive_limit;
+
 #ifdef TRACE_SYNCSCAN
 extern bool trace_syncscan;
 #endif
@@ -2669,6 +2672,18 @@ static struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"max_recursion_depth", PGC_USERSET, CUSTOM_OPTIONS,
+			gettext_noop("Sets the maximum recursion depth in WITH RECURSIVE "
+						 "and other CTE operations."),
+			gettext_noop("Sets the maximum recursion depth in WITH RECURSIVE "
+										 "and other CTE operations.")
+		},
+		&with_recursive_limit,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 06dfc06..e5e8e53 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -527,6 +527,7 @@
 #xmlbinary = 'base64'
 #xmloption = 'content'
 #gin_pending_list_limit = 4MB
+#max_recursion_depth = 0         # max depth for CTE range 0-INT_MAX, 0 means infinite
 
 # - Locale and Formatting -
 
diff --git a/src/include/executor/nodeRecursiveunion.h b/src/include/executor/nodeRecursiveunion.h
index 52cacd8..036857f 100644
--- a/src/include/executor/nodeRecursiveunion.h
+++ b/src/include/executor/nodeRecursiveunion.h
@@ -20,5 +20,7 @@ extern RecursiveUnionState *ExecInitRecursiveUnion(RecursiveUnion *node, EState
 extern TupleTableSlot *ExecRecursiveUnion(RecursiveUnionState *node);
 extern void ExecEndRecursiveUnion(RecursiveUnionState *node);
 extern void ExecReScanRecursiveUnion(RecursiveUnionState *node);
+/* Parameters for controlling recursive depth */
+extern int	with_recursive_limit;
 
 #endif   /* NODERECURSIVEUNION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index db5bd7f..eb82c5d 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1059,6 +1059,7 @@ typedef struct PlanState
 	ProjectionInfo *ps_ProjInfo;	/* info for doing tuple projection */
 	bool		ps_TupFromTlist;/* state flag for processing set-valued
 								 * functions in targetlist */
+	int			recursion_cnt;  /* counter for recursion depth */
 } PlanState;
 
 /* ----------------
diff --git a/src/test/regress/expected/with.out b/src/test/regress/expected/with.out
index 2c9226c..d42ce09 100644
--- a/src/test/regress/expected/with.out
+++ b/src/test/regress/expected/with.out
@@ -2256,3 +2256,66 @@ with ordinality as (select 1 as x) select * from ordinality;
  1
 (1 row)
 
+--
+-- RECURSION DEPTH
+-- Check the max_recursion_depth parameter
+--
+CREATE TABLE test_max_recurs ( parent integer, 	child integer);
+INSERT INTO test_max_recurs (parent, child) VALUES
+-- infinite recursion
+(1,2), (2,1),
+--just too deep path
+(100,101), (101,102), (102,103), (103,104), (104,105), (105,0),
+--good
+(200,201), (201,202), (202,203), (203,204), (204,0),
+-- tree-like too much number of nodes
+(300,311), (311,312), (312,0), (300,321), (321,322), (322,0),
+-- tree-like good
+(400,411), (411,412), (412,0), (400,421), (421,0);
+SET max_recursion_depth to 5;
+SHOW max_recursion_depth;
+ max_recursion_depth 
+---------------------
+ 5
+(1 row)
+
+-- Stress test for max_recursion_depth parameter
+WITH RECURSIVE g(id)
+AS (SELECT 1::int UNION ALL SELECT child AS id from test_max_recurs tt, g where tt.parent=g.id)
+SELECT * FROM g;
+ERROR:  The statement terminated. The maximum recursion depth 5 has been exhausted before statement completion.
+WITH RECURSIVE g(id)
+AS (SELECT 100::int UNION ALL SELECT child AS id from test_max_recurs tt, g where tt.parent=g.id)
+SELECT * FROM g;
+ERROR:  The statement terminated. The maximum recursion depth 5 has been exhausted before statement completion.
+WITH RECURSIVE g(id)
+AS (SELECT 200::int UNION ALL SELECT child AS id from test_max_recurs tt, g where tt.parent=g.id)
+SELECT * FROM g;
+ERROR:  The statement terminated. The maximum recursion depth 5 has been exhausted before statement completion.
+WITH RECURSIVE g(id)
+AS (SELECT 300::int UNION ALL SELECT child AS id from test_max_recurs tt, g where tt.parent=g.id)
+SELECT * FROM g;
+ id  
+-----
+ 300
+ 311
+ 321
+ 312
+ 322
+   0
+   0
+(7 rows)
+
+WITH RECURSIVE g(id)
+AS (SELECT 400::int UNION ALL SELECT child AS id from test_max_recurs tt, g where tt.parent=g.id)
+SELECT * FROM g;
+ id  
+-----
+ 400
+ 411
+ 421
+ 412
+   0
+   0
+(6 rows)
+
diff --git a/src/test/regress/sql/with.sql b/src/test/regress/sql/with.sql
index 3fd55f9..cede610 100644
--- a/src/test/regress/sql/with.sql
+++ b/src/test/regress/sql/with.sql
@@ -1018,3 +1018,45 @@ DROP RULE y_rule ON y;
 create table foo (with baz);  -- fail, WITH is a reserved word
 create table foo (with ordinality);  -- fail, WITH is a reserved word
 with ordinality as (select 1 as x) select * from ordinality;
+
+--
+-- RECURSION DEPTH
+-- Check the max_recursion_depth parameter
+--
+CREATE TABLE test_max_recurs ( parent integer, 	child integer);
+
+INSERT INTO test_max_recurs (parent, child) VALUES
+-- infinite recursion
+(1,2), (2,1),
+--just too deep path
+(100,101), (101,102), (102,103), (103,104), (104,105), (105,0),
+--good
+(200,201), (201,202), (202,203), (203,204), (204,0),
+-- tree-like too much number of nodes
+(300,311), (311,312), (312,0), (300,321), (321,322), (322,0),
+-- tree-like good
+(400,411), (411,412), (412,0), (400,421), (421,0);
+
+SET max_recursion_depth to 5;
+SHOW max_recursion_depth;
+
+-- Stress test for max_recursion_depth parameter
+WITH RECURSIVE g(id)
+AS (SELECT 1::int UNION ALL SELECT child AS id from test_max_recurs tt, g where tt.parent=g.id)
+SELECT * FROM g;
+
+WITH RECURSIVE g(id)
+AS (SELECT 100::int UNION ALL SELECT child AS id from test_max_recurs tt, g where tt.parent=g.id)
+SELECT * FROM g;
+
+WITH RECURSIVE g(id)
+AS (SELECT 200::int UNION ALL SELECT child AS id from test_max_recurs tt, g where tt.parent=g.id)
+SELECT * FROM g;
+
+WITH RECURSIVE g(id)
+AS (SELECT 300::int UNION ALL SELECT child AS id from test_max_recurs tt, g where tt.parent=g.id)
+SELECT * FROM g;
+
+WITH RECURSIVE g(id)
+AS (SELECT 400::int UNION ALL SELECT child AS id from test_max_recurs tt, g where tt.parent=g.id)
+SELECT * FROM g;
#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: Valery Popov (#1)
Re: [PROPOSAL] Max recursion depth in WITH Queries (Common Table Expressions)

2015-10-28 8:33 GMT+01:00 Valery Popov <v.popov@postgrespro.ru>:

Hi, Hackers

Recursive queries are typically used to deal with hierarchical or
tree-structured data.
In some conditions when data contain relationships with cycles recursive
query will loop
unlimited and significantly slows the client's session.
To prevent "infinite" loop I suggest the max_recursion_depth parameter,
which defines the maximum recursion level during the execution of recursive
query.
When max_recursion_depth > 0 and the recursion level of query exceeds
specified value then the execution of query interrupts with error message.
In the MS SQL Server there is MAXRECURSION hint for the same purpose.

+1

good idea

Regards

Pavel

Show quoted text

Thanks!
--

Regards,
Valery Popov
Postgres Professional http://www.postgrespro.com
The Russian Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Valery Popov (#1)
Re: [PROPOSAL] Max recursion depth in WITH Queries (Common Table Expressions)

Valery Popov <v.popov@postgrespro.ru> writes:

Recursive queries are typically used to deal with hierarchical or
tree-structured data.
In some conditions when data contain relationships with cycles recursive query will loop
unlimited and significantly slows the client's session.

The standard way of dealing with that is to include logic in the query to
limit the recursion depth, for example

WITH RECURSIVE t(n) AS (
SELECT 1
UNION ALL
SELECT n+1 FROM t WHERE n < 10
)
SELECT n FROM t;

I don't see an example of this technique in the documentation, which maybe
is a documentation improvement opportunity.

To prevent "infinite" loop I suggest the max_recursion_depth parameter,
which defines the maximum recursion level during the execution of recursive
query.

Controlling this via a GUC is a seriously awful idea. We learned a long
time ago to avoid GUCs that have a direct impact on query semantics; the
scope of their effects is just about never what you want.

Also, there are already ways to constrain queries-gone-crazy; particularly
statement_timeout, which has the advantage that it works for other types
of badly-written queries not only this one.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#3)
Re: [PROPOSAL] Max recursion depth in WITH Queries (Common Table Expressions)

2015-10-28 14:33 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:

Valery Popov <v.popov@postgrespro.ru> writes:

Recursive queries are typically used to deal with hierarchical or
tree-structured data.
In some conditions when data contain relationships with cycles

recursive query will loop

unlimited and significantly slows the client's session.

The standard way of dealing with that is to include logic in the query to
limit the recursion depth, for example

WITH RECURSIVE t(n) AS (
SELECT 1
UNION ALL
SELECT n+1 FROM t WHERE n < 10
)
SELECT n FROM t;

I don't see an example of this technique in the documentation, which maybe
is a documentation improvement opportunity.

To prevent "infinite" loop I suggest the max_recursion_depth parameter,
which defines the maximum recursion level during the execution of

recursive

query.

Controlling this via a GUC is a seriously awful idea. We learned a long
time ago to avoid GUCs that have a direct impact on query semantics; the
scope of their effects is just about never what you want.

Also, there are already ways to constrain queries-gone-crazy; particularly
statement_timeout, which has the advantage that it works for other types
of badly-written queries not only this one.

isn't the recursive limits much more a resource limit like work_mem etc?

Regards

Pavel

Show quoted text

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#4)
Re: [PROPOSAL] Max recursion depth in WITH Queries (Common Table Expressions)

Pavel Stehule <pavel.stehule@gmail.com> writes:

2015-10-28 14:33 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:

Also, there are already ways to constrain queries-gone-crazy; particularly
statement_timeout, which has the advantage that it works for other types
of badly-written queries not only this one.

isn't the recursive limits much more a resource limit like work_mem etc?

Exceeding work_mem isn't generally supposed to result in an error --- it
causes, or should cause, the system to shift execution strategy so that
you get the same answer with less memory and more time consumption.

In any case, the question is what purpose this would serve that isn't
already covered perfectly well by existing features like
statement_timeout.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Valery Popov
v.popov@postgrespro.ru
In reply to: Tom Lane (#3)
Re: [PROPOSAL] Max recursion depth in WITH Queries (Common Table Expressions)

28.10.2015 16:33, Tom Lane пїЅпїЅпїЅпїЅпїЅ:

Valery Popov <v.popov@postgrespro.ru> writes:

Recursive queries are typically used to deal with hierarchical or
tree-structured data.
In some conditions when data contain relationships with cycles recursive query will loop
unlimited and significantly slows the client's session.

The standard way of dealing with that is to include logic in the query to
limit the recursion depth, for example

WITH RECURSIVE t(n) AS (
SELECT 1
UNION ALL
SELECT n+1 FROM t WHERE n < 10
)
SELECT n FROM t;

Yes, I agree with this thesis. But I think in some cases would be
better to receive error message and stop execution than results will
incomplete.

--
Regards,
Valery Popov
Postgres Professional http://www.postgrespro.com
The Russian Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Valery Popov (#6)
Re: [PROPOSAL] Max recursion depth in WITH Queries (Common Table Expressions)

Valery Popov <v.popov@postgrespro.ru> writes:

28.10.2015 16:33, Tom Lane �����:

The standard way of dealing with that is to include logic in the query to
limit the recursion depth, for example ...

Yes, I agree with this thesis. But I think in some cases would be
better to receive error message and stop execution than results will
incomplete.

Sure, but you can do that at the SQL level if you have a mind to, as well.

In practice, I think people tend to use recursive queries mainly for data
layouts where the maximum recursion depth isn't terribly clear, so that
setting this GUC to a useful value would be a difficult task anyway.
If you end up setting it to 100X or 1000X more than you think your queries
could possibly recurse, you might as well use some other approach like
statement_timeout, which has got a closer relationship to what you care
about, ie how long you want to wait.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers