Proposing WITH ITERATIVE

Started by Jonah H. Harrisover 5 years ago24 messages

jonah.harris@gmail.com

over 5 years ago

Hey, everyone.

It's been quite a while since I last contributed a patch but, as this is a
new feature, I wanted to gather feedback before doing so. I've found this
functionality, already in use at Universität Tübingen, to be both highly
beneficial in many of my queries as well as a small change to Postgres - a
good trade-off IMO.

As you know, Postgres currently supports SQL:1999 recursive common table
expressions, constructed using WITH RECURSIVE, which permit the computation
of growing relations (e.g., transitive closures.) Although it is possible
to use this recursion for general-purpose iterations, doing so is a
deviation from its intended use case. Accordingly, an iterative-only use of
WITH RECURSIVE often results in sizable overhead and poor optimizer
decisions. In this email, I'd like to propose a similar but
iterative-optimized form of CTE - WITH ITERATIVE.

In that it can reference a relation within its definition, this iterative
variant has similar capabilities as recursive CTEs. In contrast to
recursive CTEs, however, rather than appending new tuples, this variant
performs a replacement of the intermediate relation. As such, the final
result consists of a relation containing tuples computed during the last
iteration only. Why would we want this?

This type of computation pattern is often found in data and graph mining
algorithms. In PageRank, for example, the initial ranks are updated in each
iteration. In clustering algorithms, assignments of data tuples to clusters
are updated in every iteration. Something these types of algorithms have in
common is that they operate on fixed-size datasets, where only specific
values (ranks, assigned clusters, etc.) are updated.

Rather than stopping when no new tuples are generated, WITH ITERATIVE stops
when a user-defined predicate evaluates to true. By providing a
non-appending iteration concept with a while-loop-style stop criterion, we
can massively speed up query execution due to better optimizer decisions.
Although it is possible to implement the types of algorithms mentioned
above using recursive CTEs, this feature has two significant advantages:

First, iterative CTEs consume significantly less memory. Consider a CTE of
N tuples and I iterations. With recursive CTEs, such a relation would grow
to (N * I) tuples. With iterative CTEs, however, all prior iterations are
discarded early. As such, the size of the relation would remain N,
requiring only a maximum of (2 * N) tuples for comparisons of the current
and the prior iteration. Furthermore, in queries where the number of
executed iterations represents the stop criterion, recursive CTEs require
even more additional per-tuple overhead - to carry along the iteration
counter.

Second, iterative CTEs have lower query response times. Because of its
smaller size, the time to scan and process the intermediate relation, to
re-compute ranks, clusters, etc., is significantly reduced.

In short, iterative CTEs retain the flexibility of recursive CTEs, but
offer a significant advantage for algorithms which don't require results
computed throughout all iterations. As this feature deviates only slightly
from WITH RECURSIVE, there is very little code required to support it (~250
loc.)

If there's any interest, I'll clean-up their patch and submit. Thoughts?

--
Jonah H. Harris

Jeff Davis

pgsql@j-davis.com

over 5 years ago

In reply to: Jonah H. Harris (#1)

Re: Proposing WITH ITERATIVE

Hi,

You might get better feedback in a month or so; right now we just got
into feature freeze.

On Mon, 2020-04-27 at 12:52 -0400, Jonah H. Harris wrote:

In that it can reference a relation within its definition, this
iterative variant has similar capabilities as recursive CTEs. In
contrast to recursive CTEs, however, rather than appending new
tuples, this variant performs a replacement of the intermediate
relation. As such, the final result consists of a relation containing
tuples computed during the last iteration only. Why would we want
this?

Can you illustrate with some examples? I get that one is appending and
the other is modifying in-place, but how does this end up looking in
the query language?

This type of computation pattern is often found in data and graph
mining algorithms.

It certainly sounds useful.

Rather than stopping when no new tuples are generated, WITH ITERATIVE
stops when a user-defined predicate evaluates to true.

Why stop when it evaluates to true, and not false?

First, iterative CTEs consume significantly less memory. Consider a
CTE of N tuples and I iterations. With recursive CTEs, such a
relation would grow to (N * I) tuples. With iterative CTEs, however,
all prior iterations are discarded early. As such, the size of the
relation would remain N, requiring only a maximum of (2 * N) tuples
for comparisons of the current and the prior iteration. Furthermore,
in queries where the number of executed iterations represents the
stop criterion, recursive CTEs require even more additional per-tuple
overhead - to carry along the iteration counter.

It seems like the benefit comes from carrying information along within
tuples (by adding to scores or counters) rather than appending tuples.
Is it possible to achieve this in other ways? The recursive CTE
implementation is a very direct implementation of the standard, perhaps
there are smarter approaches?

Regards,
Jeff Davis

David Fetter

david@fetter.org

over 5 years ago

In reply to: Jonah H. Harris (#1)

Re: Proposing WITH ITERATIVE

On Mon, Apr 27, 2020 at 12:52:48PM -0400, Jonah H. Harris wrote:

Hey, everyone.

If there's any interest, I'll clean-up their patch and submit. Thoughts?

Where's the current patch?

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Jonah H. Harris

jonah.harris@gmail.com

over 5 years ago

In reply to: David Fetter (#3)

Re: Proposing WITH ITERATIVE

On Mon, Apr 27, 2020 at 10:32 PM David Fetter <david@fetter.org> wrote:

On Mon, Apr 27, 2020 at 12:52:48PM -0400, Jonah H. Harris wrote:

Hey, everyone.

If there's any interest, I'll clean-up their patch and submit. Thoughts?

Where's the current patch?

It’s private. Hence, why I’m inquiring as to interest.

--
Jonah H. Harris

David Fetter

david@fetter.org

over 5 years ago

In reply to: Jonah H. Harris (#4)

Re: Proposing WITH ITERATIVE

On Mon, Apr 27, 2020 at 10:44:04PM -0400, Jonah H. Harris wrote:

On Mon, Apr 27, 2020 at 10:32 PM David Fetter <david@fetter.org> wrote:

On Mon, Apr 27, 2020 at 12:52:48PM -0400, Jonah H. Harris wrote:

Hey, everyone.

If there's any interest, I'll clean-up their patch and submit. Thoughts?

Where's the current patch?

It’s private. Hence, why I’m inquiring as to interest.

Have the authors agreed to make it available to the project under a
compatible license?

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Jonah H. Harris

jonah.harris@gmail.com

over 5 years ago

In reply to: David Fetter (#5)

Re: Proposing WITH ITERATIVE

On Mon, Apr 27, 2020 at 11:33 PM David Fetter <david@fetter.org> wrote:

Have the authors agreed to make it available to the project under a
compatible license?

If there’s interest, obviously. Otherwise I wouldn’t be asking.

I said from the start why I wasn’t attaching a patch and that I was seeing
feedback. Honestly, David, stop wasting my, and list time, asking pointless
off-topic questions.

--
Jonah H. Harris

Oleksandr Shulgin

oleksandr.shulgin@zalando.de

over 5 years ago

In reply to: Jonah H. Harris (#6)

Re: Proposing WITH ITERATIVE

On Tue, Apr 28, 2020 at 5:49 AM Jonah H. Harris <jonah.harris@gmail.com>
wrote:

On Mon, Apr 27, 2020 at 11:33 PM David Fetter <david@fetter.org> wrote:

Have the authors agreed to make it available to the project under a
compatible license?

If there’s interest, obviously. Otherwise I wouldn’t be asking.

I said from the start why I wasn’t attaching a patch and that I was seeing
feedback. Honestly, David, stop wasting my, and list time, asking
pointless off-topic questions.

Jonah,

I see it the other way round—it could end up as a waste of everyone's time
discussing the details, if the authors don't agree to publish their code in
the first place. Of course, it could also be written from scratch, in
which case I guess someone else from the community (who haven't seen that
private code) would have to take a stab at it, but I believe it helps to
know this in advance.

I also don't see how it "obviously" follows from your two claims: "I've
found this functionality" and "I'll clean-up their patch and submit", that
you have even asked (or, for that matter—found) the authors of that code.

Finally, I'd like to suggest you adopt a more constructive tone and become
familiar with the project's Code of Conduct[1]https://www.postgresql.org/about/policies/coc/, if you haven't yet. I am
certain that constructive, respectful communication from your side will
help the community to focus on the actual details of your proposal.

Kind regards,
--
Alex

[1]: https://www.postgresql.org/about/policies/coc/

Andreas Karlsson

andreas@proxel.se

over 5 years ago

In reply to: Jonah H. Harris (#1)

Re: Proposing WITH ITERATIVE

On 4/27/20 6:52 PM, Jonah H. Harris wrote:> If there's any interest,
I'll clean-up their patch and submit. Thoughts?

Hi,

Do you have any examples of queries where it would help? It is pretty
hard to say how much value some new syntax adds without seeing how it
improves an intended use case.

Andreas

Jonah H. Harris

jonah.harris@gmail.com

over 5 years ago

In reply to: Oleksandr Shulgin (#7)

Re: Proposing WITH ITERATIVE

On Tue, Apr 28, 2020 at 6:16 AM Oleksandr Shulgin <
oleksandr.shulgin@zalando.de> wrote:

will help the community to focus on the actual details of your proposal.

I'd like it if the community would focus on the details of the proposal as
well :)

--
Jonah H. Harris

#10

Jonah H. Harris

jonah.harris@gmail.com

over 5 years ago

In reply to: Andreas Karlsson (#8)

Re: Proposing WITH ITERATIVE

On Tue, Apr 28, 2020 at 6:19 AM Andreas Karlsson <andreas@proxel.se> wrote:

Do you have any examples of queries where it would help? It is pretty
hard to say how much value some new syntax adds without seeing how it
improves an intended use case.

Hey, Andreas.

Thanks for the response. I'm currently working on a few examples per Jeff's
response along with some benchmark information including improvements in
response time and memory usage of the current implementation. In the
meantime, as this functionality has been added to a couple of other
databases and there's academic research on it, if you're interested, here's
a few papers with examples:

http://faculty.neu.edu.cn/cc/zhangyf/papers/2018-ICDCS2018-sqloop.pdf
http://db.in.tum.de/~passing/publications/dm_in_hyper.pdf

--
Jonah H. Harris

#11

Stephen Frost

sfrost@snowman.net

over 5 years ago

In reply to: Jonah H. Harris (#10)

Re: Proposing WITH ITERATIVE

Greetings Jonah!

* Jonah H. Harris (jonah.harris@gmail.com) wrote:

On Tue, Apr 28, 2020 at 6:19 AM Andreas Karlsson <andreas@proxel.se> wrote:

Do you have any examples of queries where it would help? It is pretty
hard to say how much value some new syntax adds without seeing how it
improves an intended use case.

Thanks for the response. I'm currently working on a few examples per Jeff's
response along with some benchmark information including improvements in
response time and memory usage of the current implementation. In the
meantime, as this functionality has been added to a couple of other
databases and there's academic research on it, if you're interested, here's
a few papers with examples:

http://faculty.neu.edu.cn/cc/zhangyf/papers/2018-ICDCS2018-sqloop.pdf
http://db.in.tum.de/~passing/publications/dm_in_hyper.pdf

Nice!

One of the first question that we need to ask though, imv anyway, is- do
the other databases use the WITH ITERATIVE syntax? How many of them?
Are there other approaches? Has this been brought up to the SQL
committee?

In general, we really prefer to avoid extending SQL beyond the standard,
especially in ways that the standard is likely to be expanded. In
other words, we'd really prefer to avoid the risk that the SQL committee
declares WITH ITERATIVE to mean something else in the future, causing us
to have to have a breaking change to become compliant. Now, if all the
other major DB vendors have WITH ITERATIVE and they all work the same
way, then we can have at least some confidence that the SQL committee
will end up defining it in that way and there won't be any issue.

We do have someone on the committee who watches these lists, hopefully
they'll have a chance to comment on this. Perhaps it's already in
discussion in the committee.

Thanks!

Stephen

#12

Jonah H. Harris

jonah.harris@gmail.com

over 5 years ago

In reply to: Jeff Davis (#2)

Re: Proposing WITH ITERATIVE

On Mon, Apr 27, 2020 at 8:50 PM Jeff Davis <pgsql@j-davis.com> wrote:

Hi,

Hey, Jeff. Long time no talk. Good to see you're still on here.

You might get better feedback in a month or so; right now we just got

into feature freeze.

Yep. No hurry. I've just been playing with this and wanted to start getting
feedback. It's a side-project for me anyway, so time is limited.

Can you illustrate with some examples? I get that one is appending and
the other is modifying in-place, but how does this end up looking in
the query language?

I'm putting together a few concrete real-world examples.

Rather than stopping when no new tuples are generated, WITH ITERATIVE

stops when a user-defined predicate evaluates to true.

Why stop when it evaluates to true, and not false?

It's how they implemented it. A few other databases have implemented
similar functionality but, as it's not standard, it's kinda just up to each
implementor. I'm not married to that idea, but it has worked well for me so
far.

It seems like the benefit comes from carrying information along within

tuples (by adding to scores or counters) rather than appending tuples.
Is it possible to achieve this in other ways? The recursive CTE
implementation is a very direct implementation of the standard, perhaps
there are smarter approaches?

Yeah, in that specific case, one of the other implementations seems to
carry the counters along in the executor itself. But, as not all uses of
this functionality are iteration-count-based, I think that's a little
limiting. Using a terminator expression (of some kind) seems most
adaptable, I think. I'll give some examples of both types of cases.

--
Jonah H. Harris

#13

Jonah H. Harris

jonah.harris@gmail.com

over 5 years ago

In reply to: Stephen Frost (#11)

Re: Proposing WITH ITERATIVE

On Tue, Apr 28, 2020 at 11:51 AM Stephen Frost <sfrost@snowman.net> wrote:

Greetings Jonah!

Hey, Stephen. Hope you're doing well, man!

One of the first question that we need to ask though, imv anyway, is- do
the other databases use the WITH ITERATIVE syntax? How many of them?
Are there other approaches? Has this been brought up to the SQL
committee?

There are four that I'm aware of, but I'll put together a full list. I
don't believe it's been proposed to the standards committee, but I'll see
if I can find transcripts/proposals. I imagine those are all still public.

In general, we really prefer to avoid extending SQL beyond the standard,

especially in ways that the standard is likely to be expanded. In
other words, we'd really prefer to avoid the risk that the SQL committee
declares WITH ITERATIVE to mean something else in the future, causing us
to have to have a breaking change to become compliant.

Agreed. That's my main concern as well.

Now, if all the other major DB vendors have WITH ITERATIVE and they all
work the same
way, then we can have at least some confidence that the SQL committee
will end up defining it in that way and there won't be any issue.

This is where it sucks, as each vendor has done it differently. One uses
WITH ITERATE, one uses WITH ITERATIVE, others use, what I consider to be, a
nasty operator-esque style... I think ITERATIVE makes the most sense, but
it does create a future contention as that definitely seems like the syntax
the committee would use as well.

Oracle ran into this issue as well, which is why they added an additional
clause to WITH that permits selection of depth/breadth-first search et al.
While it's kinda nasty, we could always similarly amend WITH RECURSIVE to
add an additional ITERATE or something to the tail of the with_clause rule.
But, that's why I'm looking for feedback :)

We do have someone on the committee who watches these lists, hopefully

they'll have a chance to comment on this. Perhaps it's already in
discussion in the committee.

That would be awesome.

--
Jonah H. Harris

#14

Jonah H. Harris

jonah.harris@gmail.com

over 5 years ago

In reply to: Jeff Davis (#2)

1 attachment(s)

Re: Proposing WITH ITERATIVE

On Mon, Apr 27, 2020 at 8:50 PM Jeff Davis <pgsql@j-davis.com> wrote:

Can you illustrate with some examples? I get that one is appending and
the other is modifying in-place, but how does this end up looking in
the query language?

To ensure credit is given to the original authors, the initial patch I'm
working with (against Postgres 11 and 12) came from Denis Hirn, Torsten
Grust, and Christian Duta. Attached is a quick-and-dirty merge of that
patch that applies cleanly against 13-devel. It is not solid, at all, but
demonstrates the functionality. At present, my updates can be found at
https://github.com/jonahharris/postgres/tree/with-iterative

As the patch makes use of additional booleans for iteration, if there's
interest in incorporating this functionality, I'd like to discuss with Tom,
Robert, et al regarding the current use of booleans for CTE recursion
differentiation in parsing and planning. In terms of making it
production-ready, I think the cleanest method would be to use a bitfield
(rather than booleans) to differentiate recursive and iterative CTEs.
Though, as that would touch more code, it's obviously up for discussion.

I'm working on a few good standalone examples of PageRank, shortest path,
etc. But, the simplest Fibonacci example can be found here:

EXPLAIN ANALYZE
WITH RECURSIVE fib_sum (iteration, previous_number, new_number)
AS (SELECT 1, 0::numeric, 1::numeric
UNION ALL
SELECT (iteration + 1), new_number, (previous_number + new_number)
FROM fib_sum
WHERE iteration <= 10000)
SELECT r.iteration, r.new_number
FROM fib_sum r
ORDER BY 1 DESC
LIMIT 1;

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------
Limit (cost=3.81..3.81 rows=1 width=36) (actual time=44.418..44.418
rows=1 loops=1)
CTE fib_sum
-> Recursive Union (cost=0.00..3.03 rows=31 width=68) (actual
time=0.005..14.002 rows=10001 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=68) (actual
time=0.004..0.004 rows=1 loops=1)
-> WorkTable Scan on fib_sum (cost=0.00..0.24 rows=3 width=68)
(actual time=0.001..0.001 rows=1 loops=10001)
Filter: (iteration <= 10000)
Rows Removed by Filter: 0
-> Sort (cost=0.78..0.85 rows=31 width=36) (actual time=44.417..44.417
rows=1 loops=1)
Sort Key: r.iteration DESC
Sort Method: top-N heapsort Memory: 27kB
-> CTE Scan on fib_sum r (cost=0.00..0.62 rows=31 width=36)
(actual time=0.009..42.660 rows=10001 loops=1)
Planning Time: 0.331 ms
Execution Time: 45.949 ms
(13 rows)

-- No ORDER/LIMIT is required with ITERATIVE as only a single tuple is
present
EXPLAIN ANALYZE
WITH ITERATIVE fib_sum (iteration, previous_number, new_number)
AS (SELECT 1, 0::numeric, 1::numeric
UNION ALL
SELECT (iteration + 1), new_number, (previous_number + new_number)
FROM fib_sum
WHERE iteration <= 10000)
SELECT r.iteration, r.new_number
FROM fib_sum r;

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------
CTE Scan on fib_sum r (cost=3.03..3.65 rows=31 width=36) (actual
time=11.626..11.627 rows=1 loops=1)
CTE fib_sum
-> Recursive Union (cost=0.00..3.03 rows=31 width=68) (actual
time=11.621..11.621 rows=1 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=68) (actual
time=0.001..0.001 rows=1 loops=1)
-> WorkTable Scan on fib_sum (cost=0.00..0.24 rows=3 width=68)
(actual time=0.001..0.001 rows=1 loops=10001)
Filter: (iteration <= 10000)
Rows Removed by Filter: 0
Planning Time: 0.068 ms
Execution Time: 11.651 ms
(9 rows)

--
Jonah H. Harris

Attachments:

with_iterative_v1.patchapplication/octet-stream; name=with_iterative_v1.patchDownload

diff --git a/src/backend/executor/nodeRecursiveunion.c b/src/backend/executor/nodeRecursiveunion.c
index 620414a1ed..50f59ff0b9 100644
--- a/src/backend/executor/nodeRecursiveunion.c
+++ b/src/backend/executor/nodeRecursiveunion.c
@@ -104,7 +104,9 @@ ExecRecursiveUnion(PlanState *pstate)
 			/* Each non-duplicate tuple goes to the working table ... */
 			tuplestore_puttupleslot(node->working_table, slot);
 			/* ... and to the caller */
-			return slot;
+			if (!node->iterating) {
+				return slot;
+			}
 		}
 		node->recursing = true;
 	}
@@ -116,8 +118,33 @@ ExecRecursiveUnion(PlanState *pstate)
 		if (TupIsNull(slot))
 		{
 			/* Done if there's nothing in the intermediate table */
-			if (node->intermediate_empty)
+			if (node->intermediate_empty) {
+				if (!node->iterating) {
+					return NULL;
+				}
+
+				if (!TupIsNull(pstate->ps_ResultTupleSlot)) {
+					tuplestore_select_read_pointer(node->working_table, 1);
+					(void) tuplestore_gettupleslot(node->working_table, true, true, pstate->ps_ResultTupleSlot);
+					tuplestore_select_read_pointer(node->working_table, 0);
+					return pstate->ps_ResultTupleSlot;
+				} else {
+					tuplestore_alloc_read_pointer(node->working_table, 0);
+
+					tuplestore_rescan(node->working_table);
+
+					tuplestore_copy_read_pointer(node->working_table, 0, 1);
+					tuplestore_select_read_pointer(node->working_table, 1);
+
+					tuplestore_gettupleslot(node->working_table, true, true, pstate->ps_ResultTupleSlot);
+					tuplestore_select_read_pointer(node->working_table, 0);
+					return pstate->ps_ResultTupleSlot;
+				}
+
+				 return slot;
+
 				break;
+			}
 
 			/* done with old working table ... */
 			tuplestore_end(node->working_table);
@@ -153,10 +180,16 @@ ExecRecursiveUnion(PlanState *pstate)
 		node->intermediate_empty = false;
 		tuplestore_puttupleslot(node->intermediate_table, slot);
 		/* ... and return it */
-		return slot;
+		if (!node->iterating) {
+			return slot;
+		}
 	}
 
-	return NULL;
+	if (node->iterating) {
+		return pstate->ps_ResultTupleSlot;
+	} else {
+		return NULL;
+	}
 }
 
 /* ----------------------------------------------------------------
@@ -188,6 +221,7 @@ ExecInitRecursiveUnion(RecursiveUnion *node, EState *estate, int eflags)
 
 	/* initialize processing state */
 	rustate->recursing = false;
+	rustate->iterating = node->iterative;
 	rustate->intermediate_empty = true;
 	rustate->working_table = tuplestore_begin_heap(false, false, work_mem);
 	rustate->intermediate_table = tuplestore_begin_heap(false, false, work_mem);
@@ -233,6 +267,11 @@ ExecInitRecursiveUnion(RecursiveUnion *node, EState *estate, int eflags)
 	 */
 	ExecInitResultTypeTL(&rustate->ps);
 
+	if (rustate->iterating) {
+		/* XXX FIX */
+		rustate->ps.ps_ResultTupleSlot = ExecAllocTableSlot(&estate->es_tupleTable, NULL, &TTSOpsMinimalTuple);
+	}
+
 	/*
 	 * Initialize result tuple type.  (Note: we have to set up the result type
 	 * before initializing child nodes, because nodeWorktablescan.c expects it
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 491452ae2d..89ccc82fd6 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2556,6 +2556,7 @@ _copyWithClause(const WithClause *from)
 
 	COPY_NODE_FIELD(ctes);
 	COPY_SCALAR_FIELD(recursive);
+	COPY_SCALAR_FIELD(iterative);
 	COPY_LOCATION_FIELD(location);
 
 	return newnode;
@@ -2599,6 +2600,7 @@ _copyCommonTableExpr(const CommonTableExpr *from)
 	COPY_NODE_FIELD(ctequery);
 	COPY_LOCATION_FIELD(location);
 	COPY_SCALAR_FIELD(cterecursive);
+	COPY_SCALAR_FIELD(cteiterative);
 	COPY_SCALAR_FIELD(cterefcount);
 	COPY_NODE_FIELD(ctecolnames);
 	COPY_NODE_FIELD(ctecoltypes);
@@ -3066,6 +3068,7 @@ _copyQuery(const Query *from)
 	COPY_SCALAR_FIELD(hasSubLinks);
 	COPY_SCALAR_FIELD(hasDistinctOn);
 	COPY_SCALAR_FIELD(hasRecursive);
+	COPY_SCALAR_FIELD(hasIterative);
 	COPY_SCALAR_FIELD(hasModifyingCTE);
 	COPY_SCALAR_FIELD(hasForUpdate);
 	COPY_SCALAR_FIELD(hasRowSecurity);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index bb1565467d..e6da740d6d 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -468,6 +468,7 @@ _outRecursiveUnion(StringInfo str, const RecursiveUnion *node)
 	WRITE_OID_ARRAY(dupOperators, node->numCols);
 	WRITE_OID_ARRAY(dupCollations, node->numCols);
 	WRITE_LONG_FIELD(numGroups);
+	WRITE_BOOL_FIELD(iterative);
 }
 
 static void
@@ -2254,6 +2255,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
 	WRITE_BOOL_FIELD(hasHavingQual);
 	WRITE_BOOL_FIELD(hasPseudoConstantQuals);
 	WRITE_BOOL_FIELD(hasRecursion);
+	WRITE_BOOL_FIELD(hasIteration);
 	WRITE_INT_FIELD(wt_param_id);
 	WRITE_BITMAPSET_FIELD(curOuterRels);
 	WRITE_NODE_FIELD(curOuterParams);
@@ -2942,6 +2944,7 @@ _outQuery(StringInfo str, const Query *node)
 	WRITE_BOOL_FIELD(hasSubLinks);
 	WRITE_BOOL_FIELD(hasDistinctOn);
 	WRITE_BOOL_FIELD(hasRecursive);
+	WRITE_BOOL_FIELD(hasIterative);
 	WRITE_BOOL_FIELD(hasModifyingCTE);
 	WRITE_BOOL_FIELD(hasForUpdate);
 	WRITE_BOOL_FIELD(hasRowSecurity);
@@ -3042,6 +3045,7 @@ _outWithClause(StringInfo str, const WithClause *node)
 
 	WRITE_NODE_FIELD(ctes);
 	WRITE_BOOL_FIELD(recursive);
+	WRITE_BOOL_FIELD(iterative);
 	WRITE_LOCATION_FIELD(location);
 }
 
@@ -3056,6 +3060,7 @@ _outCommonTableExpr(StringInfo str, const CommonTableExpr *node)
 	WRITE_NODE_FIELD(ctequery);
 	WRITE_LOCATION_FIELD(location);
 	WRITE_BOOL_FIELD(cterecursive);
+	WRITE_BOOL_FIELD(cteiterative);
 	WRITE_INT_FIELD(cterefcount);
 	WRITE_NODE_FIELD(ctecolnames);
 	WRITE_NODE_FIELD(ctecoltypes);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 644e0f5fc7..2333b7f2bf 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -260,6 +260,7 @@ _readQuery(void)
 	READ_BOOL_FIELD(hasSubLinks);
 	READ_BOOL_FIELD(hasDistinctOn);
 	READ_BOOL_FIELD(hasRecursive);
+	READ_BOOL_FIELD(hasIterative);
 	READ_BOOL_FIELD(hasModifyingCTE);
 	READ_BOOL_FIELD(hasForUpdate);
 	READ_BOOL_FIELD(hasRowSecurity);
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 255f56b827..26d215b6ec 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -2302,7 +2302,7 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	/* Generate a subroot and Paths for the subquery */
 	rel->subroot = subquery_planner(root->glob, subquery,
 									root,
-									false, tuple_fraction);
+									false, tuple_fraction, false);
 
 	/* Isolate the params needed by this specific subplan */
 	rel->subplan_params = root->plan_params;
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 9941dfe65e..7586edc6c7 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -214,7 +214,8 @@ static RecursiveUnion *make_recursive_union(List *tlist,
 											Plan *righttree,
 											int wtParam,
 											List *distinctList,
-											long numGroups);
+											long numGroups,
+											bool iterate);
 static BitmapAnd *make_bitmap_and(List *bitmapplans);
 static BitmapOr *make_bitmap_or(List *bitmapplans);
 static NestLoop *make_nestloop(List *tlist,
@@ -2588,7 +2589,8 @@ create_recursiveunion_plan(PlannerInfo *root, RecursiveUnionPath *best_path)
 								rightplan,
 								best_path->wtParam,
 								best_path->distinctList,
-								numGroups);
+								numGroups,
+								best_path->iterative);
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
@@ -5549,7 +5551,8 @@ make_recursive_union(List *tlist,
 					 Plan *righttree,
 					 int wtParam,
 					 List *distinctList,
-					 long numGroups)
+					 long numGroups,
+					 bool iterate)
 {
 	RecursiveUnion *node = makeNode(RecursiveUnion);
 	Plan	   *plan = &node->plan;
@@ -5595,6 +5598,7 @@ make_recursive_union(List *tlist,
 		node->dupCollations = dupCollations;
 	}
 	node->numGroups = numGroups;
+	node->iterative = iterate;
 
 	return node;
 }
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index e664eb18c0..c191325bf2 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -315,6 +315,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	glob->lastPlanNodeId = 0;
 	glob->transientPlan = false;
 	glob->dependsOnRole = false;
+	glob->iterativePlan = parse->hasIterative;
 
 	/*
 	 * Assess whether it's feasible to use parallel mode for this query. We
@@ -403,7 +404,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 
 	/* primary planning entry point (may recurse for subqueries) */
 	root = subquery_planner(glob, parse, NULL,
-							false, tuple_fraction);
+							false, tuple_fraction, parse->hasIterative);
 
 	/* Select best Path and turn it into a Plan */
 	final_rel = fetch_upper_rel(root, UPPERREL_FINAL, NULL);
@@ -596,7 +597,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 PlannerInfo *
 subquery_planner(PlannerGlobal *glob, Query *parse,
 				 PlannerInfo *parent_root,
-				 bool hasRecursion, double tuple_fraction)
+				 bool hasRecursion, double tuple_fraction, bool hasIteration)
 {
 	PlannerInfo *root;
 	List	   *newWithCheckOptions;
@@ -630,6 +631,7 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
 	root->qual_security_level = 0;
 	root->inhTargetKind = INHKIND_NONE;
 	root->hasRecursion = hasRecursion;
+	root->hasIteration = hasIteration;
 	if (hasRecursion)
 		root->wt_param_id = assign_special_exec_param(root);
 	else
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index b02fcb9bfe..272334ea5b 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -219,7 +219,7 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 	/* Generate Paths for the subquery */
 	subroot = subquery_planner(root->glob, subquery,
 							   root,
-							   false, tuple_fraction);
+							   false, tuple_fraction, subquery->hasIterative);
 
 	/* Isolate the params needed by this specific subplan */
 	plan_params = root->plan_params;
@@ -266,7 +266,7 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 			/* Generate Paths for the ANY subquery; we'll need all rows */
 			subroot = subquery_planner(root->glob, subquery,
 									   root,
-									   false, 0.0);
+									   false, 0.0, false);
 
 			/* Isolate the params needed by this specific subplan */
 			plan_params = root->plan_params;
@@ -917,7 +917,7 @@ SS_process_ctes(PlannerInfo *root)
 		 */
 		subroot = subquery_planner(root->glob, subquery,
 								   root,
-								   cte->cterecursive, 0.0);
+								   cte->cterecursive, 0.0, (cte->cteiterative || root->hasIteration));
 
 		/*
 		 * Since the current query level doesn't yet contain any RTEs, it
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 951aed80e7..b8eedbcc47 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -239,7 +239,8 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		subroot = rel->subroot = subquery_planner(root->glob, subquery,
 												  root,
 												  false,
-												  root->tuple_fraction);
+												  root->tuple_fraction,
+												  root->hasIteration);
 
 		/*
 		 * It should not be possible for the primitive query to contain any
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index e991385059..f5db21f344 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -3422,6 +3422,7 @@ create_recursiveunion_path(PlannerInfo *root,
 	pathnode->distinctList = distinctList;
 	pathnode->wtParam = wtParam;
 	pathnode->numGroups = numGroups;
+	pathnode->iterative = (root->hasIteration || root->parse->hasIterative);
 
 	cost_recursive_union(&pathnode->path, leftpath, rightpath);
 
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index eee9c33637..6d21f7dc39 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -406,6 +406,7 @@ transformDeleteStmt(ParseState *pstate, DeleteStmt *stmt)
 	if (stmt->withClause)
 	{
 		qry->hasRecursive = stmt->withClause->recursive;
+		qry->hasIterative = stmt->withClause->iterative;
 		qry->cteList = transformWithClause(pstate, stmt->withClause);
 		qry->hasModifyingCTE = pstate->p_hasModifyingCTE;
 	}
@@ -492,6 +493,7 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
 	if (stmt->withClause)
 	{
 		qry->hasRecursive = stmt->withClause->recursive;
+		qry->hasIterative = stmt->withClause->iterative;
 		qry->cteList = transformWithClause(pstate, stmt->withClause);
 		qry->hasModifyingCTE = pstate->p_hasModifyingCTE;
 	}
@@ -1199,6 +1201,7 @@ transformSelectStmt(ParseState *pstate, SelectStmt *stmt)
 	if (stmt->withClause)
 	{
 		qry->hasRecursive = stmt->withClause->recursive;
+		qry->hasIterative = stmt->withClause->iterative;
 		qry->cteList = transformWithClause(pstate, stmt->withClause);
 		qry->hasModifyingCTE = pstate->p_hasModifyingCTE;
 	}
@@ -1360,6 +1363,7 @@ transformValuesClause(ParseState *pstate, SelectStmt *stmt)
 	if (stmt->withClause)
 	{
 		qry->hasRecursive = stmt->withClause->recursive;
+		qry->hasIterative = stmt->withClause->iterative;
 		qry->cteList = transformWithClause(pstate, stmt->withClause);
 		qry->hasModifyingCTE = pstate->p_hasModifyingCTE;
 	}
@@ -1644,6 +1648,7 @@ transformSetOperationStmt(ParseState *pstate, SelectStmt *stmt)
 	if (withClause)
 	{
 		qry->hasRecursive = withClause->recursive;
+		qry->hasIterative = withClause->iterative;
 		qry->cteList = transformWithClause(pstate, withClause);
 		qry->hasModifyingCTE = pstate->p_hasModifyingCTE;
 	}
@@ -2239,6 +2244,7 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
 	if (stmt->withClause)
 	{
 		qry->hasRecursive = stmt->withClause->recursive;
+		qry->hasIterative = stmt->withClause->iterative;
 		qry->cteList = transformWithClause(pstate, stmt->withClause);
 		qry->hasModifyingCTE = pstate->p_hasModifyingCTE;
 	}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 3c78f2d1b5..fc14c06fdf 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -662,7 +662,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P INCLUDE
 	INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
 	INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
-	INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
+	INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION ITERATIVE
 
 	JOIN
 
@@ -11559,6 +11559,7 @@ with_clause:
 				$$ = makeNode(WithClause);
 				$$->ctes = $2;
 				$$->recursive = false;
+				$$->iterative = false;
 				$$->location = @1;
 			}
 		| WITH_LA cte_list
@@ -11566,6 +11567,7 @@ with_clause:
 				$$ = makeNode(WithClause);
 				$$->ctes = $2;
 				$$->recursive = false;
+				$$->iterative = false;
 				$$->location = @1;
 			}
 		| WITH RECURSIVE cte_list
@@ -11573,6 +11575,15 @@ with_clause:
 				$$ = makeNode(WithClause);
 				$$->ctes = $3;
 				$$->recursive = true;
+				$$->iterative = false;
+				$$->location = @1;
+			}
+		| WITH ITERATIVE cte_list
+			{
+				$$ = makeNode(WithClause);
+				$$->ctes = $3;
+				$$->recursive = true;
+				$$->iterative = true;
 				$$->location = @1;
 			}
 		;
@@ -15387,6 +15398,7 @@ unreserved_keyword:
 			| INSTEAD
 			| INVOKER
 			| ISOLATION
+			| ITERATIVE
 			| KEY
 			| LABEL
 			| LANGUAGE
diff --git a/src/backend/parser/parse_cte.c b/src/backend/parser/parse_cte.c
index 1fca7485ca..cb696220e6 100644
--- a/src/backend/parser/parse_cte.c
+++ b/src/backend/parser/parse_cte.c
@@ -135,6 +135,7 @@ transformWithClause(ParseState *pstate, WithClause *withClause)
 		}
 
 		cte->cterecursive = false;
+		cte->cteiterative = withClause->iterative;
 		cte->cterefcount = 0;
 
 		if (!IsA(cte->ctequery, SelectStmt))
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 4fee043bb2..088265ba02 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1266,6 +1266,7 @@ typedef struct MergeAppendState
  *		RecursiveUnionState is used for performing a recursive union.
  *
  *		recursing			T when we're done scanning the non-recursive term
+ *		iterating			T when we're done scanning the non-recursive term
  *		intermediate_empty	T if intermediate_table is currently empty
  *		working_table		working table (to be scanned by recursive term)
  *		intermediate_table	current recursive output (next generation of WT)
@@ -1275,6 +1276,7 @@ typedef struct RecursiveUnionState
 {
 	PlanState	ps;				/* its first field is NodeTag */
 	bool		recursing;
+	bool		iterating;
 	bool		intermediate_empty;
 	Tuplestorestate *working_table;
 	Tuplestorestate *intermediate_table;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 5e1ffafb91..551e62c1c1 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -128,6 +128,7 @@ typedef struct Query
 	bool		hasSubLinks;	/* has subquery SubLink */
 	bool		hasDistinctOn;	/* distinctClause is from DISTINCT ON */
 	bool		hasRecursive;	/* WITH RECURSIVE was specified */
+	bool		hasIterative;	/* WITH ITERATIVE was specified */
 	bool		hasModifyingCTE;	/* has INSERT/UPDATE/DELETE in WITH */
 	bool		hasForUpdate;	/* FOR [KEY] UPDATE/SHARE was specified */
 	bool		hasRowSecurity; /* rewriter has applied some RLS policy */
@@ -1396,7 +1397,8 @@ typedef struct WithClause
 {
 	NodeTag		type;
 	List	   *ctes;			/* list of CommonTableExprs */
-	bool		recursive;		/* true = WITH RECURSIVE */
+	bool		recursive;		/* true = WITH RECURSIVE (or ITERATIVE) */
+	bool		iterative;		/* true = WITH ITERATIVE */
 	int			location;		/* token location, or -1 if unknown */
 } WithClause;
 
@@ -1455,6 +1457,7 @@ typedef struct CommonTableExpr
 	int			location;		/* token location, or -1 if unknown */
 	/* These fields are set during parse analysis: */
 	bool		cterecursive;	/* is this CTE actually recursive? */
+	bool		cteiterative;	/* is this CTE iterative? */
 	int			cterefcount;	/* number of RTEs referencing this CTE
 								 * (excluding internal self-references) */
 	List	   *ctecolnames;	/* list of output column names */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 1c83772d62..e8df896ffc 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -147,6 +147,7 @@ typedef struct PlannerGlobal
 	char		maxParallelHazard;	/* worst PROPARALLEL hazard level */
 
 	PartitionDirectory partition_directory; /* partition descriptors */
+	bool iterativePlan;
 } PlannerGlobal;
 
 /* macro for fetching the Plan associated with a SubPlan node */
@@ -348,6 +349,7 @@ struct PlannerInfo
 	bool		hasPseudoConstantQuals; /* true if any RestrictInfo has
 										 * pseudoconstant = true */
 	bool		hasRecursion;	/* true if planning a recursive WITH item */
+	bool		hasIteration;	/* true if planning an iteration-optimized recursive WITH item */
 
 	/* These fields are used only when hasRecursion is true: */
 	int			wt_param_id;	/* PARAM_EXEC ID for the work table */
@@ -1787,6 +1789,7 @@ typedef struct RecursiveUnionPath
 	List	   *distinctList;	/* SortGroupClauses identifying target cols */
 	int			wtParam;		/* ID of Param representing work table */
 	double		numGroups;		/* estimated number of groups in input */
+	bool		iterative;		/* WITH ITERATIVE */
 } RecursiveUnionPath;
 
 /*
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 55f363f70c..b58f82f82b 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -303,6 +303,7 @@ typedef struct RecursiveUnion
 	Oid		   *dupOperators;	/* equality operators to compare with */
 	Oid		   *dupCollations;
 	long		numGroups;		/* estimated number of groups in input */
+	bool		iterative;		/* WITH ITERATIVE */
 } RecursiveUnion;
 
 /* ----------------
diff --git a/src/include/optimizer/planner.h b/src/include/optimizer/planner.h
index beb7dbbcbe..2521bc79a2 100644
--- a/src/include/optimizer/planner.h
+++ b/src/include/optimizer/planner.h
@@ -44,7 +44,9 @@ extern PlannedStmt *standard_planner(Query *parse, const char *query_string,
 
 extern PlannerInfo *subquery_planner(PlannerGlobal *glob, Query *parse,
 									 PlannerInfo *parent_root,
-									 bool hasRecursion, double tuple_fraction);
+									 bool hasRecursion, double tuple_fraction, bool hasIterate);
+
+extern bool is_dummy_plan(Plan *plan);
 
 extern RowMarkType select_rowmark_type(RangeTblEntry *rte,
 									   LockClauseStrength strength);
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 08f22ce211..8a76c43b9b 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -220,6 +220,7 @@ PG_KEYWORD("invoker", INVOKER, UNRESERVED_KEYWORD)
 PG_KEYWORD("is", IS, TYPE_FUNC_NAME_KEYWORD)
 PG_KEYWORD("isnull", ISNULL, TYPE_FUNC_NAME_KEYWORD)
 PG_KEYWORD("isolation", ISOLATION, UNRESERVED_KEYWORD)
+PG_KEYWORD("iterative", ITERATIVE, UNRESERVED_KEYWORD)
 PG_KEYWORD("join", JOIN, TYPE_FUNC_NAME_KEYWORD)
 PG_KEYWORD("key", KEY, UNRESERVED_KEYWORD)
 PG_KEYWORD("label", LABEL, UNRESERVED_KEYWORD)

#15

Fabien COELHO

coelho@cri.ensmp.fr

over 5 years ago

In reply to: Jonah H. Harris (#14)

Re: Proposing WITH ITERATIVE

Hello Jonah,

Nice.

-- No ORDER/LIMIT is required with ITERATIVE as only a single tuple is
present
EXPLAIN ANALYZE
WITH ITERATIVE fib_sum (iteration, previous_number, new_number)
AS (SELECT 1, 0::numeric, 1::numeric
UNION ALL
SELECT (iteration + 1), new_number, (previous_number + new_number)
FROM fib_sum
WHERE iteration <= 10000)
SELECT r.iteration, r.new_number
FROM fib_sum r;

Nice.

My 0,02€ about the feature design:

I'm wondering about how to use such a feature in the context of WITH query
with several queries having different behaviors. Currently "WITH"
introduces a named-query (like a view), "WITH RECURSIVE" introduces a mix
of recursive and named queries, pg really sees whether each one is
recursive or not, and "RECURSIVE" is required but could just be guessed.

Now that there could be 3 variants in the mix, and for the feature to be
orthogonal I think that it should be allowed. However, there is no obvious
way to distinguish a RECURSIVE from an ITERATIVE, as it is more a
behavioral thing than a structural one. This suggests allowing to tag each
query somehow, eg before, which would be closer to the current approach:

WITH
foo(i) AS (simple select),
RECURSIVE bla(i) AS (recursive select),
ITERATIVE blup(i) AS (iterative select),

or maybe after AS, which may be clearer because closer to the actual
query, which looks better to me:

WITH
foo(i) AS (simple select),
bla(i) AS RECURSIVE (recursive select),
boo(i) AS ITERATIVE (iterative select),
…

Also, with 3 cases I would prefer that the default has a name so someone
can talk about it otherwise than saying "default". Maybe SIMPLE would
suffice, or something else. ISTM that as nothing is expected between AS
and the open parenthesis, there is no need to have a reserved keyword,
which is a good thing for the parser.

--
Fabien.

#16

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

over 5 years ago

In reply to: Fabien COELHO (#15)

Re: Proposing WITH ITERATIVE

On 2020-04-29 07:09, Fabien COELHO wrote:

I'm wondering about how to use such a feature in the context of WITH query
with several queries having different behaviors. Currently "WITH"
introduces a named-query (like a view), "WITH RECURSIVE" introduces a mix
of recursive and named queries, pg really sees whether each one is
recursive or not, and "RECURSIVE" is required but could just be guessed.

Yeah the RECURSIVE vs ITERATIVE is a bit of a red herring here. As you
say, the RECURSIVE keyword doesn't specify the processing but marks the
fact that the specification of the query is recursive.

I think a syntax that would fit better within the existing framework
would be something like

WITH RECURSIVE t AS (
SELECT base case
REPLACE ALL -- instead of UNION ALL
SELECT recursive case
)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#17

Jonah H. Harris

jonah.harris@gmail.com

over 5 years ago

In reply to: Peter Eisentraut (#16)

Re: Proposing WITH ITERATIVE

On Wed, Apr 29, 2020 at 7:22 AM Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> wrote:

Yeah the RECURSIVE vs ITERATIVE is a bit of a red herring here. As you
say, the RECURSIVE keyword doesn't specify the processing but marks the
fact that the specification of the query is recursive.

Agreed. I started thinking through Fabien's response last night.

I think a syntax that would fit better within the existing framework

would be something like

WITH RECURSIVE t AS (
SELECT base case
REPLACE ALL -- instead of UNION ALL
SELECT recursive case
)

I was originally thinking more along the lines of Fabien's approach, but
this is similarly interesting.

--
Jonah H. Harris

#18

Corey Huinker

corey.huinker@gmail.com

over 5 years ago

In reply to: Jonah H. Harris (#17)

Re: Proposing WITH ITERATIVE

On Wed, Apr 29, 2020 at 10:34 AM Jonah H. Harris <jonah.harris@gmail.com>
wrote:

On Wed, Apr 29, 2020 at 7:22 AM Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> wrote:

Yeah the RECURSIVE vs ITERATIVE is a bit of a red herring here. As you
say, the RECURSIVE keyword doesn't specify the processing but marks the
fact that the specification of the query is recursive.

Agreed. I started thinking through Fabien's response last night.

I think a syntax that would fit better within the existing framework

would be something like

WITH RECURSIVE t AS (
SELECT base case
REPLACE ALL -- instead of UNION ALL
SELECT recursive case
)

I was originally thinking more along the lines of Fabien's approach, but
this is similarly interesting.

Obviously I'm very concerned about doing something that the SQL Standard
will clobber somewhere down the road. Having said that, the recursive
syntax always struck me as awkward even by SQL standards.

Perhaps something like this would be more readable

WITH t AS (
UPDATE ( SELECT 1 AS ctr, 'x' as val )
SET ctr = ctr + 1, val = val || 'x'
WHILE ctr <= 100
RETURNING ctr, val
)

The notion of an UPDATE on an ephemeral subquery isn't that special, see
"subquery2" in
https://docs.oracle.com/cd/B19306_01/appdev.102/b14261/update_statement.htm ,
so the only syntax here without precedence is dropping a WHILE into an
UPDATE statement.

#19

Fabien COELHO

coelho@cri.ensmp.fr

over 5 years ago

In reply to: Corey Huinker (#18)

Re: Proposing WITH ITERATIVE

Hello Corey, Hello Peter,

My 0.02 ï¿œ about the alternative syntaxes:

Peter:

I think a syntax that would fit better within the existing framework
would be something like

WITH RECURSIVE t AS (
SELECT base case
REPLACE ALL -- instead of UNION ALL
SELECT recursive case
)

A good point about this approach is that the replacement semantics is
clear, whereas using ITERATIVE with UNION is very misleading, as it is
*not* a union at all.

This said I'm wondering how the parser would react.

Moreover, having a different syntax for normal queries and inside WITH
query looks very undesirable from a language design point of view. This
suggests that the user should be able to write it anywhere:

SELECT 1 REPLACE SELECT 2;

Well, maybe.

I'm unclear whether "REPLACE ALL" vs "REPLACE" makes sense, ISTM that
there could be only one replacement semantics (delete the content and
insert a new one)?

REPLACE should have an associativity defined wrt other operators:

SELECT 1 UNION SELECT 2 REPLACE SELECT 3; -- how many rows?

I do not see anything obvious. Probably 2 rows.

Corey:

Obviously I'm very concerned about doing something that the SQL Standard
will clobber somewhere down the road. Having said that, the recursive
syntax always struck me as awkward even by SQL standards.

Indeed!

Perhaps something like this would be more readable

WITH t AS (
UPDATE ( SELECT 1 AS ctr, 'x' as val )
SET ctr = ctr + 1, val = val || 'x'
WHILE ctr <= 100
RETURNING ctr, val
)

The notion of an UPDATE on an ephemeral subquery isn't that special, see
"subquery2" in
https://docs.oracle.com/cd/B19306_01/appdev.102/b14261/update_statement.htm ,

I must admit that I do not like much needing another level of subquery,
but maybe it could just be another named query in the WITH statement.

ISTM that UPDATE is quite restrictive as the number of rows cannot
change, which does not seem desirable at all? How could I add or remove
rows from one iteration to the next?

ISTM that the WHILE would be checked before updating, so that WHILE FALSE
does nothing, in which case its position after SET is odd.

Having both WHERE and WHILE might look awkward.

Also it looks much more procedural this way, which is the point, but also
depart from the declarative SELECT approach of WITH RECURSIVE.

--
Fabien.

#20

Corey Huinker

corey.huinker@gmail.com

over 5 years ago

In reply to: Fabien COELHO (#19)

Re: Proposing WITH ITERATIVE

Perhaps something like this would be more readable

WITH t AS (
UPDATE ( SELECT 1 AS ctr, 'x' as val )
SET ctr = ctr + 1, val = val || 'x'
WHILE ctr <= 100
RETURNING ctr, val
)

The notion of an UPDATE on an ephemeral subquery isn't that special, see
"subquery2" in

https://docs.oracle.com/cd/B19306_01/appdev.102/b14261/update_statement.htm
,

I must admit that I do not like much needing another level of subquery,
but maybe it could just be another named query in the WITH statement.

So like this:
WITH initial_conditions as (SELECT 1 as ctr, 'x' as val)
UPDATE initial_conditions
SET ctr = ctr + 1, val = val || 'x'
WHILE ctr <= 100
RETURNING ctr, val

ISTM that UPDATE is quite restrictive as the number of rows cannot
change, which does not seem desirable at all? How could I add or remove
rows from one iteration to the next?

My understanding was that maintaining a fixed number of rows was a desired
feature.

ISTM that the WHILE would be checked before updating, so that WHILE FALSE
does nothing, in which case its position after SET is odd.

True, but having the SELECT before the FROM is equally odd.

Having both WHERE and WHILE might look awkward.

Maybe an UNTIL instead of WHILE?

Also it looks much more procedural this way, which is the point, but also
depart from the declarative SELECT approach of WITH RECURSIVE.

Yeah, just throwing it out as a possibility. Looking again at what I
suggested, it looks a bit like the Oracle "CONNECT BY level <= x" idiom.

I suspect that the SQL standards body already has some preliminary work
done, and we should ultimately follow that.

#21

Jonah H. Harris

jonah.harris@gmail.com

over 5 years ago

In reply to: Corey Huinker (#20)

Re: Proposing WITH ITERATIVE

On Wed, Apr 29, 2020 at 4:50 PM Corey Huinker <corey.huinker@gmail.com>
wrote:

Having both WHERE and WHILE might look awkward.

Maybe an UNTIL instead of WHILE?

While I'm not a huge fan of it, one of the other databases implementing
this functionality does so using the syntax:

WITH ITERATIVE R AS '(' R0 ITERATE Ri UNTIL N (ITERATIONS | UPDATES) ')' Qf

Where N in ITERATIONS represents termination at an explicit count and, in
UPDATES, represents termination after Ri updates more than n rows on table
R.

--
Jonah H. Harris

#22

Jonah H. Harris

jonah.harris@gmail.com

over 5 years ago

In reply to: Jonah H. Harris (#21)

Re: Proposing WITH ITERATIVE

On Wed, Apr 29, 2020 at 5:54 PM Jonah H. Harris <jonah.harris@gmail.com>
wrote:

On Wed, Apr 29, 2020 at 4:50 PM Corey Huinker <corey.huinker@gmail.com>
wrote:

Having both WHERE and WHILE might look awkward.

Maybe an UNTIL instead of WHILE?

While I'm not a huge fan of it, one of the other databases implementing
this functionality does so using the syntax:

WITH ITERATIVE R AS '(' R0 ITERATE Ri UNTIL N (ITERATIONS | UPDATES) ')' Qf

Where N in ITERATIONS represents termination at an explicit count and, in
UPDATES, represents termination after Ri updates more than n rows on table
R.

Sent too soon :) One of the main reasons I dislike the above is that it
assumes N is known. In some cases, however, you really need termination
upon a condition.

--
Jonah H. Harris

#23

Fabien COELHO

coelho@cri.ensmp.fr

over 5 years ago

In reply to: Jonah H. Harris (#22)

Re: Proposing WITH ITERATIVE

Hello,

more random thoughts about syntax, semantics, and keeping it relational.

While I'm not a huge fan of it, one of the other databases implementing
this functionality does so using the syntax:

WITH ITERATIVE R AS '(' R0 ITERATE Ri UNTIL N (ITERATIONS | UPDATES) ')' Qf

Where N in ITERATIONS represents termination at an explicit count and, in
UPDATES, represents termination after Ri updates more than n rows on table
R.

One of the main reasons I dislike the above is that it assumes N is
known. In some cases, however, you really need termination upon a
condition.

Yes, definitely, a (boolean?) condition is really needed, but possibly
above N could be an expression, maybe with some separator before the
query.

ISTM that using SELECT iterations is relational and close to the currently
existing RECURSIVE. Separating the initialization and iterations with
ITERATE is kind of the same approach than Peter's REPLACE, somehow, i.e. a
new marker.

The above approach bothers me because it changes the query syntax a lot.
The inside-WITH syntax should be the same as the normal query syntax.

First try. If we go to new markers, maybe the following, which kind of
reuse Corey explicit condition, but replacing UPDATE with SELECT which
makes it more generic:

WITH R AS (
ITERATE [STARTING] FROM R0
WHILE/UNTIL condition REPEAT Ri
);

Ok, it is quite procedural. It is really just a reordering of the syntax
shown above, with a boolean condition thrown in and a heavy on (key)words
SQL-like look and feel. It seems to make sense on a simple example:

-- 1 by 1 count
WITH counter(n) (
ITERATE STARTING FROM
SELECT 1
WHILE n < 10 REPEAT
SELECT n+1 FROM counter
);

However I'm very unclear about the WHILE stuff, it makes some sense here
because there is just one row, but what if there are severals?

-- 2 by 2 count
WITH counter(n) (
ITERATE [STARTING FROM? OVER? nothing?]
SELECT 1 UNION SELECT 2 -- cannot be empty? why not?
WHILE n < 10 REPEAT
-- which n it is just above?
-- shoult it add a ANY/ALL semantics?
-- should it really be a sub-query returning a boolean?
-- eg: WHILE TRUE = ANY/ALL (SELECT n < 10 FROM counter)
-- which I find pretty ugly.
-- what else could it be?
SELECT n+2 FROM counter
);

Also, the overall syntax does not make much sense outside a WITH because
one cannot reference the initial query which has no name.

Hmmm. Not very convincing:-) Let us try again.

Basically iterating is a 3 select construct: one for initializing, one for
iterating, one for the stopping condition, with naming issues, the last
point being exactly what WITH should solve.

by restricting the syntax to normal existing selects and moving things
out:

WITH stuff(n) AS
ITERATE OVER/FROM/STARTING FROM '(' initial-sub-query ')' -- or a table?
WHILE/UNTIL '(' condition-sub-query ')'
-- what is TRUE/FALSE? non empty? other?
-- WHILE/UNTIL [NOT] EXISTS '(' query ')' ??
REPEAT/DO/LOOP/... '(' sub-query-over-stuff ')'
);

At least the 3 sub-queries are just standard queries, only the wrapping
around (ITERATE ... WHILE/UNTIL ... REPEAT ...) is WITH specific, which is
somehow better than having new separators in the query syntax itself. It
is pretty relational inside, and procedural on the outside, the two levels
are not mixed, which is the real win from my point of view.

ISTM that the key take away from the above discussion is to keep the
overhead syntax in WITH, it should not be moved inside the query in any
way, like adding REPLACE or WHILE or whatever there. This way there is
minimal interference with future query syntax extensions, there is only a
specific WITH-level 3-query construct with pretty explicit markers.

--
Fabien.

#24

Jeff Davis

pgsql@j-davis.com

over 5 years ago

In reply to: Jonah H. Harris (#12)

Re: Proposing WITH ITERATIVE

On Tue, 2020-04-28 at 11:57 -0400, Jonah H. Harris wrote:

Yeah, in that specific case, one of the other implementations seems
to carry the counters along in the executor itself. But, as not all
uses of this functionality are iteration-count-based, I think that's
a little limiting. Using a terminator expression (of some kind) seems
most adaptable, I think. I'll give some examples of both types of
cases.

In my experience, graph algorithms or other queries doing more
specialized analysis tend to get pretty complex with lots of special
cases. Users may want to express these algorithms in a more familiar
language (R, Python, etc.), and to version the code (probably in an
extension).

Have you considered taking this to the extreme and implementing
something like User-Defined Table Operators[1]http://www.vldb.org/conf/1999/P47.pdf? Or is there a
motivation for writing such algorithms inline in SQL?

Regards,
Jeff Davis

[1]: http://www.vldb.org/conf/1999/P47.pdf