top-level DML under CTEs
The patch attached is based on the one rejected at the last CF for 9.0
last year.
http://archives.postgresql.org/message-id/16303.1266023203@sss.pgh.pa.us
This patch implements the feature that allows top-level DMLs under CTE
WITH clause. For example:
WITH t AS (SELECT * FROM x)
UPDATE y SET val = t.val FROM t
WHERE y.key = t.key;
This feature is part of writeable CTEs proposed by David Fetter originally.
There were two issues at the CF.
1. WITH clause atop INSERT
Although the previous discussion got the consensus that we forbid WITH
atop INSERT, it seems to me that it can be allowed. I managed to do it
by treating the top WITH clause (of INSERT) as if the one of SELECT
(or VALUES). It is possible to disallow the CTE over INSERT statement,
but the lack for INSERT, though there are for UPDATE and DELETE,
sounds inconsistent enough.
2. OLD/NEW in rules
Following the subsequent discussion after the post linked above, I add
code to throw an appropriate error when OLD/NEW is used in WITH
clauses. It is true that OLD/NEW references look sane to general
users, but actually (at least in our implementation) they are located
in the top-level query's Range Table List. Consequently, they are
invisible inside the WITH clause. To allow them, we should rewrite the
rule systems overall. Thus, we forbid them in WITH though we should
throw an error indicating appropriate message.
I'll add the entry to CF app later. Any feedback is welcome.
Regards,
--
Hitoshi Harada
Attachments:
toplevel-dml-cte.20100913.patchapplication/octet-stream; name=toplevel-dml-cte.20100913.patchDownload+284-40
On Mon, Sep 13, 2010 at 9:15 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
The patch attached is based on the one rejected at the last CF for 9.0
last year.http://archives.postgresql.org/message-id/16303.1266023203@sss.pgh.pa.us
This patch implements the feature that allows top-level DMLs under CTE
WITH clause. For example:WITH t AS (SELECT * FROM x)
UPDATE y SET val = t.val FROM t
WHERE y.key = t.key;This feature is part of writeable CTEs proposed by David Fetter originally.
Thanks for pursuing this. I think this will be a useful feature if we
can get it committed, and plus David Fetter will be very, very happy.
:-)
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
On Mon, Sep 13, 2010 at 9:20 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Sep 13, 2010 at 9:15 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
The patch attached is based on the one rejected at the last CF for 9.0
last year.http://archives.postgresql.org/message-id/16303.1266023203@sss.pgh.pa.us
This patch implements the feature that allows top-level DMLs under CTE
WITH clause. For example:WITH t AS (SELECT * FROM x)
UPDATE y SET val = t.val FROM t
WHERE y.key = t.key;This feature is part of writeable CTEs proposed by David Fetter originally.
Thanks for pursuing this. I think this will be a useful feature if we
can get it committed, and plus David Fetter will be very, very happy.
:-)
Just to be clear, the attached patch is missing the part of the wCTE
that allows queries of the form:
WITH foo AS (DELETE * FROM bar RETURNING *) <any query using foo>
IOW, your CTE query has to be a select. This is still highly useful
however. The patch itself looks very clean after a quick glance
(which is all I can offer ATM unfortunately).
merlin
On Mon, Sep 13, 2010 at 2:43 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
Just to be clear, the attached patch is missing the part of the wCTE
that allows queries of the form:
WITH foo AS (DELETE * FROM bar RETURNING *) <any query using foo>
Understood.
IOW, your CTE query has to be a select. This is still highly useful
however.
Agreed.
The patch itself looks very clean after a quick glance
(which is all I can offer ATM unfortunately).
Yeah, I haven't had a chance to look at it either, just wanted to send props.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
2010/9/14 Merlin Moncure <mmoncure@gmail.com>:
On Mon, Sep 13, 2010 at 9:20 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Sep 13, 2010 at 9:15 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
The patch attached is based on the one rejected at the last CF for 9.0
last year.http://archives.postgresql.org/message-id/16303.1266023203@sss.pgh.pa.us
This patch implements the feature that allows top-level DMLs under CTE
WITH clause. For example:WITH t AS (SELECT * FROM x)
UPDATE y SET val = t.val FROM t
WHERE y.key = t.key;This feature is part of writeable CTEs proposed by David Fetter originally.
Thanks for pursuing this. I think this will be a useful feature if we
can get it committed, and plus David Fetter will be very, very happy.
:-)Just to be clear, the attached patch is missing the part of the wCTE
Yes, the main part of wCTE that allows DMLs in WITH is still under
Marko Tikkaja. To work parallel, we split the tasks into pieces.
Regards,
--
Hitoshi Harada
On 2010-09-13 4:15 PM +0300, Hitoshi Harada wrote:
1. WITH clause atop INSERT
Although the previous discussion got the consensus that we forbid WITH
atop INSERT, it seems to me that it can be allowed. I managed to do it
by treating the top WITH clause (of INSERT) as if the one of SELECT
(or VALUES).
In the email you referred to, Tom was concerned about the case where
these WITH lists have different RECURSIVE declarations. This patch
makes both RECURSIVE if either of them is. I can think of cases where
that might lead to surprising behaviour, but the chances of any of those
happening in real life seem pretty slim.
It is possible to disallow the CTE over INSERT statement,
but the lack for INSERT, though there are for UPDATE and DELETE,
sounds inconsistent enough.
Also because wCTEs are not allowed below the top level, not being able
to use INSERT as the top level statement would force people to wrap that
INSERT in another CTE.
Regards,
Marko Tiikkaja
2010/9/15 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>:
On 2010-09-13 4:15 PM +0300, Hitoshi Harada wrote:
1. WITH clause atop INSERT
Although the previous discussion got the consensus that we forbid WITH
atop INSERT, it seems to me that it can be allowed. I managed to do it
by treating the top WITH clause (of INSERT) as if the one of SELECT
(or VALUES).In the email you referred to, Tom was concerned about the case where these
WITH lists have different RECURSIVE declarations. This patch makes both
RECURSIVE if either of them is. I can think of cases where that might lead
to surprising behaviour, but the chances of any of those happening in real
life seem pretty slim.
I might not understand the RECURSIVE issue correctly. I put my effort
to make such query
WITH RECURSIVE r AS (SELECT 1 i UNION ALL SELECT i + 1 FROM r WHERE i
< 10) INSERT INTO WITH t AS (SELECT 0) VALUES((SELECT * FROM r LIMIT
1)),((SELECT * FROM t));
look like
INSERT INTO WITH RECURSIVE r AS (SELECT 1 i UNION ALL SELECT i + 1
FROM r WHERE i < 10), t AS (SELECT 0) VALUES((SELECT * FROM r LIMIT
1)),((SELECT * FROM t));
Does that cause surprising behavior?
but the chances of any of those happening in real
life seem pretty slim.
The OLD/NEW issue is also near impossible to be problem in the real
life, except for the misleading error message. But once users see the
non-understandable behavior, they make lines to claim as it's a "bug".
So we need to put effort to avoid it as possible, I believe.
Regards,
--
Hitoshi Harada
Hitoshi Harada <umi.tanuki@gmail.com> writes:
2010/9/15 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>:
In the email you referred to, Tom was concerned about the case where these
WITH lists have different RECURSIVE declarations. �This patch makes both
RECURSIVE if either of them is. �I can think of cases where that might lead
to surprising behaviour, but the chances of any of those happening in real
life seem pretty slim.
Does that cause surprising behavior?
My recollection is that whether a CTE is marked RECURSIVE or not affects
its scope of visibility, so that confusing the two cases can result in
flat-out incorrect parser behavior.
It would probably be all right to combine the cases internally, at the
rewriter or planner stage. It's not okay to do it in the parser, not
even after doing parse analysis of the individual CTEs, because then it
would be impossible for ruleutils.c to reverse-list the query correctly.
regards, tom lane
On 2010-09-14 10:51 PM, Tom Lane wrote:
Hitoshi Harada<umi.tanuki@gmail.com> writes:
2010/9/15 Marko Tiikkaja<marko.tiikkaja@cs.helsinki.fi>:
In the email you referred to, Tom was concerned about the case where these
WITH lists have different RECURSIVE declarations. This patch makes both
RECURSIVE if either of them is. I can think of cases where that might lead
to surprising behaviour, but the chances of any of those happening in real
life seem pretty slim.Does that cause surprising behavior?
My recollection is that whether a CTE is marked RECURSIVE or not affects
its scope of visibility, so that confusing the two cases can result in
flat-out incorrect parser behavior.
The worst I can think of is:
CREATE TABLE foo(a int);
WITH t AS (SELECT * FROM foo)
INSERT INTO bar
WITH RECURSIVE foo (SELECT 1 AS a)
SELECT * FROM t;
t will actually be populated with the results of the CTE, not the table foo.
I don't think this is a huge problem in real life, but if someone thinks
otherwise, I think we could just error out if the lists have a different
RECURSIVE definition.
Anyone have a worse example? Thoughts?
Regards,
Marko Tiikkaja
On Tue, Sep 14, 2010 at 4:28 PM, Marko Tiikkaja
<marko.tiikkaja@cs.helsinki.fi> wrote:
On 2010-09-14 10:51 PM, Tom Lane wrote:
Hitoshi Harada<umi.tanuki@gmail.com> writes:
2010/9/15 Marko Tiikkaja<marko.tiikkaja@cs.helsinki.fi>:
In the email you referred to, Tom was concerned about the case where
these
WITH lists have different RECURSIVE declarations. This patch makes both
RECURSIVE if either of them is. I can think of cases where that might
lead
to surprising behaviour, but the chances of any of those happening in
real
life seem pretty slim.Does that cause surprising behavior?
My recollection is that whether a CTE is marked RECURSIVE or not affects
its scope of visibility, so that confusing the two cases can result in
flat-out incorrect parser behavior.The worst I can think of is:
CREATE TABLE foo(a int);
WITH t AS (SELECT * FROM foo)
INSERT INTO bar
WITH RECURSIVE foo (SELECT 1 AS a)
SELECT * FROM t;t will actually be populated with the results of the CTE, not the table foo.
Unless I'm confused, that seems pretty clearly wrong.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
On 2010-09-14 10:51 PM, Tom Lane wrote:
My recollection is that whether a CTE is marked RECURSIVE or not affects
its scope of visibility, so that confusing the two cases can result in
flat-out incorrect parser behavior.
The worst I can think of is:
CREATE TABLE foo(a int);
WITH t AS (SELECT * FROM foo)
INSERT INTO bar
WITH RECURSIVE foo (SELECT 1 AS a)
SELECT * FROM t;
t will actually be populated with the results of the CTE, not the table foo.
I don't think this is a huge problem in real life, but if someone thinks
otherwise, I think we could just error out if the lists have a different
RECURSIVE definition.
Wrong is wrong. Doesn't matter whether it's "a huge problem in real life".
Why is it so difficult to do this correctly?
regards, tom lane
2010/9/15 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>:
On 2010-09-14 10:51 PM, Tom Lane wrote:
My recollection is that whether a CTE is marked RECURSIVE or not affects
its scope of visibility, so that confusing the two cases can result in
flat-out incorrect parser behavior.The worst I can think of is:
CREATE TABLE foo(a int);
WITH t AS (SELECT * FROM foo)
INSERT INTO bar
WITH RECURSIVE foo (SELECT 1 AS a)
SELECT * FROM t;t will actually be populated with the results of the CTE, not the table foo.
Hmmm, that's true. But it seems unrelated to RECURSIVE option, right?
Regards,
--
Hitoshi Harada
2010/9/15 Tom Lane <tgl@sss.pgh.pa.us>:
Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
On 2010-09-14 10:51 PM, Tom Lane wrote:
My recollection is that whether a CTE is marked RECURSIVE or not affects
its scope of visibility, so that confusing the two cases can result in
flat-out incorrect parser behavior.The worst I can think of is:
CREATE TABLE foo(a int);
WITH t AS (SELECT * FROM foo)
INSERT INTO bar
WITH RECURSIVE foo (SELECT 1 AS a)
SELECT * FROM t;t will actually be populated with the results of the CTE, not the table foo.
I don't think this is a huge problem in real life, but if someone thinks
otherwise, I think we could just error out if the lists have a different
RECURSIVE definition.Wrong is wrong. Doesn't matter whether it's "a huge problem in real life".
Why is it so difficult to do this correctly?
Because INSERT INTO ... (SELECT|VALUES) is already a collection of
kludge (as comments say). It was possible to parse the two WITHs
separately, but it results in ambiguous naming issue;
parseWithClause() asserts there's only one WITH clause in the Stmt and
detects duplicated CTE name in it. It seems possible to call
parseWithClause() twice by cheating ParseState and to try to find name
duplication outside it, though it is another kludge :-(
Now that we find the worst situation, I start to think I have to take
the kludy way anyway.
Regards,
--
Hitoshi Harada
Hitoshi Harada <umi.tanuki@gmail.com> writes:
2010/9/15 Tom Lane <tgl@sss.pgh.pa.us>:
Why is it so difficult to do this correctly?
Because INSERT INTO ... (SELECT|VALUES) is already a collection of
kludge (as comments say). It was possible to parse the two WITHs
separately, but it results in ambiguous naming issue;
parseWithClause() asserts there's only one WITH clause in the Stmt and
detects duplicated CTE name in it.
Well, I would think that the no-duplication rule applies to each WITH
list separately, not both together. If you do something like
with t1 as (select * from foo)
select * from
(with t2 as (select * from foo)
select * from t1, t2) ss;
there's no expectation that the WITH clauses can't both define the same
name.
regards, tom lane
2010/9/15 Tom Lane <tgl@sss.pgh.pa.us>:
Hitoshi Harada <umi.tanuki@gmail.com> writes:
2010/9/15 Tom Lane <tgl@sss.pgh.pa.us>:
Why is it so difficult to do this correctly?
Because INSERT INTO ... (SELECT|VALUES) is already a collection of
kludge (as comments say). It was possible to parse the two WITHs
separately, but it results in ambiguous naming issue;
parseWithClause() asserts there's only one WITH clause in the Stmt and
detects duplicated CTE name in it.Well, I would think that the no-duplication rule applies to each WITH
list separately, not both together. If you do something likewith t1 as (select * from foo)
select * from
(with t2 as (select * from foo)
select * from t1, t2) ss;
Well, I didn't know it is allowed. That would look like the way to go.
Regards,
--
Hitoshi Harada
2010/9/15 Hitoshi Harada <umi.tanuki@gmail.com>:
2010/9/15 Tom Lane <tgl@sss.pgh.pa.us>:
Well, I would think that the no-duplication rule applies to each WITH
list separately, not both together. If you do something likewith t1 as (select * from foo)
select * from
(with t2 as (select * from foo)
select * from t1, t2) ss;Well, I didn't know it is allowed. That would look like the way to go.
I made changes to the previous version, so that it avoids to resolve
CTE name duplication.
regression=# with t as (select 1 as i) insert into z with t as(select
2 as i )values ((select * from t));
INSERT 0 1
Time: 1.656 ms
regression=# table z;
f3
----
2
(1 row)
Also, the sample Marko gave is OK.
CREATE TABLE foo(a int);
WITH t AS (SELECT * FROM foo)
INSERT INTO bar
WITH RECURSIVE foo (SELECT 1 AS a)
SELECT * FROM t;
Hope this covers all the cases.
Regards,
--
Hitoshi Harada
Attachments:
toplevel-dml-cte.20100917.patchapplication/octet-stream; name=toplevel-dml-cte.20100917.patchDownload+300-56
On 2010-09-17 4:48 AM, Hitoshi Harada wrote:
2010/9/15 Hitoshi Harada<umi.tanuki@gmail.com>:
Well, I didn't know it is allowed. That would look like the way to go.
I made changes to the previous version, so that it avoids to resolve
CTE name duplication.
This patch still doesn't address the issue Tom raised here:
http://archives.postgresql.org/pgsql-hackers/2010-09/msg00753.php
For WITH .. INSERT .. WITH .. SELECT ..; this patch works OK, but not so
much for VALUES:
=# CREATE RULE barrule AS ON UPDATE TO bar DO INSTEAD
-# WITH RECURSIVE t AS (SELECT -1)
-# INSERT INTO bar
-# WITH t AS (SELECT 1)
-# VALUES((SELECT * FROM t));
CREATE RULE
=# \d bar
Table "public.bar"
Column | Type | Modifiers
--------+---------+-----------
a | integer |
Rules:
barrule AS
ON UPDATE TO bar DO INSTEAD WITH RECURSIVE t AS (
SELECT 1
), t AS (
SELECT (-1)
)
INSERT INTO bar (a) WITH RECURSIVE t AS (
SELECT 1
), t AS (
SELECT (-1)
)
VALUES (( SELECT t."?column?"
FROM t))
Regards,
Marko Tiikkaja
2010/9/23 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>:
On 2010-09-17 4:48 AM, Hitoshi Harada wrote:
2010/9/15 Hitoshi Harada<umi.tanuki@gmail.com>:
Well, I didn't know it is allowed. That would look like the way to go.
I made changes to the previous version, so that it avoids to resolve
CTE name duplication.This patch still doesn't address the issue Tom raised here:
http://archives.postgresql.org/pgsql-hackers/2010-09/msg00753.phpFor WITH .. INSERT .. WITH .. SELECT ..; this patch works OK, but not so
much for VALUES:=# CREATE RULE barrule AS ON UPDATE TO bar DO INSTEAD
-# WITH RECURSIVE t AS (SELECT -1)
-# INSERT INTO bar
-# WITH t AS (SELECT 1)
-# VALUES((SELECT * FROM t));
CREATE RULE=# \d bar
Table "public.bar"
Column | Type | Modifiers
--------+---------+-----------
a | integer |
Rules:
barrule AS
ON UPDATE TO bar DO INSTEAD WITH RECURSIVE t AS (
SELECT 1
), t AS (
SELECT (-1)
)
INSERT INTO bar (a) WITH RECURSIVE t AS (
SELECT 1
), t AS (
SELECT (-1)
)VALUES (( SELECT t."?column?"
FROM t))
I ran the sql and recognized what is wrong. In VALUES case, the WITH
table "t" is copied in one list and shown up in the both of
INSERT-level WITH and SELECT-level WITH. Since the transformation of
WITH clause to cheat postgres is in the parser stage currently, I
wonder if this should be done in the rewriter or the planner stage.
Thanks for the report. Next time, please point the clear problem in
English aside the sample.
Regards,
--
Hitoshi Harada
On 2010-09-23 9:12 AM +0300, Hitoshi Harada wrote:
Thanks for the report. Next time, please point the clear problem in
English aside the sample.
I apologize. The problem was exactly the one pointed out in the email I
referred to, so I assumed that further explanation was not necessary.
I will try to be more clear in the future.
Regards,
Marko Tiikkaja
On 2010-09-23 9:12 AM +0300, Hitoshi Harada wrote:
Since the transformation of
WITH clause to cheat postgres is in the parser stage currently, I
wonder if this should be done in the rewriter or the planner stage.
It's been about a week now. Should we expect a new patch soon?
Regards,
Marko Tiikkaja