DELETE syntax on JOINS

Started by Jean-Michel Pouréover 16 years ago21 messages

Dear Friends,

First, thank you very much for considering a fix on the GROUP BY issue.
I am starting a new thread about another issue:

It seems that DELETE cannot understand INNER JOINS and needs HAVING.

Read:
http://drupal.org/node/555562 (main message)
http://drupal.org/node/555648

I don't see why PostgreSQL would not be able to run queries like:

DELETE h
FROM history AS h
INNER JOIN term_node AS tn ON (h.nid = tn.nid)
INNER JOIN term_data AS td ON (td.tid = tn.tid)
WHERE h.uid = 2067 AND td.vid = 2

Ultimately, why not allow:

DELETE h, tn
FROM history AS h
INNER JOIN term_node AS tn ON (h.nid = tn.nid)
INNER JOIN term_data AS td ON (td.tid = tn.tid)
WHERE h.uid = 2067 AND td.vid = 2

IMHO this would improve compliance towards other database systems. To me
this seems to be in the reasonable scope of compatibility.

Kind regards,
Jean-Michel

#2Bruce Momjian
bruce@momjian.us
In reply to: Jean-Michel Pouré (#1)
Re: DELETE syntax on JOINS

Jean-Michel Pour��� wrote:
-- Start of PGP signed section.

Dear Friends,

First, thank you very much for considering a fix on the GROUP BY issue.
I am starting a new thread about another issue:

It seems that DELETE cannot understand INNER JOINS and needs HAVING.

Read:
http://drupal.org/node/555562 (main message)
http://drupal.org/node/555648

I don't see why PostgreSQL would not be able to run queries like:

DELETE h
FROM history AS h
INNER JOIN term_node AS tn ON (h.nid = tn.nid)
INNER JOIN term_data AS td ON (td.tid = tn.tid)
WHERE h.uid = 2067 AND td.vid = 2

Ultimately, why not allow:

DELETE h, tn
FROM history AS h
INNER JOIN term_node AS tn ON (h.nid = tn.nid)
INNER JOIN term_data AS td ON (td.tid = tn.tid)
WHERE h.uid = 2067 AND td.vid = 2

IMHO this would improve compliance towards other database systems. To me
this seems to be in the reasonable scope of compatibility.

Which "other database systems"? Only MySQL? If it is MySQL-only, we
are unlikely to add it.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#3Alvaro Herrera
alvherre@commandprompt.com
In reply to: Bruce Momjian (#2)
Re: DELETE syntax on JOINS

Bruce Momjian wrote:

Ultimately, why not allow:

DELETE h, tn
FROM history AS h
INNER JOIN term_node AS tn ON (h.nid = tn.nid)
INNER JOIN term_data AS td ON (td.tid = tn.tid)
WHERE h.uid = 2067 AND td.vid = 2

IMHO this would improve compliance towards other database systems. To me
this seems to be in the reasonable scope of compatibility.

Which "other database systems"? Only MySQL? If it is MySQL-only, we
are unlikely to add it.

The SQL standard does not support this syntax. They would have you put
the joins in a subselect (which is often not enough because then you
can't use outer joins).

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#4Bruce Momjian
bruce@momjian.us
In reply to: Alvaro Herrera (#3)
Re: DELETE syntax on JOINS

Alvaro Herrera wrote:

Bruce Momjian wrote:

Ultimately, why not allow:

DELETE h, tn
FROM history AS h
INNER JOIN term_node AS tn ON (h.nid = tn.nid)
INNER JOIN term_data AS td ON (td.tid = tn.tid)
WHERE h.uid = 2067 AND td.vid = 2

IMHO this would improve compliance towards other database systems. To me
this seems to be in the reasonable scope of compatibility.

Which "other database systems"? Only MySQL? If it is MySQL-only, we
are unlikely to add it.

The SQL standard does not support this syntax. They would have you put
the joins in a subselect (which is often not enough because then you
can't use outer joins).

So the problem is that our DELETE ... USING does not allow ANSI join
syntax? Can that be added?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#5Alvaro Herrera
alvherre@commandprompt.com
In reply to: Bruce Momjian (#4)
Re: DELETE syntax on JOINS

Bruce Momjian wrote:

Alvaro Herrera wrote:

Bruce Momjian wrote:

Ultimately, why not allow:

DELETE h, tn
FROM history AS h
INNER JOIN term_node AS tn ON (h.nid = tn.nid)
INNER JOIN term_data AS td ON (td.tid = tn.tid)
WHERE h.uid = 2067 AND td.vid = 2

IMHO this would improve compliance towards other database systems. To me
this seems to be in the reasonable scope of compatibility.

Which "other database systems"? Only MySQL? If it is MySQL-only, we
are unlikely to add it.

The SQL standard does not support this syntax. They would have you put
the joins in a subselect (which is often not enough because then you
can't use outer joins).

So the problem is that our DELETE ... USING does not allow ANSI join
syntax? Can that be added?

Not sure about that. USING is already an extension to the standard, so
if we extend it a bit more, it can't be a problem, can it? But this
doesn't solve Jean Michel's problem, because MySQL does not support
DELETE USING (or does it?).

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#6Bruce Momjian
bruce@momjian.us
In reply to: Alvaro Herrera (#5)
Re: DELETE syntax on JOINS

Alvaro Herrera wrote:

Bruce Momjian wrote:

Alvaro Herrera wrote:

Bruce Momjian wrote:

Ultimately, why not allow:

DELETE h, tn
FROM history AS h
INNER JOIN term_node AS tn ON (h.nid = tn.nid)
INNER JOIN term_data AS td ON (td.tid = tn.tid)
WHERE h.uid = 2067 AND td.vid = 2

IMHO this would improve compliance towards other database systems. To me
this seems to be in the reasonable scope of compatibility.

Which "other database systems"? Only MySQL? If it is MySQL-only, we
are unlikely to add it.

The SQL standard does not support this syntax. They would have you put
the joins in a subselect (which is often not enough because then you
can't use outer joins).

So the problem is that our DELETE ... USING does not allow ANSI join
syntax? Can that be added?

Not sure about that. USING is already an extension to the standard, so
if we extend it a bit more, it can't be a problem, can it? But this
doesn't solve Jean Michel's problem, because MySQL does not support
DELETE USING (or does it?).

Right, but if we support ANSI joins in the USING clause, at least we
would have a _functional_ equivalent, which we don't know because of
missing outer join support.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#7Bill Moran
wmoran@potentialtech.com
In reply to: Bruce Momjian (#4)
Re: DELETE syntax on JOINS

In response to Bruce Momjian <bruce@momjian.us>:

Alvaro Herrera wrote:

Bruce Momjian wrote:

Ultimately, why not allow:

DELETE h, tn
FROM history AS h
INNER JOIN term_node AS tn ON (h.nid = tn.nid)
INNER JOIN term_data AS td ON (td.tid = tn.tid)
WHERE h.uid = 2067 AND td.vid = 2

IMHO this would improve compliance towards other database systems. To me
this seems to be in the reasonable scope of compatibility.

Which "other database systems"? Only MySQL? If it is MySQL-only, we
are unlikely to add it.

The SQL standard does not support this syntax. They would have you put
the joins in a subselect (which is often not enough because then you
can't use outer joins).

So the problem is that our DELETE ... USING does not allow ANSI join
syntax? Can that be added?

I suspect that the reason MySQL has this syntax is because for a long time
they didn't have proper foreign keys and referential integrity.

With proper foreign keys and ON DELETE CASCADE, why would supporting
such syntax even be necessary?

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#5)
Re: DELETE syntax on JOINS

Alvaro Herrera <alvherre@commandprompt.com> writes:

Bruce Momjian wrote:

So the problem is that our DELETE ... USING does not allow ANSI join
syntax? Can that be added?

Not sure about that. USING is already an extension to the standard, so
if we extend it a bit more, it can't be a problem, can it?

I don't see any very good way to extend the USING syntax to allow the
target table to be outer-joined to something else. Some other systems
allow it by letting you re-specify the target in the other clause,
equivalently to

DELETE FROM target t USING t LEFT JOIN other_table ot ON ...

but we have always considered that the target is *not* to be identified
with any member of the FROM/USING clause, so it would be a serious
compatibility break to change that now.

regards, tom lane

#9Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#8)
Re: DELETE syntax on JOINS

2009/8/24 Tom Lane <tgl@sss.pgh.pa.us>:

Alvaro Herrera <alvherre@commandprompt.com> writes:

Bruce Momjian wrote:

So the problem is that our DELETE ... USING does not allow ANSI join
syntax?  Can that be added?

Not sure about that.  USING is already an extension to the standard, so
if we extend it a bit more, it can't be a problem, can it?

I don't see any very good way to extend the USING syntax to allow the
target table to be outer-joined to something else.  Some other systems
allow it by letting you re-specify the target in the other clause,
equivalently to

DELETE FROM target t USING t LEFT JOIN other_table ot ON ...

but we have always considered that the target is *not* to be identified
with any member of the FROM/USING clause, so it would be a serious
compatibility break to change that now.

I'm all in favor of compatibility, but if there is any way to make
this work without massive collateral damage, I am also all in favor of
that. I am forever writing queries that contain a needless self-join
to work around the impossibility of directly outer-joining against the
target.

...Robert

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#9)
Re: DELETE syntax on JOINS

Robert Haas <robertmhaas@gmail.com> writes:

2009/8/24 Tom Lane <tgl@sss.pgh.pa.us>:

... Some other systems
allow it by letting you re-specify the target in the other clause,
equivalently to

DELETE FROM target t USING t LEFT JOIN other_table ot ON ...

but we have always considered that the target is *not* to be identified
with any member of the FROM/USING clause, so it would be a serious
compatibility break to change that now.

I'm all in favor of compatibility, but if there is any way to make
this work without massive collateral damage, I am also all in favor of
that. I am forever writing queries that contain a needless self-join
to work around the impossibility of directly outer-joining against the
target.

It'd be pretty easy to do if we were willing to introduce a new reserved
word; for example

DELETE FROM target t USING SELF LEFT JOIN other_table ot ON ...

(or maybe TARGET instead of SELF, or some other word). Wouldn't do
anything for exact compatibility with MySQL or anybody else using the
respecify-the-target-table-name approach. But it would be unambiguous
and backwards-compatible. The real problem with this is that all the
good candidates for the reserved word are things people are probably
already using as aliases, so we'd have a large risk of breaking existing
queries. We could avoid that with a sufficiently ugly choice like

DELETE FROM target t USING DELETE_TARGET LEFT JOIN other_table ot ON ...

but yech ...

regards, tom lane

#11Sam Mason
sam@samason.me.uk
In reply to: Tom Lane (#10)
Re: DELETE syntax on JOINS

On Mon, Aug 24, 2009 at 01:41:28PM -0400, Tom Lane wrote:

The real problem with this is that all the
good candidates for the reserved word are things people are probably
already using as aliases, so we'd have a large risk of breaking existing
queries. We could avoid that with a sufficiently ugly choice like

DELETE FROM target t USING DELETE_TARGET LEFT JOIN other_table ot ON ...

but yech ...

PRIMARY or TABLE?

Both are pretty grim, but I think they're reserved at the moment.

--
Sam http://samason.me.uk/

#12Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#8)
Re: DELETE syntax on JOINS

Tom Lane wrote:

Alvaro Herrera <alvherre@commandprompt.com> writes:

Bruce Momjian wrote:

So the problem is that our DELETE ... USING does not allow ANSI join
syntax? Can that be added?

Not sure about that. USING is already an extension to the standard, so
if we extend it a bit more, it can't be a problem, can it?

I don't see any very good way to extend the USING syntax to allow the
target table to be outer-joined to something else. Some other systems
allow it by letting you re-specify the target in the other clause,
equivalently to

DELETE FROM target t USING t LEFT JOIN other_table ot ON ...

but we have always considered that the target is *not* to be identified
with any member of the FROM/USING clause, so it would be a serious
compatibility break to change that now.

Let's look at this a little closer. We can use an alias in the DELETE
FROM clause:

test=> DELETE FROM test t;

test=> DELETE FROM test t USING test;

What we cannot currently do is reference test twice:

test=> DELETE FROM test USING test;
ERROR: table name "test" specified more than once

test=> DELETE FROM test t USING test t;
ERROR: table name "t" specified more than once

As far as I understand it, allowing ANSI joins in USING would simple
mean removing that error message and linking the two table aliases.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#12)
Re: DELETE syntax on JOINS

Bruce Momjian <bruce@momjian.us> writes:

What we cannot currently do is reference test twice:

test=> DELETE FROM test USING test;
ERROR: table name "test" specified more than once

test=> DELETE FROM test t USING test t;
ERROR: table name "t" specified more than once

Hmm, I had forgotten that we throw errors in these cases now.
Maybe that *would* give us an escape-hatch for the other interpretation.

As far as I understand it, allowing ANSI joins in USING would simple
mean removing that error message and linking the two table aliases.

Well, you'd still need to complain about

DELETE FROM test USING test JOIN test ON ...

Also, it's not nearly as easy as just removing the error check.
There's stuff in the planner (and perhaps executor) that's dependent on
the assumption that the target table isn't on the inside of an outer
join, for example. Still, getting agreement on a syntax would in itself
be a huge step forward.

regards, tom lane

#14Josh Berkus
josh@agliodbs.com
In reply to: Bruce Momjian (#12)
Re: DELETE syntax on JOINS

All,

DELETE FROM target t USING t LEFT JOIN other_table ot ON ...

but we have always considered that the target is *not* to be identified
with any member of the FROM/USING clause, so it would be a serious
compatibility break to change that now.

What I don't get is why this is such a usability issue. Subqueries in
DELETE FROM work perfectly well, and provide more flexibility than most
users know what to do with.

Personally, I'd be happy just to stop with the SQL extension we have. I
think extending USING any further is going to cause more problems than
it solves.

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

In reply to: Bill Moran (#7)
Re: DELETE syntax on JOINS

With proper foreign keys and ON DELETE CASCADE, why would supporting
such syntax even be necessary?

Porting existing abstraction layers from ANSI JOINs to ON DELETE CASCADE
is complicated.

What I don't get is why this is such a usability issue. Subqueries in
DELETE FROM work perfectly well, and provide more flexibility than
most
users know what to do with.

The ANSI syntax allows deleting one or several tables at once.
Subqueries are not supported by MySQL on DELETE.

Again, this is a usability issue to gain market shares and happy users
againts MySQL.

Kind regards,
Jean-Michel

#16Bruce Momjian
bruce@momjian.us
In reply to: Josh Berkus (#14)
Re: DELETE syntax on JOINS

Josh Berkus wrote:

All,

DELETE FROM target t USING t LEFT JOIN other_table ot ON ...

but we have always considered that the target is *not* to be identified
with any member of the FROM/USING clause, so it would be a serious
compatibility break to change that now.

What I don't get is why this is such a usability issue. Subqueries in
DELETE FROM work perfectly well, and provide more flexibility than most
users know what to do with.

OK, so you are saying that every OUTER join can be efficiently
reprsented as a subquery? If that is true we don't need to add ANSI
join support to USING.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#17Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#14)
Re: DELETE syntax on JOINS

On Mon, Aug 24, 2009 at 9:31 PM, Josh Berkus<josh@agliodbs.com> wrote:

All,

DELETE FROM target t USING t LEFT JOIN other_table ot ON ...

but we have always considered that the target is *not* to be identified
with any member of the FROM/USING clause, so it would be a serious
compatibility break to change that now.

What I don't get is why this is such a usability issue.  Subqueries in
DELETE FROM work perfectly well, and provide more flexibility than most
users know what to do with.

Personally, I'd be happy just to stop with the SQL extension we have.  I
think extending USING any further is going to cause more problems than
it solves.

It's both a usability issue and a performance issue. Suppose you want
to select all the rows in foo whose id field does not appear in
bar.foo_id. The most efficient way to do this in PostgreSQL is
typically:

SELECT foo.* FROM foo LEFT JOIN bar ON foo.id = bar.foo_id WHERE
bar.foo_id IS NULL;

Now, if you want to delete those rows, you can't do it without an
extra join somewhere. You can do it like this:

DELETE FROM foo AS foo1
USING foo AS foo2 LEFT JOIN bar ON foo2.id = bar.foo_id
WHERE foo1.id = foo2.id AND foo2;

Or like this:

DELETE FROM foo WHERE id IN (SELECT foo.id FROM foo LEFT JOIN bar ON
foo.id = bar.foo_id WHERE bar.foo_id IS NULL);

...but either way you now have foo in there twice when it really
shouldn't need to be, and you're doing a useless self-join to work
around a syntax limitation.

[ thinks ]

Actually, I guess in this case you can get around it like this:

DELETE FROM foo WHERE NOT EXISTS (SELECT 1 FROM bar WHERE bar.foo_id = foo.id);

...but I'm not sure it can be rewritten that way in every case - in
particular, that won't work if you have a RETURNING clause that
includes a value taken from bar.

...Robert

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#17)
Re: DELETE syntax on JOINS

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Aug 24, 2009 at 9:31 PM, Josh Berkus<josh@agliodbs.com> wrote:

What I don't get is why this is such a usability issue. �Subqueries in
DELETE FROM work perfectly well, and provide more flexibility than most
users know what to do with.

It's both a usability issue and a performance issue.

On the usability front: if we were to take the position Josh advocates,
we should never have added FROM/USING to UPDATE/DELETE at all ... but
since we did, I think we should try to make it as flexible as the
corresponding feature in other DBMSes.

On the performance front: yeah, you can recast most joins as subqueries,
but you tend to end up with the equivalent of a nestloop plan. Works
okay for small numbers of rows, scales horribly.

regards, tom lane

#19Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#18)
Re: DELETE syntax on JOINS

Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Aug 24, 2009 at 9:31 PM, Josh Berkus<josh@agliodbs.com> wrote:

What I don't get is why this is such a usability issue. Subqueries in
DELETE FROM work perfectly well, and provide more flexibility than most
users know what to do with.

It's both a usability issue and a performance issue.

On the usability front: if we were to take the position Josh advocates,
we should never have added FROM/USING to UPDATE/DELETE at all

FWIW, I use update ... from a lot - it make many update queries easier
and less error prone to write - but I rarely find I need delete ... using.

... but
since we did, I think we should try to make it as flexible as the
corresponding feature in other DBMSes.

+1

cheers

andrew

#20Greg Stark
gsstark@mit.edu
In reply to: Tom Lane (#18)
Re: DELETE syntax on JOINS

On Tue, Aug 25, 2009 at 2:50 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

On the performance front: yeah, you can recast most joins as subqueries,
but you tend to end up with the equivalent of a nestloop plan.  Works
okay for small numbers of rows, scales horribly.

Well that's our problem isn't it? I thought we were capable of
genearting semijoins for subqueries these days though?

The problem I thought was if you wanted to pull multiple values out of
the subquery.

So something like

UPDATE foo set a=aa, b=bb FROM bar WHERE ...

If you wanted to do an outer join from foo to bar then how would you
write it as an subquery even if our optimizer could notice the
semijoin and optimize it properly?

You would have to write something like

UPDATE foo set a = (select aa from bar where...)
b = (select bb from bar where...)

and then the optimizer would have to notice the duplicates and
consolidate them? That seems inconvenient (and fragile).

--
greg
http://mit.edu/~gsstark/resume.pdf

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Stark (#20)
Re: DELETE syntax on JOINS

Greg Stark <gsstark@mit.edu> writes:

You would have to write something like

UPDATE foo set a = (select aa from bar where...)
b = (select bb from bar where...)

and then the optimizer would have to notice the duplicates and
consolidate them? That seems inconvenient (and fragile).

Well, that's why the spec nowadays allows you to write

UPDATE foo SET (a,b) = (select aa,bb from bar where ...)

But we haven't got that, and if we did it would generate a nestloop
plan. Getting to the point of absolute performance equivalence between
subqueries and joins would take a *lot* of work; I'm not even sure it's
possible at all. And once we'd done all that work there would still
remain the fact that people are accustomed to using join syntax instead.
There's a lot of existing code out there that would be a lot easier
to port to PG if we supported that style (which was exactly the point
made by the OP).

regards, tom lane