UPDATE-FROM and INNER-JOIN

Started by Dominique Devienneover 1 year ago7 messagesgeneral
Jump to latest
#1Dominique Devienne
ddevienne@gmail.com

In https://sqlite.org/forum/forumpost/df23d80682
Richard Hipp (Mr SQLite) shows an example of something
that used to be supported by SQLite, but then wasn't, to be
compatible with PostgreSQL.

Thus I'm curious as to why PostgreSQL refuses the first formulation.
Could anyone provide any insights? Thanks, --DD

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Dominique Devienne (#1)
Re: UPDATE-FROM and INNER-JOIN

On Monday, August 5, 2024, Dominique Devienne <ddevienne@gmail.com> wrote:

In https://sqlite.org/forum/forumpost/df23d80682
Richard Hipp (Mr SQLite) shows an example of something
that used to be supported by SQLite, but then wasn't, to be
compatible with PostgreSQL.

Thus I'm curious as to why PostgreSQL refuses the first formulation.
Could anyone provide any insights? Thanks, --DD

Interesting…but not too surprising. The joining condition between the
update relation and the from relation needs to be done in the where
clause. You cannot reference columns of the update relation in the from
clause because the update relation is not named in the from clause.

There is still an underlying “why” here that I don’t know…

David J.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dominique Devienne (#1)
Re: UPDATE-FROM and INNER-JOIN

Dominique Devienne <ddevienne@gmail.com> writes:

In https://sqlite.org/forum/forumpost/df23d80682
Richard Hipp (Mr SQLite) shows an example of something
that used to be supported by SQLite, but then wasn't, to be
compatible with PostgreSQL.

For the archives' sake:

CREATE TABLE t1(aa INT, bb INT);
CREATE TABLE t2(mm INT, nn INT);
CREATE TABLE t3(xx INT, yy INT);
UPDATE t1 SET bb = mm+xx FROM t2 INNER JOIN t3 ON nn=xx AND mm=aa;

yields

ERROR: column "aa" does not exist
LINE 1: ... t1 SET bb = mm+xx FROM t2 INNER JOIN t3 ON nn=xx AND mm=aa;
^
DETAIL: There is a column named "aa" in table "t1", but it cannot be referenced from this part of the query.

Thus I'm curious as to why PostgreSQL refuses the first formulation.
Could anyone provide any insights? Thanks, --DD

This seems correct to me. The scope of the ON clause is just
the relations within the INNER JOIN, which does not include t1.
You would get the same from

SELECT * FROM t1, t2 INNER JOIN t3 ON nn=xx AND mm=aa;
ERROR: column "aa" does not exist
LINE 1: SELECT * FROM t1, t2 INNER JOIN t3 ON nn=xx AND mm=aa;
^

because again t1 is not part of the JOIN sub-clause. (MySQL used
to get this wrong, many years ago, and it seems that has taught
a lot of people some strange ideas about syntactic precedence
within FROM clauses. Postgres' behavior agrees with the SQL
spec here.)

regards, tom lane

#4Dominique Devienne
ddevienne@gmail.com
In reply to: Tom Lane (#3)
Re: UPDATE-FROM and INNER-JOIN

On Mon, Aug 5, 2024 at 3:56 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Dominique Devienne <ddevienne@gmail.com> writes:

In https://sqlite.org/forum/forumpost/df23d80682
Richard Hipp (Mr SQLite) shows an example of something
that used to be supported by SQLite, but then wasn't, to be
compatible with PostgreSQL.

This seems correct to me. The scope of the ON clause is just
the relations within the INNER JOIN, which does not include t1.

The [SQLite doc][1]https://www.sqlite.org/lang_update.html#update_from does mention:

1) "With UPDATE-FROM you can join the target table against other tables"
2) "The target table is not included in the FROM clause, unless the
intent is to do a self-join"

which one can easily read as the update-target-table being implicitly
part of the join,
and thus OK to JOIN-ON against. Yes it is the SQLite doc, and not PostgreSQL's,
but naively I tend to agree with the OP (on the SQLite Forum) that it
"ought" to work.
In both SQLite (as it used to), and in PostgreSQL.

I'd rather SQLite and PostgreSQL continue to agree on this,
but not in a restrictive way. Which would imply PostgreSQL accepting
it (a tall order...).
Or perhaps SQLite should allow it back. And PostgreSQL catch up eventually?

The reason I find the restriction damaging is that `FROM t1, t2 WHERE
t1.c1 = t2.c2`
is the "old" way to write joins, versus the "newer" `FROM t1 JOIN t2
ON t1.c1 = t2.c2`
which IMHO better separates "filtering" from "joining" columns. FWIW.

Thanks, --DD

[1]: https://www.sqlite.org/lang_update.html#update_from

#5David G. Johnston
david.g.johnston@gmail.com
In reply to: Dominique Devienne (#4)
Re: UPDATE-FROM and INNER-JOIN

On Mon, Aug 5, 2024 at 7:36 AM Dominique Devienne <ddevienne@gmail.com>
wrote:

I'd rather SQLite and PostgreSQL continue to agree on this,
but not in a restrictive way.

I.e., you want to support the SQL Server syntax; allow the table named in
UPDATE to be repeated, without an alias, in which case it is taken to
represent the table being updated. And then allow the usual FROM clause to
take form.

Personally I get around this by simply doing:

UPDATE tbl
FROM (...) AS to_update
WHERE tbl.id = to_update.id

A bit more verbose in the typical case but the subquery in FROM can be
separately executed during development then just plugged in. There is no
noise in the outer where clause since its only purpose is to join the
subquery to the table to be updated. The subquery has the full separation
of filters from joins that one would like to have.

David J.

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dominique Devienne (#4)
Re: UPDATE-FROM and INNER-JOIN

Dominique Devienne <ddevienne@gmail.com> writes:

The reason I find the restriction damaging is that `FROM t1, t2 WHERE
t1.c1 = t2.c2`
is the "old" way to write joins, versus the "newer" `FROM t1 JOIN t2
ON t1.c1 = t2.c2`
which IMHO better separates "filtering" from "joining" columns. FWIW.

But the whole point of that syntax is to be explicit about which
tables the ON clause(s) can draw from. If we had a more complex
FROM clause, with say three or four JOINs involved, which part of
that would you argue the UPDATE target table should be implicitly
inserted into? The only thing that would be non-ambiguous would
be to require the target table to be explicitly named in FROM
(and not treat that as a self-join, but as the sole scan of the
target table). Some other RDBMSes do it like that, but it seems
like too much of a compatibility break for us.

Sadly, the SQL spec doesn't provide for a FROM clause in UPDATE,
so they offer no guidance. But I doubt we are going to change
this unless the standard defines it and does so in a way that
doesn't match what we're doing.

regards, tom lane

#7Dominique Devienne
ddevienne@gmail.com
In reply to: Tom Lane (#6)
Re: UPDATE-FROM and INNER-JOIN

On Mon, Aug 5, 2024 at 5:01 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Dominique Devienne <ddevienne@gmail.com> writes:

The reason I find the restriction damaging is that `FROM t1, t2 WHERE
t1.c1 = t2.c2`
is the "old" way to write joins, versus the "newer" `FROM t1 JOIN t2
ON t1.c1 = t2.c2`
which IMHO better separates "filtering" from "joining" columns. FWIW.

But the whole point of that syntax is to be explicit about which
tables the ON clause(s) can draw from. If we had a more complex
FROM clause, with say three or four JOINs involved, which part of
that would you argue the UPDATE target table should be implicitly
inserted into?

Wherever an update-target-column was referenced in an ON clause.
Like SQLite used to support. I.e. possibly multiple times even, I guess.
Yes that does imply the update-target table in not explicitly named
in the FROM clause, specifically in the UPDATE-FROM case.

Personally I don't find that "offensive", it's explicitly part of an UPDATE.

The only thing that would be non-ambiguous would
be to require the target table to be explicitly named in FROM
(and not treat that as a self-join, but as the sole scan of the
target table). Some other RDBMSes do it like that, but it seems
like too much of a compatibility break for us.

The (old for now) SQLite way would be lifting a restriction,
so that wouldn't be a backward incompatible change IMHO

Sadly, the SQL spec doesn't provide for a FROM clause in UPDATE,
so they offer no guidance.

And that's precisely why SQLite and PostgreSQL agreeing on a precedent
would be nice.

But I doubt we are going to change
this unless the standard defines it and does so in a way that
doesn't match what we're doing.

OK. Fair enough. I'm just expressing a personal opinion above.
Which the SQLite Forum OP also supports I'd guess. FWIW.

Thanks for your input Tom. --DD