doc: Clarify what "excluded" represents for INSERT ON CONFLICT

Started by David G. Johnstonover 3 years ago16 messages
#1David G. Johnston
david.g.johnston@gmail.com
1 attachment(s)

Hi,

Reposting this on its own thread.

/messages/by-id/CAKFQuwby1aMsJDMeibaBaohgoaZhivAo4WcqHC1=9-GDZ3TSng@mail.gmail.com

As one cannot place excluded in a FROM clause (subquery) in the
ON CONFLICT clause referring to it as a table, with plural rows
nonetheless, leads the reader to infer more about what the
behavior here is than is correct. We already just say use the
table's name for the existing row so just match that pattern
of using the name excluded for the proposed row.

The alias description doesn't have the same issue regarding the
use of the word table and rows, as the use there is more conceptual,
but the wording about "otherwise taken as" is wrong: rather two
labels of excluded end up in scope and you get an ambiguous name error.

The error messages still consider excluded to be a table reference
and this patch does not try to change that. That implementation
detail need not force the user-facing documentation for the feature
to use the term table when it doesn't really apply.

David J.

Attachments:

0001-doc-Clarify-that-excluded-is-really-just-a-special-n.patchapplication/octet-stream; name=0001-doc-Clarify-that-excluded-is-really-just-a-special-n.patchDownload
From 1f4f4c87550af4065b85830b35079ec930779877 Mon Sep 17 00:00:00 2001
From: "David G. Johnston" <david.g.johnston@gmail.com>
Date: Thu, 9 Jun 2022 15:15:52 +0000
Subject: [PATCH] doc: Clarify that excluded is really just a special name

---
 doc/src/sgml/ref/insert.sgml | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/ref/insert.sgml b/doc/src/sgml/ref/insert.sgml
index a9af9959c0..0fad1d3640 100644
--- a/doc/src/sgml/ref/insert.sgml
+++ b/doc/src/sgml/ref/insert.sgml
@@ -175,9 +175,8 @@ INSERT INTO <replaceable class="parameter">table_name</replaceable> [ AS <replac
         class="parameter">table_name</replaceable>.  When an alias is
         provided, it completely hides the actual name of the table.
         This is particularly useful when <literal>ON CONFLICT DO UPDATE</literal>
-        targets a table named <varname>excluded</varname>, since that will otherwise
-        be taken as the name of the special table representing rows proposed
-        for insertion.
+        targets a table named <varname>excluded</varname>, since that will
+        conflict with the special name representing the row proposed for insertion.
        </para>
       </listitem>
      </varlistentry>
@@ -396,8 +395,8 @@ INSERT INTO <replaceable class="parameter">table_name</replaceable> [ AS <replac
         conflict.  The <literal>SET</literal> and
         <literal>WHERE</literal> clauses in <literal>ON CONFLICT DO
         UPDATE</literal> have access to the existing row using the
-        table's name (or an alias), and to rows proposed for insertion
-        using the special <varname>excluded</varname> table.
+        table's name (or an alias), and to the proposed row
+        using the name <varname>excluded</varname>.
         <literal>SELECT</literal> privilege is required on any column in the
         target table where corresponding <varname>excluded</varname>
         columns are read.
@@ -699,8 +698,8 @@ INSERT INTO employees_log SELECT *, current_timestamp FROM upd;
   <para>
    Insert or update new distributors as appropriate.  Assumes a unique
    index has been defined that constrains values appearing in the
-   <literal>did</literal> column.  Note that the special
-   <varname>excluded</varname> table is used to reference values originally
+   <literal>did</literal> column.  Note that the name
+   <varname>excluded</varname> is used to reference values originally
    proposed for insertion:
 <programlisting>
 INSERT INTO distributors (did, dname)
-- 
2.25.1

#2Robert Haas
robertmhaas@gmail.com
In reply to: David G. Johnston (#1)
Re: doc: Clarify what "excluded" represents for INSERT ON CONFLICT

On Thu, Jun 9, 2022 at 11:40 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:

As one cannot place excluded in a FROM clause (subquery) in the
ON CONFLICT clause referring to it as a table, ...

Well, it would be nice if you had included a test case rather than
leaving it to the reviewer or committer to construct one. In general,
dropping subtle patches with minimal commentary isn't really very
helpful.

But I decided to dig in and see what I could figure out. I constructed
this test case first, which does work:

rhaas=# create table foo (a int primary key, b text);
CREATE TABLE
rhaas=# insert into foo values (1, 'blarg');
INSERT 0 1
rhaas=# insert into foo values (1, 'frob') on conflict (a) do update
set b = (select excluded.b || 'nitz');
INSERT 0 1
rhaas=# select * from foo;
a | b
---+----------
1 | frobnitz
(1 row)

Initially I thought that was the case you were talking about, but
after staring at your email for another 20 minutes, I figured out that
you're probably talking about something more like this, which doesn't
work:

rhaas=# insert into foo values (1, 'frob') on conflict (a) do update
set b = (select b || 'nitz' from excluded);
ERROR: relation "excluded" does not exist
LINE 1: ...ct (a) do update set b = (select b || 'nitz' from excluded);

I do find that a bit of a curious error message, because that relation
clearly DOES exist in the range table. I know that because, if I use a
wrong column name, I get a complaint about the column not existing,
not the relation not existing:

rhaas=# insert into foo values (1, 'frob') on conflict (a) do update
set b = (select excluded.bbbbbbbbb || 'nitz');
ERROR: column excluded.bbbbbbbbb does not exist
LINE 1: ...'frob') on conflict (a) do update set b = (select excluded.b...

That said, I am not convinced that changing the documentation in this
way is a good idea. It is clear that, at the level of the code,
"excluded" behaves like a pseudo-table, and the fact that it isn't
equivalent to a real table in all ways, or that it can't be referenced
at every point in the query equally, doesn't change that. I don't
think that the language you're proposing is horrible or anything --
the distinction between a special table and a special name that
behaves somewhat like a single-row table is subtle at best -- but I
think that the threshold to commit a patch like this is that the
change has to be a clear improvement, and I don't think it is.

I think it might be fruitful to consider whether some of the error
messages here could be improved or even whether some of the
non-working cases could be made to work, but I'm just not really
seeing the value of tinkering with documentation which is, in my view,
not wrong.

--
Robert Haas
EDB: http://www.enterprisedb.com

In reply to: Robert Haas (#2)
Re: doc: Clarify what "excluded" represents for INSERT ON CONFLICT

On Thu, Jun 30, 2022 at 1:43 PM Robert Haas <robertmhaas@gmail.com> wrote:

rhaas=# insert into foo values (1, 'frob') on conflict (a) do update
set b = (select b || 'nitz' from excluded);
ERROR: relation "excluded" does not exist
LINE 1: ...ct (a) do update set b = (select b || 'nitz' from excluded);

I do find that a bit of a curious error message, because that relation
clearly DOES exist in the range table.

Let's say that we supported this syntax. I see some problems with that
(you may have thought of these already). Thinking of "excluded" as a
separate table creates some very thorny questions, such as:

How would the user be expected to handle the case where there was more
than a single "row" in "excluded"? How could the implementation know
what the contents of the "excluded table" were ahead of time in the
multi-row-insert case? We'd have to know about *all* of the conflicts
for all rows proposed for insertion to do this, which seems impossible
in the general case -- because some of those conflicts won't have
happened yet, and cannot be predicted.

The "excluded" pseudo-table is conceptually similar to an from_item
alias used within an UPDATE .... FROM ... where the target table is
also the from_item table (i.e. there is a self-join). There is only
one table involved.

--
Peter Geoghegan

#4David G. Johnston
david.g.johnston@gmail.com
In reply to: Robert Haas (#2)
Re: doc: Clarify what "excluded" represents for INSERT ON CONFLICT

On Thu, Jun 30, 2022 at 1:43 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Jun 9, 2022 at 11:40 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:

As one cannot place excluded in a FROM clause (subquery) in the
ON CONFLICT clause referring to it as a table, ...

Well, it would be nice if you had included a test case rather than
leaving it to the reviewer or committer to construct one. In general,
dropping subtle patches with minimal commentary isn't really very
helpful.

Fair point.

But I decided to dig in and see what I could figure out. I constructed
this test case first, which does work:

rhaas=# create table foo (a int primary key, b text);
CREATE TABLE
rhaas=# insert into foo values (1, 'blarg');
INSERT 0 1
rhaas=# insert into foo values (1, 'frob') on conflict (a) do update
set b = (select excluded.b || 'nitz');
INSERT 0 1
rhaas=# select * from foo;
a | b
---+----------
1 | frobnitz
(1 row)

Initially I thought that was the case you were talking about, but
after staring at your email for another 20 minutes, I figured out that
you're probably talking about something more like this, which doesn't
work:

rhaas=# insert into foo values (1, 'frob') on conflict (a) do update
set b = (select b || 'nitz' from excluded);
ERROR: relation "excluded" does not exist
LINE 1: ...ct (a) do update set b = (select b || 'nitz' from excluded);

Right, the word "excluded" appearing immediately after the word FROM is
what I meant by:

"As one cannot place excluded in a FROM clause (subquery) in the
ON CONFLICT clause"

It is clear that, at the level of the code,

"excluded" behaves like a pseudo-table,

And people in the code are capable of understanding this without difficulty
no matter how we write it. They are not the target audience.

but I
think that the threshold to commit a patch like this is that the
change has to be a clear improvement, and I don't think it is.

I'm hoping for "more clear and accurate without making things worse"...

The fact that it does not and cannot use FROM and that it never refers to
more than a single row (which is what motivated the change in the first
place) for me make using the word table here more trouble than it is worth.

I think it might be fruitful to consider whether some of the error
messages here could be improved

Possibly...

or even whether some of the
non-working cases could be made to work,

That would, IMO, make things worse. "excluded" isn't a table in that
sense, anymore than "NEW" and "OLD" in the context of triggers.

but I'm just not really

seeing the value of tinkering with documentation which is, in my view,
not wrong.

Current:
"The SET and WHERE clauses in ON CONFLICT DO UPDATE have access to the
existing row using the table's name (or an alias), and to [rows] proposed
for insertion using the special excluded table."

The word table in that sentence is wrong and not a useful way to think of
the thing which we've named excluded. It is a single value of a composite
type having the structure of the named table.

I'll agree that most people will mentally paper over the difference and go
merrily on their way. At least one person recently did not do that, which
prompted an email to the community, which prompted a response and this
suggestion to avoid that in the future while, IMO, not making understanding
of the concept any less clear.

David J.

In reply to: David G. Johnston (#4)
Re: doc: Clarify what "excluded" represents for INSERT ON CONFLICT

On Thu, Jun 30, 2022 at 2:07 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:

Current:
"The SET and WHERE clauses in ON CONFLICT DO UPDATE have access to the
existing row using the table's name (or an alias), and to [rows] proposed
for insertion using the special excluded table."

The word table in that sentence is wrong and not a useful way to think of the thing which we've named excluded. It is a single value of a composite type having the structure of the named table.

I think that your reasoning is correct, but I don't agree with your
conclusion. The term "special excluded table" is a fudge, but that
isn't necessarily a bad thing. Sure, we could add something about the
UPDATE being similar to an UPDATE with a self-join, as I said
upthread. But I think that that would make the concept harder to
grasp.

I'll agree that most people will mentally paper over the difference and go merrily on their way. At least one person recently did not do that, which prompted an email to the community

Can you provide a reference for this? Didn't see anything like that in
the reference you gave upthread.

I have a hard time imagining a user that reads the INSERT docs and
imagines that "excluded" is a relation that is visible to the query in
ways that are not limited to expression evaluation for the UPDATE's
WHERE/SET. The way that it works (and doesn't work) follows naturally
from what a user would want to do in order to upsert. MySQL's INSERT
... ON DUPLICATE KEY UPDATE feature has a magical UPSERT-only
expression instead of "excluded".

--
Peter Geoghegan

#6David G. Johnston
david.g.johnston@gmail.com
In reply to: Peter Geoghegan (#5)
Re: doc: Clarify what "excluded" represents for INSERT ON CONFLICT

On Thu, Jun 30, 2022 at 2:31 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Thu, Jun 30, 2022 at 2:07 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:

Current:
"The SET and WHERE clauses in ON CONFLICT DO UPDATE have access to the
existing row using the table's name (or an alias), and to [rows] proposed
for insertion using the special excluded table."

The word table in that sentence is wrong and not a useful way to think

of the thing which we've named excluded. It is a single value of a
composite type having the structure of the named table.

I think that your reasoning is correct, but I don't agree with your
conclusion. The term "special excluded table" is a fudge, but that
isn't necessarily a bad thing. Sure, we could add something about the
UPDATE being similar to an UPDATE with a self-join, as I said
upthread. But I think that that would make the concept harder to
grasp.

I don't think incorporating self-joining to be helpful; the status quo is
better than that. I believe people mostly think of "composite variable"
from the current description even if we don't use those words - or such a
concept can be explained by analogy with NEW and OLD (I think of it like a
trigger, only that SQL doesn't have variables so we cannot use that term,
hence just using "name").

I'll agree that most people will mentally paper over the difference and

go merrily on their way. At least one person recently did not do that,
which prompted an email to the community

Can you provide a reference for this? Didn't see anything like that in
the reference you gave upthread.

OK, the discussion I am recalling happened on Discord hence the lack of a
link.

On roughly 3/8 the following conversation occurred (I've trimmed out some
non-relevant comments):

OP

Hello, I have a simple question.
My table has a column 'transaction_date'
I have an insert statement with an ON CONFLICT statement
I update using the 'excluded' values, but I only want to update if the
transaction date is the same or newer.
Do I just use: "WHERE EXCLUDED.transaction_date >= transaction_date"?
so the full query is something like: INSERT INTO table VALUES (pk, yadda1,
yadda2) ON CONFLICT (pk) DO UPDATE SET (yadda1 = EXCLUDED.yadda1, yadda2 =
EXCLUDED.yadda2) WHERE EXCLUDED.transaction_date >= transaction_date;

Other Person

I mean, the ... like 3 examples imply what it contains, and it vaguely says
"and to rows proposed for insertion using the special excluded table."
but...
Still, based on the BNF, that should work as you stated it.

OP

would perhaps it try to overwrite more than one row because many rows would
meet the criteria?
It seems like it limits it to the conflict row but..

Other Person

Well, you're only conflicting on the PK, which is guaranteed to be unique.

OP

Ah, so then it is limited to that row if it is specified within the ON
CONFLICT action if I am reading correct.
[...]
If it matters to you, the only thing I got wrong apparently (in my limited
non-sufficient testing) is that to access the current value within the
table row you must use the table name. So: WHERE EXCLUDED.transaction_date

= tableName.transaction_date

ME

"have access [...] to rows proposed for insertion using the special
excluded table.". You have an update situation where two tables (the
target and "excluded") are in scope with the exact same column names (by
definition) so any column references in the value expressions need to be
prefixed with which of the two tables you want to examine. As with a
normal UPDATE, the left side of the SET clause entry must reference the
target table and so its column cannot, and must not, be table qualified.
While it speaks of "rows" this is basically a per-row thing. As each row
is tested and finds a conflict the update is executed.

Other Person

Mentioning something as critical as that offhand is a mistake IMO. It
should have its own section.
It's also not mentioned in the BNF, though it shows up in the examples. You
have to basically infer everything.

ME

The exact wording of the conflict_action description in head is: "The SET
and WHERE clauses in ON CONFLICT DO UPDATE have access to the existing row
using the table's name (or an alias), and to rows proposed for insertion
using the special excluded table." I haven't read anything here that gives
me a hint as to how that ended up misinterpreted so that I could possibly
formulate an alternative wording. And I cannot think of a more appropriate
place to locate that sentence either. The examples do cover this and the
specifics here are not something that we try to represent in BNF.
I'd probably change "and to rows proposed for insertion" to "and to the
corresponding row proposed for insertion".

OP

This does not change the original conclusion we arrived at correct? If I am
reading what you are saying right, since it only discovered the conflict
after examining the row, then by the same token it will only affect the
same row where the conflict was detected.

I have a hard time imagining a user that reads the INSERT docs and
imagines that "excluded" is a relation that is visible to the query in
ways that are not limited to expression evaluation for the UPDATE's
WHERE/SET.

Yes, and based on a single encounter I agree this doesn't seem like a
broadly encountered issue. My takeaway from that eventually led to this
proposal. The "Other Person" who is complaining about the docs is one of
the mentors on the Discord server and works for one of the corporate
contributors to the community. (I suppose Discord is considered public so
maybe this redaction is unnecessary...)

David J.

In reply to: David G. Johnston (#6)
Re: doc: Clarify what "excluded" represents for INSERT ON CONFLICT

On Thu, Jun 30, 2022 at 3:07 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:

Yes, and based on a single encounter I agree this doesn't seem like a broadly encountered issue. My takeaway from that eventually led to this proposal. The "Other Person" who is complaining about the docs is one of the mentors on the Discord server and works for one of the corporate contributors to the community. (I suppose Discord is considered public so maybe this redaction is unnecessary...)

My impression from reading this transcript is that the user was
confused as to why they needed to qualify the target table name in the
ON CONFLICT DO UPDATE's WHERE clause -- they didn't have to qualify it
in the targetlist that appears in "SET ... ", so why the need to do it
in the WHERE clause? This isn't something that upsert statements need
to do all that often, just because adding additional conditions to the
WHERE clause isn't usually necessary. That much makes sense to me -- I
*can* imagine how that could cause confusion.

If that interpretation is correct, then it's not clear what it should
mean for how the INSERT documentation describes EXCLUDED. EXCLUDED is
involved here, since EXCLUDED is the thing that creates the ambiguity,
but that seems almost incidental to me. This user would probably not
have been confused if they didn't need to use a WHERE clause (very
much the common case), even when expression evaluation involving
EXCLUDED in the SET was still used (also common).

--
Peter Geoghegan

#8Robert Haas
robertmhaas@gmail.com
In reply to: Peter Geoghegan (#3)
Re: doc: Clarify what "excluded" represents for INSERT ON CONFLICT

On Thu, Jun 30, 2022 at 5:05 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Thu, Jun 30, 2022 at 1:43 PM Robert Haas <robertmhaas@gmail.com> wrote:

rhaas=# insert into foo values (1, 'frob') on conflict (a) do update
set b = (select b || 'nitz' from excluded);
ERROR: relation "excluded" does not exist
LINE 1: ...ct (a) do update set b = (select b || 'nitz' from excluded);

I do find that a bit of a curious error message, because that relation
clearly DOES exist in the range table.

Let's say that we supported this syntax. I see some problems with that
(you may have thought of these already). Thinking of "excluded" as a
separate table creates some very thorny questions, such as:

How would the user be expected to handle the case where there was more
than a single "row" in "excluded"? How could the implementation know
what the contents of the "excluded table" were ahead of time in the
multi-row-insert case? We'd have to know about *all* of the conflicts
for all rows proposed for insertion to do this, which seems impossible
in the general case -- because some of those conflicts won't have
happened yet, and cannot be predicted.

I was assuming it would just behave like a 1-row table i.e. these
would do the same thing:

insert into foo values (1, 'frob') on conflict (a) do update set b =
(select excluded.b || 'nitz');
insert into foo values (1, 'frob') on conflict (a) do update set b =
(select b || 'nitz' from excluded);

I'm actually kinda unsure why that doesn't already work. There may
well be a very good reason, but my naive thought would be that if
excluded doesn't have a range table entry, the first one would fail
because excluded can't be looked up in the range table, and if it does
have a range-table entry, then the second one would work because the
from-clause reference would find it just like the qualified column
reference did.

--
Robert Haas
EDB: http://www.enterprisedb.com

#9Robert Haas
robertmhaas@gmail.com
In reply to: Peter Geoghegan (#7)
Re: doc: Clarify what "excluded" represents for INSERT ON CONFLICT

On Thu, Jun 30, 2022 at 6:40 PM Peter Geoghegan <pg@bowt.ie> wrote:

My impression from reading this transcript is that the user was
confused as to why they needed to qualify the target table name in the
ON CONFLICT DO UPDATE's WHERE clause -- they didn't have to qualify it
in the targetlist that appears in "SET ... ", so why the need to do it
in the WHERE clause? This isn't something that upsert statements need
to do all that often, just because adding additional conditions to the
WHERE clause isn't usually necessary. That much makes sense to me -- I
*can* imagine how that could cause confusion.

+1.

I think that the issue here is simply that because both the updated
table and the "excluded" pseudo-table are visible here, and have the
same columns, an unqualified name is ambiguous. I don't really think
that it's worth documenting. The error message you get if you fail to
do it is actually pretty good:

rhaas=# insert into foo values (1, 'frob') on conflict (a) do update
set b = (select b || 'nitz');
ERROR: column reference "b" is ambiguous
LINE 1: ...'frob') on conflict (a) do update set b = (select b || 'nitz...
^

Now you could read that and not understand that the ambiguity is
between the target table and the "excluded" pseudo-table, for sure.
But, would you think to check the documentation at that point? I'm not
sure that's what people would really do. And if they did, I think that
David's proposed patch would be unlikely to make them less confused.
What would probably help more is adding something like this to the
error message:

HINT: column "b" could refer to any of these relations: "foo", "excluded"

That could also help people who encounter this error in other
situations. I'm not 100% sure this is a good idea, but I feel like it
would have a much better chance of helping someone in this situation
than the proposed doc patch.

--
Robert Haas
EDB: http://www.enterprisedb.com

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#9)
Re: doc: Clarify what "excluded" represents for INSERT ON CONFLICT

Robert Haas <robertmhaas@gmail.com> writes:

I think that the issue here is simply that because both the updated
table and the "excluded" pseudo-table are visible here, and have the
same columns, an unqualified name is ambiguous. I don't really think
that it's worth documenting. The error message you get if you fail to
do it is actually pretty good:

ERROR: column reference "b" is ambiguous

Now you could read that and not understand that the ambiguity is
between the target table and the "excluded" pseudo-table, for sure.

Agreed. It doesn't help that there's no explicit use of "excluded"
anywhere, as there is in more usual ambiguous-column cases.

What would probably help more is adding something like this to the
error message:
HINT: column "b" could refer to any of these relations: "foo", "excluded"

+1, that seems like it could be handy across the board.

regards, tom lane

In reply to: Tom Lane (#10)
Re: doc: Clarify what "excluded" represents for INSERT ON CONFLICT

On Fri, Jul 1, 2022 at 6:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

What would probably help more is adding something like this to the
error message:
HINT: column "b" could refer to any of these relations: "foo", "excluded"

+1, that seems like it could be handy across the board.

The user *will* get a similar HINT if they happen to *also* spell the
would-be ambiguous column name slightly incorrectly:

ERROR: column "barr" does not exist
LINE 1: ...lict (bar) do update set bar = excluded.bar where barr != 5;
^
HINT: Perhaps you meant to reference the column "foo.bar" or the
column "excluded.bar".

--
Peter Geoghegan

In reply to: Robert Haas (#9)
Re: doc: Clarify what "excluded" represents for INSERT ON CONFLICT

On Fri, Jul 1, 2022 at 6:01 AM Robert Haas <robertmhaas@gmail.com> wrote:

What would probably help more is adding something like this to the
error message:

HINT: column "b" could refer to any of these relations: "foo", "excluded"

That could also help people who encounter this error in other
situations. I'm not 100% sure this is a good idea, but I feel like it
would have a much better chance of helping someone in this situation
than the proposed doc patch.

I agree with everything you've said here, though I am 100% sure that
something like your proposed HINT would be a real usability win.

The "Perhaps you meant to reference the column" HINT displayed when
the user misspells a column is a surprisingly popular feature. I'm
aware of quite a few people singing its praises on Twitter, for
example. That hardly ever happens, even with features that we think of
as high impact major features. So evidently users really value these
kinds of quality of life improvements.

--
Peter Geoghegan

#13David G. Johnston
david.g.johnston@gmail.com
In reply to: Peter Geoghegan (#12)
Re: doc: Clarify what "excluded" represents for INSERT ON CONFLICT

On Fri, Jul 1, 2022 at 7:58 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Fri, Jul 1, 2022 at 6:01 AM Robert Haas <robertmhaas@gmail.com> wrote:

What would probably help more is adding something like this to the
error message:

HINT: column "b" could refer to any of these relations: "foo", "excluded"

That could also help people who encounter this error in other
situations. I'm not 100% sure this is a good idea, but I feel like it
would have a much better chance of helping someone in this situation
than the proposed doc patch.

I agree with everything you've said here, though I am 100% sure that
something like your proposed HINT would be a real usability win.

The "Perhaps you meant to reference the column" HINT displayed when
the user misspells a column is a surprisingly popular feature. I'm
aware of quite a few people singing its praises on Twitter, for
example. That hardly ever happens, even with features that we think of
as high impact major features. So evidently users really value these
kinds of quality of life improvements.

+1 to this better approach to address the specific confusion regarding
ambiguity.

Without any other changes being made I'm content with the status quo
calling excluded a table a reasonable approximation that hasn't been seen
to be confusing to our readers.

That said, I still think that the current wording should be tweak with
respect to row vs. rows (especially if we continue to call it a table):

Current:
"The SET and WHERE clauses in ON CONFLICT DO UPDATE have access to the
existing row using the table's name (or an alias), and to [rows] proposed
for insertion using the special excluded table."

Change [rows] to:

"the row"

I'm undecided whether "FROM excluded" should be something that works - but
I also don't think it would actually be used in any case.

David J.

#14Bruce Momjian
bruce@momjian.us
In reply to: David G. Johnston (#13)
1 attachment(s)
Re: doc: Clarify what "excluded" represents for INSERT ON CONFLICT

On Fri, Jul 1, 2022 at 08:11:36AM -0700, David G. Johnston wrote:

That said, I still think that the current wording should be tweak with respect
to row vs. rows (especially if we continue to call it a table):

Current:
"The SET and WHERE clauses in ON CONFLICT DO UPDATE have access to the
existing row using the table's name (or an alias), and to [rows] proposed
for insertion using the special excluded table."

Change [rows] to:

"the row"

I'm undecided whether "FROM excluded" should be something that works - but I
also don't think it would actually be used in any case.

I found two places where a singular "row" would be better, doc patch
attached.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Indecision is a decision. Inaction is an action. Mark Batterson

Attachments:

exclude.difftext/x-diff; charset=us-asciiDownload
diff --git a/doc/src/sgml/ref/insert.sgml b/doc/src/sgml/ref/insert.sgml
index a9af9959c0..022ac99b75 100644
--- a/doc/src/sgml/ref/insert.sgml
+++ b/doc/src/sgml/ref/insert.sgml
@@ -176,7 +176,7 @@ INSERT INTO <replaceable class="parameter">table_name</replaceable> [ AS <replac
         provided, it completely hides the actual name of the table.
         This is particularly useful when <literal>ON CONFLICT DO UPDATE</literal>
         targets a table named <varname>excluded</varname>, since that will otherwise
-        be taken as the name of the special table representing rows proposed
+        be taken as the name of the special table representing the row proposed
         for insertion.
        </para>
       </listitem>
@@ -396,7 +396,7 @@ INSERT INTO <replaceable class="parameter">table_name</replaceable> [ AS <replac
         conflict.  The <literal>SET</literal> and
         <literal>WHERE</literal> clauses in <literal>ON CONFLICT DO
         UPDATE</literal> have access to the existing row using the
-        table's name (or an alias), and to rows proposed for insertion
+        table's name (or an alias), and to the row proposed for insertion
         using the special <varname>excluded</varname> table.
         <literal>SELECT</literal> privilege is required on any column in the
         target table where corresponding <varname>excluded</varname>
#15Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#14)
Re: doc: Clarify what "excluded" represents for INSERT ON CONFLICT

On Fri, Jul 8, 2022 at 11:18 PM Bruce Momjian <bruce@momjian.us> wrote:

I found two places where a singular "row" would be better, doc patch
attached.

+1.

--
Robert Haas
EDB: http://www.enterprisedb.com

#16Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#14)
Re: doc: Clarify what "excluded" represents for INSERT ON CONFLICT

On Fri, Jul 8, 2022 at 11:18:43PM -0400, Bruce Momjian wrote:

On Fri, Jul 1, 2022 at 08:11:36AM -0700, David G. Johnston wrote:

That said, I still think that the current wording should be tweak with respect
to row vs. rows (especially if we continue to call it a table):

Current:
"The SET and WHERE clauses in ON CONFLICT DO UPDATE have access to the
existing row using the table's name (or an alias), and to [rows] proposed
for insertion using the special excluded table."

Change [rows] to:

"the row"

I'm undecided whether "FROM excluded" should be something that works - but I
also don't think it would actually be used in any case.

I found two places where a singular "row" would be better, doc patch
attached.

Patch applied to all supported versions.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Indecision is a decision. Inaction is an action. Mark Batterson