INSERT ... ON CONFLICT syntax issues
I'm separating this discussion out of the thread because I think this
needs wider input.
On 2015-04-24 19:21:37 -0700, Peter Geoghegan wrote:
I've *provisionally* pushed code that goes back to the old way,
Andres: https://github.com/petergeoghegan/postgres/commit/2a5d80b27d2c5832ad26dde4651c64dd2004f401Perhaps this is the least worst way, after all.
I still think it's a bad idea. To recap, the old and current way is:
INSERT ... ON CONFLICT (cola, colb [WHERE predicate_for_partial]) UPDATE|IGNORE
My problem with the WHERE being inside the parens in the above is that
it's
a) different from CREATE INDEX
b) unclear whether the WHERE belongs to colb or the whole index
expression. The equivalent for aggregates, which I bet is going to be
used less often, caused a fair amount of confusing.
That's why I wanted the WHERE outside the (), which requires either
adding DO between the index inference clause, and the action, to avoid
ambiguities in the grammar.
But I'm generally having some doubts about the syntax.
Right now it's
INSERT ... ON CONFLICT opt_on_conf_clause UPDATE|IGNORE.
A couple things:
a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What
if we, at some later point, also want to handle other kind of
violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ...
b) For me there's a WITH before the index inference clause missing, to
have it read in 'SQL' style.
c) Right now the UPDATE can refer to pseudo relations 'TARGET' and
'EXCLUDED'. I think especially the latter doesn't fit anymore at
all. How about 'CONFLICTING' and 'EXISTING'? Or even NEW and OLD?
So I guess it boils down to that I think we should switch the syntax to
be:
INSERT ... ON UNIQUE VIOLATION [WITH (cola, colb) WHERE ...] DO {NOTHING|UPDATE}
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Apr 25, 2015 at 2:01 AM, Andres Freund <andres@anarazel.de> wrote:
My problem with the WHERE being inside the parens in the above is that
it's
a) different from CREATE INDEX
I don't think that that's an important goal.
b) unclear whether the WHERE belongs to colb or the whole index
expression. The equivalent for aggregates, which I bet is going to be
used less often, caused a fair amount of confusing.
I don't see those two situations as being comparable. The inference
specification does not accept aggregates. It seems obvious to me that
the predicate only ever applies to the entire table. And it's obvious
that it's part of the inference specification because it appears in
parentheses with everything else - otherwise, *users* might find this
phantom WHERE clause ambiguous/confusing.
But I'm generally having some doubts about the syntax.
Right now it's
INSERT ... ON CONFLICT opt_on_conf_clause UPDATE|IGNORE.A couple things:
a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What
if we, at some later point, also want to handle other kind of
violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ...
I think that naming unique violations alone would be wrong (not to
mention ludicrously verbose). Heikki and I both feel that the CONFLICT
keyword captures the fact that this could be a dup violation, or an
exclusion violation. The syntax has been like this for some time, and
hasn't been a point of contention for a long time, so I thought this
was settled. Note that the syntax is quite similar to the SQLite
syntax of the same feature, that has ON CONFLICT IGNORE (it also has
ON CONFLICT REPLACE, but not ON CONFLICT UPDATE).
b) For me there's a WITH before the index inference clause missing, to
have it read in 'SQL' style.
I'm not seeing it. BTW, Robert was the one who initially proposed that
the unique index inference clause follow this exact style (albeit
before it accepted a WHERE clause to infer partial indexes, which was
only added a couple of months ago).
c) Right now the UPDATE can refer to pseudo relations 'TARGET' and
'EXCLUDED'. I think especially the latter doesn't fit anymore at
all. How about 'CONFLICTING' and 'EXISTING'? Or even NEW and OLD?
NEW and OLD are terribly misleading, since surely the NEW tuple is the
one actually appended to the relation by the UPDATE, and the OLD one
is the existing one (not the excluded one). Plus they have all that
intellectual baggage from rules.
CONFLICTING, as Greg Stark pointed out many months ago, is something
that applies to both tuples that can be referenced, which is why I
*stopped* using it months ago. They conflict with *each other*. Any
conflict must pertain to both.
Dictionary.com defines "exclude" as:
"""
verb (used with object), excluded, excluding.
1.
to shut or keep out; prevent the entrance of.
2.
to shut out from consideration, privilege, etc.:
Employees and their relatives were excluded from participation in the contest.
3.
to expel and keep out; thrust out; eject:
He was excluded from the club for infractions of the rules.
"""
Seems pretty descriptive of the situation to me - I actually put a lot
of thought into this. Additionally, the word is widely understood by
non-native speakers. TARGET is also very descriptive, because it
situationally describes either the existing tuple actually present in
the table, or (from a RETURNING clause) the final tuple present in the
table post-UPDATE. We use the term "target" for that pervasively (in
the docs and in the code).
So I guess it boils down to that I think we should switch the syntax to
be:INSERT ... ON UNIQUE VIOLATION [WITH (cola, colb) WHERE ...] DO {NOTHING|UPDATE}
Beauty is in the eye of the beholder and all, but that seems pretty
ugly to me. Honestly, I think we should just accept that the predicate
appears in the parentheses on the odd occasion that it appears at all
- partial unique indexes are not all that common.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2015-04-25 11:05:49 -0700, Peter Geoghegan wrote:
On Sat, Apr 25, 2015 at 2:01 AM, Andres Freund <andres@anarazel.de> wrote:
My problem with the WHERE being inside the parens in the above is that
it's
a) different from CREATE INDEXI don't think that that's an important goal.
Given that it's used to 'match' to indexes, I can't agree.
b) unclear whether the WHERE belongs to colb or the whole index
expression. The equivalent for aggregates, which I bet is going to be
used less often, caused a fair amount of confusing.I don't see those two situations as being comparable. The inference
specification does not accept aggregates.
Huh? It's pretty much entirely besides the point that inference doesn't
accept aggregates. The point is that ORDER BY for aggregates has
confused users because it's inside the parens.
a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What
if we, at some later point, also want to handle other kind of
violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ...I think that naming unique violations alone would be wrong (not to
mention ludicrously verbose).
Why?
The syntax has been like this for some time, and
hasn't been a point of contention for a long time, so I thought this
was settled.
I really don't care if it's been that for a long while. This is a not
yet commited feature.
b) For me there's a WITH before the index inference clause missing, to
have it read in 'SQL' style.I'm not seeing it. BTW, Robert was the one who initially proposed that
the unique index inference clause follow this exact style (albeit
before it accepted a WHERE clause to infer partial indexes, which was
only added a couple of months ago).
So?
I guess I can live with that uglyness; but I'd like somebody else to
chime in.
c) Right now the UPDATE can refer to pseudo relations 'TARGET' and
'EXCLUDED'. I think especially the latter doesn't fit anymore at
all. How about 'CONFLICTING' and 'EXISTING'? Or even NEW and OLD?NEW and OLD are terribly misleading, since surely the NEW tuple is the
one actually appended to the relation by the UPDATE, and the OLD one
is the existing one (not the excluded one). Plus they have all that
intellectual baggage from rules.
What 'intellectual baggage' would that be? That they're already known to
have been used in another place? I don't see the problem.
How about EXISTING and NEW?
"""
verb (used with object), excluded, excluding.
1.
to shut or keep out; prevent the entrance of.
2.
to shut out from consideration, privilege, etc.:
Employees and their relatives were excluded from participation in the contest.
3.
to expel and keep out; thrust out; eject:
He was excluded from the club for infractions of the rules.
"""Seems pretty descriptive of the situation to me - I actually put a lot
of thought into this. Additionally, the word is widely understood by
non-native speakers. TARGET is also very descriptive, because it
situationally describes either the existing tuple actually present in
the table, or (from a RETURNING clause) the final tuple present in the
table post-UPDATE. We use the term "target" for that pervasively (in
the docs and in the code).
Sorry, I don't buy either argument. EXISTING and NEW would surely at
least as widely understood than EXCLUDE and TARGET. The latter does just
about no sense to me; especially from a user POV. I don't think the
existing usage of the term has much to do what it's used for here. That
it has 'morphing' characteristics imo just makes it worse, rather than
better. Besides being confusing that it has different meanings, it's far
from inconceivable that somebody wants to return values from the
preexisting, new, and merged rows.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Apr 25, 2015 at 11:24 AM, Andres Freund <andres@anarazel.de> wrote:
b) unclear whether the WHERE belongs to colb or the whole index
expression. The equivalent for aggregates, which I bet is going to be
used less often, caused a fair amount of confusing.I don't see those two situations as being comparable. The inference
specification does not accept aggregates.Huh? It's pretty much entirely besides the point that inference doesn't
accept aggregates. The point is that ORDER BY for aggregates has
confused users because it's inside the parens.
Would any alternative cause less confusion? That's the real issue. And
I'm unconvinced that your alternative would.
a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What
if we, at some later point, also want to handle other kind of
violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ...I think that naming unique violations alone would be wrong (not to
mention ludicrously verbose).Why?
Because, as I said, it might not be a unique violation at all. It
could be an exclusion violation.
b) For me there's a WITH before the index inference clause missing, to
have it read in 'SQL' style.I'm not seeing it. BTW, Robert was the one who initially proposed that
the unique index inference clause follow this exact style (albeit
before it accepted a WHERE clause to infer partial indexes, which was
only added a couple of months ago).So?
So, his opinion matters if it comes down to a vote. The inference
specification syntax as implemented is exactly what he suggested (plus
I've added a predicate).
I guess I can live with that uglyness; but I'd like somebody else to
chime in.
Agreed.
c) Right now the UPDATE can refer to pseudo relations 'TARGET' and
'EXCLUDED'. I think especially the latter doesn't fit anymore at
all. How about 'CONFLICTING' and 'EXISTING'? Or even NEW and OLD?NEW and OLD are terribly misleading, since surely the NEW tuple is the
one actually appended to the relation by the UPDATE, and the OLD one
is the existing one (not the excluded one). Plus they have all that
intellectual baggage from rules.What 'intellectual baggage' would that be? That they're already known to
have been used in another place? I don't see the problem.
The problem is that they make you think of rules, and they don't
describe what's going on at all.
Seems pretty descriptive of the situation to me - I actually put a lot
of thought into this. Additionally, the word is widely understood by
non-native speakers. TARGET is also very descriptive, because it
situationally describes either the existing tuple actually present in
the table, or (from a RETURNING clause) the final tuple present in the
table post-UPDATE. We use the term "target" for that pervasively (in
the docs and in the code).Sorry, I don't buy either argument. EXISTING and NEW would surely at
least as widely understood than EXCLUDE and TARGET. The latter does just
about no sense to me; especially from a user POV. I don't think the
existing usage of the term has much to do what it's used for here.
Yes it does. The UPDATE docs refer to the target table in a way
intended to distinguish it from any joined-to table (FROM table). It's
clear as day. Maybe EXISTING is equally well understood as a word in
general, but it's way more ambiguous than EXCLUDED is here.
That
it has 'morphing' characteristics imo just makes it worse, rather than
better. Besides being confusing that it has different meanings, it's far
from inconceivable that somebody wants to return values from the
preexisting, new, and merged rows.
This is how RETURNING works from UPDATEs in general. IOW, if you do an
UPDATE FROM (which is pretty similar to ON CONFLICT UPDATE,
syntax-wise), then you can only refer to the joined table's tuple and
the final post-update tuple from within RETURNING. You cannot refer to
the pre-UPDATE target tuple there either -- it's *exactly* the same
situation. Why should it be any different here? The
situational/morphing characteristic of the alias name TARGET is
therefore absolutely appropriate, in that it follows UPDATE.
To be fair, there is one unrelated slight difference with RETURNING
and conventional UPDATEs: You cannot return the EXCLUDED tuple (in the
same way that you can reference the joined-FROM tuple within
conventional UPDATEs). This is because the pertinent information is
likely to be in the target tuple (after all, the DML statement names
the proposed-for-insertion tuples itself, directly), but more
importantly because projecting both would necessitate *always*
qualifying the RETURNING column names to resolve which tuple is
intended (UPDATE FROM will seldom be a self-join, but this will always
be like a self-join).
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Apr 25, 2015 at 11:50 AM, Peter Geoghegan <pg@heroku.com> wrote:
To be fair, there is one unrelated slight difference with RETURNING
and conventional UPDATEs: You cannot return the EXCLUDED tuple (in the
same way that you can reference the joined-FROM tuple within
conventional UPDATEs). This is because the pertinent information is
likely to be in the target tuple (after all, the DML statement names
the proposed-for-insertion tuples itself, directly), but more
importantly because projecting both would necessitate *always*
qualifying the RETURNING column names to resolve which tuple is
intended (UPDATE FROM will seldom be a self-join, but this will always
be like a self-join).
It also makes sense because this is the RETURNING clause of an INSERT,
not an UPDATE. So the general INSERT behavior is what is expected. It
ought to be irrelevant if tuples were projected by actually inserting
or updating. Otherwise, your UPSERT is probably misconceived.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2015-04-25 11:50:59 -0700, Peter Geoghegan wrote:
On Sat, Apr 25, 2015 at 11:24 AM, Andres Freund <andres@anarazel.de> wrote:
c) Right now the UPDATE can refer to pseudo relations 'TARGET' and
'EXCLUDED'. I think especially the latter doesn't fit anymore at
all. How about 'CONFLICTING' and 'EXISTING'? Or even NEW and OLD?NEW and OLD are terribly misleading, since surely the NEW tuple is the
one actually appended to the relation by the UPDATE, and the OLD one
is the existing one (not the excluded one). Plus they have all that
intellectual baggage from rules.What 'intellectual baggage' would that be? That they're already known to
have been used in another place? I don't see the problem.The problem is that they make you think of rules, and they don't
describe what's going on at all.
95% of all users will know NEW/OLD from triggers, not rules. Where NEW
is used in a quite comparable way.
Seems pretty descriptive of the situation to me - I actually put a lot
of thought into this. Additionally, the word is widely understood by
non-native speakers. TARGET is also very descriptive, because it
situationally describes either the existing tuple actually present in
the table, or (from a RETURNING clause) the final tuple present in the
table post-UPDATE. We use the term "target" for that pervasively (in
the docs and in the code).Sorry, I don't buy either argument. EXISTING and NEW would surely at
least as widely understood than EXCLUDE and TARGET. The latter does just
about no sense to me; especially from a user POV. I don't think the
existing usage of the term has much to do what it's used for here.Yes it does. The UPDATE docs refer to the target table in a way
intended to distinguish it from any joined-to table (FROM table). It's
clear as day.
Which means the term is used in a different way for INSERTs and UPDATEs
already. To me it sounds like it's a remnant of your earlier syntax
proposal for UPSERT.
Maybe EXISTING is equally well understood as a word in general, but
it's way more ambiguous than EXCLUDED is here.
What? I'm not suggesting to replace EXCLUDED by EXISTING - that'd make
absolutely no sense. My suggesting is to have NEW refer to the tuple
specified in the INSERT and EXISTING to the, well, pre existing tuple
that the conflict is with.
That
it has 'morphing' characteristics imo just makes it worse, rather than
better. Besides being confusing that it has different meanings, it's far
from inconceivable that somebody wants to return values from the
preexisting, new, and merged rows.This is how RETURNING works from UPDATEs in general.
And there's been a patch (which unfortunately died because it's
implementation wasn't good), to allow referring to the other versions of
the tuple. It has been wished for numerous times.
IOW, if you do an UPDATE FROM (which is pretty similar to ON CONFLICT
UPDATE, syntax-wise), then you can only refer to the joined table's
tuple and the final post-update tuple from within RETURNING.
You cannot refer to the pre-UPDATE target tuple there either -- it's
*exactly* the same situation. Why should it be any different here? The
situational/morphing characteristic of the alias name TARGET is
therefore absolutely appropriate, in that it follows UPDATE.
Contrasting
TARGET is also very descriptive, because it
situationally describes either the existing tuple actually present in
the table, or (from a RETURNING clause) the final tuple present in the
table post-UPDATE. We use the term "target" for that pervasively (in
the docs and in the code).
the docs say:
Since
<literal>RETURNING</> is not part of the <command>UPDATE</>
auxiliary query, the special <literal>ON CONFLICT UPDATE</> aliases
(<varname>TARGET</> and <varname>EXCLUDED</>) may not be
referenced; only the row as it exists after updating (or
inserting) is returned.
So I don't understand that whole chain of argument. There's no such
morphing behaviour, unless I miss something?
2a5d80b27d2c5832ad26dde4651c64dd2004f401:
The problem with this seems to be that it more or less
necessitates making both IGNORE and UPDATE fully reserved keywords in
order to avoid an ambiguity, which we prefer not to do
It does not. As mentioned in the thread DO UPDATE/NOTHING work without
anything like that.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Apr 25, 2015 at 12:23 PM, Andres Freund <andres@anarazel.de> wrote:
95% of all users will know NEW/OLD from triggers, not rules. Where NEW
is used in a quite comparable way.
I don't think it's comparable.
Seems pretty descriptive of the situation to me - I actually put a lot
of thought into this. Additionally, the word is widely understood by
non-native speakers. TARGET is also very descriptive, because it
situationally describes either the existing tuple actually present in
the table, or (from a RETURNING clause) the final tuple present in the
table post-UPDATE. We use the term "target" for that pervasively (in
the docs and in the code).Sorry, I don't buy either argument. EXISTING and NEW would surely at
least as widely understood than EXCLUDE and TARGET. The latter does just
about no sense to me; especially from a user POV. I don't think the
existing usage of the term has much to do what it's used for here.Yes it does. The UPDATE docs refer to the target table in a way
intended to distinguish it from any joined-to table (FROM table). It's
clear as day.Which means the term is used in a different way for INSERTs and UPDATEs
already.
No, it isn't. It both cases the table is the one involved in the parse
analysis setTargetTable() call.
Maybe EXISTING is equally well understood as a word in general, but
it's way more ambiguous than EXCLUDED is here.What? I'm not suggesting to replace EXCLUDED by EXISTING - that'd make
absolutely no sense. My suggesting is to have NEW refer to the tuple
specified in the INSERT and EXISTING to the, well, pre existing tuple
that the conflict is with.
Okay, but that doesn't change my opinion.
That
it has 'morphing' characteristics imo just makes it worse, rather than
better. Besides being confusing that it has different meanings, it's far
from inconceivable that somebody wants to return values from the
preexisting, new, and merged rows.This is how RETURNING works from UPDATEs in general.
And there's been a patch (which unfortunately died because it's
implementation wasn't good), to allow referring to the other versions of
the tuple. It has been wished for numerous times.
Well, if that patch is ever committed, then it won't be hard to get
the behavior here too, since it is literally exactly the same code. I
don't change anything about it, and that seems to be your problem.
the docs say:
Since
<literal>RETURNING</> is not part of the <command>UPDATE</>
auxiliary query, the special <literal>ON CONFLICT UPDATE</> aliases
(<varname>TARGET</> and <varname>EXCLUDED</>) may not be
referenced; only the row as it exists after updating (or
inserting) is returned.So I don't understand that whole chain of argument. There's no such
morphing behaviour, unless I miss something?
That's a documentation bug (a remnant of an earlier version). Sorry
about that. You can reference TARGET from returning. It's directly
contradicted by this much earlier statement on the INSERT doc page:
"Both aliases can be used in the auxiliary query targetlist and WHERE
clause, while the TARGET alias can be used anywhere within the entire
statement (e.g., within the RETURNING clause)"
I'll go fix that.
2a5d80b27d2c5832ad26dde4651c64dd2004f401:
The problem with this seems to be that it more or less
necessitates making both IGNORE and UPDATE fully reserved keywords in
order to avoid an ambiguity, which we prefer not to doIt does not. As mentioned in the thread DO UPDATE/NOTHING work without
anything like that.
I just mean that it couldn't work as-was in the repo at that time.
This commit message was written before your proposal of 8 hours ago.
We're going to have to agree to disagree here. I've given you my
opinion. I'm burnt out on this patch, and whatever the path of least
resistance is is the path I'll take. Frankly, the only reason that I'm
putting up any kind of argument is because I don't think that your
proposal is the path of least resistance.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Apr 25, 2015 at 12:35 PM, Peter Geoghegan <pg@heroku.com> wrote:
That
it has 'morphing' characteristics imo just makes it worse, rather than
better. Besides being confusing that it has different meanings, it's far
from inconceivable that somebody wants to return values from the
preexisting, new, and merged rows.This is how RETURNING works from UPDATEs in general.
And there's been a patch (which unfortunately died because it's
implementation wasn't good), to allow referring to the other versions of
the tuple. It has been wished for numerous times.Well, if that patch is ever committed, then it won't be hard to get
the behavior here too, since it is literally exactly the same code. I
don't change anything about it, and that seems to be your problem.
I withdraw this remark. Even in a world where this patch is committed,
it still makes sense for the INSERT returning behavior to not be
altered (and to project only TARGET tuples even if they come from the
auxiliary UPDATE). The "join" is within the auxiliary UPDATE, not the
INSERT, and it should be no more possible to project intermediate
tuples (like EXCLUDED.*) from the INSERT's RETURNING than it is to
project CTE scan tuples from an INSERT ... RETURNING with a CTE.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 04/25/2015 12:01 PM, Andres Freund wrote:
INSERT ... ON CONFLICT (cola, colb [WHERE predicate_for_partial]) UPDATE|IGNORE
My problem with the WHERE being inside the parens in the above is that
it's
a) different from CREATE INDEX
b) unclear whether the WHERE belongs to colb or the whole index
expression. The equivalent for aggregates, which I bet is going to be
used less often, caused a fair amount of confusing.That's why I wanted the WHERE outside the (), which requires either
adding DO between the index inference clause, and the action, to avoid
ambiguities in the grammar.
Yeah, having the WHERE outside the parens seems much nicer. What is the
ambiguity?
But I'm generally having some doubts about the syntax.
Right now it's
INSERT ... ON CONFLICT opt_on_conf_clause UPDATE|IGNORE.A couple things:
a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What
if we, at some later point, also want to handle other kind of
violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ...
As Peter said, it's also for exclusion constraints. Perhaps "ON
CONSTRAINT VIOLATION"? It doesn't apply to foreign key constraints,
though. I think "ON CONFLICT" is fine.
b) For me there's a WITH before the index inference clause missing, to
have it read in 'SQL' style.
Agreed. ON would sound more natural than WITH though:
INSERT INTO mytable ON CONFLICT ON (keycol) UPDATE ...
The ability to specify a constraint by name hasn't been implemented, but
that would read quite naturally as:
INSERT INTO mytable ON CONFLICT ON CONSTRAINT my_constraint UPDATE ...
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: 20008_1429952483_553B57E3_20008_187_1_20150425090114.GB12723@awork2.anarazel.deReference msg id not found: 20008_1429952483_553B57E3_20008_187_1_20150425090114.GB12723@awork2.anarazel.de | Resolved by subject fallback
On April 26, 2015 11:22:01 AM GMT+02:00, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 04/25/2015 12:01 PM, Andres Freund wrote:
INSERT ... ON CONFLICT (cola, colb [WHERE predicate_for_partial])
UPDATE|IGNORE
My problem with the WHERE being inside the parens in the above is
that
it's
a) different from CREATE INDEX
b) unclear whether the WHERE belongs to colb or the whole index
expression. The equivalent for aggregates, which I bet is goingto be
used less often, caused a fair amount of confusing.
That's why I wanted the WHERE outside the (), which requires either
adding DO between the index inference clause, and the action, toavoid
ambiguities in the grammar.
Yeah, having the WHERE outside the parens seems much nicer. What is the
ambiguity?
With a full keyword in between (like DO), there's none. But without it its ambiguous where a trailing UPDATE belongs to. At least from the point of a LALR grammar. WHERE UPDATE; is legal. I don't see the DO as much of a problem though.
But I'm generally having some doubts about the syntax.
Right now it's
INSERT ... ON CONFLICT opt_on_conf_clause UPDATE|IGNORE.A couple things:
a) Why is is 'CONFLICT"? We're talking about a uniquness violation.
What
if we, at some later point, also want to handle other kind of
violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION...
As Peter said, it's also for exclusion constraints. Perhaps "ON
CONSTRAINT VIOLATION"? It doesn't apply to foreign key constraints,
though. I think "ON CONFLICT" is fine.
What if we, as at least I have previously wished for, want to allow handling other types of constraints? It'd be quite cool to be able to insert the referenced key on a fkey violation for some use cases.
b) For me there's a WITH before the index inference clause missing,
to
have it read in 'SQL' style.
Agreed. ON would sound more natural than WITH though:
INSERT INTO mytable ON CONFLICT ON (keycol) UPDATE ...
I chose WITh because of the repeated DO; that's all ;)
---
Please excuse brevity and formatting - I am writing this on my mobile phone.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 26/04/15 12:08, Andres Freund wrote:
On April 26, 2015 11:22:01 AM GMT+02:00, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 04/25/2015 12:01 PM, Andres Freund wrote:
That's why I wanted the WHERE outside the (), which requires either
adding DO between the index inference clause, and the action, toavoid
ambiguities in the grammar.
Yeah, having the WHERE outside the parens seems much nicer. What is the
ambiguity?
With a full keyword in between (like DO), there's none. But without it its ambiguous where a trailing UPDATE belongs to. At least from the point of a LALR grammar. WHERE UPDATE; is legal. I don't see the DO as much of a problem though.
The DO variant with WHERE outside of parenthesis sounds fine to me. Or
at least better than the alternatives I've seen or can come up with.
A couple things:
a) Why is is 'CONFLICT"? We're talking about a uniquness violation.
What
if we, at some later point, also want to handle other kind of
violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION...
As Peter said, it's also for exclusion constraints. Perhaps "ON
CONSTRAINT VIOLATION"? It doesn't apply to foreign key constraints,
though. I think "ON CONFLICT" is fine.What if we, as at least I have previously wished for, want to allow handling other types of constraints? It'd be quite cool to be able to insert the referenced key on a fkey violation for some use cases.
b) For me there's a WITH before the index inference clause missing,
to
have it read in 'SQL' style.
Agreed. ON would sound more natural than WITH though:
INSERT INTO mytable ON CONFLICT ON (keycol) UPDATE ...
I chose WITh because of the repeated DO; that's all ;)
The ON CONFLICT ON sounds really weird to me. Either ON CONSTRAINT
VIOLATION (foo) or ON CONFLICT [WITH] (foo) both seem acceptable.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
* Heikki Linnakangas (hlinnaka@iki.fi) wrote:
On 04/25/2015 12:01 PM, Andres Freund wrote:
INSERT ... ON CONFLICT (cola, colb [WHERE predicate_for_partial]) UPDATE|IGNORE
My problem with the WHERE being inside the parens in the above is that
it's
a) different from CREATE INDEX
b) unclear whether the WHERE belongs to colb or the whole index
expression. The equivalent for aggregates, which I bet is going to be
used less often, caused a fair amount of confusing.That's why I wanted the WHERE outside the (), which requires either
adding DO between the index inference clause, and the action, to avoid
ambiguities in the grammar.Yeah, having the WHERE outside the parens seems much nicer. What is
the ambiguity?
I like having it outside the parens also.
But I'm generally having some doubts about the syntax.
Right now it's
INSERT ... ON CONFLICT opt_on_conf_clause UPDATE|IGNORE.A couple things:
a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What
if we, at some later point, also want to handle other kind of
violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ...As Peter said, it's also for exclusion constraints. Perhaps "ON
CONSTRAINT VIOLATION"? It doesn't apply to foreign key constraints,
though. I think "ON CONFLICT" is fine.
I don't mind using "CONFLICT" here, seems to make sense to me.
b) For me there's a WITH before the index inference clause missing, to
have it read in 'SQL' style.Agreed. ON would sound more natural than WITH though:
INSERT INTO mytable ON CONFLICT ON (keycol) UPDATE ...
The ability to specify a constraint by name hasn't been implemented,
but that would read quite naturally as:INSERT INTO mytable ON CONFLICT ON CONSTRAINT my_constraint UPDATE ...
I don't particularly like the double-ON in this..
I've not tried, but is the first ON required to be a full keyword?
Seems like it probably is, but just to finish the thought I had, what
about:
INSERT INTO mytable .. IF CONFLICT ON (a,b) WHERE .. THEN UPDATE
IF is currently just an unreserved keyword though.
We could use FOR though:
INSERT INTO mytable .. FOR CONFLICT ON (a,b) WHERE .. THEN UPDATE
Though that'd probably sound better as:
INSERT INTO mytable .. FOR CONFLICT ON (a,b) WHERE .. DO UPDATE
Another option is:
INSERT INTO mytable .. WHEN CONFLICT ON (a,b) WHERE .. DO UPDATE
Which could also be:
INSERT INTO mytable .. WHEN CONFLICT ON (a,b) WHERE .. THEN UPDATE
of course..
What's important, in my view, is to keep the simple case simple and so
I'm not particularly wedded to any of these approaches, just trying to
help with other suggestions.
INSERT INTO mytable VALUES ('key1','key2','val1','val2')
ON CONFLICT UPDATE SET val1 = 'val1', val2 = 'val2';
strikes me as a the 99% use-case here that we need to keep sane, and
it'd be really nice if we didn't have to include the SET clause and
duplicate those values at all.. That could be something we add later
though, I don't think it needs to be done now.
Thanks!
Stephen
On Sun, Apr 26, 2015 at 6:34 AM, Stephen Frost <sfrost@snowman.net> wrote:
What's important, in my view, is to keep the simple case simple and so
I'm not particularly wedded to any of these approaches, just trying to
help with other suggestions.INSERT INTO mytable VALUES ('key1','key2','val1','val2')
ON CONFLICT UPDATE SET val1 = 'val1', val2 = 'val2';strikes me as a the 99% use-case here that we need to keep sane, and
it'd be really nice if we didn't have to include the SET clause and
duplicate those values at all.. That could be something we add later
though, I don't think it needs to be done now.
You can do that already. That's what the EXCLUDED.* alias that is
automatically added is for (the thing that Andres disliked the
spelling of - or the other thing). This is legal, for example:
INSERT INTO mytable (foo, bar, baz, bat) VALUES ('key1','key2','val1','val2')
ON CONFLICT (foo) UPDATE SET (foo, bar, baz, bat) = (EXCLUDED.foo,
EXCLUDED.bar, EXCLUDED.baz, EXCLUDED.bat)';
I don't want to accept something that automatically merges the
excluded tuple (e.g., "SET (*) = EXLCUDED.*"), for reasons outlined
here: https://wiki.postgresql.org/wiki/UPSERT#VoltDB.27s_UPSERT
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Peter,
* Peter Geoghegan (pg@heroku.com) wrote:
On Sun, Apr 26, 2015 at 6:34 AM, Stephen Frost <sfrost@snowman.net> wrote:
What's important, in my view, is to keep the simple case simple and so
I'm not particularly wedded to any of these approaches, just trying to
help with other suggestions.INSERT INTO mytable VALUES ('key1','key2','val1','val2')
ON CONFLICT UPDATE SET val1 = 'val1', val2 = 'val2';strikes me as a the 99% use-case here that we need to keep sane, and
it'd be really nice if we didn't have to include the SET clause and
duplicate those values at all.. That could be something we add later
though, I don't think it needs to be done now.You can do that already. That's what the EXCLUDED.* alias that is
automatically added is for (the thing that Andres disliked the
spelling of - or the other thing). This is legal, for example:INSERT INTO mytable (foo, bar, baz, bat) VALUES ('key1','key2','val1','val2')
ON CONFLICT (foo) UPDATE SET (foo, bar, baz, bat) = (EXCLUDED.foo,
EXCLUDED.bar, EXCLUDED.baz, EXCLUDED.bat)';
Yeah, that's not exactly simpler and I don't expect to see it used very
often (as in, less than 1%) because of that.
I don't want to accept something that automatically merges the
excluded tuple (e.g., "SET (*) = EXLCUDED.*"), for reasons outlined
here: https://wiki.postgresql.org/wiki/UPSERT#VoltDB.27s_UPSERT
Perhaps I'm missing it, but the reasons that I see there appear to be:
"It'd be like SELECT *" and "we'd have to decide what to do about the
value for unspecified columns". As for the latter- we have to do that
*anyway*, no? What happens if you do:
INSERT INTO mytable (foo, bar, baz, bat) VALUES ('key1','key2','val1','val2')
ON CONFLICT (foo) UPDATE SET (baz) = (EXCLUDED.baz);
?
As for the "SELECT *" concern, I fail to see how it's any different from
the exact same currently-encouraged usage of INSERT + UPDATE:
INSERT INTO mytable (foo, bar, baz, bat) VALUES ('key1','key2','val1','val2');
... catch the exception
UPDATE mytable SET baz = 'val1', bat = 'val2' WHERE foo = 'key1' and bar = 'key2';
Clearly there are issues with the above if someone is running around
adding columns to tables and PG has to figure out if we should be
setting the non-mentioned columns to NULL or to the default for the
column, but we're all quite happy to do so and trust that whomever is
adding the column has set a sane default and that PG will use it when
the column isn't included in either the INSERT or the UPDATE.
Note that I wasn't suggesting your "SET (*) = EXLCUDED.*" syntax and if
that would expand to something different than what I've outlined above
then it would make sense to not include it (... or fix it to act the
same, and then it's just a more verbose approach).
Further, this is *very* different from how the "SELECT *" concern can
cause things to break unexpectedly- new columns end up getting returned
which the application is unlikely to be prepared for. That doesn't
happen here and so I don't believe it makes any sense to try and compare
the two.
Happy to discuss, of course, and apologies if I missed some other issue-
I was just reading what I found at the link provided.
Thanks!
Stephen
On Sun, Apr 26, 2015 at 11:08 AM, Stephen Frost <sfrost@snowman.net> wrote:
I don't want to accept something that automatically merges the
excluded tuple (e.g., "SET (*) = EXLCUDED.*"), for reasons outlined
here: https://wiki.postgresql.org/wiki/UPSERT#VoltDB.27s_UPSERTPerhaps I'm missing it, but the reasons that I see there appear to be:
"It'd be like SELECT *" and "we'd have to decide what to do about the
value for unspecified columns". As for the latter- we have to do that
*anyway*, no? What happens if you do:INSERT INTO mytable (foo, bar, baz, bat) VALUES ('key1','key2','val1','val2')
ON CONFLICT (foo) UPDATE SET (baz) = (EXCLUDED.baz);?
It's like any other UPDATE - the values of columns not appearing in
the targetlist are unchanged from the original row version now
superseded. It doesn't matter that you had some other values in the
INSERT. You only get what you ask for.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
* Peter Geoghegan (pg@heroku.com) wrote:
On Sun, Apr 26, 2015 at 11:08 AM, Stephen Frost <sfrost@snowman.net> wrote:
I don't want to accept something that automatically merges the
excluded tuple (e.g., "SET (*) = EXLCUDED.*"), for reasons outlined
here: https://wiki.postgresql.org/wiki/UPSERT#VoltDB.27s_UPSERTPerhaps I'm missing it, but the reasons that I see there appear to be:
"It'd be like SELECT *" and "we'd have to decide what to do about the
value for unspecified columns". As for the latter- we have to do that
*anyway*, no? What happens if you do:INSERT INTO mytable (foo, bar, baz, bat) VALUES ('key1','key2','val1','val2')
ON CONFLICT (foo) UPDATE SET (baz) = (EXCLUDED.baz);?
It's like any other UPDATE - the values of columns not appearing in
the targetlist are unchanged from the original row version now
superseded. It doesn't matter that you had some other values in the
INSERT. You only get what you ask for.
Ok, that makes sense.. So is the concern that an INSERT would end up
getting default values while an UPDATE would preserve whatever's there?
I don't see that as an issue.
Are you still against having a way to say "go forth and update whatever
non-conflicting columns I've specified in the INSERT, if there is a
conflict"..?
Again, not saying it has to be done now, but it'd certainly be nice if
we had it initially because otherwise the ORMs and "frameworks" of the
world will be stuck supporting the more verbose approach for as long as
we support it (~5 years..).
Thanks!
Stephen
On Sun, Apr 26, 2015 at 11:35 AM, Stephen Frost <sfrost@snowman.net> wrote:
Ok, that makes sense.. So is the concern that an INSERT would end up
getting default values while an UPDATE would preserve whatever's there?I don't see that as an issue.
I think it easily could be.
Are you still against having a way to say "go forth and update whatever
non-conflicting columns I've specified in the INSERT, if there is a
conflict"..?Again, not saying it has to be done now, but it'd certainly be nice if
we had it initially because otherwise the ORMs and "frameworks" of the
world will be stuck supporting the more verbose approach for as long as
we support it (~5 years..).
The more verbose approach is entirely necessary much of the time. For example:
insert into upsert_race_test (index, count)
values ('541','-1') on conflict update set count=TARGET.count + EXCLUDED.count;
Merging like this will be a very common requirement.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
* Peter Geoghegan (pg@heroku.com) wrote:
On Sun, Apr 26, 2015 at 11:35 AM, Stephen Frost <sfrost@snowman.net> wrote:
Ok, that makes sense.. So is the concern that an INSERT would end up
getting default values while an UPDATE would preserve whatever's there?I don't see that as an issue.
I think it easily could be.
Ok.. Can you elaborate on that? Would it be an issue that's different
from the same thing done as independent commands?
Perhaps it'd be an issue for individuals who attempt to combine some
more complicated INSERT/UPDATE logic and don't realize that they'd get
whatever the existing value is for the non-specified columns rather than
the default value, but I'm sure they'd realize it on testing it and,
well, there's lots of ways users can misuse SQL and PG and get what they
expect 99% of the time (JOIN would be a great example..) only to have
things break one day.
Are you still against having a way to say "go forth and update whatever
non-conflicting columns I've specified in the INSERT, if there is a
conflict"..?Again, not saying it has to be done now, but it'd certainly be nice if
we had it initially because otherwise the ORMs and "frameworks" of the
world will be stuck supporting the more verbose approach for as long as
we support it (~5 years..).The more verbose approach is entirely necessary much of the time. For example:
insert into upsert_race_test (index, count)
values ('541','-1') on conflict update set count=TARGET.count + EXCLUDED.count;Merging like this will be a very common requirement.
I was just about to reply to myself that I didn't intend to say that we
would remove the more verbose syntax but rather that they'd have to
use the more verbose syntax as long as we supported a release which
*didn't* have the simpler syntax, which would be ~5 years.
Thanks!
Stephen
On Sun, Apr 26, 2015 at 11:43 AM, Stephen Frost <sfrost@snowman.net> wrote:
I think it easily could be.
Ok.. Can you elaborate on that? Would it be an issue that's different
from the same thing done as independent commands?
I think that the stuff I linked to describes my concerns exhaustively.
In any case, it's a discussion for another day.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Apr 26, 2015 at 09:34:12AM -0400, Stephen Frost wrote:
* Heikki Linnakangas (hlinnaka@iki.fi) wrote:
On 04/25/2015 12:01 PM, Andres Freund wrote:
INSERT ... ON CONFLICT (cola, colb [WHERE predicate_for_partial]) UPDATE|IGNORE
My problem with the WHERE being inside the parens in the above is that
it's
a) different from CREATE INDEX
b) unclear whether the WHERE belongs to colb or the whole index
expression. The equivalent for aggregates, which I bet is going to be
used less often, caused a fair amount of confusing.That's why I wanted the WHERE outside the (), which requires either
adding DO between the index inference clause, and the action, to avoid
ambiguities in the grammar.Yeah, having the WHERE outside the parens seems much nicer. What is
the ambiguity?I like having it outside the parens also.
Agreed, and I like the DO [ UPDATE | NOTHING ] too.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers