INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

pgsqladmin@geoff.dj

almost 11 years ago

In reply to: Geoff Winkless (#1)

Re: INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

On 19 May 2015 at 16:32, I wrote:

In the event that the INSERT triggers a constraint that the UPDATE fails
to resolve, it will still fail in exactly the same way that running the ON
CONFLICT on a specific constraint would fail, so it's not like you gain any
extra value from specifying the constraint, is it?

I don't know why I wrote this paragraph, it's just the product of me
thinking of something else at the same time:
UPDATE obviously doesn't resolve a conflict as such.

Thinking about it more, I suppose if multiple constraints end up triggering
for the same INSERT, it would require UPDATEs of multiple rows. Is that the
issue?

Geoff

simon@2ndQuadrant.com

almost 11 years ago

In reply to: Geoff Winkless (#2)

Re: INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

On 19 May 2015 at 11:49, Geoff Winkless <pgsqladmin@geoff.dj> wrote:

On 19 May 2015 at 16:32, I wrote:

In the event that the INSERT triggers a constraint that the UPDATE fails
to resolve, it will still fail in exactly the same way that running the ON
CONFLICT on a specific constraint would fail, so it's not like you gain any
extra value from specifying the constraint, is it?

I don't know why I wrote this paragraph, it's just the product of me
thinking of something else at the same time:
UPDATE obviously doesn't resolve a conflict as such.

Thinking about it more, I suppose if multiple constraints end up
triggering for the same INSERT, it would require UPDATEs of multiple rows.
Is that the issue?

I'm sure we'll be asked these questions many times.

Can you comment on whether the docs are sufficiently detailed to explain
this answer?

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsqladmin@geoff.dj

almost 11 years ago

In reply to: Simon Riggs (#3)

Re: INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

On 19 May 2015 at 20:11, Simon Riggs <simon@2ndquadrant.com> wrote:

I'm sure we'll be asked these questions many times.

Can you comment on whether the docs are sufficiently detailed to explain
this answer?

Well http://www.postgresql.org/docs/devel/static/sql-insert.html explains
that a conflict_target clause is required but doesn't explain why.

It _does_ make clear that multiple UPDATEs to the same row are not allowed,
but that in itself doesn't automatically restrict the use of multiple
constraint targets; I could easily INSERT a set of values that would
trigger that failure with just one constraint target.

http://www.postgresql.org/docs/devel/static/sql-insert.html talks about how
MySQL's ON DUPLICATE can only act against the first matching row where
multiple constraints match against multiple rows. I suppose if that were
the case here (ie the first excluding row would stop other rows firing
against the UPDATE) would break the deterministic feature, but it's not
clear if that's true or not. I don't see why multiple target rows couldn't
be updated based on multiple constraints, that would not in-and-of-itself
break determinism.

If I'm missing the obvious, accept my apologies.

Geoff

pg@bowt.ie

almost 11 years ago

In reply to: Geoff Winkless (#4)

Re: INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

On Tue, May 19, 2015 at 12:57 PM, Geoff Winkless <pgsqladmin@geoff.dj> wrote:

Well http://www.postgresql.org/docs/devel/static/sql-insert.html explains
that a conflict_target clause is required but doesn't explain why.

Yes, for ON CONFLICT DO UPDATE, it is mandatory.

It _does_ make clear that multiple UPDATEs to the same row are not allowed,
but that in itself doesn't automatically restrict the use of multiple
constraint targets; I could easily INSERT a set of values that would trigger
that failure with just one constraint target.

True.

http://www.postgresql.org/docs/devel/static/sql-insert.html talks about how
MySQL's ON DUPLICATE can only act against the first matching row where
multiple constraints match against multiple rows. I suppose if that were the
case here (ie the first excluding row would stop other rows firing against
the UPDATE) would break the deterministic feature, but it's not clear if
that's true or not. I don't see why multiple target rows couldn't be updated
based on multiple constraints, that would not in-and-of-itself break
determinism.

If I'm missing the obvious, accept my apologies.

It's trivial to modify Postgres to not require that a specific unique
index be inferred, so that you can omit the inference specification
for DO UPDATE just as you can for DO NOTHING. That would make it work
in a similar way to MySQL; whatever actually conflict was detected
would be assumed to be cause to take the alternative update path.

The only reason I can see for wanting to do this is where you're
running a migration or something, and two unique indexes are
equivalent anyway. Like maybe you have a partial index and a
non-partial index, and you're just about to drop one of them. But the
inference specification will do the right thing here anyway --
multiple unique indexes can be inferred for edge cases like this.

I have a hard time imagining why you'd ever not want to be explicit
about what to take the alternative path on for the DO UPDATE variant.
Unless perhaps you have a different UPDATE targetlist and so on
corresponding to that case, which is currently not possible -- but
then what if multiple constraints have would-be violations at the same
time? It gets awfully messy very quickly.

What do you have in mind?

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsqladmin@geoff.dj

almost 11 years ago

In reply to: Peter Geoghegan (#5)

Re: INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

On 19 May 2015 at 21:12, Peter Geoghegan <pg@heroku.com> wrote:

It's trivial to modify Postgres to not require that a specific unique
index be inferred, so that you can omit the inference specification
for DO UPDATE just as you can for DO NOTHING. That would make it work
in a similar way to MySQL; whatever actually conflict was detected
would be assumed to be cause to take the alternative update path.

Except that would break the deterministic behaviour, surely? Because if
you only updated one row based on which constraint matched first, the row
that was updated would depend on the order in which the constraints were
evaluated, yes? I was expecting that matching two constraints would end up
UPDATEing two separate rows.

I have a hard time imagining why you'd ever not want to be explicit

about what to take the alternative path on for the DO UPDATE variant.

What do you have in mind?

If I'm being honest, my main driver is laziness :) I don't mind specifying
the constraint if I can understand why it's required, but otherwise it just
seems like I need to do more typing for no reason. Especially when there's
only one unique constraint on a table.

Geoff

pg@bowt.ie

almost 11 years ago

In reply to: Geoff Winkless (#6)

Re: INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

On Tue, May 19, 2015 at 1:36 PM, Geoff Winkless <pgsqladmin@geoff.dj> wrote:

On 19 May 2015 at 21:12, Peter Geoghegan <pg@heroku.com> wrote:

It's trivial to modify Postgres to not require that a specific unique
index be inferred, so that you can omit the inference specification
for DO UPDATE just as you can for DO NOTHING. That would make it work
in a similar way to MySQL; whatever actually conflict was detected
would be assumed to be cause to take the alternative update path.

Except that would break the deterministic behaviour, surely? Because if you
only updated one row based on which constraint matched first, the row that
was updated would depend on the order in which the constraints were
evaluated, yes? I was expecting that matching two constraints would end up
UPDATEing two separate rows.

Well, it would be deterministic to the extent that the indexes would
be evaluated in OID order. But yes, the first would-be duplicate
violation would make the update path be taken once and only once for
the row proposed for insertion -- at that point, you've given up on
insertion (unless there is a row locking conflict). Just like MySQL, I
believe.

How can you find a would-be violation without inserting? How can you
insert without also violating the other thing? It's far messier than
it first appears.

I have a hard time imagining why you'd ever not want to be explicit
about what to take the alternative path on for the DO UPDATE variant.

What do you have in mind?

If I'm being honest, my main driver is laziness :) I don't mind specifying
the constraint if I can understand why it's required, but otherwise it just
seems like I need to do more typing for no reason. Especially when there's
only one unique constraint on a table.

Well, I don't have zero sympathy for that, but I'm pretty sure that
that's what other people wanted. If I'm being honest, I don't actually
remember how true that was.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

simon@2ndQuadrant.com

almost 11 years ago

In reply to: Geoff Winkless (#6)

Re: INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

On 19 May 2015 at 16:36, Geoff Winkless <pgsqladmin@geoff.dj> wrote:

On 19 May 2015 at 21:12, Peter Geoghegan <pg@heroku.com> wrote:

It's trivial to modify Postgres to not require that a specific unique
index be inferred, so that you can omit the inference specification
for DO UPDATE just as you can for DO NOTHING. That would make it work
in a similar way to MySQL; whatever actually conflict was detected
would be assumed to be cause to take the alternative update path.

Except that would break the deterministic behaviour, surely? Because if
you only updated one row based on which constraint matched first, the row
that was updated would depend on the order in which the constraints were
evaluated, yes?

It would depend upon the evaluation order, but that would not break
determinism unless you allowed a random evaluation order.

Omitting the clause for DO NOTHING yet requiring it for DO UPDATE doesn't
make sense.

We should allow DO UPDATE to exclude a constraint and apply a deterministic
order to the constraints. 1. PK if it exists. 2. Replica Identity, when not
PK, 3. UNIQUE constraints in name order, like triggers, so users can define
a default evaluation order, just like they do with triggers.

I was expecting that matching two constraints would end up UPDATEing two
separate rows.

It's not clear to me how a single INSERT could cause two or more UPDATEs.

I have a hard time imagining why you'd ever not want to be explicit

about what to take the alternative path on for the DO UPDATE variant.

What do you have in mind?

If I'm being honest, my main driver is laziness :) I don't mind specifying
the constraint if I can understand why it's required, but otherwise it just
seems like I need to do more typing for no reason. Especially when there's
only one unique constraint on a table.

1) Ease of use - Unique constraints don't change very often. This saves
time for the common case where they stay the same. It also saves time if
they do change, because you avoid having to completely recode your app AND
make that happen at exactly the same time you apply the change of unique
constraint.

2) Compatibility with MySQL

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pg@bowt.ie

almost 11 years ago

In reply to: Simon Riggs (#8)

Re: INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

On Tue, May 19, 2015 at 1:57 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

We should allow DO UPDATE to exclude a constraint and apply a deterministic
order to the constraints. 1. PK if it exists. 2. Replica Identity, when not
PK, 3. UNIQUE constraints in name order, like triggers, so users can define
a default evaluation order, just like they do with triggers.

That seems like something way worse than just allowing it for all constraints.

I have a hard time imagining why you'd ever not want to be explicit
about what to take the alternative path on for the DO UPDATE variant.

What do you have in mind?

If I'm being honest, my main driver is laziness :) I don't mind specifying
the constraint if I can understand why it's required, but otherwise it just
seems like I need to do more typing for no reason. Especially when there's
only one unique constraint on a table.

1) Ease of use - Unique constraints don't change very often. This saves time
for the common case where they stay the same. It also saves time if they do
change, because you avoid having to completely recode your app AND make that
happen at exactly the same time you apply the change of unique constraint.

I don't see how it's possible to change unique constraints in a way
that breaks the inference specification without that actually being
desirable -- naming the constraint by name is an escape hatch that is
generally discouraged. That's the whole point of inference. I put an
awful lot of work into making unique index inference as forgiving as
possible. For example, it doesn't care what order attributes appear
in, or if they appear redundantly, or if an ON CONFLICT unique index
predicate is more selective than any available index that is otherwise
satisfied (there is a call to predicate_implied_by()).

2) Compatibility with MySQL

But what you describe isn't compatible with MySQL. It's totally novel.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

simon@2ndQuadrant.com

almost 11 years ago

In reply to: Peter Geoghegan (#9)

Re: INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

On 19 May 2015 at 17:10, Peter Geoghegan <pg@heroku.com> wrote:

On Tue, May 19, 2015 at 1:57 PM, Simon Riggs <simon@2ndquadrant.com>
wrote:

We should allow DO UPDATE to exclude a constraint and apply a

deterministic

order to the constraints. 1. PK if it exists. 2. Replica Identity, when

not

PK, 3. UNIQUE constraints in name order, like triggers, so users can

define

a default evaluation order, just like they do with triggers.

That seems like something way worse than just allowing it for all
constraints.

I'm talking about the evaluation order; it would still match all
constraints, otherwise they wouldn't be constraints.

2) Compatibility with MySQL

But what you describe isn't compatible with MySQL. It's totally novel.

Upthread you said

"It's trivial to modify Postgres to not require that a specific unique
index be inferred, so that you can omit the inference specification
for DO UPDATE just as you can for DO NOTHING. That would make it work
in a similar way to MySQL"

Similar is good and useful. Full compatibility is even better.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#11

pg@bowt.ie

almost 11 years ago

In reply to: Simon Riggs (#10)

Re: INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

On Tue, May 19, 2015 at 2:28 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 19 May 2015 at 17:10, Peter Geoghegan <pg@heroku.com> wrote:

On Tue, May 19, 2015 at 1:57 PM, Simon Riggs <simon@2ndquadrant.com>
wrote:

We should allow DO UPDATE to exclude a constraint and apply a
deterministic
order to the constraints. 1. PK if it exists. 2. Replica Identity, when
not
PK, 3. UNIQUE constraints in name order, like triggers, so users can
define
a default evaluation order, just like they do with triggers.

That seems like something way worse than just allowing it for all
constraints.

I'm talking about the evaluation order; it would still match all
constraints, otherwise they wouldn't be constraints.

But it doesn't match all constraints when a would-be conflict is
detected. IOW, we lock the row and go to UPDATE, and then the user is
on their own insofar as avoiding duplicate violations goes. What might
have happened in other unique indexes (had that original would-be dup
violation not occurred) is irrelevant (with the MySQL thing, say) --
you better just get it right, and know that if a dup violation occurs
it was the one you anticipated (e.g. because there is only one unique
index anyway). With Postgres, we want to make sure that the user has
put thought into the condition they take that update path on, and so
it is mandatory (it can infer multiple unique indexes, but only when
they're basically equivalent for this purpose).

I think I agree with you, though: We should change things so that the
relcache gives indexes in something like the ordering that you
outline, rather than in the current arbitrary (though consistent) OID
order. However, I think that this should be done to avoid unnecessary
index bloat (fail early), and I don't think it makes much sense to do
it on the grounds you outline. This is because you can still easily
take the alternative path for the wrong reason, causing subtle
"logical corruption". You can still not match all indexes because one
index had a would-be dup violation (and so, as I said, it doesn't
matter what would have happened with the other ones). Maybe you still
get a dup violation from the update, "saving" you, but who wants to
rely on that?

2) Compatibility with MySQL

But what you describe isn't compatible with MySQL. It's totally novel.

Upthread you said

"It's trivial to modify Postgres to not require that a specific unique
index be inferred, so that you can omit the inference specification
for DO UPDATE just as you can for DO NOTHING. That would make it work
in a similar way to MySQL"

Similar is good and useful. Full compatibility is even better.

I actually do not feel strongly that it would be terrible to allow the
user to omit an inference clause for the DO UPDATE variant (on the
grounds of that being closer to MySQL). After all, we don't mandate
that the user specifies an explicit targetlist for INSERT, and that
seems like a footgun to me. If you want to make the case for doing
things that way, I probably will not oppose it. FWIW, I don't think
it's unreasonable to have a little discussion on fine points of
semantics like that post feature-freeze.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Tom Lane

tgl@sss.pgh.pa.us

almost 11 years ago

In reply to: Peter Geoghegan (#11)

Re: INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

Peter Geoghegan <pg@heroku.com> writes:

I think I agree with you, though: We should change things so that the
relcache gives indexes in something like the ordering that you
outline, rather than in the current arbitrary (though consistent) OID
order.

I'm fairly sure that there are aspects of the code that rely on indexes
being returned by RelationGetIndexList() in a stable order. While I doubt
that has to be exactly increasing-OID-order, I'm quite concerned about
allowing the order to depend on mutable aspects of the indexes, like
names.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

pg@bowt.ie

almost 11 years ago

In reply to: Tom Lane (#12)

Re: INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

On Tue, May 19, 2015 at 5:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Peter Geoghegan <pg@heroku.com> writes:

I think I agree with you, though: We should change things so that the
relcache gives indexes in something like the ordering that you
outline, rather than in the current arbitrary (though consistent) OID
order.

I'm fairly sure that there are aspects of the code that rely on indexes
being returned by RelationGetIndexList() in a stable order. While I doubt
that has to be exactly increasing-OID-order, I'm quite concerned about
allowing the order to depend on mutable aspects of the indexes, like
names.

I thought the importance of the ordering was just down to some AMs
(like hash) using heavyweight locks. This could cause unprincipled
deadlocks in the face of an inconsistent ordering. nbtree used to use
page-level heavyweight locks many years ago, too, so this used to be a
big, obvious requirement. Maybe there is another reason, but AFAICR
there are no hints of that from the relevant code, and I've looked
carefully.

If it was ever changed, I think it could be done in a way that didn't
add any problems, assuming I've accounted for all the ways in which
changing the ordering could be problematic.
--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

pgsqladmin@geoff.dj

almost 11 years ago

In reply to: Simon Riggs (#8)

Re: INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

On 19 May 2015 at 21:57, Simon Riggs <simon@2ndquadrant.com> wrote:

It's not clear to me how a single INSERT could cause two or more UPDATEs.

CREATE TABLE mytable (
c1 int NOT NULL,
c2 int NOT NULL,
PRIMARY KEY (c1),
UNIQUE (c2)

);

INSERT INTO mytable (c1, c2) (10, 20);
INSERT INTO mytable (c1, c2) (11, 21);
INSERT INTO mytable (c1, c2) (10, 21) ON CONFLICT DO UPDATE .....

Or did you mean from a coding point of view how it would be possible to
implement?

Geoff

#15

simon@2ndQuadrant.com

almost 11 years ago

In reply to: Peter Geoghegan (#11)

Re: INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

On 19 May 2015 at 19:59, Peter Geoghegan <pg@heroku.com> wrote:

On Tue, May 19, 2015 at 2:28 PM, Simon Riggs <simon@2ndquadrant.com>
wrote:

On 19 May 2015 at 17:10, Peter Geoghegan <pg@heroku.com> wrote:

On Tue, May 19, 2015 at 1:57 PM, Simon Riggs <simon@2ndquadrant.com>
wrote:

We should allow DO UPDATE to exclude a constraint and apply a
deterministic
order to the constraints. 1. PK if it exists. 2. Replica Identity,

when

not
PK, 3. UNIQUE constraints in name order, like triggers, so users can
define
a default evaluation order, just like they do with triggers.

That seems like something way worse than just allowing it for all
constraints.

I'm talking about the evaluation order; it would still match all
constraints, otherwise they wouldn't be constraints.

But it doesn't match all constraints when a would-be conflict is
detected.

No not all, but we can evaluate the constraints one at a time in a
consistent order.

My point is this: We do not need to explicitly specify the constraint we
wish to test to ensure that we get deterministic behaviour. So it is
possible to avoid specifying a constraint/conflict target and still get
deterministic behaviour (which is essential).

With Postgres, we want to make sure that the user has

put thought into the condition they take that update path on, and so
it is mandatory (it can infer multiple unique indexes, but only when
they're basically equivalent for this purpose).

If I have two constraints and I think about it, I would want to be able to
specify this...

INSERT
ON CONFLICT (col1) DO UPDATE... (handle it one way)
ON CONFLICT (col2) DO UPDATE... (handle it 2nd way)

but I cannot with the current syntax.

It seems strange to force the user to think about constraint handling and
then not offer them any choices once they have done the thinking.

If the update is the same no matter which constraint is violated, why would
I need to specify the constraint? We're forcing the developer to make an
arbitrary choice between two constraints.

I actually do not feel strongly that it would be terrible to allow the

user to omit an inference clause for the DO UPDATE variant (on the
grounds of that being closer to MySQL). After all, we don't mandate
that the user specifies an explicit targetlist for INSERT, and that
seems like a footgun to me. If you want to make the case for doing
things that way, I probably will not oppose it. FWIW, I don't think
it's unreasonable to have a little discussion on fine points of
semantics like that post feature-freeze.

We will see many people ask why they have to specify constraints explicitly.

As I've pointed out, if the underlying model changes then you now have to
explicitly recode all the SQL as well AND time that exactly so you roll out
the new code at the same time you add/change constraints. That makes it
much harder to use this feature than I would like.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#16

simon@2ndQuadrant.com

almost 11 years ago

In reply to: Geoff Winkless (#14)

Re: INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

On 20 May 2015 at 05:49, Geoff Winkless <pgsqladmin@geoff.dj> wrote:

On 19 May 2015 at 21:57, Simon Riggs <simon@2ndquadrant.com> wrote:

It's not clear to me how a single INSERT could cause two or more UPDATEs.

CREATE TABLE mytable (
c1 int NOT NULL,
c2 int NOT NULL,
PRIMARY KEY (c1),
UNIQUE (c2)

);

INSERT INTO mytable (c1, c2) (10, 20);
INSERT INTO mytable (c1, c2) (11, 21);
INSERT INTO mytable (c1, c2) (10, 21) ON CONFLICT DO UPDATE .....

Or did you mean from a coding point of view how it would be possible to
implement?

I mean "how could that possibly have useful meaning?".

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#17

pg@bowt.ie

almost 11 years ago

In reply to: Simon Riggs (#15)

Re: INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

On Thu, May 21, 2015 at 9:51 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

No not all, but we can evaluate the constraints one at a time in a
consistent order.

We do so currently. Now, you point out that that might not be the most
useful ordering, and as it happens I agree. But changing that ordering
to not just be OID-ordering, but to put the PK first (and so on) isn't
going to fundamentally change anything. FWIW, I think that that much
(PK first) will usually accidentally be true anyway, because of the
way that create table statement is originally executed.

My point is this: We do not need to explicitly specify the constraint we
wish to test to ensure that we get deterministic behaviour. So it is
possible to avoid specifying a constraint/conflict target and still get
deterministic behaviour (which is essential).

It is deterministic, but omitting an inference specification still
risks taking the wrong path. You seem not be acknowledging that you
can still take the wrong path due to a dup violation in the wrong
constraint. So being guaranteed to have observed or not observed a
would-be dup violation in the PK does not buy much.

If I have two constraints and I think about it, I would want to be able to
specify this...

INSERT
ON CONFLICT (col1) DO UPDATE... (handle it one way)
ON CONFLICT (col2) DO UPDATE... (handle it 2nd way)

but I cannot with the current syntax.

It seems strange to force the user to think about constraint handling and
then not offer them any choices once they have done the thinking.

What if both constraints are violated? Won't the update end up in trouble?

If the update is the same no matter which constraint is violated, why would
I need to specify the constraint? We're forcing the developer to make an
arbitrary choice between two constraints.

Why would the update be the same, though? How could that make sense?
You're still going to have to update both unique-indexed columns with
something, and that could fail.

We will see many people ask why they have to specify constraints explicitly.

I'm not sure that we will, actually, but as I said, go ahead and
propose removing the restriction if you think it's important (maybe
start a thread on it).

As I've pointed out, if the underlying model changes then you now have to
explicitly recode all the SQL as well AND time that exactly so you roll out
the new code at the same time you add/change constraints. That makes it much
harder to use this feature than I would like.

If the underlying model changes, then it's good that your queries
break, because they're predicated on the original model. I don't think
that happens very often at all. What is much more routine - adding
redundant indexes to reindex using CREATE INDEX CONCURRENTLY, or
changing the predicate on whatever partial unique indexes happen to be
defined on the table - is handled gracefully.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18

simon@2ndQuadrant.com

almost 11 years ago

In reply to: Peter Geoghegan (#17)

Re: INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

On 21 May 2015 at 14:25, Peter Geoghegan <pg@heroku.com> wrote:

If I have two constraints and I think about it, I would want to be able

to

specify this...

INSERT
ON CONFLICT (col1) DO UPDATE... (handle it one way)
ON CONFLICT (col2) DO UPDATE... (handle it 2nd way)

but I cannot with the current syntax.

It seems strange to force the user to think about constraint handling and
then not offer them any choices once they have done the thinking.

What if both constraints are violated? Won't the update end up in trouble?

Great question. We don't handle that at the moment. So how do we handle
that?

If the update is the same no matter which constraint is violated, why

would

I need to specify the constraint? We're forcing the developer to make an
arbitrary choice between two constraints.

Why would the update be the same, though?

*If* is the keyword there.

How could that make sense?

It wouldn't, that is the point. So why does the current syntax force that?

You're still going to have to update both unique-indexed columns with
something, and that could fail.

ISTM clear that you might want to handle each kind of violation
differently, but we cannot.

We will see many people ask why they have to specify constraints
explicitly.

I'm not sure that we will, actually, but as I said, go ahead and
propose removing the restriction if you think it's important (maybe
start a thread on it).

I am. I have. Many times. What is wrong with this thread or all of the
other times I said it?

Please look at the $SUBJECT of this thread. We're here now.

As I've pointed out, if the underlying model changes then you now have to

explicitly recode all the SQL as well AND time that exactly so you roll

out

the new code at the same time you add/change constraints. That makes it

much

harder to use this feature than I would like.

If the underlying model changes, then it's good that your queries
break, because they're predicated on the original model. I don't think

that happens very often at all.

If it seldom happens, then why do we need to specify the conflict-target?
If I know there is only one unique constraint, why can I not rely upon that
knowledge?

What is much more routine - adding
redundant indexes to reindex using CREATE INDEX CONCURRENTLY, or
changing the predicate on whatever partial unique indexes happen to be
defined on the table - is handled gracefully.

What has CREATE INDEX CONCURRENTLY got to do with this? If you don't
specify the conflict-target at all, it wouldn't matter what the indexes
are. If you have two indexes the same then it clearly wouldn't matter which
one was checked first.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#19

Stephen Frost

sfrost@snowman.net

almost 11 years ago

In reply to: Simon Riggs (#18)

Re: INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

* Simon Riggs (simon@2ndQuadrant.com) wrote:

On 21 May 2015 at 14:25, Peter Geoghegan <pg@heroku.com> wrote:

If the update is the same no matter which constraint is violated, why

would

I need to specify the constraint? We're forcing the developer to make an
arbitrary choice between two constraints.

Why would the update be the same, though?

*If* is the keyword there.

Agreed.

We will see many people ask why they have to specify constraints
explicitly.

I'm not sure that we will, actually, but as I said, go ahead and
propose removing the restriction if you think it's important (maybe
start a thread on it).

I am. I have. Many times. What is wrong with this thread or all of the
other times I said it?

Please look at the $SUBJECT of this thread. We're here now.

I've also asked for this.

Thanks!

Stephen

#20

pg@bowt.ie

almost 11 years ago

In reply to: Simon Riggs (#18)

Re: INSERT ... ON CONFLICT DO UPDATE with _any_ constraint

On Thu, May 21, 2015 at 11:55 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

It seems strange to force the user to think about constraint handling
and
then not offer them any choices once they have done the thinking.

What if both constraints are violated? Won't the update end up in trouble?

Great question. We don't handle that at the moment. So how do we handle
that?

By writing two separate INSERT ... ON CONFLICT DO UPDATE statements?
There is very little or no disadvantage to doing it that way.

If the update is the same no matter which constraint is violated, why
would
I need to specify the constraint? We're forcing the developer to make an
arbitrary choice between two constraints.

Why would the update be the same, though?

*If* is the keyword there.

I'm having a hard time imagining a scenario in which the update would
be the same. That's why I asked how it could be. I'm asking for a
practical example involving plausible business rules.

How could that make sense?

It wouldn't, that is the point. So why does the current syntax force that?

You're still going to have to update both unique-indexed columns with
something, and that could fail.

ISTM clear that you might want to handle each kind of violation differently,
but we cannot.

I think you can -- with two statements.

We will see many people ask why they have to specify constraints
explicitly.

I'm not sure that we will, actually, but as I said, go ahead and
propose removing the restriction if you think it's important (maybe
start a thread on it).

I am. I have. Many times. What is wrong with this thread or all of the other
times I said it?

Please look at the $SUBJECT of this thread. We're here now.

What do you want me to do about it? I've said that I think that what
you say about not mandating the inference clause in the parser could
be okay. If you want to change it, obviously you're going to need to
get some buy in, and this thread could easily be missed. I'm not
willing to defend mandating it, and I'm not willing to argue for
removing it (to be clear, I think being able to infer a unique index
is very important, but that doesn't mean that I'm attached to
mandating it for UPDATE). That's all.

As I've pointed out, if the underlying model changes then you now have
to
explicitly recode all the SQL as well AND time that exactly so you roll
out
the new code at the same time you add/change constraints. That makes it
much
harder to use this feature than I would like.

If the underlying model changes, then it's good that your queries
break, because they're predicated on the original model. I don't think

that happens very often at all.

If it seldom happens, then why do we need to specify the conflict-target? If
I know there is only one unique constraint, why can I not rely upon that
knowledge?

You say that as if I'm giving you pushback on that point -- for the
third time, I'm not.

If there is more than one unique constraint (or if there might be in
the future), why take the chance that the update will take the wrong
path? I'm not saying that that's the overriding consideration, but it
is certainly a big consideration.

What is much more routine - adding
redundant indexes to reindex using CREATE INDEX CONCURRENTLY, or
changing the predicate on whatever partial unique indexes happen to be
defined on the table - is handled gracefully.

What has CREATE INDEX CONCURRENTLY got to do with this? If you don't specify
the conflict-target at all, it wouldn't matter what the indexes are. If you
have two indexes the same then it clearly wouldn't matter which one was
checked first.

I'm not talking about that here. What I meant is that changes to
unique indexes that don't affect the underlying model (as you put it)
don't break your queries. Changes that do *will* break your queries.
And that's definitely a good thing. I am pretty neutral on whether
it's right to mandate that DO UPDATE statements *must* buy in to this.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21

simon@2ndQuadrant.com

almost 11 years ago

In reply to: Peter Geoghegan (#20)

#22

pg@bowt.ie

almost 11 years ago

In reply to: Simon Riggs (#21)

#23