ON CONFLICT and WHERE

Started by Adrian Klaverover 3 years ago5 messagesgeneral
Jump to latest
#1Adrian Klaver
adrian.klaver@aklaver.com

In process of answering an SO question I ran across the below.

The original question example:

CREATE TABLE books (
id int4 NOT NULL,
version int8 NOT NULL,
updated timestamp NULL,
CONSTRAINT books_pkey PRIMARY KEY (id)
);

INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
WHERE version IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP;

select *, CURRENT_TIMESTAMP, updated + INTERVAL '2min' <
CURRENT_TIMESTAMP from books where id = 12;

INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
WHERE updated IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP;

select *, CURRENT_TIMESTAMP, updated + INTERVAL '2min' <
CURRENT_TIMESTAMP from books where id = 12;

With select results as:

id | version | updated | current_timestamp
| ?column?
----+---------+----------------------------+--------------------------------+----------
12 | 0 | 11/13/2022 12:21:38.032578 | 11/13/2022 12:21:38.057545
PST | f

id | version | updated | current_timestamp
| ?column?
----+---------+----------------------------+--------------------------------+----------
12 | 1 | 11/13/2022 12:21:38.058673 | 11/13/2022 12:21:40.686231
PST | f

I have not used WHERE with ON CONFLICT myself so it took longer then I
care to admit to correct the above to:

DROP TABLE IF EXISTS books;

CREATE TABLE books (
id int4 NOT NULL,
version int8 NOT NULL,
updated timestamp NULL,
CONSTRAINT books_pkey PRIMARY KEY (id)
);

INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP
WHERE books.version IS NULL OR books.updated + INTERVAL '2min' <
CURRENT_TIMESTAMP;

select *, CURRENT_TIMESTAMP, updated + INTERVAL '2min' <
CURRENT_TIMESTAMP from books where id = 12;

INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP
WHERE books.version IS NULL OR books.updated + INTERVAL '2min' <
CURRENT_TIMESTAMP;

select *, CURRENT_TIMESTAMP, updated + INTERVAL '2min' <
CURRENT_TIMESTAMP from books where id = 12

With select results as:

id | version | updated | current_timestamp
| ?column?
----+---------+----------------------------+--------------------------------+----------
12 | 0 | 11/13/2022 12:32:01.427769 | 11/13/2022 12:32:01.463705
PST | f

id | version | updated | current_timestamp
| ?column?
----+---------+----------------------------+--------------------------------+----------
12 | 0 | 11/13/2022 12:32:01.427769 | 11/13/2022 12:32:01.476484
PST | f

I ran this on both version 14 and 15 with same results.

The question is why did the first case just ignore the WHERE instead of
throwing a syntax error?

--
Adrian Klaver
adrian.klaver@aklaver.com

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Adrian Klaver (#1)
Re: ON CONFLICT and WHERE

Adrian Klaver <adrian.klaver@aklaver.com> writes:

INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
WHERE updated IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP;

I have not used WHERE with ON CONFLICT myself so it took longer then I
care to admit to correct the above to:

INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP
WHERE books.version IS NULL OR books.updated + INTERVAL '2min' <
CURRENT_TIMESTAMP;

The question is why did the first case just ignore the WHERE instead of
throwing a syntax error?

A WHERE placed there is an index_predicate attachment to the ON CONFLICT
clause. It doesn't have any run-time effect other than to allow partial
indexes to be chosen as arbiter indexes. TFM explains

index_predicate

Used to allow inference of partial unique indexes. Any indexes
that satisfy the predicate (which need not actually be partial
indexes) can be inferred.

This strikes me as a bit of a foot-gun. I wonder if we should make
it safer by insisting that the resolved index be partial when there's
a WHERE clause here. (This documentation text is about as clear as
mud, too. What does "inferred" mean here? I think it means "chosen as
arbiter index", but maybe I misunderstand.)

regards, tom lane

#3Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Tom Lane (#2)
Re: ON CONFLICT and WHERE

On 11/13/22 13:07, Tom Lane wrote:

Adrian Klaver <adrian.klaver@aklaver.com> writes:

INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
WHERE updated IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP;

I have not used WHERE with ON CONFLICT myself so it took longer then I
care to admit to correct the above to:

INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP
WHERE books.version IS NULL OR books.updated + INTERVAL '2min' <
CURRENT_TIMESTAMP;

The question is why did the first case just ignore the WHERE instead of
throwing a syntax error?

A WHERE placed there is an index_predicate attachment to the ON CONFLICT
clause. It doesn't have any run-time effect other than to allow partial
indexes to be chosen as arbiter indexes. TFM explains

index_predicate

Used to allow inference of partial unique indexes. Any indexes
that satisfy the predicate (which need not actually be partial
indexes) can be inferred.

This strikes me as a bit of a foot-gun. I wonder if we should make
it safer by insisting that the resolved index be partial when there's
a WHERE clause here. (This documentation text is about as clear as
mud, too. What does "inferred" mean here? I think it means "chosen as
arbiter index", but maybe I misunderstand.)

Alright I see how another use of WHERE comes into play.

I do agree with the clarity of the description, especially after looking
at the example:

"
Insert new distributor if possible; otherwise DO NOTHING. Example
assumes a unique index has been defined that constrains values appearing
in the did column on a subset of rows where the is_active Boolean column
evaluates to true:

-- This statement could infer a partial unique index on "did"
-- with a predicate of "WHERE is_active", but it could also
-- just use a regular unique constraint on "did"
INSERT INTO distributors (did, dname) VALUES (10, 'Conrad International')
ON CONFLICT (did) WHERE is_active DO NOTHING;
"

I honestly cannot figure out what that is saying.

regards, tom lane

--
Adrian Klaver
adrian.klaver@aklaver.com

In reply to: Tom Lane (#2)
Re: ON CONFLICT and WHERE

On Sun, Nov 13, 2022 at 1:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

A WHERE placed there is an index_predicate attachment to the ON CONFLICT
clause. It doesn't have any run-time effect other than to allow partial
indexes to be chosen as arbiter indexes. TFM explains

index_predicate

Used to allow inference of partial unique indexes. Any indexes
that satisfy the predicate (which need not actually be partial
indexes) can be inferred.

This strikes me as a bit of a foot-gun. I wonder if we should make
it safer by insisting that the resolved index be partial when there's
a WHERE clause here.

I don't think that it would be safer.

Adrian has asked why it's possible to attach an arbitrary
index_predicate type WHERE clause to an ON CONFLICT query, without
that really changing the behavior of the statement. That *is* a little
odd, so it's certainly a fair question (I can recall perhaps as many
as 5 similar questions over the years). But it's not the end of the
world, either -- there are far worse things.

I think that it would be a lot worse (just for example) to have your
ON CONFLICT query suddenly start throwing an ERROR in production, just
because you replaced a partial unique index with a unique constraint.
If we have a suitable unique index or constraint, why wouldn't we use
it in ON CONFLICT? Maybe it won't work out that way (maybe there won't
be any suitable unique index or constraint), but why not do our utmost
to insulate the user from what might be a serious production issue?
That was the guiding principle.

Overall I'm quite happy with the amount of foot-guns ON CONFLICT has,
especially compared to other comparable features in other DB systems
(which had plenty). There are one or two ostensibly odd things about
the syntax that are downstream consequences of trying to make the
constraint/unique index inference process maximally forgiving. I'm
pretty happy with that trade-off.

(This documentation text is about as clear as
mud, too. What does "inferred" mean here? I think it means "chosen as
arbiter index", but maybe I misunderstand.)

Unique index/constraint inference is the process by which we choose an
arbiter index. See the second paragraph of the "ON CONFLICT Clause"
section of the INSERT docs.

--
Peter Geoghegan

#5jian he
jian.universality@gmail.com
In reply to: Adrian Klaver (#3)
Re: ON CONFLICT and WHERE

On Mon, Nov 14, 2022 at 2:55 AM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 11/13/22 13:07, Tom Lane wrote:

Adrian Klaver <adrian.klaver@aklaver.com> writes:

INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
WHERE updated IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP;

I have not used WHERE with ON CONFLICT myself so it took longer then I
care to admit to correct the above to:

INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP
WHERE books.version IS NULL OR books.updated + INTERVAL '2min' <
CURRENT_TIMESTAMP;

The question is why did the first case just ignore the WHERE instead of
throwing a syntax error?

A WHERE placed there is an index_predicate attachment to the ON CONFLICT
clause. It doesn't have any run-time effect other than to allow partial
indexes to be chosen as arbiter indexes. TFM explains

index_predicate

Used to allow inference of partial unique indexes. Any indexes
that satisfy the predicate (which need not actually be partial
indexes) can be inferred.

This strikes me as a bit of a foot-gun. I wonder if we should make
it safer by insisting that the resolved index be partial when there's
a WHERE clause here. (This documentation text is about as clear as
mud, too. What does "inferred" mean here? I think it means "chosen as
arbiter index", but maybe I misunderstand.)

Alright I see how another use of WHERE comes into play.

I do agree with the clarity of the description, especially after looking
at the example:

"
Insert new distributor if possible; otherwise DO NOTHING. Example
assumes a unique index has been defined that constrains values appearing
in the did column on a subset of rows where the is_active Boolean column
evaluates to true:

-- This statement could infer a partial unique index on "did"
-- with a predicate of "WHERE is_active", but it could also
-- just use a regular unique constraint on "did"
INSERT INTO distributors (did, dname) VALUES (10, 'Conrad International')
ON CONFLICT (did) WHERE is_active DO NOTHING;
"

I honestly cannot figure out what that is saying.

regards, tom lane

--
Adrian Klaver
adrian.klaver@aklaver.com

INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
WHERE version IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP;

Since id is already the primary key, it skipped the WHERE part. it resolves
to the DO UPDATE part.

from test code.

create table insertconflicttest(key int4, fruit text);
create unique index partial_key_index on insertconflicttest(key) where
fruit like '%berry';

In this case, the on conflict clause should be exactly like *on conflict
(key) where fruit like '%berry'*

-- fails
insert into insertconflicttest values (23, 'Blackberry') on conflict (key)
do update set fruit = excluded.fruit;
ERROR: there is no unique or exclusion constraint matching the ON CONFLICT
specification
insert into insertconflicttest values (23, 'Blackberry') on conflict (key)
where fruit like '%berry' or fruit = 'consequential' do nothing;
ERROR: there is no unique or exclusion constraint matching the ON CONFLICT
specification
insert into insertconflicttest values (23, 'Blackberry') on conflict
(fruit) where fruit like '%berry' do update set fruit = excluded.fruit;
ERROR: there is no unique or exclusion constraint matching the ON CONFLICT
specification

--
I recommend David Deutsch's <<The Beginning of Infinity>>

Jian