BUG #14343: UPSERT (ON CONFLICT) doesn't check ON CONFLICT constraint first

Started by Reyes Ponceover 9 years ago6 messagesbugs

reyes.r.ponce@gmail.com

over 9 years ago

The following bug has been logged on the website:

Bug reference: 14343
Logged by: Reyes Ponce
Email address: reyes.r.ponce@gmail.com
PostgreSQL version: 9.5.2
Operating system: RHEL 4.8.2-16, 64-bit
Description:

-- Create a table with a key, one not null column, and one nullable
column.

CREATE TABLE public.MyTable
(
MY_ID integer primary key NOT NULL,
COL1 integer NOT NULL,
COL2 integer,
CRETN_TS timestamp with time zone NOT NULL,
CRETN_USER_ID varchar NOT NULL,
UPDT_TS timestamp with time zone ,
UPDT_USER_ID varchar
);

-- Create a function to upsert whatever is passed in.

CREATE OR REPLACE FUNCTION public.upsert_mytable(
i_my_id integer,
i_col1val integer DEFAULT NULL::integer,
i_col2val integer DEFAULT NULL::integer
)
RETURNS json AS
$BODY$
DECLARE
get_rows json;
beginSql text :=
'INSERT INTO public.MyTable(
MY_ID, COL1, COL2, CRETN_TS, CRETN_USER_ID, UPDT_TS,
UPDT_USER_ID)
VALUES ($1, $2, $3, NOW(), current_user, NOW(), current_user)
ON CONFLICT(MY_ID)
DO UPDATE SET UPDT_TS = NOW(), UPDT_USER_ID = current_user, ';
comma text := '';
endSql text := '';
sql text := '';
BEGIN

-- Validate input parameters - MUST SPECIFY PK

IF ( i_my_id is NULL OR i_my_id = '0') THEN
Raise Exception 'INVALID INPUT: Id cannot be null' using errcode =
'3';
END IF;

/* Create middle SQL (the fields to update).

After we see the first non-null parameter, set the comma to a comma
instead of blank...

The $ variables line up with the order in the "EXECUTE sql USING"
statement below, not in order in the parameter list above.
*/
IF (i_col1val is not NULL) THEN
endSql := endSql || comma || 'COL1 = $2';
comma := ', ';
END IF;

IF (i_col2val is not NULL) THEN
endSql := endSql || comma || 'COL2 = $3';
comma := ', ';
END IF;

-- raise NOTICE '%', beginSql; /* DEBUGGING CODE - Outputs message in
plAdmin messages tab. */
-- raise NOTICE '%', endSql; /* DEBUGGING CODE - Outputs message in
plAdmin messages tab. */

sql := beginSql || endSql;

-- raise NOTICE '%', sql; /* DEBUGGING CODE - Outputs message in plAdmin
messages tab. */

/* The $ variables line up with the order in the USING clause below, not in
order in the parameter list. */
EXECUTE sql
USING i_my_id, i_col1val, i_col2val;

-- Return json.

select row_to_json(f) INTO get_rows
from (
SELECT MY_ID, COL1, COL2, cretn_ts, cretn_user_id, updt_ts,
updt_user_id
FROM public.MyTable
WHERE MY_ID = i_my_id) f;

RETURN get_rows;

END
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;

-- Call the SP for initial insert (all columns). Works fine.
SELECT public.upsert_mytable(
1,
2,
3
);

-- Call the SP for update of the nullable column. Error because not checking
ON CONFLICT condition first.
SELECT public.upsert_mytable(
1,
NULL,
5
);

/*
ERROR: null value in column "col1" violates not-null constraint
DETAIL: Failing row contains (1, null, 5, 2016-09-27 17:32:51.054896+00,
pl_mstr_usr, 2016-09-27 17:32:51.054896+00, pl_mstr_usr).
CONTEXT: SQL statement "INSERT INTO public.MyTable(
MY_ID, COL1, COL2, CRETN_TS, CRETN_USER_ID, UPDT_TS,
UPDT_USER_ID)
VALUES ($1, $2, $3, NOW(), current_user, NOW(), current_user)
ON CONFLICT(MY_ID)
DO UPDATE SET UPDT_TS = NOW(), UPDT_USER_ID = current_user, COL2 = $3"
PL/pgSQL function upsert_mytable(integer,integer,integer) line 46 at
EXECUTE

********** Error **********

ERROR: null value in column "col1" violates not-null constraint
SQL state: 23502
Detail: Failing row contains (1, null, 5, 2016-09-27 17:32:51.054896+00,
pl_mstr_usr, 2016-09-27 17:32:51.054896+00, pl_mstr_usr).
Context: SQL statement "INSERT INTO public.MyTable(
MY_ID, COL1, COL2, CRETN_TS, CRETN_USER_ID, UPDT_TS,
UPDT_USER_ID)
VALUES ($1, $2, $3, NOW(), current_user, NOW(), current_user)
ON CONFLICT(MY_ID)
DO UPDATE SET UPDT_TS = NOW(), UPDT_USER_ID = current_user, COL2 = $3"
PL/pgSQL function upsert_mytable(integer,integer,integer) line 46 at
EXECUTE

-----------------------------------------------------------------------

ON CONFLICT should have been checked before the attempt to insert!
If i implement as CTE instead of using the new syntax it works fine.
Not checking ON CONFLICT condition first makes the new syntax applicable
to
fewer scenarios than it should be able to be used for. It's useless for this
particular use case...

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Tom Lane

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: Reyes Ponce (#1)

Re: BUG #14343: UPSERT (ON CONFLICT) doesn't check ON CONFLICT constraint first

reyes.r.ponce@gmail.com writes:

ERROR: null value in column "col1" violates not-null constraint
DETAIL: Failing row contains (1, null, 5, 2016-09-27 17:32:51.054896+00,
pl_mstr_usr, 2016-09-27 17:32:51.054896+00, pl_mstr_usr).
CONTEXT: SQL statement "INSERT INTO public.MyTable(
MY_ID, COL1, COL2, CRETN_TS, CRETN_USER_ID, UPDT_TS,
UPDT_USER_ID)
VALUES ($1, $2, $3, NOW(), current_user, NOW(), current_user)
ON CONFLICT(MY_ID)
DO UPDATE SET UPDT_TS = NOW(), UPDT_USER_ID = current_user, COL2 = $3"
PL/pgSQL function upsert_mytable(integer,integer,integer) line 46 at
EXECUTE

This test case seems rather overcomplicated, but AFAICS you are
complaining because the NOT NULL constraint is checked before uniqueness
is checked. Sorry, that is not a bug, that is by design.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Reyes Ponce

reyes.r.ponce@gmail.com

over 9 years ago

In reply to: Tom Lane (#2)

Re: BUG #14343: UPSERT (ON CONFLICT) doesn't check ON CONFLICT constraint first

Hi Tom,

That may be "the design" and it may be a fine design for insert, but
it's not a great design for upsert, which was sort of my point.

An update statement does not require all NOT NULL columns to be
specified so neither should an upsert require all NOT NULL columns to be
specified in the case where the update will run.

Any chance you guys will do a

UPDATE ... ON MISSING... DO INSERT...

version as I expect in that case it would be implemented closer to the
functionality you get implementing upsert with a CTE (and how upsert in
most NoSql DB works (i.e. doesn't impose more restrictions than update
in the update case)) which would cover far more use cases than the
current design of INSERT... ON CONFLICT... DO UPDATE...?

Thanks for all the hard work. I've only been using Postgres for a few
months, but thus far it's been solid.

-- Reyes

On 9/27/2016 3:06 PM, Tom Lane wrote:

reyes.r.ponce@gmail.com writes:

ERROR: null value in column "col1" violates not-null constraint
DETAIL: Failing row contains (1, null, 5, 2016-09-27 17:32:51.054896+00,
pl_mstr_usr, 2016-09-27 17:32:51.054896+00, pl_mstr_usr).
CONTEXT: SQL statement "INSERT INTO public.MyTable(
MY_ID, COL1, COL2, CRETN_TS, CRETN_USER_ID, UPDT_TS,
UPDT_USER_ID)
VALUES ($1, $2, $3, NOW(), current_user, NOW(), current_user)
ON CONFLICT(MY_ID)
DO UPDATE SET UPDT_TS = NOW(), UPDT_USER_ID = current_user, COL2 = $3"
PL/pgSQL function upsert_mytable(integer,integer,integer) line 46 at
EXECUTE

This test case seems rather overcomplicated, but AFAICS you are
complaining because the NOT NULL constraint is checked before uniqueness
is checked. Sorry, that is not a bug, that is by design.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Peter Geoghegan

pg@bowt.ie

over 9 years ago

In reply to: Reyes Ponce (#3)

Re: BUG #14343: UPSERT (ON CONFLICT) doesn't check ON CONFLICT constraint first

On Fri, Sep 30, 2016 at 8:19 PM, Reyes Ponce <reyes.r.ponce@gmail.com> wrote:

Any chance you guys will do a

UPDATE ... ON MISSING... DO INSERT...

version as I expect in that case it would be implemented closer to the
functionality you get implementing upsert with a CTE (and how upsert in
most NoSql DB works (i.e. doesn't impose more restrictions than update
in the update case)) which would cover far more use cases than the
current design of INSERT... ON CONFLICT... DO UPDATE...?

I don't think that that makes sense. If you know ahead of time that
the INSERT path will definitely throw an error, you're either relying
on that path never being taken, in which case you should just use a
plain UPDATE, or you have a bug in your application code, in which
case you should be glad to have it surfaced sooner rather than later.

--
Peter Geoghegan

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Reyes Ponce

reyes.r.ponce@gmail.com

over 9 years ago

In reply to: Peter Geoghegan (#4)

Re: BUG #14343: UPSERT (ON CONFLICT) doesn't check ON CONFLICT constraint first

Hi Peter,

In my case I'm writing a stored procedure that will insert or update based
on whether the row already exists. Some things calling it will be
interactive and some will be batch processes. Sometimes all the parameters
will be not null and sometimes some of the parameters will be null. The
stored procedure should do the right thing (insert or update) as best it
can or throw an error. Regardless, of this specific design, the following
are still true:

1) Upsert is becoming a common feature (probably because it matches well
with the definition of REST PUT and POST functionality).

2) The current INSERT... ON CONFLICT... DO UPDATE... doesn't handle the
scenarios stated in my initial email, which upsert functionality can
usually handle.

3) Upsert can be done with CTEs which can handle those scenarios but has
potential race conditions.

which is why I am asking about this (i.e. if the insert version of upsert
can't handle these scenarios, maybe we need an upsert based on update).

On Oct 1, 2016 8:21 AM, "Peter Geoghegan" <pg@heroku.com> wrote:

Show quoted text

On Fri, Sep 30, 2016 at 8:19 PM, Reyes Ponce <reyes.r.ponce@gmail.com>
wrote:

Any chance you guys will do a

UPDATE ... ON MISSING... DO INSERT...

version as I expect in that case it would be implemented closer to the
functionality you get implementing upsert with a CTE (and how upsert in
most NoSql DB works (i.e. doesn't impose more restrictions than update
in the update case)) which would cover far more use cases than the
current design of INSERT... ON CONFLICT... DO UPDATE...?

I don't think that that makes sense. If you know ahead of time that
the INSERT path will definitely throw an error, you're either relying
on that path never being taken, in which case you should just use a
plain UPDATE, or you have a bug in your application code, in which
case you should be glad to have it surfaced sooner rather than later.

--
Peter Geoghegan

Peter Geoghegan

pg@bowt.ie

over 9 years ago

In reply to: Reyes Ponce (#5)

Re: BUG #14343: UPSERT (ON CONFLICT) doesn't check ON CONFLICT constraint first

On Wed, Oct 5, 2016 at 7:34 PM, Reyes Ponce <reyes.r.ponce@gmail.com> wrote:

1) Upsert is becoming a common feature (probably because it matches well
with the definition of REST PUT and POST functionality).

2) The current INSERT... ON CONFLICT... DO UPDATE... doesn't handle the
scenarios stated in my initial email, which upsert functionality can usually
handle.

3) Upsert can be done with CTEs which can handle those scenarios but has
potential race conditions.

I think that your stored procedure needs to learn about the different
cases. Simple as that.

If an INSERT would fail, you have no right to assume an upsert that
*might* take the insert path, but doesn't this time, should not fail
all the time. It's not as if a NOT NULL constraint is based on
anything other than the simple fact that the row that you mean to
INSERT has attributes that are NULL but shouldn't be. Unlike a unique
constraint, it doesn't matter what anybody else may be doing at the
same time, or may have inserted even before you began -- your tuple is
definitely going to violate the constraint.

which is why I am asking about this (i.e. if the insert version of upsert
can't handle these scenarios, maybe we need an upsert based on update).

For reasons that are rather complicated, "an upsert based on update"
is more or less an oxymoron. Basically, all of the useful upsert
guarantees hinge upon the implementation taking the alternative path
in the event of a would-be duplicate violation specifically. You can't
make that work with UPDATE, because it would have to be based on
something *not* existing, which is an impossibly ticklish condition to
rely on, unless you lock the entire table or something.

--
Peter Geoghegan

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs