Solution to UPDATE...INSERT problem

Started by Christopher Kings-Lynnealmost 23 years ago9 messages
#1Christopher Kings-Lynne
chriskl@familyhealth.com.au

Hi Guys,

I just thought I'd share with you guys a very clever solution to the old
'update row. if no rows affected, then insert the row' race condition
problem. A guy at my work came up with it.

We were discussing this earlier on -hackers, but no-one could find a
solution that didn't involve locking the entire table around the
update...insert commands.

The problem is that sometimes the row will be inserted by another process
between your update and insert, causing your insert to fail with a unique
constraint violation.

So, say this is the insert:

INSERT INTO table VALUES (1, 'foo'); // 1 is in the primary key column

Rewrite it like this:

INSERT INTO table SELECT 1, 'foo' EXCEPT SELECT 1, 'foo' FROM table WHERE
pkcol=1;

See? So now that INSERT statement will insert the row if it doesn't exist,
or insert zero rows if it does. You are then guaranteed that your
transaction will not fail and rollback, so you can repeat your update, or do
the insert first and then the update, etc.

Hope that's handy for people,

Chris

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christopher Kings-Lynne (#1)
Re: [HACKERS] Solution to UPDATE...INSERT problem

"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:

INSERT INTO table SELECT 1, 'foo' EXCEPT SELECT 1, 'foo' FROM table WHERE
pkcol=1;

See? So now that INSERT statement will insert the row if it doesn't exist,
or insert zero rows if it does. You are then guaranteed that your
transaction will not fail and rollback, so you can repeat your update, or do
the insert first and then the update, etc.

Uh, why exactly do you think this is race-free?

It looks fancy, but AFAICS the SELECT will return info that is correct
as of its starting timestamp; which is not enough to guarantee that the
INSERT won't conflict with another transaction doing the same thing
concurrently.

regards, tom lane

#3Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Christopher Kings-Lynne (#1)
Re: Solution to UPDATE...INSERT problem

Uh, why exactly do you think this is race-free?

It looks fancy, but AFAICS the SELECT will return info that is correct
as of its starting timestamp; which is not enough to guarantee that the
INSERT won't conflict with another transaction doing the same thing
concurrently.

How about:

INSERT INTO table SELECT 1, 'foo' WHERE NOT EXISTS (SELECT TRUE FROM table
WHERE pkcol=1 FOR UPDATE);

It's a lot more straightforward and has a FOR UPDATE. Can this still cause
unique constraint failures?

Chris

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christopher Kings-Lynne (#3)
Re: Solution to UPDATE...INSERT problem

"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:

Uh, why exactly do you think this is race-free?

How about:

INSERT INTO table SELECT 1, 'foo' WHERE NOT EXISTS (SELECT TRUE FROM table
WHERE pkcol=1 FOR UPDATE);

It's a lot more straightforward and has a FOR UPDATE. Can this still cause
unique constraint failures?

Certainly. FOR UPDATE locks an existing row; it cannot lock the
condition of non-existence of a row.

regards, tom lane

#5Lincoln Yeoh
lyeoh@pop.jaring.my
In reply to: Christopher Kings-Lynne (#1)
Re: Solution to UPDATE...INSERT problem

AFAIK the "except" select won't see other inserts in uncommitted
transactions. If those transactions are committed you will end up with the
same problem. You can try it yourself, by manually doing two separate
transactions in psql.

You either have to lock the whole table, or lock at the application layer.
Some time back I suggested a "lock on arbitrary string" feature for
postgresql for this and various other purposes, but that feature probably
wouldn't scale in terms of management (it requires 100% cooperation amongst
all apps/clients involved).

There's no "select * from table where pkey=x for insert;" which would block
on uncommitted inserts/updates of pkey=x and other selects for insert/update.

In contrast "select ... for update" blocks on committed stuff.

Regards,
Link.

At 09:55 AM 3/27/03 +0800, Christopher Kings-Lynne wrote:

Show quoted text

Hi Guys,

I just thought I'd share with you guys a very clever solution to the old
'update row. if no rows affected, then insert the row' race condition
problem. A guy at my work came up with it.

We were discussing this earlier on -hackers, but no-one could find a
solution that didn't involve locking the entire table around the
update...insert commands.

The problem is that sometimes the row will be inserted by another process
between your update and insert, causing your insert to fail with a unique
constraint violation.

So, say this is the insert:

INSERT INTO table VALUES (1, 'foo'); // 1 is in the primary key column

Rewrite it like this:

INSERT INTO table SELECT 1, 'foo' EXCEPT SELECT 1, 'foo' FROM table WHERE
pkcol=1;

See? So now that INSERT statement will insert the row if it doesn't exist,
or insert zero rows if it does. You are then guaranteed that your
transaction will not fail and rollback, so you can repeat your update, or do
the insert first and then the update, etc.

Hope that's handy for people,

Chris

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

#6Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Lincoln Yeoh (#5)
Re: [GENERAL] Solution to UPDATE...INSERT problem

AFAIK the "except" select won't see other inserts in uncommitted
transactions. If those transactions are committed you will end up with the
same problem. You can try it yourself, by manually doing two separate
transactions in psql.

Yeah, I see that now.

You either have to lock the whole table, or lock at the application layer.
Some time back I suggested a "lock on arbitrary string" feature for
postgresql for this and various other purposes, but that feature probably
wouldn't scale in terms of management (it requires 100% cooperation

amongst

all apps/clients involved).

There's no "select * from table where pkey=x for insert;" which would

block

on uncommitted inserts/updates of pkey=x and other selects for

insert/update.

How about user locks? Isn't there something in contrib/ for that??? I
could do a userlock on the primary key, whether it existed or not?

Chris

#7Dennis Gearon
gearond@cvc.net
In reply to: Lincoln Yeoh (#5)
Re: Solution to UPDATE...INSERT problem

so the only real solution to this now is in application code outside of a
transatction, i.e. PHP,Perl,VB,C,Python, etc, right?

Lincoln Yeoh wrote:

Show quoted text

AFAIK the "except" select won't see other inserts in uncommitted
transactions. If those transactions are committed you will end up with
the same problem. You can try it yourself, by manually doing two
separate transactions in psql.

You either have to lock the whole table, or lock at the application
layer. Some time back I suggested a "lock on arbitrary string" feature
for postgresql for this and various other purposes, but that feature
probably wouldn't scale in terms of management (it requires 100%
cooperation amongst all apps/clients involved).

There's no "select * from table where pkey=x for insert;" which would
block on uncommitted inserts/updates of pkey=x and other selects for
insert/update.

In contrast "select ... for update" blocks on committed stuff.

Regards,
Link.

At 09:55 AM 3/27/03 +0800, Christopher Kings-Lynne wrote:

Hi Guys,

I just thought I'd share with you guys a very clever solution to the old
'update row. if no rows affected, then insert the row' race condition
problem. A guy at my work came up with it.

We were discussing this earlier on -hackers, but no-one could find a
solution that didn't involve locking the entire table around the
update...insert commands.

The problem is that sometimes the row will be inserted by another process
between your update and insert, causing your insert to fail with a unique
constraint violation.

So, say this is the insert:

INSERT INTO table VALUES (1, 'foo'); // 1 is in the primary key column

Rewrite it like this:

INSERT INTO table SELECT 1, 'foo' EXCEPT SELECT 1, 'foo' FROM table WHERE
pkcol=1;

See? So now that INSERT statement will insert the row if it doesn't
exist,
or insert zero rows if it does. You are then guaranteed that your
transaction will not fail and rollback, so you can repeat your update,
or do
the insert first and then the update, etc.

Hope that's handy for people,

Chris

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

#8Haroldo Stenger
hstenger@adinet.com.uy
In reply to: Tom Lane (#2)
Re: [HACKERS] Solution to UPDATE...INSERT problem

El jue, 27-03-2003 a las 03:41, Tom Lane escribió:

"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:

INSERT INTO table SELECT 1, 'foo' EXCEPT SELECT 1, 'foo' FROM table WHERE
pkcol=1;

See? So now that INSERT statement will insert the row if it doesn't exist,
or insert zero rows if it does. You are then guaranteed that your
transaction will not fail and rollback, so you can repeat your update, or do
the insert first and then the update, etc.

Uh, why exactly do you think this is race-free?

It looks fancy, but AFAICS the SELECT will return info that is correct
as of its starting timestamp; which is not enough to guarantee that the
INSERT won't conflict with another transaction doing the same thing
concurrently.

This approach certainly reduces significantly the time span within which
a race could occur, compared to, say, using two separate statements, or
worse, two statements in two consecutive transactions. But race
conditions either exist or they don't, so you're right.

Now, up to my knowledge this problem was only a an intractable one in
PostgreSQL because of transactions going into abort state once a unique
restriction violation happened. If savepoints/nested transactions were
there, one would simply put the insert within a protected area, and
retry as much as needed.

That's my bet why other databases don't seem to have a problem with this
one. Am I right? Or do they have some magic solution other than locking
the whole table?

Regards
Haroldo

#9Lincoln Yeoh
lyeoh@pop.jaring.my
In reply to: Christopher Kings-Lynne (#6)
Re: Solution to UPDATE...INSERT problem

At 05:28 PM 3/27/03 +0800, Christopher Kings-Lynne wrote:

There's no "select * from table where pkey=x for insert;" which would

block

on uncommitted inserts/updates of pkey=x and other selects for

insert/update.

How about user locks? Isn't there something in contrib/ for that??? I
could do a userlock on the primary key, whether it existed or not?

Depends on your case, whether you can correctly convert your potential
primary keys into integers to be locked on.

It still requires full cooperation by all relevant apps/clients.

Actually select ... for updates also require cooperation, but it's a
standard way of doing things, so apps that don't cooperate can be said to
be broken :).

Is there a standard for "select ... for insert"? Or lock table for insert
where pkey=x?

Regards,
Link.