Hard problem with concurrency

Started by Christopher Kings-Lynnealmost 23 years ago19 messages

chriskl@familyhealth.com.au

almost 23 years ago

OK, this is the problem:

I want to write a bit of SQL that if a row exists in a table, then update
it, otherwise it will update it. Problem is, there is a very high chance of
simultaneous execute of this query on the same row (the rows have a unique
index).

So, strategy one:

begin;
update row;
if (no rows affected) insert row;
commit;

Problem - race condition! If the two transactions run at the same time, the
second will end up doing an insert on a unique row which will cause query
failure

Strategy two:

begin;
select row for update;
if (row returned) update;
else insert;
commit;

Problem - race condition. The row-level locking doesn't allow me to lock
'potential rows', so if the row does not yet exists and two transactions run
simultaneously then the second with die with a unique violation;

Strategy three:

begin;
lock table in exclusive mode;
update row;
if (no rows affected) insert row;
commit;

Problem - Works, but this table needs high concurrency. Every time a member
hits a page of the site that needs authentication, this function is called.
In particular, the login transaction can take a little time sometimes and we
can't halt everyone else's activites for that duration...

So what is the solution???

I'm not sure if acquiring a ROW EXCLUSIVE MODE lock will help at all. Also,
I can't try the insert and then the update because the INSERT, in Postgres,
will cause an outright transaction failure.

What the heck is the solution??

Chris

Noname

greg@turnstep.com

almost 23 years ago

In reply to: Christopher Kings-Lynne (#1)

Re: Hard problem with concurrency

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Solution one: use sequences for the unique value.

Solution two: use another table to effect the exclusive locking
and use it to store the "unique" values:

begin;
update row;
if (no rows affected) {
lock table foo in exclusive mode;
find a unique value that is not already in foo
store this value inside of foo
insert row;
}
commit;

Solution three: use your strategy two, but throw a loop around it and have
it try again (with a new value) if it gets a unique violation.

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200302162143

-----BEGIN PGP SIGNATURE-----
Comment: http://www.turnstep.com/pgp.html

iD8DBQE+T0sFvJuQZxSWSsgRAvMbAJwNndfcRb8U+W4TCeSGMGg+j7CqMwCgpfbd
98bDZI1r5AOLv1iCyVTC/AI=
=0Nkm
-----END PGP SIGNATURE-----

Tom Lane

tgl@sss.pgh.pa.us

almost 23 years ago

In reply to: Christopher Kings-Lynne (#1)

Re: Hard problem with concurrency

"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:

I can't try the insert and then the update because the INSERT, in Postgres,
will cause an outright transaction failure.

Do the update, then try to insert if the update found nothing, and put
a retry loop around the whole transaction in case you fail because of
concurrent inserts.

Realistically you will need a retry loop in all but the most trivial
cases anyway --- certainly so if you want to use serializable
transaction mode. So I don't think this solution is unworkably complex.

regards, tom lane

Christopher Kings-Lynne

chriskl@familyhealth.com.au

almost 23 years ago

In reply to: Christopher Kings-Lynne (#1)

Re: Hard problem with concurrency

Do the update, then try to insert if the update found nothing, and put
a retry loop around the whole transaction in case you fail because of
concurrent inserts.

Realistically you will need a retry loop in all but the most trivial
cases anyway --- certainly so if you want to use serializable
transaction mode. So I don't think this solution is unworkably complex.

I guess that will work, but it will not prevent our log from being spammed
up with error messages. Also, this is a somewhat simplified case. Some
transactions, such as our login transaction have rather large numbers of
operations in them and we don't want to have to rollback the whole thing.
I guess we'll have to live with it.

REPLACE INTO anyone? ;)

Chris

Greg Stark

gsstark@mit.edu

almost 23 years ago

In reply to: Christopher Kings-Lynne (#4)

Re: Hard problem with concurrency

Hm, odd, nobody mentioned this solution:

If you don't have a primary key already, create a unique index on the
combination you want to be unique. Then:

. Try to insert the record
. If you get a duplicate key error
then do update instead

No possibilities of duplicate records due to race conditions. If two people
try to insert/update at the same time you'll only get one of the two results,
but that's the downside of the general approach you've taken. It's a tad
inefficient if the usual case is updates, but certainly not less efficient
than doing table locks.

I'm not sure what you're implementing here. Depending on what it is you might
consider having a table of raw data that you _only_ insert into. Then you
process those results into a table with the consolidated data you're trying to
gather. I've usually found that's more flexible later because then you have
all the raw data in the database even if you only present a limited view.

--
greg

Christopher Kings-Lynne

chriskl@familyhealth.com.au

almost 23 years ago

In reply to: Christopher Kings-Lynne (#1)

Re: Hard problem with concurrency

If you don't have a primary key already, create a unique index on the
combination you want to be unique. Then:

. Try to insert the record
. If you get a duplicate key error
then do update instead

No possibilities of duplicate records due to race conditions. If two

people

try to insert/update at the same time you'll only get one of the two

results,

but that's the downside of the general approach you've taken. It's a tad
inefficient if the usual case is updates, but certainly not less efficient
than doing table locks.

The idea was to stop our postgres logs being spammed up with unique
constraint violation warnings....in which case your solution above is
identical to our current one. Update and if it fails, insert, except since
the row is likely to already be there - our current way will be a bit more
efficient.

Chris

Bruno Wolff III

bruno@wolff.to

almost 23 years ago

In reply to: Greg Stark (#5)

Re: Hard problem with concurrency

On Sun, Feb 16, 2003 at 23:51:49 -0500,
Greg Stark <gsstark@mit.edu> wrote:

Hm, odd, nobody mentioned this solution:

If you don't have a primary key already, create a unique index on the
combination you want to be unique. Then:

. Try to insert the record
. If you get a duplicate key error
then do update instead

I think this solution depends on records not being deleted. Otherwise
a record could be deleted between the insert attempt and the update attempt.

Bruce Momjian

pgman@candle.pha.pa.us

almost 23 years ago

In reply to: Christopher Kings-Lynne (#6)

Re: Hard problem with concurrency

Christopher Kings-Lynne wrote:

If you don't have a primary key already, create a unique index on the
combination you want to be unique. Then:

. Try to insert the record
. If you get a duplicate key error
then do update instead

No possibilities of duplicate records due to race conditions. If two

people

try to insert/update at the same time you'll only get one of the two

results,

but that's the downside of the general approach you've taken. It's a tad
inefficient if the usual case is updates, but certainly not less efficient
than doing table locks.

The idea was to stop our postgres logs being spammed up with unique
constraint violation warnings....in which case your solution above is
identical to our current one. Update and if it fails, insert, except since
the row is likely to already be there - our current way will be a bit more
efficient.

To control the spamming, use server_min_messages before the INSERT.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Ron Mayer

ron@intervideo.com

almost 23 years ago

In reply to: Bruce Momjian (#8)

Re: Hard problem with concurrency

FWIW, that's the approach O*'s taking.
http://otn.oracle.com/products/oracle9i/daily/Aug24.html

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Peter Eisentraut
Sent: Tuesday, February 18, 2003 11:02 AM
To: Christopher Kings-Lynne
Cc: Tom Lane; Hackers
Subject: Re: [HACKERS] Hard problem with concurrency

Christopher Kings-Lynne writes:

REPLACE INTO anyone? ;)

The upcoming SQL 200x standard includes a MERGE command that appears to
fulfill that purpose.

--
Peter Eisentraut peter_e@gmx.net

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Import Notes

Reply to msg id not found: Pine.LNX.4.44.0302181556020.1644-100000@peter.localdomain | Resolved by subject fallback

#10

Peter Eisentraut

peter_e@gmx.net

almost 23 years ago

In reply to: Christopher Kings-Lynne (#4)

Re: Hard problem with concurrency

Christopher Kings-Lynne writes:

REPLACE INTO anyone? ;)

The upcoming SQL 200x standard includes a MERGE command that appears to
fulfill that purpose.

--
Peter Eisentraut peter_e@gmx.net

#11

Hannu Krosing

hannu@tm.ee

almost 23 years ago

In reply to: Peter Eisentraut (#10)

Re: Hard problem with concurrency

Peter Eisentraut kirjutas T, 18.02.2003 kell 21:02:

Christopher Kings-Lynne writes:

REPLACE INTO anyone? ;)

The upcoming SQL 200x standard includes a MERGE command that appears to
fulfill that purpose.

Where is this upcoming standard available on net ?

----------------
Hannu

#12

Peter Eisentraut

peter_e@gmx.net

almost 23 years ago

In reply to: Hannu Krosing (#11)

Re: Hard problem with concurrency

Hannu Krosing writes:

Where is this upcoming standard available on net ?

Near ftp://sqlstandards.org/SC32/WG3/Progression_Documents/FCD

--
Peter Eisentraut peter_e@gmx.net

#13

Vincent van Leeuwen

pgsql.spam@vinz.nl

almost 23 years ago

In reply to: Peter Eisentraut (#10)

Re: Hard problem with concurrency

On 2003-02-18 20:02:29 +0100, Peter Eisentraut wrote:

Christopher Kings-Lynne writes:

REPLACE INTO anyone? ;)

The upcoming SQL 200x standard includes a MERGE command that appears to
fulfill that purpose.

MySQL features a poor-mans aproach to this problem, their REPLACE command:

http://www.mysql.com/doc/en/REPLACE.html
REPLACE works exactly like INSERT, except that if an old record in the table
has the same value as a new record on a UNIQUE index or PRIMARY KEY, the old
record is deleted before the new record is inserted.

I'd love to see this kind of functionality in PG, I've got a database that
caches data which only gets conditional INSERT/UPDATEs, so that would save a
lot of wasted SQL commands.

Vincent van Leeuwen
Media Design

#14

Christopher Kings-Lynne

chriskl@familyhealth.com.au

almost 23 years ago

In reply to: Peter Eisentraut (#10)

Re: Hard problem with concurrency

REPLACE INTO anyone? ;)

The upcoming SQL 200x standard includes a MERGE command that appears to
fulfill that purpose.

Is there somewhere that I can read that spec?

Or can you just post the MERGE syntax for us?

*sigh* It's just like a standard to come up with a totally new syntax for a
feature that no-one has except MySQL who use a different syntax :)

Thanks,

Chris

#15

Bruce Momjian

pgman@candle.pha.pa.us

almost 23 years ago

In reply to: Peter Eisentraut (#12)

Re: Hard problem with concurrency

URL added to develepers FAQ.

---------------------------------------------------------------------------

Peter Eisentraut wrote:

Hannu Krosing writes:

Where is this upcoming standard available on net ?

Near ftp://sqlstandards.org/SC32/WG3/Progression_Documents/FCD

--
Peter Eisentraut peter_e@gmx.net

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

#16

Ron Mayer

ron@intervideo.com

almost 23 years ago

In reply to: Christopher Kings-Lynne (#14)

Re: Hard problem with concurrency

Christopher Kings-Lynne wrote:

*sigh* It's just like a standard to come up with a totally new syntax for a
feature that no-one has except MySQL who use a different syntax :)

You sure? :)

http://otn.oracle.com/products/oracle9i/daily/Aug24.html

MERGE INTO SALES_FACT D
USING SALES_JUL01 S
ON (D.TIME_ID = S.TIME_ID
AND D.STORE_ID = S.STORE_ID
AND D.REGION_ID = S.REGION_ID)
WHEN MATCHED THEN
UPDATE
SET d_parts = d_parts + s_parts,
d_sales_amt = d_sales_amt + s_sales_amt,
d_tax_amt = d_tax_amt + s_tax_amt,
d_discount = d_discount + s_discount
WHEN NOT MATCHED THEN
INSERT (D.TIME_ID ,D.STORE_ID ,D.REGION_ID,
D.PARTS ,D.SALES_AMT ,D.TAX_AMT ,D.DISCOUNT)
VALUES (
S.TIME_ID ,S.STORE_ID ,S.REGION_ID,
S.PARTS ,S.SALES_AMT ,S.TAX_AMT ,S.DISCOUNT);

For those who last played with 8X, they have a couple of other
new features in 9i. This is the best doc I saw talking about them.

http://www.oracle-base.com/Articles/9i/SQLNewFeatures9i.asp

#17

Peter Eisentraut

peter_e@gmx.net

almost 23 years ago

In reply to: Christopher Kings-Lynne (#14)

Re: Hard problem with concurrency

Christopher Kings-Lynne writes:

*sigh* It's just like a standard to come up with a totally new syntax for a
feature that no-one has except MySQL who use a different syntax :)

Actually that command was copied straight out of Oracle.

--
Peter Eisentraut peter_e@gmx.net

#18

CoL

col@mportal.hu

almost 23 years ago

In reply to: Vincent van Leeuwen (#13)

Re: Hard problem with concurrency

Hi,

Vincent van Leeuwen wrote, On 2/19/2003 10:08 PM:

On 2003-02-18 20:02:29 +0100, Peter Eisentraut wrote:

Christopher Kings-Lynne writes:

REPLACE INTO anyone? ;)

The upcoming SQL 200x standard includes a MERGE command that appears to
fulfill that purpose.

MySQL features a poor-mans aproach to this problem, their REPLACE command:

http://www.mysql.com/doc/en/REPLACE.html
REPLACE works exactly like INSERT, except that if an old record in the table
has the same value as a new record on a UNIQUE index or PRIMARY KEY, the old
record is deleted before the new record is inserted.

I'd love to see this kind of functionality in PG, I've got a database that
caches data which only gets conditional INSERT/UPDATEs, so that would save a
lot of wasted SQL commands.

I think this replace function is stupid in mysql. It deletes the the
row, and what if that row is linked into another table? You loose your
connection, relation.
However you can easy write a procedure which can make a real replace,
cause it checks if same data (by keys) is in the table then makes an
update, if not, do an insert.

You can do everything, not like in mysql, just write it as you like.

#19

Manfred Koizar

mkoi-pg@aon.at

almost 23 years ago

In reply to: Christopher Kings-Lynne (#1)

Re: Hard problem with concurrency

On Mon, 17 Feb 2003 09:51:54 +0800, "Christopher Kings-Lynne"
<chriskl@familyhealth.com.au> wrote:

Strategy three:

begin;
lock table in exclusive mode;
update row;
if (no rows affected) insert row;
commit;

Problem - Works, but this table needs high concurrency.

Chris,

distributing congestion might improve that. Instead of locking the
whole table just lock one row in a dummy table:

CREATE TABLE dummylock(id INT PRIMARY KEY);
INSERT INTO dummylock VALUES (0);
INSERT INTO dummylock VALUES (1);
...
INSERT INTO dummylock VALUES (999);

Create an immutable function lockhash(<type of PK>) returning a value
between 0 and 999.

BEGIN;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
UPDATE t SET c = 'newval' WHERE pk = 'pk';
if (no rows affected) THEN
SELECT * FROM dummylock
WHERE id = lockhash('pk') FOR UPDATE;
-- try again
UPDATE t SET c = 'newval' WHERE pk = 'pk';
if (no rows affected) THEN
INSERT INTO t ...;
END IF;
END IF;
COMMIT;

This is just an idea. Completely untested ...

Servus
Manfred