Hard problem with concurrency
OK, this is the problem:
I want to write a bit of SQL that if a row exists in a table, then update
it, otherwise it will update it. Problem is, there is a very high chance of
simultaneous execute of this query on the same row (the rows have a unique
index).
So, strategy one:
begin;
update row;
if (no rows affected) insert row;
commit;
Problem - race condition! If the two transactions run at the same time, the
second will end up doing an insert on a unique row which will cause query
failure
Strategy two:
begin;
select row for update;
if (row returned) update;
else insert;
commit;
Problem - race condition. The row-level locking doesn't allow me to lock
'potential rows', so if the row does not yet exists and two transactions run
simultaneously then the second with die with a unique violation;
Strategy three:
begin;
lock table in exclusive mode;
update row;
if (no rows affected) insert row;
commit;
Problem - Works, but this table needs high concurrency. Every time a member
hits a page of the site that needs authentication, this function is called.
In particular, the login transaction can take a little time sometimes and we
can't halt everyone else's activites for that duration...
So what is the solution???
I'm not sure if acquiring a ROW EXCLUSIVE MODE lock will help at all. Also,
I can't try the insert and then the update because the INSERT, in Postgres,
will cause an outright transaction failure.
What the heck is the solution??
Chris
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Solution one: use sequences for the unique value.
Solution two: use another table to effect the exclusive locking
and use it to store the "unique" values:
begin;
update row;
if (no rows affected) {
lock table foo in exclusive mode;
find a unique value that is not already in foo
store this value inside of foo
insert row;
}
commit;
Solution three: use your strategy two, but throw a loop around it and have
it try again (with a new value) if it gets a unique violation.
- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200302162143
-----BEGIN PGP SIGNATURE-----
Comment: http://www.turnstep.com/pgp.html
iD8DBQE+T0sFvJuQZxSWSsgRAvMbAJwNndfcRb8U+W4TCeSGMGg+j7CqMwCgpfbd
98bDZI1r5AOLv1iCyVTC/AI=
=0Nkm
-----END PGP SIGNATURE-----
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
I can't try the insert and then the update because the INSERT, in Postgres,
will cause an outright transaction failure.
Do the update, then try to insert if the update found nothing, and put
a retry loop around the whole transaction in case you fail because of
concurrent inserts.
Realistically you will need a retry loop in all but the most trivial
cases anyway --- certainly so if you want to use serializable
transaction mode. So I don't think this solution is unworkably complex.
regards, tom lane
Do the update, then try to insert if the update found nothing, and put
a retry loop around the whole transaction in case you fail because of
concurrent inserts.Realistically you will need a retry loop in all but the most trivial
cases anyway --- certainly so if you want to use serializable
transaction mode. So I don't think this solution is unworkably complex.
I guess that will work, but it will not prevent our log from being spammed
up with error messages. Also, this is a somewhat simplified case. Some
transactions, such as our login transaction have rather large numbers of
operations in them and we don't want to have to rollback the whole thing.
I guess we'll have to live with it.
REPLACE INTO anyone? ;)
Chris
Hm, odd, nobody mentioned this solution:
If you don't have a primary key already, create a unique index on the
combination you want to be unique. Then:
. Try to insert the record
. If you get a duplicate key error
then do update instead
No possibilities of duplicate records due to race conditions. If two people
try to insert/update at the same time you'll only get one of the two results,
but that's the downside of the general approach you've taken. It's a tad
inefficient if the usual case is updates, but certainly not less efficient
than doing table locks.
I'm not sure what you're implementing here. Depending on what it is you might
consider having a table of raw data that you _only_ insert into. Then you
process those results into a table with the consolidated data you're trying to
gather. I've usually found that's more flexible later because then you have
all the raw data in the database even if you only present a limited view.
--
greg
If you don't have a primary key already, create a unique index on the
combination you want to be unique. Then:. Try to insert the record
. If you get a duplicate key error
then do update insteadNo possibilities of duplicate records due to race conditions. If two
people
try to insert/update at the same time you'll only get one of the two
results,
but that's the downside of the general approach you've taken. It's a tad
inefficient if the usual case is updates, but certainly not less efficient
than doing table locks.
The idea was to stop our postgres logs being spammed up with unique
constraint violation warnings....in which case your solution above is
identical to our current one. Update and if it fails, insert, except since
the row is likely to already be there - our current way will be a bit more
efficient.
Chris
On Sun, Feb 16, 2003 at 23:51:49 -0500,
Greg Stark <gsstark@mit.edu> wrote:
Hm, odd, nobody mentioned this solution:
If you don't have a primary key already, create a unique index on the
combination you want to be unique. Then:. Try to insert the record
. If you get a duplicate key error
then do update instead
I think this solution depends on records not being deleted. Otherwise
a record could be deleted between the insert attempt and the update attempt.
Christopher Kings-Lynne wrote:
If you don't have a primary key already, create a unique index on the
combination you want to be unique. Then:. Try to insert the record
. If you get a duplicate key error
then do update insteadNo possibilities of duplicate records due to race conditions. If two
people
try to insert/update at the same time you'll only get one of the two
results,
but that's the downside of the general approach you've taken. It's a tad
inefficient if the usual case is updates, but certainly not less efficient
than doing table locks.The idea was to stop our postgres logs being spammed up with unique
constraint violation warnings....in which case your solution above is
identical to our current one. Update and if it fails, insert, except since
the row is likely to already be there - our current way will be a bit more
efficient.
To control the spamming, use server_min_messages before the INSERT.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
FWIW, that's the approach O*'s taking.
http://otn.oracle.com/products/oracle9i/daily/Aug24.html
-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Peter Eisentraut
Sent: Tuesday, February 18, 2003 11:02 AM
To: Christopher Kings-Lynne
Cc: Tom Lane; Hackers
Subject: Re: [HACKERS] Hard problem with concurrency
Christopher Kings-Lynne writes:
REPLACE INTO anyone? ;)
The upcoming SQL 200x standard includes a MERGE command that appears to
fulfill that purpose.
--
Peter Eisentraut peter_e@gmx.net
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?
Import Notes
Reply to msg id not found: Pine.LNX.4.44.0302181556020.1644-100000@peter.localdomain | Resolved by subject fallback
Christopher Kings-Lynne writes:
REPLACE INTO anyone? ;)
The upcoming SQL 200x standard includes a MERGE command that appears to
fulfill that purpose.
--
Peter Eisentraut peter_e@gmx.net
Peter Eisentraut kirjutas T, 18.02.2003 kell 21:02:
Christopher Kings-Lynne writes:
REPLACE INTO anyone? ;)
The upcoming SQL 200x standard includes a MERGE command that appears to
fulfill that purpose.
Where is this upcoming standard available on net ?
----------------
Hannu
Hannu Krosing writes:
Where is this upcoming standard available on net ?
Near ftp://sqlstandards.org/SC32/WG3/Progression_Documents/FCD
--
Peter Eisentraut peter_e@gmx.net
On 2003-02-18 20:02:29 +0100, Peter Eisentraut wrote:
Christopher Kings-Lynne writes:
REPLACE INTO anyone? ;)
The upcoming SQL 200x standard includes a MERGE command that appears to
fulfill that purpose.
MySQL features a poor-mans aproach to this problem, their REPLACE command:
http://www.mysql.com/doc/en/REPLACE.html
REPLACE works exactly like INSERT, except that if an old record in the table
has the same value as a new record on a UNIQUE index or PRIMARY KEY, the old
record is deleted before the new record is inserted.
I'd love to see this kind of functionality in PG, I've got a database that
caches data which only gets conditional INSERT/UPDATEs, so that would save a
lot of wasted SQL commands.
Vincent van Leeuwen
Media Design
REPLACE INTO anyone? ;)
The upcoming SQL 200x standard includes a MERGE command that appears to
fulfill that purpose.
Is there somewhere that I can read that spec?
Or can you just post the MERGE syntax for us?
*sigh* It's just like a standard to come up with a totally new syntax for a
feature that no-one has except MySQL who use a different syntax :)
Thanks,
Chris
URL added to develepers FAQ.
---------------------------------------------------------------------------
Peter Eisentraut wrote:
Hannu Krosing writes:
Where is this upcoming standard available on net ?
Near ftp://sqlstandards.org/SC32/WG3/Progression_Documents/FCD
--
Peter Eisentraut peter_e@gmx.net---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Christopher Kings-Lynne wrote:
*sigh* It's just like a standard to come up with a totally new syntax for a
feature that no-one has except MySQL who use a different syntax :)
You sure? :)
http://otn.oracle.com/products/oracle9i/daily/Aug24.html
MERGE INTO SALES_FACT D
USING SALES_JUL01 S
ON (D.TIME_ID = S.TIME_ID
AND D.STORE_ID = S.STORE_ID
AND D.REGION_ID = S.REGION_ID)
WHEN MATCHED THEN
UPDATE
SET d_parts = d_parts + s_parts,
d_sales_amt = d_sales_amt + s_sales_amt,
d_tax_amt = d_tax_amt + s_tax_amt,
d_discount = d_discount + s_discount
WHEN NOT MATCHED THEN
INSERT (D.TIME_ID ,D.STORE_ID ,D.REGION_ID,
D.PARTS ,D.SALES_AMT ,D.TAX_AMT ,D.DISCOUNT)
VALUES (
S.TIME_ID ,S.STORE_ID ,S.REGION_ID,
S.PARTS ,S.SALES_AMT ,S.TAX_AMT ,S.DISCOUNT);
For those who last played with 8X, they have a couple of other
new features in 9i. This is the best doc I saw talking about them.
Christopher Kings-Lynne writes:
*sigh* It's just like a standard to come up with a totally new syntax for a
feature that no-one has except MySQL who use a different syntax :)
Actually that command was copied straight out of Oracle.
--
Peter Eisentraut peter_e@gmx.net
Hi,
Vincent van Leeuwen wrote, On 2/19/2003 10:08 PM:
On 2003-02-18 20:02:29 +0100, Peter Eisentraut wrote:
Christopher Kings-Lynne writes:
REPLACE INTO anyone? ;)
The upcoming SQL 200x standard includes a MERGE command that appears to
fulfill that purpose.MySQL features a poor-mans aproach to this problem, their REPLACE command:
http://www.mysql.com/doc/en/REPLACE.html
REPLACE works exactly like INSERT, except that if an old record in the table
has the same value as a new record on a UNIQUE index or PRIMARY KEY, the old
record is deleted before the new record is inserted.I'd love to see this kind of functionality in PG, I've got a database that
caches data which only gets conditional INSERT/UPDATEs, so that would save a
lot of wasted SQL commands.
I think this replace function is stupid in mysql. It deletes the the
row, and what if that row is linked into another table? You loose your
connection, relation.
However you can easy write a procedure which can make a real replace,
cause it checks if same data (by keys) is in the table then makes an
update, if not, do an insert.
You can do everything, not like in mysql, just write it as you like.
C.
On Mon, 17 Feb 2003 09:51:54 +0800, "Christopher Kings-Lynne"
<chriskl@familyhealth.com.au> wrote:
Strategy three:
begin;
lock table in exclusive mode;
update row;
if (no rows affected) insert row;
commit;Problem - Works, but this table needs high concurrency.
Chris,
distributing congestion might improve that. Instead of locking the
whole table just lock one row in a dummy table:
CREATE TABLE dummylock(id INT PRIMARY KEY);
INSERT INTO dummylock VALUES (0);
INSERT INTO dummylock VALUES (1);
...
INSERT INTO dummylock VALUES (999);
Create an immutable function lockhash(<type of PK>) returning a value
between 0 and 999.
BEGIN;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
UPDATE t SET c = 'newval' WHERE pk = 'pk';
if (no rows affected) THEN
SELECT * FROM dummylock
WHERE id = lockhash('pk') FOR UPDATE;
-- try again
UPDATE t SET c = 'newval' WHERE pk = 'pk';
if (no rows affected) THEN
INSERT INTO t ...;
END IF;
END IF;
COMMIT;
This is just an idea. Completely untested ...
Servus
Manfred