Concurrent updates

Started by Steve Ericksonabout 13 years ago2 messagesgeneral
Jump to latest
#1Steve Erickson
serickson@digitiliti.com

I have a table that I want to use as a queue with all functionality (Insert,
update, delete) embodied in a stored procedure. Inserts and deletes are no
problem. An external program would call the stored procedure to get one or
more emails to work on, selecting on "state='N'", then updating the row so
"state='P'". My problem is having multiple threads calling the stored
procedure simultaneously and getting the same row(s). Selecting FOR UPDATE
won't work as, if thread #1 gets 3 rows and thread #2 starts before thread
#1 completes (Commits), thread #2 will select the same 3 rows as thread #1
except, since thread #1 will update the state (changing the state to 'P') so
that those rows no longer meet thread #2 criteria, and thread #2 will
receive zero rows. The table looks like:

CREATE TABLE dss.stage_email

(

emailid bigserial NOT NULL,

email_path text,

state character(1) DEFAULT 'N'::bpchar, -- N = New Email, P=Processing,
D=Deleting

fetch_date timestamp without time zone DEFAULT now(),

ingest_date timestamp without time zone

)

Steve Erickson

Senior Developer

266 East 7th Street, Floor 4

Saint Paul, MN 55101

651.925.3237 office

612.242.1343 cell

NOTICE: This email, including any attachments, is covered by the Electronic
Communications Privacy Act, is confidential and may be legally privileged.
If you are not the intended recipient, you are hereby notified that any
retention, dissemination, distribution or copying of this communication is
strictly prohibited. If you have received this communication in error,
please immediately notify our office by return email or at our telephone
number (651) 925-3200. Then delete and destroy all copies of this email and
any attachments.

Thank you.

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Steve Erickson (#1)
Re: Concurrent updates

Steve Erickson wrote:

I have a table that I want to use as a queue with all functionality (Insert, update, delete) embodied
in a stored procedure. Inserts and deletes are no problem. An external program would call the stored
procedure to get one or more emails to work on, selecting on "state='N'", then updating the row so
"state='P'". My problem is having multiple threads calling the stored procedure simultaneously and
getting the same row(s). Selecting FOR UPDATE won't work as, if thread #1 gets 3 rows and thread #2
starts before thread #1 completes (Commits), thread #2 will select the same 3 rows as thread #1
except, since thread #1 will update the state (changing the state to 'P') so that those rows no longer
meet thread #2 criteria, and thread #2 will receive zero rows. The table looks like:

CREATE TABLE dss.stage_email
(
emailid bigserial NOT NULL,
email_path text,
state character(1) DEFAULT 'N'::bpchar, -- N = New Email, P=Processing, D=Deleting
fetch_date timestamp without time zone DEFAULT now(),
ingest_date timestamp without time zone
)

Strange, it works fine for me.

SELECT * FROM stage_email;

emailid | email_path | state | fetch_date | ingest_date
---------+------------+-------+----------------------------+-------------
1 | one | N | 2013-03-19 09:25:35.25905 |
2 | two | N | 2013-03-19 09:25:38.682343 |
3 | three | N | 2013-03-19 09:25:42.162118 |
4 | four | N | 2013-03-19 09:25:45.626052 |
(4 rows)

Session 1: BEGIN;
Session 1: SELECT * FROM stage_email WHERE state = 'N' ORDER BY emailid LIMIT 2 FOR UPDATE;

emailid | email_path | state | fetch_date | ingest_date
---------+------------+-------+----------------------------+-------------
1 | one | N | 2013-03-19 09:25:35.25905 |
2 | two | N | 2013-03-19 09:25:38.682343 |
(2 rows)

Session 2: BEGIN;
Session 2: SELECT * FROM stage_email WHERE state = 'N' ORDER BY emailid LIMIT 2 FOR UPDATE;
(Session 2 blocks)

Session 1: UPDATE stage_email SET state = 'P' WHERE emailid IN (1, 2);
Session 1: COMMIT;

Now session 2 gets this result:

emailid | email_path | state | fetch_date | ingest_date
---------+------------+-------+----------------------------+-------------
3 | three | N | 2013-03-19 09:25:42.162118 |
4 | four | N | 2013-03-19 09:25:45.626052 |
(2 rows)

So session 2 can continue processing the next two entries.

Maybe you can illustrate your problem with an example.

Yours,
Laurenz Albe

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general