Weird (?) happenings with locks and limits?
I'm using PostgreSQL (via pyPgSQL) to deal with a database of tasks. I
want to have processes be able to "check out" a task, but I'm seeing some
kind of odd results. If I try to force two processes to check out tasks
at the same time, some of them get a response that would indicate no
further tasks.
Here are the details. The database is:
CREATE TABLE jobs (
id serial,
assignedto text default NULL
);
The SQL I'm using is:
1) SELECT id FROM jobs WHERE assignedto is NULL FOR UPDATE LIMIT 1;
2) UPDATE jobs SET assignedto = 'assigned' WHERE id = <ID gotten above>;
3) Commit
Each worker is only interested in a single job, hence the "LIMIT 1".
The "weirdness" is that if two processes do step 1 above at the same time,
the second one will get an empty result set. The second process to do step
1 will wait because of the update lock until process 1 gets to step 3. If
I set the limit to 2, then the same thing happens to the third process
that's simultaneously at step 1.
It would seem like the select is getting performed, but then the second
process is getting blocked, and then when the first process completes the
row that it updated is getting removed from the result set of the second
one.
I'm not sure if this is a bug or a feature, but it wasn't what I was
expecting to have happen. I'll probably modify the way it works so that
either I just use no limit (since I guess that wouldn't impact
performance), or make another table which has the job number and who it's
assigned to. That way, with a unique constraint on the job number, I can
get feedback that there was a collision (instead of it just appearing that
there's no jobs to work).
Sean
--
Well I've been to one world fair, a picnic and a rodeo, and that's the
stupidest thing I've heard come over a pair of earphones. -- Major Kong
Sean Reifschneider, Inimitably Superfluous <jafo@tummy.com>
tummy.com - Linux Consulting since 1995. Qmail, KRUD, Firewalls, Python
Sean Reifschneider <jafo@tummy.com> writes:
The SQL I'm using is:
1) SELECT id FROM jobs WHERE assignedto is NULL FOR UPDATE LIMIT 1;
2) UPDATE jobs SET assignedto = 'assigned' WHERE id = <ID gotten above>;
3) Commit
The "weirdness" is that if two processes do step 1 above at the same time,
the second one will get an empty result set. The second process to do step
1 will wait because of the update lock until process 1 gets to step 3.
Yeah. The syntax is a little misleading, because actually the FOR
UPDATE lock is the last step. The second guy comes along, finds the
first row WHERE assignedto is NULL, and then blocks because he can't get
a FOR UPDATE lock on it. When the first guy commits, the second can
get a FOR UPDATE lock ... but he now discovers that assignedto isn't
NULL anymore in that row, so he's not interested in it anymore. And
then he continues the SELECT ... but the LIMIT step is underneath
FOR UPDATE, and it thinks it's done; it won't return any more rows.
I'm not sure if switching the order of the steps would improve matters
or not; offhand I suspect it would break other cases that work now.
Even if the command worked the way you hoped, you'd still have no
concurrency in this operation, because everyone entering the transaction
concurrently will find the same first candidate row, and so they'll all
try to lock FOR UPDATE that same row. I don't see any good way around
this, so I'd suggest simplifying matters by not bothering with FOR
UPDATE. Instead perhaps:
BEGIN;
LOCK TABLE jobs IN EXCLUSIVE MODE;
SELECT id FROM jobs WHERE assignedto is NULL LIMIT 1;
UPDATE jobs SET assignedto = 'assigned' WHERE id = <ID gotten above>;
COMMIT;
The LOCK ensures that only one transaction at a time does this.
If you have other updates you want to do to table jobs that don't
involve changing a NULL assignedto field, then an exclusive lock on
jobs is too strong, because it'll lock out those updates too. The
best answer here may be to create a dummy table that's used for nothing
except establishing the exclusive right to run the above sequence. The
LOCK then becomes something like
LOCK TABLE jobs_assign_interlock;
and the rest is the same.
regards, tom lane
On Sunday 14 Jul 2002 1:33 am, Sean Reifschneider wrote:
1) SELECT id FROM jobs WHERE assignedto is NULL FOR UPDATE LIMIT 1;
2) UPDATE jobs SET assignedto = 'assigned' WHERE id = <ID gotten above>;
3) CommitEach worker is only interested in a single job, hence the "LIMIT 1".
The "weirdness" is that if two processes do step 1 above at the same time,
the second one will get an empty result set. The second process to do step
1 will wait because of the update lock until process 1 gets to step 3. If
I set the limit to 2, then the same thing happens to the third process
that's simultaneously at step 1.
Look at what gets returned from the first select (strip the FOR UPDATE) - if
you repeat the select several times you'll get the same id. Now, in theory
you're not *guaranteed* the same id, but in practice that'll tend to be the
case.
So - the first process selects e.g. id=4, and locks it. The second process
runs the same select and comes up with id=4, sees it is already locked and
then blocks waiting to see if process #1 does anything with that row. All
perfectly reasonable. What you want to say is "select id where assignedto is
null and row is not locked" but I don't know how to help you there.
The interesting thing is (possible bug / my misunderstanding?) if I try it on
7.2.1 the second process returns no results. I can see how that might happen
but I'm not convinced it's the "correct" behaviour.
At the start, I've selected for update in process 1 and got id=4.
richardh=> begin;
BEGIN
The select blocks and when I update that row I get...
richardh=> SELECT id FROM jobs WHERE assignedto is NULL FOR UPDATE LIMIT 1;
id
----
(0 rows)
richardh=> select * from jobs;
id | assignedto
----+------------
1 | a
2 | b
3 | c
5 | d
6 |
4 | p1
(6 rows)
richardh=> SELECT id FROM jobs WHERE assignedto is NULL FOR UPDATE LIMIT 1;
id
----
6
(1 row)
You can see how if the SELECT is looking only at one row it will see that row
invalidated. I can't think if changing the transaction level will help here
(serializable needed?) or if this is a bug.
- Richard Huxton
On Mon, Jul 15, 2002 at 10:19:43AM -0400, Tom Lane wrote:
Yeah. The syntax is a little misleading, because actually the FOR
UPDATE lock is the last step. The second guy comes along, finds the
first row WHERE assignedto is NULL, and then blocks because he can't get
a FOR UPDATE lock on it. When the first guy commits, the second can
get a FOR UPDATE lock ... but he now discovers that assignedto isn't
NULL anymore in that row, so he's not interested in it anymore. And
So it's re-verifying the WHERE clauses after the other branch commits. It
would seem like it would be hard to get correct results after a commit
without re-running the SELECT (in the case where the lock blocks after the
select but before the results are returned).
try to lock FOR UPDATE that same row. I don't see any good way around
this, so I'd suggest simplifying matters by not bothering with FOR
UPDATE. Instead perhaps:
That's a good idea, thanks. I'm also considering having another table
which lists what jobs have been assigned, and having that be a unique
constraint, which would cause other processes requesting the same job to
get a constraint violation and let me re-run the query. On the other hand,
I don't really need concurrency, so locking the table should be fine.
Perhaps the "FOR UPDATE" section of the SELECT documentation should contain
something like:
The lock used by FOR UPDATE may cause some select results to be marked
as no longer valid in cases where another lock has changed related rows.
This may cause some or all of the original SELECT results to become
invalid, producing an artificially small or empty result set.
Thanks,
Sean
--
Follow your dreams. Unless it's the one where you're at work in your
underwear during a fire drill.
Sean Reifschneider, Inimitably Superfluous <jafo@tummy.com>
tummy.com - Linux Consulting since 1995. Qmail, KRUD, Firewalls, Python
Hi Tom,
thanks for your reply
You could try reindexing the system tables, but I don't have a lot of
hope for that. See the REINDEX man page.
I allready tried this ... didn't work out.
OK - I am sorry, but the ECC-RAM I ordered didn't run in the old server
so I had to order a new server as fast as possible... it's going to run
in a about 20-30 minutes hopefully....
How can I start postgres in stand-alone mode with "-o -P" as you wrote
and then do a pg_dump ? Or did I get you wrong?
Show quoted text
or is only an initb possible?
I'd recommend an initdb in any case on the new server. Who knows how
much corruption has snuck into your existing database files? You
certainly cannot copy them over to the new server without great risk.
If you are able to do a pg_dump then you could use the dump file with
some amount of assurance.This is iffy, but if REINDEX fails, you might try starting the
postmaster with "-o -P" (disable use of system indexes) and then
see if you can do a pg_dump. I think that might get around the loss
of pg_statistic_relid_att_index.In any case it's foolish to keep running a live application on such
clearly broken hardware. The longer you run, the more corrupt your
database will get. I suggest that you do nothing except an immediate
pg_dump...regards, tom lane
Henrik Steffen wrote:
Hi Tom,
thanks for your reply
You could try reindexing the system tables, but I don't have a lot of
hope for that. See the REINDEX man page.I allready tried this ... didn't work out.
OK - I am sorry, but the ECC-RAM I ordered didn't run in the old server
so I had to order a new server as fast as possible... it's going to run
in a about 20-30 minutes hopefully....How can I start postgres in stand-alone mode with "-o -P" as you wrote
and then do a pg_dump ? Or did I get you wrong?
postgres -o -P does look valuable:
postmaster -o -P -D ...
That may help.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
yes, but that's stand-alone mode...
if i now try to pg_dump from the bash, it says "no postmaster's running"
how can i dump in stand-alone mode?
Mit freundlichem Gru�
Henrik Steffen
Gesch�ftsf�hrer
top concepts Internetmarketing GmbH
Am Steinkamp 7 - D-21684 Stade - Germany
--------------------------------------------------------
http://www.topconcepts.com Tel. +49 4141 991230
mail: steffen@topconcepts.com Fax. +49 4141 991233
--------------------------------------------------------
24h-Support Hotline: +49 1908 34697 (EUR 1.86/Min,topc)
--------------------------------------------------------
System-Partner gesucht: http://www.franchise.city-map.de
--------------------------------------------------------
Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
--------------------------------------------------------
----- Original Message -----
From: "Bruce Momjian" <pgman@candle.pha.pa.us>
To: "Henrik Steffen" <steffen@city-map.de>
Cc: "Tom Lane" <tgl@sss.pgh.pa.us>; "pg" <pgsql-general@postgresql.org>
Sent: Monday, July 15, 2002 7:50 PM
Subject: Re: [GENERAL] Tom, I've got an emergency - please help me
Show quoted text
Henrik Steffen wrote:
Hi Tom,
thanks for your reply
You could try reindexing the system tables, but I don't have a lot of
hope for that. See the REINDEX man page.I allready tried this ... didn't work out.
OK - I am sorry, but the ECC-RAM I ordered didn't run in the old server
so I had to order a new server as fast as possible... it's going to run
in a about 20-30 minutes hopefully....How can I start postgres in stand-alone mode with "-o -P" as you wrote
and then do a pg_dump ? Or did I get you wrong?postgres -o -P does look valuable:
postmaster -o -P -D ...
That may help.
-- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
"Henrik Steffen" <steffen@city-map.de> writes:
How can I start postgres in stand-alone mode with "-o -P" as you wrote
and then do a pg_dump ? Or did I get you wrong?
No, pg_dump won't talk to a standalone backend. I was wondering if it
would work to start the normal postmaster with -P option (which has to
be passed through to the backend, thus -o) and then dump. Worth a
try...
regards, tom lane
ah, I see, -o -P is something completely different than -O -P
ok ... now I am dumping.... no errors so far....
we'll see if it'll work better on the new server.
I am lucky that I didn't lose any data, last backup is 32 hours old :((
thanks to all
Mit freundlichem Gru�
Henrik Steffen
Gesch�ftsf�hrer
top concepts Internetmarketing GmbH
Am Steinkamp 7 - D-21684 Stade - Germany
--------------------------------------------------------
http://www.topconcepts.com Tel. +49 4141 991230
mail: steffen@topconcepts.com Fax. +49 4141 991233
--------------------------------------------------------
24h-Support Hotline: +49 1908 34697 (EUR 1.86/Min,topc)
--------------------------------------------------------
System-Partner gesucht: http://www.franchise.city-map.de
--------------------------------------------------------
Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
--------------------------------------------------------
----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Henrik Steffen" <steffen@city-map.de>
Cc: "pg" <pgsql-general@postgresql.org>
Sent: Monday, July 15, 2002 8:00 PM
Subject: Re: Tom, I've got an emergency - please help me
Show quoted text
"Henrik Steffen" <steffen@city-map.de> writes:
How can I start postgres in stand-alone mode with "-o -P" as you wrote
and then do a pg_dump ? Or did I get you wrong?No, pg_dump won't talk to a standalone backend. I was wondering if it
would work to start the normal postmaster with -P option (which has to
be passed through to the backend, thus -o) and then dump. Worth a
try...regards, tom lane
Tom Lane wrote:
"Henrik Steffen" <steffen@city-map.de> writes:
How can I start postgres in stand-alone mode with "-o -P" as you wrote
and then do a pg_dump ? Or did I get you wrong?No, pg_dump won't talk to a standalone backend. I was wondering if it
would work to start the normal postmaster with -P option (which has to
be passed through to the backend, thus -o) and then dump. Worth a
try...
Yes use -o -P on normal postmaster start, not stand-alone postgres
backend.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Hello all,
my new hardware is up and running, the dump is restored, the new
server is equipped with 1 GB of ECC RAM and an AMD 1.6 GHz processor
Until now the system has been running without any problems.
Thanks for your help!
Hopefully, the problem is solved now!
Mit freundlichem Gru�
Henrik Steffen
Gesch�ftsf�hrer
top concepts Internetmarketing GmbH
Am Steinkamp 7 - D-21684 Stade - Germany
--------------------------------------------------------
http://www.topconcepts.com Tel. +49 4141 991230
mail: steffen@topconcepts.com Fax. +49 4141 991233
--------------------------------------------------------
24h-Support Hotline: +49 1908 34697 (EUR 1.86/Min,topc)
--------------------------------------------------------
System-Partner gesucht: http://www.franchise.city-map.de
--------------------------------------------------------
Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
--------------------------------------------------------
----- Original Message -----
From: "Bruce Momjian" <pgman@candle.pha.pa.us>
To: "Tom Lane" <tgl@sss.pgh.pa.us>
Cc: "Henrik Steffen" <steffen@city-map.de>; "pg" <pgsql-general@postgresql.org>
Sent: Monday, July 15, 2002 8:15 PM
Subject: Re: [GENERAL] Tom, I've got an emergency - please help me
Show quoted text
Tom Lane wrote:
"Henrik Steffen" <steffen@city-map.de> writes:
How can I start postgres in stand-alone mode with "-o -P" as you wrote
and then do a pg_dump ? Or did I get you wrong?No, pg_dump won't talk to a standalone backend. I was wondering if it
would work to start the normal postmaster with -P option (which has to
be passed through to the backend, thus -o) and then dump. Worth a
try...Yes use -o -P on normal postmaster start, not stand-alone postgres
backend.-- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?
On Tue, 16 Jul 2002 07:51:42 +0000 (UTC), steffen@city-map.de ("Henrik
Steffen") wrote:
my new hardware is up and running, the dump is restored, the new
server is equipped with 1 GB of ECC RAM and an AMD 1.6 GHz processorUntil now the system has been running without any problems.
Thanks for your help!
Hopefully, the problem is solved now!
I've watched this thread from the beginning with great interest.
I must say how impressed I am by the support given to Henrik Steffen.
Not just in technical expertise, but with such quick response and
dedication.
Even though the problem was not caused by PostgreSQL, the group seemed
to be resolved to finding a resolution, not just to the cause, but
also to the recovery.
Why do people run their businesses on MS Access / MS SQL Server or
even competent products like Oracle, when products like PostgreSQL (or
whatever you wish to call it), exist.
I work with commercial DB products (MS SQL/Oracle/DB2) and the company
I work for pay a vendor 10,000UKP/pa for support for just ONE bespoke
application they use on their Oracle database (which has a seperate
support cost of it's own) The support calls average 2/pa for this
application. In the two years I've been here, it's been me thats
actually resolved the problems. The applicaton is no more than a
windows based SQL report writer. There are approx., 20 of such bespoke
applications used by this company.
If I received support like I've seen here, the money would be well
spent.
On Tue, 16 Jul 2002, Ian Linwood wrote:
On Tue, 16 Jul 2002 07:51:42 +0000 (UTC), steffen@city-map.de ("Henrik
Steffen") wrote:my new hardware is up and running, the dump is restored, the new
server is equipped with 1 GB of ECC RAM and an AMD 1.6 GHz processorUntil now the system has been running without any problems.
Thanks for your help!
Hopefully, the problem is solved now!
I've watched this thread from the beginning with great interest.
I must say how impressed I am by the support given to Henrik Steffen.
Not just in technical expertise, but with such quick response and
dedication.[trimmed]
You're not kidding. I've saved that thread on the off chance it can be used
when selling PostgreSQL to the World, or a client at least.
Could that example perhaps be written up as a case study and used on the web
site for marketing?
--
Nigel J. Andrews
Director
---
Logictree Systems Limited
Computer Consultants