Is what I want possible and if so how?
Hi,
I have a table with events that must be handled by multiple clients. It
does not matter which client handles an event, but no two clients may
handle the same event and an event may only handled once. A client can
only determine the availability of an event by querying the database.
The access to the table should be queue-like with synchronization.
My idea was that a client should do a "SELECT" on the table and mark the
selected records as "being handled" to avoid double handling.
I have read the manual about "LOCK", "SET TRANSACTION" and "SELECT...FOR
UPDATE", but from what I understand, they cannot prevent a "SELECT" from
another client, based on the "SELECT".
Is there a way to make this possible?
TIA
--
Groeten,
Joost Kraaijeveld
Askesis B.V.
Molukkenstraat 14
6524NB Nijmegen
tel: 024-3888063 / 06-51855277
fax: 024-3608416
web: www.askesis.nl
Joost Kraaijeveld <J.Kraaijeveld@Askesis.nl> writes:
I have read the manual about "LOCK", "SET TRANSACTION" and "SELECT...FOR
UPDATE", but from what I understand, they cannot prevent a "SELECT" from
another client, based on the "SELECT".Is there a way to make this possible?
Have all the clients use SELECT FOR UPDATE.
-Doug
On Mon, 2006-07-03 at 07:54 -0400, Douglas McNaught wrote:
Have all the clients use SELECT FOR UPDATE.
OK, thanks, your answer lead me to a re-read of the docs and I think I
found a way to do it.
--
Groeten,
Joost Kraaijeveld
Askesis B.V.
Molukkenstraat 14
6524NB Nijmegen
tel: 024-3888063 / 06-51855277
fax: 024-3608416
web: www.askesis.nl
Joost Kraaijeveld <J.Kraaijeveld@Askesis.nl> writes:
I have a table with events that must be handled by multiple clients. It
does not matter which client handles an event, but no two clients may
handle the same event and an event may only handled once. A client can
only determine the availability of an event by querying the database.
The access to the table should be queue-like with synchronization.
My idea was that a client should do a "SELECT" on the table and mark the
selected records as "being handled" to avoid double handling.
I have read the manual about "LOCK", "SET TRANSACTION" and "SELECT...FOR
UPDATE", but from what I understand, they cannot prevent a "SELECT" from
another client, based on the "SELECT".
Is there a way to make this possible?
Sure, but everybody has to use SELECT FOR UPDATE and/or UPDATE.
For instance, do this in READ COMMITTED mode:
begin;
select jobid from queue
where status = 'waiting'
order by priority
limit 1
for update;
[ if no row returned, rollback, sleep a bit, try again ]
update queue set status = 'busy' where jobid = 'prev result';
commit;
Once TX1 has selected a row for update, it's locked and TX2 will sleep
until TX1 commits before selecting it; then it will see the change of
status and TX2's select will not return the row. This isn't entirely
perfect because LIMIT acts before FOR UPDATE: TX2's select will return
nothing, rather than selecting the next available row as you might wish.
So you might want to retry the select several times before deciding
there's nothing to do.
Also, an index on priority would be a good idea to make the select
fast.
regards, tom lane
On Mon, 2006-07-03 at 17:03, Tom Lane wrote:
status and TX2's select will not return the row. This isn't entirely
perfect because LIMIT acts before FOR UPDATE: TX2's select will return
nothing, rather than selecting the next available row as you might wish.
So you might want to retry the select several times before deciding
there's nothing to do.
We do have a table like this, and in fact we did observe this behavior
that if multiple clients ask for a row at the same time, the first gets
something and the rest nothing. We're actually still looking for an
optimal solution for this...
For now, we added a random field to the table (with values 0-9), and the
clients asks with a where clause for a random value in this field. This
way there's a good chance the clients will not tip on each other's toes
(i.e. the row asked for is not locked by another client). It is still
necessary to retry a few times, but after introducing this random number
mechanism we did notice a significant performance improvement in
emptying the queue... so it must work somehow. It's true that we usually
have 10-15 clients constantly polling the queue, and the queue itself is
usually loaded with at least a few hundred tasks, so the random numbers
are reasonably distributed to be effective.
Now I wonder if there's some other way to get the same result without
additional column in the table ?
Cheers,
Csaba.
Joost,
Everyone has to do a select for update, then they will be serialized.
Dave
On 3-Jul-06, at 6:45 AM, Joost Kraaijeveld wrote:
Show quoted text
Hi,
I have a table with events that must be handled by multiple
clients. It
does not matter which client handles an event, but no two clients may
handle the same event and an event may only handled once. A client
can
only determine the availability of an event by querying the database.
The access to the table should be queue-like with synchronization.My idea was that a client should do a "SELECT" on the table and
mark the
selected records as "being handled" to avoid double handling.I have read the manual about "LOCK", "SET TRANSACTION" and
"SELECT...FOR
UPDATE", but from what I understand, they cannot prevent a "SELECT"
from
another client, based on the "SELECT".Is there a way to make this possible?
TIA
--
Groeten,Joost Kraaijeveld
Askesis B.V.
Molukkenstraat 14
6524NB Nijmegen
tel: 024-3888063 / 06-51855277
fax: 024-3608416
web: www.askesis.nl---------------------------(end of
broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
Csaba Nagy wrote:
On Mon, 2006-07-03 at 17:03, Tom Lane wrote:
status and TX2's select will not return the row. This isn't entirely
perfect because LIMIT acts before FOR UPDATE: TX2's select will return
nothing, rather than selecting the next available row as you might wish.
So you might want to retry the select several times before deciding
there's nothing to do.We do have a table like this, and in fact we did observe this behavior
that if multiple clients ask for a row at the same time, the first gets
something and the rest nothing. We're actually still looking for an
optimal solution for this...For now, we added a random field to the table (with values 0-9), and the
clients asks with a where clause for a random value in this field. This
way there's a good chance the clients will not tip on each other's toes
(i.e. the row asked for is not locked by another client). It is still
necessary to retry a few times, but after introducing this random number
mechanism we did notice a significant performance improvement in
emptying the queue... so it must work somehow. It's true that we usually
have 10-15 clients constantly polling the queue, and the queue itself is
usually loaded with at least a few hundred tasks, so the random numbers
are reasonably distributed to be effective.Now I wonder if there's some other way to get the same result without
additional column in the table ?
For a small number of processes and a large difference in time
between the 'loookup' speed and the 'work' I have used a two-step
process where you first get a batch of records and then try them
all in rapid succession. In pseudocode:
SELECT *
FROM table
WHERE condition
LIMIT number_of_queue_processes + 1;
LOOP;
BEGIN;
SELECT *
FROM table
WHERE condition AND pk = xxx
LIMIT 1 FOR UPDATE NOWAIT;
do something;
COMMIT;
END;
Jochem
Jochem,
For a small number of processes and a large difference in time
between the 'loookup' speed and the 'work' I have used a two-step
process where you first get a batch of records and then try them
all in rapid succession. In pseudocode:SELECT *
FROM table
WHERE condition
LIMIT number_of_queue_processes + 1;LOOP;
BEGIN;
SELECT *
FROM table
WHERE condition AND pk = xxx
LIMIT 1 FOR UPDATE NOWAIT;do something;
COMMIT;
END;
I decided to use the same schema here. The only improvement I can see to
it is to shuffle the batch in random order... that way the competing
processors get a higher chance to avoid collision. I have to see if this
really works out well once I get the chance to deploy it...
Thanks,
Csaba.