Select for update with offset interferes with concurrent transactions

Started by Yngve Nysaeter Pettersenover 15 years ago17 messagesgeneral

yngve@opera.com

over 15 years ago

Hello all,

I am in the process of migrating a system from Postgresql 8.3 to 9.0, and
have run into a problem with the task queue systems I am using.

The task queue controls the allocation of tasks between about 1000
processes working in parallel, and is essentially a table of

record_id (unique)
project_id
task_description_id
state (idle, started, finished)

Each project currently have about 2 million entries. My plan is to
increase that significantly the next few months.

To avoid having the processes trample each other's queries (the first
attempt was to select the first matching entries of the table, which
caused one to block all other transactions), one of the steps I took was
to select a set of idle rows at a random offset into the table from the
project, mark them for update, then update each record's state as started.

SELECT record_id FROM queue WHERE project_id = my_project AND state =
idle LIMIT n OFFSET i FOR UPDATE

At present "n" is 100-150, "i" is a random value in the range 0-10000.

There is, intentionally, no ordering specified, since that would just slow
down the query, and is not necessary.

For reference, the above query is sent through Django's cursor.execute()
call in a manual transaction block.

What I've discovered when using Postgres 9.0 is that the processes are now
blocking every other query into this table, apparently reducing the task
processing speed by at least a factor of 10, and increasing the load on
the server by a similar factor, compared to when Postgres 8.3 was used.
The problem is apparent just after starting, with only 50-100 processes
active (startup is staggered).

Reducing "n" (and looping), or increasing the "i" range did not work.

The reason seems to be this new part of
http://www.postgresql.org/docs/9.0/static/sql-select.html (towards the end
of the FOR UPDATE section):

If a LIMIT is used, locking stops once enough rows have been returned
to satisfy the limit
(but note that rows skipped over by OFFSET will get locked). Similarly,
if FOR UPDATE or
FOR SHARE is used in a cursor's query, only rows actually fetched or
stepped past by the
cursor will be locked.

I can't find similar text in the 8.3 or 8.4 documentation.

AFAICT, and assuming I have not misunderstood this part of the
documentation this means that if one of my processing nodes selects a
block of 100 entries at offset 8000 in the resulting table, then every
other node will be blocked while the block is being processed, not just
the nodes that would have selected the rows in the range 0 to 7999, but
also >=8100, because they cannot gain access to the rows.

Also, using FOR SHARE does not seem to solve the problem.

IMO, as a database non-expert, locking rows that were not returned as a
result of the query is a bug. As an example, if a query selects the X last
items in the matching rows, that is equivalent to locking the table, or
the relevant part of it, even if the requester have no intention to modify
those other rows.

Is there any way to avoid this problem? Or do I have to add a random
batch_id field to the queue table in order to separate the processes'
queries so that they do not block each other (as frequently)?

Is it possible to disable the source code causing this (that is, reverting
the patch that introduced the problem, or changing a configuration switch)?

--
Sincerely,
Yngve N. Pettersen
********************************************************************
Senior Developer Email: yngve@opera.com
Opera Software ASA http://www.opera.com/
Phone: +47 23 69 32 60 Fax: +47 23 69 24 01
********************************************************************

Andy Colson

andy@squeakycode.net

over 15 years ago

In reply to: Yngve Nysaeter Pettersen (#1)

Re: Select for update with offset interferes with concurrent transactions

On 2/1/2011 6:32 AM, Yngve Nysaeter Pettersen wrote:

Hello all,

I am in the process of migrating a system from Postgresql 8.3 to 9.0,
and have run into a problem with the task queue systems I am using.

The task queue controls the allocation of tasks between about 1000
processes working in parallel, and is essentially a table of

record_id (unique)
project_id
task_description_id
state (idle, started, finished)

Each project currently have about 2 million entries. My plan is to
increase that significantly the next few months.

To avoid having the processes trample each other's queries (the first
attempt was to select the first matching entries of the table, which
caused one to block all other transactions), one of the steps I took was
to select a set of idle rows at a random offset into the table from the
project, mark them for update, then update each record's state as started.

SELECT record_id FROM queue WHERE project_id = my_project AND state =
idle LIMIT n OFFSET i FOR UPDATE

At present "n" is 100-150, "i" is a random value in the range 0-10000.

There is, intentionally, no ordering specified, since that would just
slow down the query, and is not necessary.

For reference, the above query is sent through Django's cursor.execute()
call in a manual transaction block.

What I've discovered when using Postgres 9.0 is that the processes are
now blocking every other query into this table, apparently reducing the
task processing speed by at least a factor of 10, and increasing the
load on the server by a similar factor, compared to when Postgres 8.3
was used. The problem is apparent just after starting, with only 50-100
processes active (startup is staggered).

Reducing "n" (and looping), or increasing the "i" range did not work.

The reason seems to be this new part of
http://www.postgresql.org/docs/9.0/static/sql-select.html (towards the
end of the FOR UPDATE section):

If a LIMIT is used, locking stops once enough rows have been returned to
satisfy the limit
(but note that rows skipped over by OFFSET will get locked). Similarly,
if FOR UPDATE or
FOR SHARE is used in a cursor's query, only rows actually fetched or
stepped past by the
cursor will be locked.

I can't find similar text in the 8.3 or 8.4 documentation.

AFAICT, and assuming I have not misunderstood this part of the
documentation this means that if one of my processing nodes selects a
block of 100 entries at offset 8000 in the resulting table, then every
other node will be blocked while the block is being processed, not just
the nodes that would have selected the rows in the range 0 to 7999, but
also >=8100, because they cannot gain access to the rows.

Also, using FOR SHARE does not seem to solve the problem.

IMO, as a database non-expert, locking rows that were not returned as a
result of the query is a bug. As an example, if a query selects the X
last items in the matching rows, that is equivalent to locking the
table, or the relevant part of it, even if the requester have no
intention to modify those other rows.

Is there any way to avoid this problem? Or do I have to add a random
batch_id field to the queue table in order to separate the processes'
queries so that they do not block each other (as frequently)?

Is it possible to disable the source code causing this (that is,
reverting the patch that introduced the problem, or changing a
configuration switch)?

So, if I understand correctly, you:

q = SELECT record_id FROM queue
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i FOR UPDATE
while not q.eof
update queue set state = started where record_id = x;
process record_id
update queue set state = finsihed where record_id = x;
q.next;

Might I suggest and alternative:

q = update queue set state = started
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i
RETURNING project_id;
idlist = @q;
commit;

foreach x in idlist
process record_id
begin
update queue set state = finsihed where record_id = x;
commit;

Forgive the part perl part python sudocode. Oh, and I've never done
this, no idea if it actually works. :-)

-Andy

Yngve Nysaeter Pettersen

yngve@opera.com

over 15 years ago

In reply to: Andy Colson (#2)

Re: Select for update with offset interferes with concurrent transactions

Hi,

Thanks for the quick answer, Andy.

On Tue, 01 Feb 2011 16:19:17 +0100, Andy Colson <andy@squeakycode.net>
wrote:

<snip>

So, if I understand correctly, you:

q = SELECT record_id FROM queue
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i FOR UPDATE
while not q.eof
update queue set state = started where record_id = x;
process record_id
update queue set state = finsihed where record_id = x;
q.next;

Almost, the update to "started" is done for all selected elements first,
releasing the lock, then the items are processed one at a time, marking
each "finished" as they complete. (each processing step can take minutes,
so keeping a lock the whole time is not an option)

Might I suggest and alternative:

q = update queue set state = started
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i
RETURNING project_id;
idlist = @q;
commit;

foreach x in idlist
process record_id
begin
update queue set state = finsihed where record_id = x;
commit;

Forgive the part perl part python sudocode. Oh, and I've never done
this, no idea if it actually works. :-)

Thanks for that suggestion, I'll take a look at it.

While I hadn't caught on to the "RETURNING" part, I had been wondering if
using a single step UPDATE might be a solution. One concern I have is how
concurrent updates will affect the returned list (or if they will just be
skipped, as SELECT would in normal transaction mode, if I understood
correctly), or whether it might return with an error code (I know that the
normal update return value is the number of updated items, just not sure
if that applies for "RETURNING").

Although, I will note that this process (if it works) will, sort of, make
FOR UPDATE redundant. Or, if it doesn't, the current lock-policy might
cause issues for concurrent updates for the use-cases where FOR UPDATE is
relevant.

Andy Colson

andy@squeakycode.net

over 15 years ago

In reply to: Yngve Nysaeter Pettersen (#3)

Re: Select for update with offset interferes with concurrent transactions

On 2/1/2011 10:59 AM, Yngve Nysaeter Pettersen wrote:

Hi,

Thanks for the quick answer, Andy.

On Tue, 01 Feb 2011 16:19:17 +0100, Andy Colson <andy@squeakycode.net>
wrote:

<snip>

So, if I understand correctly, you:

q = SELECT record_id FROM queue
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i FOR UPDATE
while not q.eof
update queue set state = started where record_id = x;
process record_id
update queue set state = finsihed where record_id = x;
q.next;

Almost, the update to "started" is done for all selected elements first,
releasing the lock, then the items are processed one at a time, marking
each "finished" as they complete. (each processing step can take
minutes, so keeping a lock the whole time is not an option)

Might I suggest and alternative:

q = update queue set state = started
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i
RETURNING project_id;
idlist = @q;
commit;

foreach x in idlist
process record_id
begin
update queue set state = finsihed where record_id = x;
commit;

Forgive the part perl part python sudocode. Oh, and I've never done
this, no idea if it actually works. :-)

Thanks for that suggestion, I'll take a look at it.

While I hadn't caught on to the "RETURNING" part, I had been wondering
if using a single step UPDATE might be a solution. One concern I have is
how concurrent updates will affect the returned list (or if they will
just be skipped, as SELECT would in normal transaction mode, if I
understood correctly), or whether it might return with an error code (I
know that the normal update return value is the number of updated items,
just not sure if that applies for "RETURNING").

Although, I will note that this process (if it works) will, sort of,
make FOR UPDATE redundant. Or, if it doesn't, the current lock-policy
might cause issues for concurrent updates for the use-cases where FOR
UPDATE is relevant.

Yeah, I'd wondered the same thing. It could be two updates hitting the
same row will deadlock, or maybe not, I'm not sure. But I think its the
same as with the select, if you happen to have two limits that hit the
same range, you're in trouble.

I think the random limit thing is a race condition itself. Whenever you
have multiple processes hitting the same rows you're going to run into
problems. Have you thought of using a sequence instead of a random
limit? Each process could get the next 100 record_id'd via a sequence,
then there would be much less chance of deadlock.

-Andy

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Yngve Nysaeter Pettersen (#1)

Re: Select for update with offset interferes with concurrent transactions

"Yngve Nysaeter Pettersen" <yngve@opera.com> writes:

To avoid having the processes trample each other's queries (the first
attempt was to select the first matching entries of the table, which
caused one to block all other transactions), one of the steps I took was
to select a set of idle rows at a random offset into the table from the
project, mark them for update, then update each record's state as started.

SELECT record_id FROM queue WHERE project_id = my_project AND state =
idle LIMIT n OFFSET i FOR UPDATE

At present "n" is 100-150, "i" is a random value in the range 0-10000.

There is, intentionally, no ordering specified, since that would just slow
down the query, and is not necessary.

This seems like a pretty bad design. There are recognized ways to solve
this problem with more predictability and much less chance of different
processes blocking each other. In particular, this query seems be based
on some untenable assumptions about the physical row order being stable.

What I've discovered when using Postgres 9.0 is that the processes are now
blocking every other query into this table,

In 9.0, LIMIT/OFFSET processing is done after FOR UPDATE locking, which
means that rows skipped over by OFFSET still get locked, which means
that different sessions executing this query are now practically certain
to block each other, rather than just likely to block each other.
This was an intentional change to improve the predictability of FOR
UPDATE's interactions with LIMIT/OFFSET, and indeed it's improved the
predictability of the behavior for you, just not in the direction you'd
like :-(

regards, tom lane

Yngve Nysaeter Pettersen

yngve@opera.com

over 15 years ago

In reply to: Tom Lane (#5)

Re: Select for update with offset interferes with concurrent transactions

On Tue, 01 Feb 2011 18:18:17 +0100, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Yngve Nysaeter Pettersen" <yngve@opera.com> writes:

To avoid having the processes trample each other's queries (the first
attempt was to select the first matching entries of the table, which
caused one to block all other transactions), one of the steps I took was
to select a set of idle rows at a random offset into the table from the
project, mark them for update, then update each record's state as
started.

SELECT record_id FROM queue WHERE project_id = my_project AND state =
idle LIMIT n OFFSET i FOR UPDATE

At present "n" is 100-150, "i" is a random value in the range 0-10000.

There is, intentionally, no ordering specified, since that would just
slow
down the query, and is not necessary.

This seems like a pretty bad design.

Well, I don't claim to be a database expert ;).

While there might be better ways, the current one have worked OK in the
year since it was implemented.

There are recognized ways to solve
this problem with more predictability and much less chance of different

I'd appreciate it if you could provide a couple of pointers.

processes blocking each other. In particular, this query seems be based
on some untenable assumptions about the physical row order being stable.

No, it does not assume that the row order is stable; I don't really care
about the order of the elements, since the actual order of task execution
depends much more significantly on other variables, and the actual order
isn't important at all (although further design changes might impose some
limited category grouping on the queue, that would still not make the
ordering important within the group).

What I've discovered when using Postgres 9.0 is that the processes are
now
blocking every other query into this table,

In 9.0, LIMIT/OFFSET processing is done after FOR UPDATE locking, which
means that rows skipped over by OFFSET still get locked, which means
that different sessions executing this query are now practically certain
to block each other, rather than just likely to block each other.
This was an intentional change to improve the predictability of FOR
UPDATE's interactions with LIMIT/OFFSET, and indeed it's improved the
predictability of the behavior for you, just not in the direction you'd
like :-(

That might be, but is is necessary to continue locking (which is what it
sounds like to me) the elements that are not used in the final response
past completing the query?

What happens now, if I understand it correctly, is that if a "select foo
from bar limit 1 order by whatever offset tablelen-1 for update" is
performed, the effective operation is also "LOCK bar", not just a row lock
on item tablelen-1 in that table. Was that the intention? (and yes, I am
aware that ordering might be used to reverse that sequence so offset 0 can
be used, but wouldn't that just as much block the query for offset 1?)

Yngve Nysaeter Pettersen

yngve@opera.com

over 15 years ago

In reply to: Andy Colson (#4)

Re: Select for update with offset interferes with concurrent transactions

On Tue, 01 Feb 2011 18:11:23 +0100, Andy Colson <andy@squeakycode.net>
wrote:

On 2/1/2011 10:59 AM, Yngve Nysaeter Pettersen wrote:

Hi,

Thanks for the quick answer, Andy.

On Tue, 01 Feb 2011 16:19:17 +0100, Andy Colson <andy@squeakycode.net>
wrote:

<snip>

So, if I understand correctly, you:

q = SELECT record_id FROM queue
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i FOR UPDATE
while not q.eof
update queue set state = started where record_id = x;
process record_id
update queue set state = finsihed where record_id = x;
q.next;

Almost, the update to "started" is done for all selected elements first,
releasing the lock, then the items are processed one at a time, marking
each "finished" as they complete. (each processing step can take
minutes, so keeping a lock the whole time is not an option)

Might I suggest and alternative:

q = update queue set state = started
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i
RETURNING project_id;
idlist = @q;
commit;

foreach x in idlist
process record_id
begin
update queue set state = finsihed where record_id = x;
commit;

Forgive the part perl part python sudocode. Oh, and I've never done
this, no idea if it actually works. :-)

Thanks for that suggestion, I'll take a look at it.

While I hadn't caught on to the "RETURNING" part, I had been wondering
if using a single step UPDATE might be a solution. One concern I have is
how concurrent updates will affect the returned list (or if they will
just be skipped, as SELECT would in normal transaction mode, if I
understood correctly), or whether it might return with an error code (I
know that the normal update return value is the number of updated items,
just not sure if that applies for "RETURNING").

Although, I will note that this process (if it works) will, sort of,
make FOR UPDATE redundant. Or, if it doesn't, the current lock-policy
might cause issues for concurrent updates for the use-cases where FOR
UPDATE is relevant.

Yeah, I'd wondered the same thing. It could be two updates hitting the
same row will deadlock, or maybe not, I'm not sure. But I think its the
same as with the select, if you happen to have two limits that hit the
same range, you're in trouble.

I think the random limit thing is a race condition itself. Whenever you
have multiple processes hitting the same rows you're going to run into
problems. Have you thought of using a sequence instead of a random
limit? Each process could get the next 100 record_id'd via a sequence,
then there would be much less chance of deadlock.

How would that work, in case you would like to provide an example?

I am not really familiar with sequences, as I have only seen them used for
the "id" field in Django generated tables.

In case it is relevant, the processes does not (currently, at least) have
a unique ID; though they have a local sequence number for the machine they
are running on.

Andy Colson

andy@squeakycode.net

over 15 years ago

In reply to: Yngve Nysaeter Pettersen (#7)

Re: Select for update with offset interferes with concurrent transactions

On 2/1/2011 12:10 PM, Yngve Nysaeter Pettersen wrote:

On Tue, 01 Feb 2011 18:11:23 +0100, Andy Colson <andy@squeakycode.net>

I think the random limit thing is a race condition itself. Whenever
you have multiple processes hitting the same rows you're going to run
into problems. Have you thought of using a sequence instead of a
random limit? Each process could get the next 100 record_id'd via a
sequence, then there would be much less chance of deadlock.

How would that work, in case you would like to provide an example?

I am not really familiar with sequences, as I have only seen them used
for the "id" field in Django generated tables.

In case it is relevant, the processes does not (currently, at least)
have a unique ID; though they have a local sequence number for the
machine they are running on.

I have a really simple q table I use.

create table q (id integer not null, msg integer, primary key(id));
create sequence q_add;
create sequence q_read;

I insert via q_add:

andy=# insert into q(id, msg) values(nextval('q_add'), 20);
INSERT 0 1
andy=# insert into q(id, msg) values(nextval('q_add'), 4);
INSERT 0 1
andy=# select * from q;
id | msg
----+-----
1 | 20
2 | 4
(2 rows)

Then I run multiple batch proc's which get their next job like:

andy=# select msg from q where id = (select nextval('q_read'));
msg
-----
20
(1 row)

andy=# select msg from q where id = (select nextval('q_read'));
msg
-----
4
(1 row)

It works for me because I can empty the q table, reset the q_add and
q_read sequences and start over clean. Not sure if it would work for
your setup.

-Andy

Yngve Nysaeter Pettersen

yngve@opera.com

over 15 years ago

In reply to: Andy Colson (#8)

Re: Select for update with offset interferes with concurrent transactions

Thanks Andy,

On Tue, 01 Feb 2011 19:29:08 +0100, Andy Colson <andy@squeakycode.net>
wrote:

On 2/1/2011 12:10 PM, Yngve Nysaeter Pettersen wrote:

On Tue, 01 Feb 2011 18:11:23 +0100, Andy Colson <andy@squeakycode.net>

I think the random limit thing is a race condition itself. Whenever
you have multiple processes hitting the same rows you're going to run
into problems. Have you thought of using a sequence instead of a
random limit? Each process could get the next 100 record_id'd via a
sequence, then there would be much less chance of deadlock.

How would that work, in case you would like to provide an example?

I am not really familiar with sequences, as I have only seen them used
for the "id" field in Django generated tables.

In case it is relevant, the processes does not (currently, at least)
have a unique ID; though they have a local sequence number for the
machine they are running on.

I have a really simple q table I use.

create table q (id integer not null, msg integer, primary key(id));
create sequence q_add;
create sequence q_read;

I insert via q_add:

andy=# insert into q(id, msg) values(nextval('q_add'), 20);
INSERT 0 1
andy=# insert into q(id, msg) values(nextval('q_add'), 4);
INSERT 0 1
andy=# select * from q;
id | msg
----+-----
1 | 20
2 | 4
(2 rows)

Then I run multiple batch proc's which get their next job like:

andy=# select msg from q where id = (select nextval('q_read'));
msg
-----
20
(1 row)

andy=# select msg from q where id = (select nextval('q_read'));
msg
-----
4
(1 row)

It works for me because I can empty the q table, reset the q_add and
q_read sequences and start over clean. Not sure if it would work for
your setup.

I see how that would work (it is essentially how Django assigns row ids).

My current setup can have multiple runs configured at a time (and have had
several dozen queued, in one case), with varying priorities on each run,
and they might, at least theoretically, be configured in parallel (even
the individual runs are set up in parallel), meaning the ids would not be
sequential (a sequence is used for the id field in each row of the table),
unless they could somehow be allocated for each individual run/project
(multiple sequence objects, one for each run might be an option, but I
don't like that possibility). And as I mentioned elsewhere in the thread I
might make the queuing a bit more complex, which might make this system
even more complicated.

So, AFAICT I am afraid it would not work in the general case for my
project :( .
However, it might be useful in somebody else's project :) .

#10

Andy Colson

andy@squeakycode.net

over 15 years ago

In reply to: Yngve Nysaeter Pettersen (#9)

Re: Select for update with offset interferes with concurrent transactions

On 2/1/2011 12:51 PM, Yngve Nysaeter Pettersen wrote:

Thanks Andy,

On Tue, 01 Feb 2011 19:29:08 +0100, Andy Colson <andy@squeakycode.net>
wrote:

On 2/1/2011 12:10 PM, Yngve Nysaeter Pettersen wrote:

On Tue, 01 Feb 2011 18:11:23 +0100, Andy Colson <andy@squeakycode.net>

I think the random limit thing is a race condition itself. Whenever
you have multiple processes hitting the same rows you're going to run
into problems. Have you thought of using a sequence instead of a
random limit? Each process could get the next 100 record_id'd via a
sequence, then there would be much less chance of deadlock.

How would that work, in case you would like to provide an example?

I am not really familiar with sequences, as I have only seen them used
for the "id" field in Django generated tables.

In case it is relevant, the processes does not (currently, at least)
have a unique ID; though they have a local sequence number for the
machine they are running on.

I have a really simple q table I use.

create table q (id integer not null, msg integer, primary key(id));
create sequence q_add;
create sequence q_read;

I insert via q_add:

andy=# insert into q(id, msg) values(nextval('q_add'), 20);
INSERT 0 1
andy=# insert into q(id, msg) values(nextval('q_add'), 4);
INSERT 0 1
andy=# select * from q;
id | msg
----+-----
1 | 20
2 | 4
(2 rows)

Then I run multiple batch proc's which get their next job like:

andy=# select msg from q where id = (select nextval('q_read'));
msg
-----
20
(1 row)

andy=# select msg from q where id = (select nextval('q_read'));
msg
-----
4
(1 row)

It works for me because I can empty the q table, reset the q_add and
q_read sequences and start over clean. Not sure if it would work for
your setup.

I see how that would work (it is essentially how Django assigns row ids).

My current setup can have multiple runs configured at a time (and have had
several dozen queued, in one case), with varying priorities on each run,
and they might, at least theoretically, be configured in parallel (even
the individual runs are set up in parallel), meaning the ids would not be
sequential (a sequence is used for the id field in each row of the table),
unless they could somehow be allocated for each individual run/project
(multiple sequence objects, one for each run might be an option, but I
don't like that possibility). And as I mentioned elsewhere in the thread I
might make the queuing a bit more complex, which might make this system
even more complicated.

So, AFAICT I am afraid it would not work in the general case for my
project :( .
However, it might be useful in somebody else's project :) .

No, I didn't think it would work for you, yours looks much more
complicated than main. Just out of curiosity, have you looked at PgQ?

-Andy

#11

Radosław Smogura

rsmogura@softperience.eu

over 15 years ago

In reply to: Tom Lane (#5)

Re: Select for update with offset interferes with concurrent transactions

Hmm...

May I ask how this look in details. If e.g. I do select * from myeshop offset
100 limit 20, I have 1000 rows which rows will be locked?

a) 0 to 120, or
b) all rows will be locked.?

Kind regards,
Radek

Tom Lane <tgl@sss.pgh.pa.us> Tuesday 01 February 2011 18:18:17

Show quoted text

In 9.0, LIMIT/OFFSET processing is done after FOR UPDATE locking, which
means that rows skipped over by OFFSET still get locked, which means
that different sessions executing this query are now practically certain
to block each other, rather than just likely to block each other.
This was an intentional change to improve the predictability of FOR
UPDATE's interactions with LIMIT/OFFSET, and indeed it's improved the
predictability of the behavior for you, just not in the direction you'd
like :-(

regards, tom lane

#12

Yngve Nysaeter Pettersen

yngve@opera.com

over 15 years ago

In reply to: Andy Colson (#10)

Re: Select for update with offset interferes with concurrent transactions

On Tue, 01 Feb 2011 20:04:31 +0100, Andy Colson <andy@squeakycode.net>
wrote:

On 2/1/2011 12:51 PM, Yngve Nysaeter Pettersen wrote:

So, AFAICT I am afraid it would not work in the general case for my
project :( .
However, it might be useful in somebody else's project :) .

No, I didn't think it would work for you, yours looks much more
complicated than main. Just out of curiosity, have you looked at PgQ?

I did look around for some queuing systems a year ago, I am not sure if
that one crossed my path, but didn't find any that I thought would work
for me, which might just be due to the fact that I had just started with
database programming (which was also the reason I chose a framework like
Django for most of it; the FOR UPDATE SQL is one of less than 10 locations
where I use raw SQL in my system, because Django could not provide the
functionality) and I just did not realize that it could help me.

Regarding PgQ, based on a quick skimming I am not sure how it would fit in
my case. This may be because the tutorial leaves (IMO) a bit too much up
in the air regarding how the system it is working in is organized, at
least for a relative beginner as myself, and also not how a similar
alternative system would look. A small complete example showing all the
tables involved, the client(s), the server(s), and the operations
performed, might have helped.

#13

David G. Johnston

david.g.johnston@gmail.com

over 15 years ago

In reply to: Tom Lane (#5)

Re: Select for update with offset interferes with concurrent transactions

If random sampling is desirable would the following construct limit locking
only to the sampled rows?

SELECT id
FROM tasktable
WHERE id IN (SELECT random_id_sample())
FOR UPDATE

The "random_id_sample" would supply a configurable group of IDs off of
tasktable which the FOR UPDATE would then lock

I guess the issue remains that "random_id_sample()" would still end up
blocking if any of the rows it wants to return are already locked.

I too am using this basic protocol of maintaining state info within the
database and sending every query against it. As I ponder this more it
really seems as if moving some of this logic into the application layer
would possibly make more sense in Yngve's situation (or at least something
to consider). Continue to use the database as a persistence mechanism but
code the "dispatching" of tasks in the application layer and then as each
task is dispatched you simply do an "UPDATE table SET state = 'dispatch'
WHERE id = 'ID'" and a similar UPDATE when the task is returned completed.
This somewhat presumes you still only ever hand off one task at a time. If
you are indeed handing off tasks in batches then it would make sense to have
a "batch" table and operate at the batch level instead of individual tasks -
assigning tasks to a given batch via some standard mechanism.

Either way if you truly want true parallel processing then you need to
create the parallel paths that can operate without clobbering each other and
thus each path needs to have its own pool of tasks since as soon as you have
a shared resource the only true way to make sure it is only allocated once
is to serialize access to it. An alternative method would be to allow
multiple dispatches but have a "write-once" method that is called and sets
an immutable handler_id and then when the processing begins only the handler
with the matching id would be able allow to perform the actual processing.

I say the above with certainty but at the moment I am using and fairly happy
with my limited serialization - especially since I have specific
sub-properties that I can use to limit how many records are locked AND also
because the locking time is very short (I cap around 20 or so active tasks
to dispatch - and only infrequently at that) so my experience and insight to
high-demand situations is limited.

Dave

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, February 01, 2011 12:18 PM
To: Yngve Nysaeter Pettersen
Cc: pgsql-general@postgresql.org
Subject: Re: Select for update with offset interferes with concurrent
transactions

"Yngve Nysaeter Pettersen" <yngve@opera.com> writes:

To avoid having the processes trample each other's queries (the first
attempt was to select the first matching entries of the table, which
caused one to block all other transactions), one of the steps I took
was to select a set of idle rows at a random offset into the table
from the project, mark them for update, then update each record's state as

started.

SELECT record_id FROM queue WHERE project_id = my_project AND state
= idle LIMIT n OFFSET i FOR UPDATE

At present "n" is 100-150, "i" is a random value in the range 0-10000.

There is, intentionally, no ordering specified, since that would just
slow down the query, and is not necessary.

This seems like a pretty bad design. There are recognized ways to solve
this problem with more predictability and much less chance of different
processes blocking each other. In particular, this query seems be based on
some untenable assumptions about the physical row order being stable.

What I've discovered when using Postgres 9.0 is that the processes are
now blocking every other query into this table,

In 9.0, LIMIT/OFFSET processing is done after FOR UPDATE locking, which
means that rows skipped over by OFFSET still get locked, which means that
different sessions executing this query are now practically certain to block
each other, rather than just likely to block each other.
This was an intentional change to improve the predictability of FOR UPDATE's
interactions with LIMIT/OFFSET, and indeed it's improved the predictability
of the behavior for you, just not in the direction you'd like :-(

regards, tom lane

#14

Yngve Nysaeter Pettersen

yngve@opera.com

over 15 years ago

In reply to: David G. Johnston (#13)

Re: Select for update with offset interferes with concurrent transactions

Hello David,

On Wed, 02 Feb 2011 01:36:15 +0100, David Johnston <polobo@yahoo.com>
wrote:

If random sampling is desirable would the following construct limit
locking
only to the sampled rows?

SELECT id
FROM tasktable
WHERE id IN (SELECT random_id_sample())
FOR UPDATE

The "random_id_sample" would supply a configurable group of IDs off of
tasktable which the FOR UPDATE would then lock

I guess the issue remains that "random_id_sample()" would still end up
blocking if any of the rows it wants to return are already locked.

My immediate guess is that this would work, and I might explore it once I
get my new fullscale test-db up and running

I too am using this basic protocol of maintaining state info within the
database and sending every query against it. As I ponder this more it
really seems as if moving some of this logic into the application layer
would possibly make more sense in Yngve's situation (or at least
something
to consider). Continue to use the database as a persistence mechanism
but
code the "dispatching" of tasks in the application layer and then as each
task is dispatched you simply do an "UPDATE table SET state = 'dispatch'
WHERE id = 'ID'" and a similar UPDATE when the task is returned
completed.
This somewhat presumes you still only ever hand off one task at a time.
If
you are indeed handing off tasks in batches then it would make sense to
have
a "batch" table and operate at the batch level instead of individual
tasks -
assigning tasks to a given batch via some standard mechanism.

If I read you correctly that is what my system does (dispatch = started,
marked by the node that is to do the task).

The reason I am allocating tasks in batches is that there are so many
processes involved that if they pick one at a time they would block each
other. With the block allocation they only need to fetch the tasks once,
meaning that there are not as many requests to the queue at a time, on
average.

Either way if you truly want true parallel processing then you need to
create the parallel paths that can operate without clobbering each other
and
thus each path needs to have its own pool of tasks since as soon as you
have

That is what the offset part of the query was supposed to achieve.

At the moment I have worked around the problem by breaking the task list
into 2000 subgroups, and each process picks one at random. That limits the
number of processes that get in each others way, and the measured speed is
now 4-5 times what I saw on Monday, and back in the old range of
performance. However, it is a hack I had hoped to avoid (and I might get
rid of it with the above suggestion)

a shared resource the only true way to make sure it is only allocated
once
is to serialize access to it. An alternative method would be to allow
multiple dispatches but have a "write-once" method that is called and
sets
an immutable handler_id and then when the processing begins only the
handler
with the matching id would be able allow to perform the actual
processing.

This requires the handlers to have a unique ID, which my system has not
needed so far.

I say the above with certainty but at the moment I am using and fairly
happy
with my limited serialization - especially since I have specific
sub-properties that I can use to limit how many records are locked AND
also
because the locking time is very short (I cap around 20 or so active
tasks
to dispatch - and only infrequently at that) so my experience and
insight to
high-demand situations is limited.

Dave

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, February 01, 2011 12:18 PM
To: Yngve Nysaeter Pettersen
Cc: pgsql-general@postgresql.org
Subject: Re: Select for update with offset interferes with concurrent
transactions

"Yngve Nysaeter Pettersen" <yngve@opera.com> writes:

To avoid having the processes trample each other's queries (the first
attempt was to select the first matching entries of the table, which
caused one to block all other transactions), one of the steps I took
was to select a set of idle rows at a random offset into the table
from the project, mark them for update, then update each record's state
as

started.

SELECT record_id FROM queue WHERE project_id = my_project AND state
= idle LIMIT n OFFSET i FOR UPDATE

At present "n" is 100-150, "i" is a random value in the range 0-10000.

There is, intentionally, no ordering specified, since that would just
slow down the query, and is not necessary.

This seems like a pretty bad design. There are recognized ways to solve
this problem with more predictability and much less chance of different
processes blocking each other. In particular, this query seems be based
on
some untenable assumptions about the physical row order being stable.

What I've discovered when using Postgres 9.0 is that the processes are
now blocking every other query into this table,

In 9.0, LIMIT/OFFSET processing is done after FOR UPDATE locking, which
means that rows skipped over by OFFSET still get locked, which means that
different sessions executing this query are now practically certain to
block
each other, rather than just likely to block each other.
This was an intentional change to improve the predictability of FOR
UPDATE's
interactions with LIMIT/OFFSET, and indeed it's improved the
predictability
of the behavior for you, just not in the direction you'd like :-(

regards, tom lane

#15

Yngve Nysaeter Pettersen

yngve@opera.com

about 15 years ago

In reply to: Andy Colson (#2)

Re: Select for update with offset interferes with concurrent transactions

Hello all,

Just a quick update of how it went.

I ended up using code similar to a combination of Andy Colson's and David
Johnston's suggestions below, and performance is back at what is was
before. Thanks for the suggestions

BTW: AFAICT I never got a response from Tom Lane about whether it was the
intention with the new FOR UPDATE locking policy to effectively lock the
entire table for all other clients using the exact same Select but with a
different and non-overlapping offset/limit for update query. IMO
continuing to lock unselected rows after the selection have completed is a
significant performance regression.

Also, an off-topic BTW: I have noticed that autovacuum of a table seems to
block ANALYZE of the same table, because the autovacuum do not release its
lock on the table.

On Tue, 01 Feb 2011 16:19:17 +0100, Andy Colson <andy@squeakycode.net>
wrote:

On 2/1/2011 6:32 AM, Yngve Nysaeter Pettersen wrote:

Hello all,

I am in the process of migrating a system from Postgresql 8.3 to 9.0,
and have run into a problem with the task queue systems I am using.

The task queue controls the allocation of tasks between about 1000
processes working in parallel, and is essentially a table of

<snip>

So, if I understand correctly, you:

q = SELECT record_id FROM queue
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i FOR UPDATE
while not q.eof
update queue set state = started where record_id = x;
process record_id
update queue set state = finsihed where record_id = x;
q.next;

Might I suggest and alternative:

q = update queue set state = started
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i
RETURNING project_id;
idlist = @q;
commit;

foreach x in idlist
process record_id
begin
update queue set state = finsihed where record_id = x;
commit;

On Wed, 02 Feb 2011 01:36:15 +0100, David Johnston <polobo@yahoo.com>
wrote:

If random sampling is desirable would the following construct limit
locking
only to the sampled rows?

SELECT id
FROM tasktable
WHERE id IN (SELECT random_id_sample())
FOR UPDATE

#16

Andy Colson

andy@squeakycode.net

about 15 years ago

In reply to: Yngve Nysaeter Pettersen (#15)

Re: Select for update with offset interferes with concurrent transactions

On 3/14/2011 10:13 AM, Yngve N. Pettersen (Developer Opera Software ASA)
wrote:

Hello all,

Just a quick update of how it went.

I ended up using code similar to a combination of Andy Colson's and David
Johnston's suggestions below, and performance is back at what is was
before. Thanks for the suggestions

Sweet, thanks for the update... I always like to hear how things turn out.

-Andy

#17

Merlin Moncure

mmoncure@gmail.com

about 15 years ago

In reply to: Tom Lane (#5)

Re: Select for update with offset interferes with concurrent transactions

On Tue, Feb 1, 2011 at 11:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Yngve Nysaeter Pettersen" <yngve@opera.com> writes:

To avoid having the processes trample each other's queries (the first
attempt was to select the first matching entries of the table, which
caused one to block all other transactions), one of the steps I took was
to select a set of idle rows at a random offset into the table from the
project, mark them for update, then update each record's state as started.

SELECT record_id FROM queue WHERE project_id = my_project AND state =
idle LIMIT n OFFSET i FOR UPDATE

At present "n" is 100-150, "i" is a random value in the range 0-10000.

There is, intentionally, no ordering specified, since that would just slow
down the query, and is not necessary.

This seems like a pretty bad design. There are recognized ways to solve
this problem with more predictability and much less chance of different
processes blocking each other. In particular, this query seems be based
on some untenable assumptions about the physical row order being stable.

What I've discovered when using Postgres 9.0 is that the processes are now
blocking every other query into this table,

In 9.0, LIMIT/OFFSET processing is done after FOR UPDATE locking, which
means that rows skipped over by OFFSET still get locked, which means
that different sessions executing this query are now practically certain
to block each other, rather than just likely to block each other.
This was an intentional change to improve the predictability of FOR
UPDATE's interactions with LIMIT/OFFSET, and indeed it's improved the
predictability of the behavior for you, just not in the direction you'd
like :-(

You can get something approximating the old behavior with a CTE:
with q as (select id from foo where <something> limit x offset y)
select * from foo join q using(id) order by id for update;

not that this is a good idea -- it isn't -- but if you must do it that
way, the above might work. CTE are always a something to consider
when dealing with order of execution problems which seem to be burning
just about everyone these days.

merlin