offset and limit in update and subselect

Started by Timothy H. Keittalmost 25 years ago9 messages
#1Timothy H. Keitt
Timothy.Keitt@SUNYSB.Edu

Hi,

It appears that limit and offset do not work in a subselect such as:

update my_table set my_col = new_val where oid = (select oid from
my_table offset row_number limit 1);

Basically, I need to update rows by offset from the beginning of the
table. Even nicer would be

update my_table set my_col = new_val offset row_number limit 1;

But this is not supported either.

Tim

--
Timothy H. Keitt
Department of Ecology and Evolution
State University of New York at Stony Brook
Phone: 631-632-1101, FAX: 631-632-7626
http://life.bio.sunysb.edu/ee/keitt/

#2Timothy H. Keitt
Timothy.Keitt@SUNYSB.Edu
In reply to: Timothy H. Keitt (#1)
Re: offset and limit in update and subselect

I see this (subselect) is available in >=7.1.

Tim

Timothy H. Keitt wrote:

Hi,

It appears that limit and offset do not work in a subselect such as:

update my_table set my_col = new_val where oid = (select oid from
my_table offset row_number limit 1);

Basically, I need to update rows by offset from the beginning of the
table. Even nicer would be

update my_table set my_col = new_val offset row_number limit 1;

But this is not supported either.

Tim

--
Timothy H. Keitt
Department of Ecology and Evolution
State University of New York at Stony Brook
Phone: 631-632-1101, FAX: 631-632-7626
http://life.bio.sunysb.edu/ee/keitt/

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Timothy H. Keitt (#1)
Re: offset and limit in update and subselect

"Timothy H. Keitt" <Timothy.Keitt@SUNYSB.Edu> writes:

Basically, I need to update rows by offset from the beginning of the
table.

I think you'd better rethink your data design. Tuple order in a table
is not a defined concept according to SQL. Even if we allowed queries
such as you've described, the results would not be well-defined, but
would change at the slightest provocation. The implementation feels
itself entitled to rearrange tuple order whenever the whim strikes it.

As the documentation tries hard to make plain, LIMIT/OFFSET are only
guaranteed to produce reproducible results if there's also an ORDER BY
that constrains the tuples into a unique ordering.

regards, tom lane

#4Lincoln Yeoh
lyeoh@pop.jaring.my
In reply to: Tom Lane (#3)
Re: offset and limit in update and subselect

At 05:07 PM 2/24/01 -0500, Tom Lane wrote:

is not a defined concept according to SQL. Even if we allowed queries
such as you've described, the results would not be well-defined, but
would change at the slightest provocation. The implementation feels
itself entitled to rearrange tuple order whenever the whim strikes it.

As the documentation tries hard to make plain, LIMIT/OFFSET are only
guaranteed to produce reproducible results if there's also an ORDER BY
that constrains the tuples into a unique ordering.

Hi,

Would it then be fine to use update ... limit in the following scenario?

I have a todo queue:

create table todo ( task text, pid int default 0);

The tasks are inserted into the todo table.

Then the various worker processes do the following update to grab tasks
without duplication.

update todo set pid=$mypid where pid=0 limit 1;

For me it doesn't matter what which row each worker gets, as long as they
only get one each and they are not the same.

What would the performance impact of "order by" be in a LIMIT X case? Would
it require a full table scan?

Thanks,
Link.

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Lincoln Yeoh (#4)
Re: Re: offset and limit in update and subselect

Lincoln Yeoh <lyeoh@pop.jaring.my> writes:

Would it then be fine to use update ... limit in the following scenario?
I have a todo queue:
create table todo ( task text, pid int default 0);
The tasks are inserted into the todo table.
Then the various worker processes do the following update to grab tasks
without duplication.
update todo set pid=$mypid where pid=0 limit 1;

There's no LIMIT clause in UPDATE. You could do something like

BEGIN
SELECT taskid FROM todo WHERE pid = 0 FOR UPDATE LIMIT 1;
UPDATE todo SET pid = $mypid WHERE taskid = $selectedid;
COMMIT

(assuming taskid is unique; you could use the OID if you have no
application-defined ID).

What would the performance impact of "order by" be in a LIMIT X case? Would
it require a full table scan?

Yes, unless there's an index on the order-by item. The above example
should be fairly efficient if both pid and taskid are indexed.

Hmm ... trying this out just now, I realize that 7.1 effectively does
the LIMIT before the FOR UPDATE, which is not the way 7.0 behaved.
Ugh. Too late to fix it for 7.1, but I guess FOR UPDATE marking ought
to become a plan node just like LIMIT did.

regards, tom lane

#6Lincoln Yeoh
lyeoh@pop.jaring.my
In reply to: Tom Lane (#5)
Re: Re: offset and limit in update and subselect

At 04:58 PM 25-02-2001 -0500, Tom Lane wrote:

There's no LIMIT clause in UPDATE. You could do something like

Oh. I thought 7.1 had that.

BEGIN
SELECT taskid FROM todo WHERE pid = 0 FOR UPDATE LIMIT 1;
UPDATE todo SET pid = $mypid WHERE taskid = $selectedid;
COMMIT

This is very similar to what I'm testing out in 7.0.3 - except I'm
currently trying "order by random" to prevent blocking. This is because
all worker processes will tend to select stuff in the same order (in the
absence of inserts or updates on that table), and thus they will hit the
same first row (this is what I encountered last week - and I got the wrong
impression that all rows were locked).

What would happen if I rewrite that query to:

update todo set pid = $mypid where exists ( select task id from todo where
pid = 0 for update limit 1);

This is pushing it, but I'm curious on what would happen :).

I'll stick to doing it in two queries, and leave out the "order by random"-
faster select vs low blocking.

Cheerio,
Link.

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Lincoln Yeoh (#6)
Re: offset and limit in update and subselect

Lincoln Yeoh <lyeoh@pop.jaring.my> writes:

BEGIN
SELECT taskid FROM todo WHERE pid = 0 FOR UPDATE LIMIT 1;
UPDATE todo SET pid = $mypid WHERE taskid = $selectedid;
COMMIT

This is very similar to what I'm testing out in 7.0.3 - except I'm
currently trying "order by random" to prevent blocking. This is because
all worker processes will tend to select stuff in the same order (in the
absence of inserts or updates on that table), and thus they will hit the
same first row (this is what I encountered last week - and I got the wrong
impression that all rows were locked).

Right. Only the first row is locked, but that doesn't help any. "order
by random" sounds like it might be a good answer, if there aren't many
rows that need to be sorted.

What would happen if I rewrite that query to:

update todo set pid = $mypid where exists ( select task id from todo where
pid = 0 for update limit 1);

Right now you get

ERROR: SELECT FOR UPDATE is not allowed in subselects

This is something that could be fixed if FOR UPDATE were a plan node
instead of a function done at the executor top level.

regards, tom lane

#8Lincoln Yeoh
lyeoh@pop.jaring.my
In reply to: Tom Lane (#7)
Re: offset and limit in update and subselect

At 11:16 PM 25-02-2001 -0500, Tom Lane wrote:

Right. Only the first row is locked, but that doesn't help any. "order
by random" sounds like it might be a good answer, if there aren't many
rows that need to be sorted.

Yep. I'll just see what happens in the testing stages.

What would happen if I rewrite that query to:

update todo set pid = $mypid where exists ( select task id from todo where
pid = 0 for update limit 1);

Right now you get

ERROR: SELECT FOR UPDATE is not allowed in subselects

This is something that could be fixed if FOR UPDATE were a plan node
instead of a function done at the executor top level.

OK. Sounds like it won't be worth the trouble to do, plus deadlocks would
be real fun ;).

Cheerio,
Link.

#9Timothy H. Keitt
Timothy.Keitt@SUNYSB.Edu
In reply to: Timothy H. Keitt (#1)
Re: offset and limit in update and subselect

Hmmm... that's good to know. Basically, I'm trying to model fixed order
tables in another application through a proxy mechanism (see
http://rpgsql.sourceforge.net/). I guess I will have to force row
ordering on all proxied tables.

Tim

Tom Lane wrote:

"Timothy H. Keitt" <Timothy.Keitt@SUNYSB.Edu> writes:

Basically, I need to update rows by offset from the beginning of the
table.

I think you'd better rethink your data design. Tuple order in a table
is not a defined concept according to SQL. Even if we allowed queries
such as you've described, the results would not be well-defined, but
would change at the slightest provocation. The implementation feels
itself entitled to rearrange tuple order whenever the whim strikes it.

As the documentation tries hard to make plain, LIMIT/OFFSET are only
guaranteed to produce reproducible results if there's also an ORDER BY
that constrains the tuples into a unique ordering.

regards, tom lane

--
Timothy H. Keitt
Department of Ecology and Evolution
State University of New York at Stony Brook
Phone: 631-632-1101, FAX: 631-632-7626
http://life.bio.sunysb.edu/ee/keitt/