Endgame for all those SELECT FOR UPDATE changes: fix plan node order

Started by Tom Laneover 16 years ago16 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

Now that we've got a hopefully-non-broken implementation of SELECT FOR
UPDATE locking as a plan node, we can finally contemplate fixing two
misbehaviors that are called out on the SELECT reference page:

It is possible for a SELECT command using both LIMIT and FOR
UPDATE/SHARE clauses to return fewer rows than specified by LIMIT. This
is because LIMIT is applied first. The command selects the specified
number of rows, but might then block trying to obtain a lock on one or
more of them. Once the SELECT unblocks, the row might have been deleted
or updated so that it does not meet the query WHERE condition anymore,
in which case it will not be returned.

Similarly, it is possible for a SELECT command using ORDER BY and FOR
UPDATE/SHARE to return rows out of order. This is because ORDER BY is
applied first. The command orders the result, but might then block
trying to obtain a lock on one or more of the rows. Once the SELECT
unblocks, one of the ordered columns might have been modified and be
returned out of order. A workaround is to perform SELECT ... FOR
UPDATE/SHARE and then SELECT ... ORDER BY.

All that we have to do to fix the first one is to put the LockRows node
below the Limit node instead of above it. The solution for the second
one is to also put LockRows underneath the Sort node, and to regard its
output as unsorted so that a Sort node will certainly be generated.
(This in turn implies that we should prefer the cheapest-total plan
for the rest of the query.)

Does anyone have any objections to this? I can't see that it will break
any applications that work today, but maybe I'm missing something.

regards, tom lane

#2Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#1)
Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order

On Oct 25, 2009, at 10:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Now that we've got a hopefully-non-broken implementation of SELECT FOR
UPDATE locking as a plan node, we can finally contemplate fixing two
misbehaviors that are called out on the SELECT reference page:

It is possible for a SELECT command using both LIMIT and FOR
UPDATE/SHARE clauses to return fewer rows than specified by
LIMIT. This
is because LIMIT is applied first. The command selects the
specified
number of rows, but might then block trying to obtain a lock on
one or
more of them. Once the SELECT unblocks, the row might have been
deleted
or updated so that it does not meet the query WHERE condition
anymore,
in which case it will not be returned.

Similarly, it is possible for a SELECT command using ORDER BY and
FOR
UPDATE/SHARE to return rows out of order. This is because ORDER
BY is
applied first. The command orders the result, but might then block
trying to obtain a lock on one or more of the rows. Once the SELECT
unblocks, one of the ordered columns might have been modified and
be
returned out of order. A workaround is to perform SELECT ... FOR
UPDATE/SHARE and then SELECT ... ORDER BY.

All that we have to do to fix the first one is to put the LockRows
node
below the Limit node instead of above it. The solution for the second
one is to also put LockRows underneath the Sort node, and to regard
its
output as unsorted so that a Sort node will certainly be generated.
(This in turn implies that we should prefer the cheapest-total plan
for the rest of the query.)

This seems like it could potentially introduce a performance
regression, but the current behavior is so bizarre that it seems like
we should still change it.

Does anyone have any objections to this? I can't see that it will
break
any applications that work today, but maybe I'm missing something.

I'm pretty excited about this. It is a nifty piece of foot-gun
removal. Thanks!

...Robert

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#2)
Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order

Robert Haas <robertmhaas@gmail.com> writes:

On Oct 25, 2009, at 10:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

... The solution for the second
one is to also put LockRows underneath the Sort node, and to regard its
output as unsorted so that a Sort node will certainly be generated.
(This in turn implies that we should prefer the cheapest-total plan
for the rest of the query.)

This seems like it could potentially introduce a performance
regression, but the current behavior is so bizarre that it seems like
we should still change it.

Yeah, it could definitely run slower than the existing code --- in
particular the combination of all three (FOR UPDATE ORDER BY LIMIT)
would tend to become a seqscan-and-sort rather than possibly just
reading one end of an index. However, I quote the old aphorism that
it can be made indefinitely fast if it doesn't have to give the right
answer. The reason the current behavior is fast is it's giving the
wrong answer :-(

regards, tom lane

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#3)
Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order

Tom Lane escribi�:

Robert Haas <robertmhaas@gmail.com> writes:

This seems like it could potentially introduce a performance
regression, but the current behavior is so bizarre that it seems like
we should still change it.

Yeah, it could definitely run slower than the existing code --- in
particular the combination of all three (FOR UPDATE ORDER BY LIMIT)
would tend to become a seqscan-and-sort rather than possibly just
reading one end of an index. However, I quote the old aphorism that
it can be made indefinitely fast if it doesn't have to give the right
answer. The reason the current behavior is fast is it's giving the
wrong answer :-(

So this probably merits a warning in the release notes for people to
check that their queries continue to run with the performance they
expect.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#4)
Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order

Alvaro Herrera <alvherre@commandprompt.com> writes:

Tom Lane escribi�:

Yeah, it could definitely run slower than the existing code --- in
particular the combination of all three (FOR UPDATE ORDER BY LIMIT)
would tend to become a seqscan-and-sort rather than possibly just
reading one end of an index. However, I quote the old aphorism that
it can be made indefinitely fast if it doesn't have to give the right
answer. The reason the current behavior is fast is it's giving the
wrong answer :-(

So this probably merits a warning in the release notes for people to
check that their queries continue to run with the performance they
expect.

One problem with this is that there isn't any good way for someone to
get back the old behavior if they want to. Which might be a perfectly
reasonable thing, eg if they know that no concurrent update is supposed
to change the sort-key column. The obvious thing would be to allow

select * from (select * from foo order by col limit 10) ss for update;

to apply the FOR UPDATE last. Unfortunately, that's not how it works
now because the FOR UPDATE will get pushed down into the subquery.
I was shot down when I proposed a related change, a couple weeks ago
http://archives.postgresql.org/message-id/7741.1255278907@sss.pgh.pa.us
but it seems like we might want to reconsider.

regards, tom lane

#6Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#5)
Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order

On Mon, Oct 26, 2009 at 10:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alvaro Herrera <alvherre@commandprompt.com> writes:

Tom Lane escribió:

Yeah, it could definitely run slower than the existing code --- in
particular the combination of all three (FOR UPDATE ORDER BY LIMIT)
would tend to become a seqscan-and-sort rather than possibly just
reading one end of an index.  However, I quote the old aphorism that
it can be made indefinitely fast if it doesn't have to give the right
answer.  The reason the current behavior is fast is it's giving the
wrong answer :-(

So this probably merits a warning in the release notes for people to
check that their queries continue to run with the performance they
expect.

One problem with this is that there isn't any good way for someone to
get back the old behavior if they want to.  Which might be a perfectly
reasonable thing, eg if they know that no concurrent update is supposed
to change the sort-key column.  The obvious thing would be to allow

select * from (select * from foo order by col limit 10) ss for update;

to apply the FOR UPDATE last.  Unfortunately, that's not how it works
now because the FOR UPDATE will get pushed down into the subquery.
I was shot down when I proposed a related change, a couple weeks ago
http://archives.postgresql.org/message-id/7741.1255278907@sss.pgh.pa.us
but it seems like we might want to reconsider.

"Shot down" might be an overstatement of the somewhat cautious
reaction that proposal. :-)

Could the desired behavior be obtained using a CTE?

...Robert

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#6)
Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Oct 26, 2009 at 10:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

One problem with this is that there isn't any good way for someone to
get back the old behavior if they want to. �Which might be a perfectly
reasonable thing, eg if they know that no concurrent update is supposed
to change the sort-key column. �The obvious thing would be to allow

select * from (select * from foo order by col limit 10) ss for update;

to apply the FOR UPDATE last. �Unfortunately, that's not how it works
now because the FOR UPDATE will get pushed down into the subquery.

Could the desired behavior be obtained using a CTE?

Nope, we push FOR UPDATE into WITHs too. I don't really see any way to
deal with this without some sort of semantic changes.

regards, tom lane

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#7)
Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order

I wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Could the desired behavior be obtained using a CTE?

Nope, we push FOR UPDATE into WITHs too. I don't really see any way to
deal with this without some sort of semantic changes.

... although on reflection, I'm not sure *why* we push FOR UPDATE into
WITHs. That seems a bit antithetical to the position we've evolved that
WITH queries are executed independently of the outer query.

If we removed that bit of behavior, which hopefully is too new for much
code to depend on, then the old FOR-UPDATE-last behavior could be
attained via a WITH. And we'd not have to risk touching the interaction
between plain subqueries and FOR UPDATE, which is something that seems
much more likely to break existing apps.

That seems like a reasonable compromise to me ... any objections?

regards, tom lane

#9Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#1)
Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order

On Sun, Oct 25, 2009 at 7:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

All that we have to do to fix the first one is to put the LockRows node
below the Limit node instead of above it.  The solution for the second
one is to also put LockRows underneath the Sort node, and to regard its
output as unsorted so that a Sort node will certainly be generated.
(This in turn implies that we should prefer the cheapest-total plan
for the rest of the query.)

I'm not following how this would work. Would it mean that every record
would end up being locked?

--
greg

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#9)
Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order

Greg Stark <gsstark@mit.edu> writes:

On Sun, Oct 25, 2009 at 7:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

All that we have to do to fix the first one is to put the LockRows node
below the Limit node instead of above it. �The solution for the second
one is to also put LockRows underneath the Sort node, and to regard its
output as unsorted so that a Sort node will certainly be generated.
(This in turn implies that we should prefer the cheapest-total plan
for the rest of the query.)

I'm not following how this would work. Would it mean that every record
would end up being locked?

In the case of LIMIT, no, because LIMIT doesn't fetch any more rows than
it needs from its input node. In the case of ORDER BY, yes,
potentially. So we might conceivably decide we should fix the first
issue and not the second. However, I'd prefer to have a solution
whereby the query does what it appears to mean and you have to write
something more complicated if you want performance over correctness.

regards, tom lane

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#8)
Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order

I wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Could the desired behavior be obtained using a CTE?

Nope, we push FOR UPDATE into WITHs too. I don't really see any way to
deal with this without some sort of semantic changes.

... although on reflection, I'm not sure *why* we push FOR UPDATE into
WITHs. That seems a bit antithetical to the position we've evolved that
WITH queries are executed independently of the outer query.

If we removed that bit of behavior, which hopefully is too new for much
code to depend on, then the old FOR-UPDATE-last behavior could be
attained via a WITH. And we'd not have to risk touching the interaction
between plain subqueries and FOR UPDATE, which is something that seems
much more likely to break existing apps.

On further investigation, this still seems like a good change to make,
but it doesn't solve the problem at hand. If we make FOR UPDATE not
propagate into WITH then the effect of

with w as (select ... order by x limit n) select * from w for update

is not going to be to lock just the rows pulled from the WITH; it's
going to be to not lock any rows at all. So we're still up against a
performance problem if we make these otherwise-desirable changes in plan
node order.

I realized just now that the backwards-compatibility problem is not
nearly as bad as I thought it was, because a lot of the syntaxes
we might want to change the behavior of will fail outright in 8.4
and before. We had this little bit in preptlist.c:

/*
* Currently the executor only supports FOR UPDATE/SHARE at top level
*/
if (root->query_level > 1)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("SELECT FOR UPDATE/SHARE is not allowed in subqueries")));

and that is the error you will get if you have FOR UPDATE in or applying
to a sub-select that does not get flattened into the calling query.
So in particular, a sub-select using ORDER BY or LIMIT has never been
compatible with FOR UPDATE at all, and that means we can define the
behavior of that combination freely.

What I am thinking we should do is define that FOR UPDATE happens before
ORDER BY or LIMIT normally, but that if the FOR UPDATE is inherited from
an outer query level, it happens after the sub-select's ORDER BY or
LIMIT. The first provision fixes the bugs noted in our documentation,
and the second one allows people to get back the old behavior if they
need it for performance. This also seems reasonably non-astonishing
from a semantic viewpoint.

Actually implementing this will be more than a one-line change, but it
doesn't seem too terribly difficult --- we'll just need to modify the
query representation of FOR UPDATE enough that we can remember which
case we had.

Comments?

regards, tom lane

#12Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#11)
Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order

On Tue, Oct 27, 2009 at 11:22 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Could the desired behavior be obtained using a CTE?

Nope, we push FOR UPDATE into WITHs too.  I don't really see any way to
deal with this without some sort of semantic changes.

... although on reflection, I'm not sure *why* we push FOR UPDATE into
WITHs.  That seems a bit antithetical to the position we've evolved that
WITH queries are executed independently of the outer query.

If we removed that bit of behavior, which hopefully is too new for much
code to depend on, then the old FOR-UPDATE-last behavior could be
attained via a WITH.  And we'd not have to risk touching the interaction
between plain subqueries and FOR UPDATE, which is something that seems
much more likely to break existing apps.

On further investigation, this still seems like a good change to make,
but it doesn't solve the problem at hand.  If we make FOR UPDATE not
propagate into WITH then the effect of

with w as (select ... order by x limit n) select * from w for update

is not going to be to lock just the rows pulled from the WITH; it's
going to be to not lock any rows at all.  So we're still up against a
performance problem if we make these otherwise-desirable changes in plan
node order.

I realized just now that the backwards-compatibility problem is not
nearly as bad as I thought it was, because a lot of the syntaxes
we might want to change the behavior of will fail outright in 8.4
and before.  We had this little bit in preptlist.c:

       /*
        * Currently the executor only supports FOR UPDATE/SHARE at top level
        */
       if (root->query_level > 1)
           ereport(ERROR,
                   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
           errmsg("SELECT FOR UPDATE/SHARE is not allowed in subqueries")));

and that is the error you will get if you have FOR UPDATE in or applying
to a sub-select that does not get flattened into the calling query.
So in particular, a sub-select using ORDER BY or LIMIT has never been
compatible with FOR UPDATE at all, and that means we can define the
behavior of that combination freely.

What I am thinking we should do is define that FOR UPDATE happens before
ORDER BY or LIMIT normally, but that if the FOR UPDATE is inherited from
an outer query level, it happens after the sub-select's ORDER BY or
LIMIT.  The first provision fixes the bugs noted in our documentation,
and the second one allows people to get back the old behavior if they
need it for performance.  This also seems reasonably non-astonishing
from a semantic viewpoint.

Actually implementing this will be more than a one-line change, but it
doesn't seem too terribly difficult --- we'll just need to modify the
query representation of FOR UPDATE enough that we can remember which
case we had.

When you refer to an "outer query level", is that the same thing as a
sub-select? If so, I think I agree that the behavior is
non-astonishing.

...Robert

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#12)
Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Oct 27, 2009 at 11:22 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

What I am thinking we should do is define that FOR UPDATE happens before
ORDER BY or LIMIT normally, but that if the FOR UPDATE is inherited from
an outer query level, it happens after the sub-select's ORDER BY or
LIMIT. �The first provision fixes the bugs noted in our documentation,
and the second one allows people to get back the old behavior if they
need it for performance. �This also seems reasonably non-astonishing
from a semantic viewpoint.

When you refer to an "outer query level", is that the same thing as a
sub-select? If so, I think I agree that the behavior is
non-astonishing.

Right, the case would be something like

select * from
(select * from foo order by x limit n) ss
for update of ss;

If you try this in any existing release it will just fail, because the
planner knows that it hasn't got a way to execute FOR UPDATE in a
subquery.

regards, tom lane

#14Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#13)
Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order

On Tue, Oct 27, 2009 at 1:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Oct 27, 2009 at 11:22 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

What I am thinking we should do is define that FOR UPDATE happens before
ORDER BY or LIMIT normally, but that if the FOR UPDATE is inherited from
an outer query level, it happens after the sub-select's ORDER BY or
LIMIT.  The first provision fixes the bugs noted in our documentation,
and the second one allows people to get back the old behavior if they
need it for performance.  This also seems reasonably non-astonishing
from a semantic viewpoint.

When you refer to an "outer query level", is that the same thing as a
sub-select?  If so, I think I agree that the behavior is
non-astonishing.

Right, the case would be something like

       select * from
         (select * from foo order by x limit n) ss
       for update of ss;

If you try this in any existing release it will just fail, because the
planner knows that it hasn't got a way to execute FOR UPDATE in a
subquery.

That's a pretty odd construction.

In some sense I don't like the proposed behavior, because it's
imaginable that someone would use this syntax without realizing that
it could produce wrong answers. My own gut instinct would be to
always push down the FOR UPDATE as being a clearer way to convey what
was meant - but we've already established that not everyone's gut
instincts agree with mine, and if someone does write this, they might
easily fail to understand the risk that it poses.

I'm not sure what to do about it, though. Not giving people ANY way
to recover the old behavior is a little troubling.

...Robert

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#14)
Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Oct 27, 2009 at 1:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Right, the case would be something like

select * from
(select * from foo order by x limit n) ss
for update of ss;

That's a pretty odd construction.

Dunno why you think that. That's exactly what one would write if one
wanted certain operations to execute in a different order than they're
defined to execute in within a single query level. We have not
previously been very clear about the order of operations for FOR UPDATE
locking relative to other steps, but now we will be.

regards, tom lane

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#15)
Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order

I wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Oct 27, 2009 at 1:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Right, the case would be something like

select * from
(select * from foo order by x limit n) ss
for update of ss;

That's a pretty odd construction.

Dunno why you think that. That's exactly what one would write if one
wanted certain operations to execute in a different order than they're
defined to execute in within a single query level. We have not
previously been very clear about the order of operations for FOR UPDATE
locking relative to other steps, but now we will be.

Actually ... it strikes me that there is another way we could approach
this. Namely, leave the semantics as-is (FOR UPDATE runs last) and
document that you can do

select * from
(select * from foo for update) ss
order by x limit n;

if you need FOR UPDATE to run before sorting. Or perhaps better,
redefine the ordering as ORDER BY then FOR UPDATE then LIMIT. Swapping
FOR UPDATE and LIMIT has no performance cost and eliminates the worse of
the two complaints in the documentation, without breaking any working
queries AFAICS. If you have the case where you want to cope with
concurrent updates to the sort key, then you can write the more
complicated query, and it's gonna cost ya. But that's not a typical
usage, as proven by the fact that it took years to realize there was
a problem there. So we shouldn't optimize for that usage at the expense
of cases where the sort key isn't expected to change.

It could be argued that this approach doesn't satisfy the principle of
least astonishment as well as doing FOR UPDATE first, but on reflection
I'm not sure I buy that. The traditional definition has been that we
only lock the rows that are actually returned, and putting FOR UPDATE
underneath the sort will break that expectation. If it's only underneath
LIMIT we can still meet that expectation.

So I'm liking this more the more I think about it ... and it's also
significantly less work than the other way would be.

regards, tom lane