Different results between PostgreSQL and Oracle for "for update" statement
We can reproduce this difference with the following steps.
create table su (a int, b int);
insert into su values(1, 1);
- session 1:
begin;
update su set b = 2 where b = 1;
- sess 2:
select * from su where a in (select a from su where b = 1) for update;
- sess 1:
commit;
Then session 2 can get the result.
PostgreSQL:
a | b
---+---
1 | 2
(1 row)
Oracle: It gets 0 rows.
Oracle's plan is pretty similar to Postgres.
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 2828511618
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 52 | 4 (0)| 00:00:01 |
| 1 | FOR UPDATE | | | | | |
| 2 | BUFFER SORT | | | | | |
|* 3 | HASH JOIN SEMI | | 1 | 52 | 4 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL| SU | 1 | 26 | 2 (0)| 00:00:01 |
|* 5 | TABLE ACCESS FULL| SU | 1 | 26 | 2 (0)| 00:00:01 |
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
-----------------------------------------------------------------------------
Any thoughts on who is wrong?
--
Best Regards
Andy Fan
Andy Fan <zhihui.fan1213@gmail.com> writes:
create table su (a int, b int);
insert into su values(1, 1);
- session 1:
begin;
update su set b = 2 where b = 1;
- sess 2:
select * from su where a in (select a from su where b = 1) for update;
This'd probably work the way you expect if there were "for update"
in the sub-select as well. As is, the sub-select will happily
return "1".
regards, tom lane
On Thu, Nov 19, 2020 at 11:49 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andy Fan <zhihui.fan1213@gmail.com> writes:
create table su (a int, b int);
insert into su values(1, 1);- session 1:
begin;
update su set b = 2 where b = 1;- sess 2:
select * from su where a in (select a from su where b = 1) for update;This'd probably work the way you expect if there were "for update"
in the sub-select as well. As is, the sub-select will happily
return "1".regards, tom lane
Hi Tom:
Thank you for your attention. Your suggestion would fix the issue. However
The difference will cause some risks when users move their application from
Oracle
to PostgreSQL. So I'd like to think which behavior is more reasonable.
In our current pg strategy, we would get
select * from su where a in (select a from su where b = 1) for update;
a | b
---+---
1 | 2
(1 row)
The data is obtained from 4 steps:
1. In the subquery, it gets su(a=1, b=1).
2. in the outer query, it is blocked.
3. session 1 is committed. sess 2 can continue.
4. outer query gets su(a=1, b=2).
By comparing the result in step 1 & step 4 in the same query, it
looks like we are using 2 different snapshots for the same query.
I think it is a bit strange.
If we think it is an issue, I think we can fix it with something like this:
+/*
+ * We should use the same RowMarkType for the RangeTblEntry
+ * if the underline relation is the same. Doesn't handle SubQuery for now.
+ */
+static RowMarkType
+select_rowmark_type_for_rangetable(RangeTblEntry *rte,
+ List
*rowmarks,
+
LockClauseStrength strength)
+{
+ ListCell *lc;
+ foreach(lc, rowmarks)
+ {
+ PlanRowMark *rc = lfirst_node(PlanRowMark, lc);
+ if (rc->relid == rte->relid)
+ return rc->markType;
+ }
+ return select_rowmark_type(rte, strength);
+}
+
/*
* preprocess_rowmarks - set up PlanRowMarks if needed
*/
@@ -2722,6 +2743,7 @@ preprocess_rowmarks(PlannerInfo *root)
newrc->strength = rc->strength;
newrc->waitPolicy = rc->waitPolicy;
newrc->isParent = false;
+ newrc->relid = rte->relid;
prowmarks = lappend(prowmarks, newrc);
}
@@ -2742,18 +2764,18 @@ preprocess_rowmarks(PlannerInfo *root)
newrc = makeNode(PlanRowMark);
newrc->rti = newrc->prti = i;
newrc->rowmarkId = ++(root->glob->lastRowMarkId);
- newrc->markType = select_rowmark_type(rte, LCS_NONE);
+ newrc->markType = select_rowmark_type_for_rangetable(rte,
prowmarks, LCS_NONE);
newrc->allMarkTypes = (1 << newrc->markType);
newrc->strength = LCS_NONE;
newrc->waitPolicy = LockWaitBlock; /* doesn't matter */
newrc->isParent = false;
-
+ newrc->relid = rte->relid;
prowmarks = lappend(prowmarks, newrc);
}
-
root->rowMarks = prowmarks;
}
+
/*
* Select RowMarkType to use for a given table
*/
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index ac5685da64..926086a69a 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -1103,6 +1103,7 @@ typedef struct PlanRowMark
LockClauseStrength strength; /* LockingClause's strength, or
LCS_NONE */
LockWaitPolicy waitPolicy; /* NOWAIT and SKIP LOCKED options */
bool isParent; /* true if this is a
"dummy" parent entry */
+ Oid relid; /* relation oid */
} PlanRowMark;
Do you think it is a bug and the above method is the right way to go?
--
Best Regards
Andy Fan
On 11/20/20 9:57 AM, Andy Fan wrote:
Thank you for your attention. Your suggestion would fix the issue. However
The difference will cause some risks when users move their application
from Oracle
to PostgreSQL. So I'd like to think which behavior is more reasonable.
I think PostgreSQL's behavior is more reasonable since it only locks the
rows it claims to lock and no extra rows. This makes the code easy to
reason about. And PostgreSQL does not re-evaluate sub queries after
grabbing the lock which while it might be surprising to some people is
also a quite nice consistent behavior in practice as long as you are
aware of it.
I do not see why these two scenarios should behave differently (which I
think they would with your proposed patch):
== Scenario 1
create table su (a int, b int);
insert into su values(1, 1);
- session 1:
begin;
update su set b = 2 where b = 1;
- sess 2:
select * from su where a in (select a from su where b = 1) for update;
- sess 1:
commit;
== Scenario 2
create table su (a int, b int);
insert into su values(1, 1);
create table su2 (a int, b int);
insert into su2 values(1, 1);
- session 1:
begin;
update su set b = 2 where b = 1;
update su2 set b = 2 where b = 1;
- sess 2:
select * from su where a in (select a from su2 where b = 1) for update;
- sess 1:
commit;
Andreas
Hi Andreas:
Thanks for your input.
On Fri, Nov 20, 2020 at 9:37 PM Andreas Karlsson <andreas@proxel.se> wrote:
On 11/20/20 9:57 AM, Andy Fan wrote:
Thank you for your attention. Your suggestion would fix the issue.
However
The difference will cause some risks when users move their application
from Oracle
to PostgreSQL. So I'd like to think which behavior is more reasonable.I think PostgreSQL's behavior is more reasonable since it only locks the
rows it claims to lock and no extra rows. This makes the code easy to
reason about. And PostgreSQL does not re-evaluate sub queries after
grabbing the lock which while it might be surprising to some people is
also a quite nice consistent behavior in practice as long as you are
aware of it.
I admit my way is bad after reading your below question, but I
would not think *it might be surprising to some people* is a good signal
for a design. Would you think "re-evaluate the quals" after grabbing the
lock should be a good idea? And do you know if any other database uses
the postgres's way or Oracle's way? I just heard Oracle might do the
re-check just some minutes before reading your reply and I also found
Oracle doesn't lock the extra rows per my test.
I do not see why these two scenarios should behave differently (which I
think they would with your proposed patch):
Good question! I think my approach doesn't make sense now!
--
Best Regards
Andy Fan
On 11/20/20 3:25 PM, Andy Fan wrote:> On Fri, Nov 20, 2020 at 9:37 PM
Andreas Karlsson <andreas@proxel.se
<mailto:andreas@proxel.se>> wrote:
On 11/20/20 9:57 AM, Andy Fan wrote:
Thank you for your attention. Your suggestion would fix the
issue. However
The difference will cause some risks when users move their
application
from Oracle
to PostgreSQL. So I'd like to think which behavior is morereasonable.
I think PostgreSQL's behavior is more reasonable since it only locks
the
rows it claims to lock and no extra rows. This makes the code easy to
reason about. And PostgreSQL does not re-evaluate sub queries after
grabbing the lock which while it might be surprising to some people is
also a quite nice consistent behavior in practice as long as you are
aware of it.I admit my way is bad after reading your below question, but I
would not think *it might be surprising to some people* is a good signal
for a design. Would you think "re-evaluate the quals" after grabbing the
lock should be a good idea? And do you know if any other database uses
the postgres's way or Oracle's way? I just heard Oracle might do the
re-check just some minutes before reading your reply and I also found
Oracle doesn't lock the extra rows per my test.
Re-evaluating the sub queries is probably a very bad idea in practice
since a sub query can have side effects, side effects which could really
mess up some poor developer's database if they are unaware of it. The
tradeoff PostgreSQL has made is not perfect but on top of my head I
cannot think of anything less bad.
I am sadly not familiar enough with Oracle or have access to any Oracle
license so I cannot comment on how Oracle have implemented their behvior
or what tradeoffs they have made.
Andreas
On Fri, Nov 20, 2020 at 3:04 PM Andreas Karlsson <andreas@proxel.se> wrote:
I am sadly not familiar enough with Oracle or have access to any Oracle
license so I cannot comment on how Oracle have implemented their behvior
or what tradeoffs they have made.
I bet that Oracle does a statement-level rollback for READ COMMITTED
mode's conflict handling. I'm not sure if this means that it locks
multiple rows or not. I think that it only uses one snapshot, which
isn't quite what we do in the Postgres case. It's really complicated
in both systems.
Andy is right to say that it looks like Postgres is using 2 different
snapshots for the same query. That's *kind of* what happens here.
Technically the executor doesn't take a new snapshot, but it does the
moral equivalent. See the EvalPlanQual() section of the executor
README.
FWIW this area is something that isn't very well standardized, despite
what you may hear. For example, InnoDBs REPEATABLE READ doesn't even
use the transaction snapshot for UPDATEs and DELETEs at all:
https://dev.mysql.com/doc/refman/8.0/en/innodb-consistent-read.html
Worst of all, you can update rows that were not visible to the
transaction snapshot, thus rendering them visible (see the "Note" box
in the documentation for an example of this). InnoDB won't throw a
serialization error at any isolation level. So it could be worse!
--
Peter Geoghegan
Thank all of you for your great insight!
On Sat, Nov 21, 2020 at 9:04 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Fri, Nov 20, 2020 at 3:04 PM Andreas Karlsson <andreas@proxel.se>
wrote:I am sadly not familiar enough with Oracle or have access to any Oracle
license so I cannot comment on how Oracle have implemented their behvior
or what tradeoffs they have made.I bet that Oracle does a statement-level rollback for READ COMMITTED
mode's conflict handling.
I'd agree with you about this point, this difference can cause more
different
behavior between Postgres & Oracle (not just select .. for update).
create table dml(a int, b int);
insert into dml values(1, 1), (2,2);
-- session 1:
begin;
delete from dml where a in (select min(a) from dml);
--session 2:
delete from dml where a in (select min(a) from dml);
-- session 1:
commit;
In Oracle: 1 row deleted in sess 2.
In PG: 0 rows are deleted.
I'm not sure if this means that it locks multiple rows or not.
This is something not really exists and you can ignore this part:)
About the statement level rollback, Another difference is related.
create table t (a int primary key, b int);
begin;
insert into t values(1,1);
insert into t values(1, 1);
commit;
Oracle : t has 1 row, PG: t has 0 row (since the whole transaction is
aborted).
I don't mean we need to be the same as Oracle, but to support a
customer who comes from Oracle, it would be good to know the
difference.
--
Best Regards
Andy Fan
so 21. 11. 2020 v 9:59 odesílatel Andy Fan <zhihui.fan1213@gmail.com>
napsal:
Thank all of you for your great insight!
On Sat, Nov 21, 2020 at 9:04 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Fri, Nov 20, 2020 at 3:04 PM Andreas Karlsson <andreas@proxel.se>
wrote:I am sadly not familiar enough with Oracle or have access to any Oracle
license so I cannot comment on how Oracle have implemented their behvior
or what tradeoffs they have made.I bet that Oracle does a statement-level rollback for READ COMMITTED
mode's conflict handling.I'd agree with you about this point, this difference can cause more
different
behavior between Postgres & Oracle (not just select .. for update).create table dml(a int, b int);
insert into dml values(1, 1), (2,2);-- session 1:
begin;
delete from dml where a in (select min(a) from dml);--session 2:
delete from dml where a in (select min(a) from dml);-- session 1:
commit;In Oracle: 1 row deleted in sess 2.
In PG: 0 rows are deleted.I'm not sure if this means that it locks multiple rows or not.
This is something not really exists and you can ignore this part:)
About the statement level rollback, Another difference is related.
create table t (a int primary key, b int);
begin;
insert into t values(1,1);
insert into t values(1, 1);
commit;Oracle : t has 1 row, PG: t has 0 row (since the whole transaction is
aborted).I don't mean we need to be the same as Oracle, but to support a
customer who comes from Oracle, it would be good to know the
difference.
yes, it would be nice to be better documented, somewhere - it should not be
part of Postgres documentation. Unfortunately, people who know Postgres
perfectly do not have the same knowledge about Oracle.
Some differences are documented in Orafce documentation
https://github.com/orafce/orafce/tree/master/doc
but I am afraid so there is nothing about the different behaviour of
snapshots.
Regards
Pavel
Show quoted text
--
Best Regards
Andy Fan
On Sat, Nov 21, 2020 at 12:58 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:
I don't mean we need to be the same as Oracle, but to support a
customer who comes from Oracle, it would be good to know the
difference.
Actually, it is documented here:
https://www.postgresql.org/docs/devel/transaction-iso.html
The description starts with: "UPDATE, DELETE, SELECT FOR UPDATE, and
SELECT FOR SHARE commands behave the same as SELECT in terms of
searching for target rows...".
I imagine that the number of application developers that are aware of
this specific aspect of transaction isolation in PostgreSQL (READ
COMMITTED conflict handling/EvalPlanQual()) is extremely small. In
practice it doesn't come up that often. Though Postgres hackers tend
to think about it a lot because it is hard to maintain.
I'm not saying that that's good or bad. Just that that has been my experience.
I am sure that some application developers really do understand the
single most important thing about READ COMMITTED mode's behavior: each
new command gets its own MVCC snapshot. But I believe that Oracle is
no different. So in practice application developers probably don't
notice any difference between READ COMMITTED mode in practically all
cases. (Again, just my opinion.)
--
Peter Geoghegan
On Sat, Nov 21, 2020 at 11:27 PM Pavel Stehule <pavel.stehule@gmail.com>
wrote:
so 21. 11. 2020 v 9:59 odesílatel Andy Fan <zhihui.fan1213@gmail.com>
napsal:Thank all of you for your great insight!
On Sat, Nov 21, 2020 at 9:04 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Fri, Nov 20, 2020 at 3:04 PM Andreas Karlsson <andreas@proxel.se>
wrote:I am sadly not familiar enough with Oracle or have access to any Oracle
license so I cannot comment on how Oracle have implemented theirbehvior
or what tradeoffs they have made.
I bet that Oracle does a statement-level rollback for READ COMMITTED
mode's conflict handling.I'd agree with you about this point, this difference can cause more
different
behavior between Postgres & Oracle (not just select .. for update).create table dml(a int, b int);
insert into dml values(1, 1), (2,2);-- session 1:
begin;
delete from dml where a in (select min(a) from dml);--session 2:
delete from dml where a in (select min(a) from dml);-- session 1:
commit;In Oracle: 1 row deleted in sess 2.
In PG: 0 rows are deleted.I'm not sure if this means that it locks multiple rows or not.
This is something not really exists and you can ignore this part:)
About the statement level rollback, Another difference is related.
create table t (a int primary key, b int);
begin;
insert into t values(1,1);
insert into t values(1, 1);
commit;Oracle : t has 1 row, PG: t has 0 row (since the whole transaction is
aborted).I don't mean we need to be the same as Oracle, but to support a
customer who comes from Oracle, it would be good to know the
difference.yes, it would be nice to be better documented, somewhere - it should not
be part of Postgres documentation. Unfortunately, people who know Postgres
perfectly do not have the same knowledge about Oracle.Some differences are documented in Orafce documentation
https://github.com/orafce/orafce/tree/master/doc
orafce project is awesome!
but I am afraid so there is nothing about the different behaviour of
snapshots.
https://github.com/orafce/orafce/pull/120 is opened for this.
--
Best Regards
Andy Fan
On Sun, Nov 22, 2020 at 5:56 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Sat, Nov 21, 2020 at 12:58 AM Andy Fan <zhihui.fan1213@gmail.com>
wrote:I don't mean we need to be the same as Oracle, but to support a
customer who comes from Oracle, it would be good to know the
difference.Actually, it is documented here:
https://www.postgresql.org/docs/devel/transaction-iso.htmlThe description starts with: "UPDATE, DELETE, SELECT FOR UPDATE, and
SELECT FOR SHARE commands behave the same as SELECT in terms of
searching for target rows...".I imagine that the number of application developers that are aware of
this specific aspect of transaction isolation in PostgreSQL (READ
COMMITTED conflict handling/EvalPlanQual()) is extremely small. In
practice it doesn't come up that often.
Totally agree with that.
Though Postgres hackers tend to think about it a lot because it is hard to
maintain.
Hackers may care about this if they run into a real user case :)
I'm not saying that that's good or bad. Just that that has been my
experience.I am sure that some application developers really do understand the
single most important thing about READ COMMITTED mode's behavior: each
new command gets its own MVCC snapshot. But I believe that Oracle is
no different. So in practice application developers probably don't
notice any difference between READ COMMITTED mode in practically all
cases. (Again, just my opinion.)--
Peter Geoghegan
--
Best Regards
Andy Fan