possible row locking bug in 7.0.3 & 7.1

Started by Forest Wilkinsonalmost 25 years ago27 messages
#1Forest Wilkinson
fspam@home.com

I'm having a problem with functions written in SQL. Specifically, they
don't seem to be adhering to Postgres locking rules. For the record, I'm
using postgres 7.0.3, installed from RPMs, on Red Hat 6.2. I got the same
results with postgres 7.1 beta 6, installed from sources.

Here's what I'm seeing:
(psql input represented by '<<'; output represented by '>>'.)

session1<< create table idseq
session1<< (
session1<< name varchar(32) not null,
session1<< id int8 not null default 0
session1<< );
session1>> CREATE

session1<< insert into idseq values ('myid');
session1>> INSERT 18734 1

Each row in the table is supposed to represent a named numeric sequence,
much like the sequences built into postgres. (Mine use an int8 though,
so their values can be much higher.)

session1<< create function nextid( varchar(32)) returns int8 as '
session1<< select * from idseq where name = $1::text for update;
session1<< update idseq set id = id + 1 where name = $1::text;
session1<< select id from idseq where name = $1::text;
session1<< ' language 'sql';
session1>> CREATE

The idea here is that the select...for update within the nextid() function
will establish a row level lock, preventing two concurrent function calls
from overlapping.

Next, I test with two sessions as follows:

session1<< begin;
session1>> BEGIN

session2<< begin;
session2>> BEGIN

session1<< select nextid('myid');
session1>> nextid
session1>> --------
session1>> 1
session1>> (1 row)

session2<< select nextid('myid');

(session2 blocks until session1 completes its transaction)

session1<< commit;
session1>> COMMIT

(session2 resumes)

session2>> nextid
session2>> --------
session2>> 0
session2>> (1 row)

What gives??? I expected the second call to nextid() to return 2!

session2<< commit;
session2>> COMMIT

session2<< select * from idseq;
session2>> name | id
session2>> ------+----
session2>> myid | 2
session2>> (1 row)

session1<< select * from idseq;
session1>> name | id
session1>> ------+----
session1>> myid | 2
session1>> (1 row)

As you can see, my nextid() function is not synchronized the way I hoped.
I don't know why, though. Can someone help? I'm going to try out some of my
SPI functions with 7.1 beta 6, to see if they exhibit a locking problem as
well.

Thanks,

Forest Wilkinson

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Forest Wilkinson (#1)
Re: possible row locking bug in 7.0.3 & 7.1

Forest Wilkinson <fspam@home.com> writes:

session1<< create function nextid( varchar(32)) returns int8 as '
session1<< select * from idseq where name = $1::text for update;
session1<< update idseq set id = id + 1 where name = $1::text;
session1<< select id from idseq where name = $1::text;
session1<< ' language 'sql';
[ doesn't work as expected in parallel transactions ]

This is a fairly interesting example. What I find is that at the final
SELECT, the function can see both the tuple outdated by the other
transaction AND the new tuple it has inserted. (You can demonstrate
that by doing select count(id) instead of select id.) Whichever one
happens to be visited first is the one that gets returned by the
function, and that's generally the older one in this example.

MVCC seems to be operating as designed here, more or less. The outdated
tuple is inserted by a known-committed transaction, and deleted by a
transaction that's also committed, but one that committed *since the
start of the current transaction*. So its effects should not be visible
to the SELECT, and therefore the tuple should be visible. The anomalous
behavior is not really in the final SELECT, but in the earlier commands
that were able to see the effects of a transaction committed later than
the start of the second session's transaction.

The workaround for Forest is to make the final SELECT be a SELECT FOR
UPDATE, so that it's playing by the same rules as the earlier commands.
But I wonder whether we ought to rethink the MVCC rules so that that's
not necessary. I have no idea how we might change the rules though.
If nothing else, we should document this issue better: SELECT and SELECT
FOR UPDATE have different visibility rules, so you probably don't want
to intermix them.

regards, tom lane

#3Philip Warner
pjw@rhyme.com.au
In reply to: Tom Lane (#2)
Re: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

At 18:14 27/03/01 -0500, Tom Lane wrote:

Forest Wilkinson <fspam@home.com> writes:

session1<< create function nextid( varchar(32)) returns int8 as '
session1<< select * from idseq where name = $1::text for update;
session1<< update idseq set id = id + 1 where name = $1::text;
session1<< select id from idseq where name = $1::text;
session1<< ' language 'sql';
[ doesn't work as expected in parallel transactions ]

What I find is that at the final
SELECT, the function can see both the tuple outdated by the other
transaction AND the new tuple it has inserted.

Surely we should distinguish between real new tuples, and new tuple
versions? I don't think it's ever reasonable behaviour to see two versions
of the same row.

(You can demonstrate
that by doing select count(id) instead of select id.) Whichever one
happens to be visited first is the one that gets returned by the
function, and that's generally the older one in this example.

MVCC seems to be operating as designed here, more or less. The outdated
tuple is inserted by a known-committed transaction, and deleted by a
transaction that's also committed, but one that committed *since the
start of the current transaction*. So its effects should not be visible
to the SELECT, and therefore the tuple should be visible. The anomalous
behavior is not really in the final SELECT, but in the earlier commands
that were able to see the effects of a transaction committed later than
the start of the second session's transaction.

Looking at the docs, I see that 'SERIALIZABLE' has the same visibility
rules as 'READ COMMITTED', which is very confusing. I expect that a Read
Committed TX should see committed changes for a TX that commits during the
first TX (although this may need to be limited to TXs started before the
first TX, but I'm not sure). If this is not the case, then we never get
non-repeatable reads, AFAICT:

P2 (������Non-repeatable read������): SQL-transaction T1 reads a row.
SQL-transaction T2 then modifies or deletes that row and performs
a COMMIT. If T1 then attempts to reread the row, it may
receive the modified value or discover that the row has been deleted.

which is one of the differences between SERIALIZABLE and READ-COMMITTED.

The workaround for Forest is to make the final SELECT be a SELECT FOR
UPDATE, so that it's playing by the same rules as the earlier commands.

Eek. Does this seem good to you? I would expect that SELECT and
SELECT...FOR UPDATE should return the same result set.

But I wonder whether we ought to rethink the MVCC rules so that that's
not necessary. I have no idea how we might change the rules though.

Disallowing visibility of two versions of the same row would help.

----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Philip Warner (#3)
Re: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

Philip Warner <pjw@rhyme.com.au> writes:

The workaround for Forest is to make the final SELECT be a SELECT FOR
UPDATE, so that it's playing by the same rules as the earlier commands.

Eek. Does this seem good to you?

I did call it a workaround ;-)

I don't think that we dare try to make any basic changes in MVCC for 7.1
at this late hour, so Forest is going to have to live with that answer
for awhile. But I would like to see a cleaner answer in future
releases. As I've opined before, the whole EvalPlanQual mechanism
strikes me as essentially bogus in any case...

regards, tom lane

#5Forest Wilkinson
fspam@home.com
In reply to: Tom Lane (#2)
Re: possible row locking bug in 7.0.3 & 7.1

On Tuesday 27 March 2001 15:14, Tom Lane wrote:

Forest Wilkinson <fspam@home.com> writes:

session1<< create function nextid( varchar(32)) returns int8 as '
session1<< select * from idseq where name = $1::text for update;
session1<< update idseq set id = id + 1 where name = $1::text;
session1<< select id from idseq where name = $1::text;
session1<< ' language 'sql';
[ doesn't work as expected in parallel transactions ]

[snip]

The workaround for Forest is to make the final SELECT be a SELECT FOR
UPDATE, so that it's playing by the same rules as the earlier commands.
But I wonder whether we ought to rethink the MVCC rules so that that's
not necessary. I have no idea how we might change the rules though.
If nothing else, we should document this issue better: SELECT and SELECT
FOR UPDATE have different visibility rules, so you probably don't want
to intermix them.

My, that's ugly. (But thanks for the workaround.)

If I remember correctly, UPDATE establishes a lock on the affected rows,
which will block another UPDATE on the same rows for the duration of the
transaction. If that's true, shouldn't I be able to achieve my desired
behavior by removing the initial as follows:

create function nextid( varchar(32)) returns int8 as '
update idseq set id = id + 1 where name = $1::text;
select id from idseq where name = $1::text;
' language 'sql';

Or, would I still have to add FOR UPDATE to that final SELECT?

Forest

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Forest Wilkinson (#5)
Re: possible row locking bug in 7.0.3 & 7.1

Forest Wilkinson <fspam@home.com> writes:

If I remember correctly, UPDATE establishes a lock on the affected rows,
which will block another UPDATE on the same rows for the duration of the
transaction. If that's true, shouldn't I be able to achieve my desired
behavior by removing the initial as follows:

create function nextid( varchar(32)) returns int8 as '
update idseq set id = id + 1 where name = $1::text;
select id from idseq where name = $1::text;
' language 'sql';

Or, would I still have to add FOR UPDATE to that final SELECT?

You're right, the initial SELECT FOR UPDATE is a waste of cycles
considering that you're not using the value it returns. But you'll
still need the last select to be FOR UPDATE so that it plays by the
same rules as the UPDATE does.

regards, tom lane

#7Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Forest Wilkinson (#1)
Re: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

Tom Lane wrote:

Philip Warner <pjw@rhyme.com.au> writes:

The workaround for Forest is to make the final SELECT be a SELECT FOR
UPDATE, so that it's playing by the same rules as the earlier commands.

Eek. Does this seem good to you?

I did call it a workaround ;-)

I don't think that we dare try to make any basic changes in MVCC for 7.1
at this late hour, so Forest is going to have to live with that answer
for awhile. But I would like to see a cleaner answer in future
releases.

Is it the MVCC's restriction that each query inside a function
must use the same snapshot ?

As I've opined before, the whole EvalPlanQual mechanism
strikes me as essentially bogus in any case...

How would you change it ? UPDATE/SELECT FOR UPDATE have to
SELECT/UPDATE the latest tuples. I don't think of any simple
way for 'SELECT FOR UPDATE' to have the same visibility as
simple SELECT.

regards,
Hiroshi Inoue

#8Mikheev, Vadim
vmikheev@SECTORBASE.COM
In reply to: Hiroshi Inoue (#7)
RE: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

I don't think that we dare try to make any basic changes in
MVCC for 7.1 at this late hour, so Forest is going to have
to live with that answer for awhile. But I would like to see
a cleaner answer in future releases.

Is it the MVCC's restriction that each query inside a function
must use the same snapshot ?

No. MVCC restricts what is visible to query itself. Current
function' behaviour is like Oracle' one.
Strictly speaking queries inside function don't use the same
snapshot - they see changes made by other queries of this
function. Should we allow them see changes made by other
transactions? I'm not sure. Maybe by special means like
CREATE SNAPSHOT S;
SELECT FROM foo WITH SNAPSHOT S;
?

For this particular case - concurrent UPDATE then
UPDATE/DELETE + SELECT - there is simple solution: meeting
tuple updated by concurrent *committed* transaction
SELECT (in READ COMMITTED mode) should look in newer tuple
versions. If some of newer tuples is invalidated (updated/deleted)
by *this* transaction and this invalidation is *visible*
to SELECT (older CommandId) then old tuple version must not
be returned (newer tuple version will be returned of course).
Reported problem is caused by bug (only one tuple version must be
returned by SELECT) and this is way to fix it.

But note that for the case of concurrent DELETE then
INSERT + SELECT two tuples will be returned anyway and
I don't think that this is bug.

As I've opined before, the whole EvalPlanQual mechanism
strikes me as essentially bogus in any case...

How would you change it ? UPDATE/SELECT FOR UPDATE have to
SELECT/UPDATE the latest tuples. I don't think of any simple
way for 'SELECT FOR UPDATE' to have the same visibility as
simple SELECT.

Yes, I also don't understand what's wrong with EvalPlanQual.

Vadim

#9Mikheev, Vadim
vmikheev@SECTORBASE.COM
In reply to: Mikheev, Vadim (#8)
RE: Re: [SQL] possible row locking bug in 7.0.3 & 7.1

Looking at the docs, I see that 'SERIALIZABLE' has the same visibility
rules as 'READ COMMITTED', which is very confusing. I expect

Hm, you're right:
http://www.postgresql.org/devel-corner/docs/postgres/xact-read-committed.htm
l

"Read Committed is the default isolation level in Postgres. When
a transaction runs on this isolation level, a SELECT query sees only
data committed before the transaction began..."
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Must be "committed before the *query* began" as it was in 6.5 docs.
Any way to fix it before release?

Vadim

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mikheev, Vadim (#9)
Re: Re: [SQL] possible row locking bug in 7.0.3 & 7.1

"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:

Hm, you're right:
http://www.postgresql.org/devel-corner/docs/postgres/xact-read-committed.html

"Read Committed is the default isolation level in Postgres. When
a transaction runs on this isolation level, a SELECT query sees only
data committed before the transaction began..."
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Must be "committed before the *query* began" as it was in 6.5 docs.

Will fix this shortly...

regards, tom lane

#11Philip Warner
pjw@rhyme.com.au
In reply to: Mikheev, Vadim (#8)
RE: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

At 09:58 28/03/01 -0800, Mikheev, Vadim wrote:

Reported problem is caused by bug (only one tuple version must be
returned by SELECT) and this is way to fix it.

I assume this is not possible in 7.1?

----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/

#12Mikheev, Vadim
vmikheev@SECTORBASE.COM
In reply to: Philip Warner (#11)
RE: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

Reported problem is caused by bug (only one tuple version must be
returned by SELECT) and this is way to fix it.

I assume this is not possible in 7.1?

Just looked in heapam.c - I can fix it in two hours.
The question is - should we do this now?
Comments?

Vadim

#13Philip Warner
pjw@rhyme.com.au
In reply to: Mikheev, Vadim (#12)
RE: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

At 19:14 29/03/01 -0800, Mikheev, Vadim wrote:

Reported problem is caused by bug (only one tuple version must be
returned by SELECT) and this is way to fix it.

I assume this is not possible in 7.1?

Just looked in heapam.c - I can fix it in two hours.
The question is - should we do this now?
Comments?

It's a bug; how confident are you of the fix?

----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/

#14Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Philip Warner (#13)
Re: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

Philip Warner wrote:

At 19:14 29/03/01 -0800, Mikheev, Vadim wrote:

Reported problem is caused by bug (only one tuple version must be
returned by SELECT) and this is way to fix it.

I assume this is not possible in 7.1?

Just looked in heapam.c - I can fix it in two hours.
The question is - should we do this now?
Comments?

It's a bug; how confident are you of the fix?

I doubt if it's a bug of SELECT. Well what
'concurrent UPDATE then SELECT FOR UPDATE +
SELECT' return ?

regards,
Hiroshi Inoue

#15Mikheev, Vadim
vmikheev@SECTORBASE.COM
In reply to: Hiroshi Inoue (#14)
RE: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

I assume this is not possible in 7.1?

Just looked in heapam.c - I can fix it in two hours.
The question is - should we do this now?
Comments?

It's a bug; how confident are you of the fix?

95% -:)

I doubt if it's a bug of SELECT. Well what
'concurrent UPDATE then SELECT FOR UPDATE +
SELECT' return ?

I'm going to add additional check to heapgettup and
heap_fetch:

HeapTupleSatisfies(T) is TRUE:

IF XactIsoLevel is READ_COMMITTED
and snapshot != SnapshotDirty
and !(T->t_data->t_infomask & HEAP_XMAX_INVALID)
and T->t_data->t_infomask & HEAP_XMAX_COMMITTED
and T->t_self != T->t_data->t_ctid
{
FOR ( ; ; )
{
fetch tuple->t_data->t_ctid tuple
IF t_infomask & (HEAP_XMAX_INVALID | HEAP_MARKED_FOR_UPDATE)
break; -- and return T
IF t_infomask & HEAP_XMAX_COMMITTED
{
IF t_self != ctid -- updated
continue;
break; -- deleted, return T
}
-- uncommitted update/delete
IF t_xmax != CurrentTransactionID
break; -- and return T
-- changed by current TX!
IF changed *BEFORE* this query began
{
-- DELETE + SELECT: nothing to return
-- UPDATE + SELECT: newer tuple version
-- will be/was returned by query
return NULL;
}
continue;
}
}

Vadim

#16Philip Warner
pjw@rhyme.com.au
In reply to: Hiroshi Inoue (#14)
Re: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

At 13:16 30/03/01 +0900, Hiroshi Inoue wrote:

Philip Warner wrote:

At 19:14 29/03/01 -0800, Mikheev, Vadim wrote:

Reported problem is caused by bug (only one tuple version must be
returned by SELECT) and this is way to fix it.

I assume this is not possible in 7.1?

Just looked in heapam.c - I can fix it in two hours.
The question is - should we do this now?
Comments?

It's a bug; how confident are you of the fix?

I doubt if it's a bug of SELECT.

No idea where the bug is, but SELECT should never return two versions of
the *same* row.

'Well what
'concurrent UPDATE then SELECT FOR UPDATE +
SELECT' return ?

No idea, maybe Vadim or Tom can help?

----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hiroshi Inoue (#14)
Re: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

Just looked in heapam.c - I can fix it in two hours.
The question is - should we do this now?

This scares the hell out of me.

I do NOT think we should be making quick-hack changes in fundamental
system semantics at this point of the release cycle.

The problem went unnoticed for two full release cycles --- therefore,
it can wait another cycle for a fix that has been considered, reviewed,
and tested. Let's not risk making things worse by releasing a new
behavior we might find out is also wrong.

regards, tom lane

#18Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Mikheev, Vadim (#15)
Re: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

"Mikheev, Vadim" wrote:

I assume this is not possible in 7.1?

Just looked in heapam.c - I can fix it in two hours.
The question is - should we do this now?
Comments?

It's a bug; how confident are you of the fix?

95% -:)

I doubt if it's a bug of SELECT. Well what
'concurrent UPDATE then SELECT FOR UPDATE +
SELECT' return ?

I'm going to add additional check to heapgettup and
heap_fetch:

SELECT seems to be able to return a different result
from that of preceding SELECT FOR UPDATE even after
applying your change.
SELECT doesn't seem guilty but the result is far
from intuitive.
It seems impossoble for all queires inside such
a function to use a common snapshot.

regards,
Hiroshi Inoue

#19Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: Hiroshi Inoue (#18)
AW: Re: [SQL] possible row locking bug in 7.0.3 & 7.1

I doubt if it's a bug of SELECT. Well what
'concurrent UPDATE then SELECT FOR UPDATE +
SELECT' return ?

I'm going to add additional check to heapgettup and
heap_fetch:

SELECT seems to be able to return a different result
from that of preceding SELECT FOR UPDATE even after
applying your change.

Only if you left this cursor position without doing an actual update
(i.e. after fetch next). The select for update is only supposed to guard
the current cursor position. Once you leave without modification
another session can be allowed to update.
This is how it is supposed to react in read committed
mode. If you don't like this you need repeatable read.

The example given is of questionable value, since a select for update
without a cursor in read committed mode does not need to behave any different
than a simple select without for update.

SELECT doesn't seem guilty but the result is far
from intuitive.

It is intuitive. The bug was iirc, that you saw 2 versions of the same row
in the second select statement (= 2 rows returned by second select).
Vadim's patch will let you see only the newer row.

It seems impossoble for all queires inside such
a function to use a common snapshot.

In read committed they are not required to !

It looks like a lot of people on the list are absolute fans
of repeatable read isolation :-) Not me, I know a lot of applications
where committed read, or even read uncommitted makes a lot more
sense.

Andreas

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zeugswetter Andreas SB (#19)
Re: AW: Re: [SQL] possible row locking bug in 7.0.3 & 7.1

Zeugswetter Andreas SB <ZeugswetterA@Wien.Spardat.at> writes:

It is intuitive. The bug was iirc, that you saw 2 versions of the same row
in the second select statement (= 2 rows returned by second select).

I think we should be extremely wary of assuming that we have a clear
characterization of "what the bug is", let alone "how to fix it".
The real issue here is that SELECT has different MVCC visibility rules
from UPDATE and SELECT FOR UPDATE. I suspect that that *must* be so
in any mode that allows more concurrency than full serializable mode.
Thus, the question we are really facing is how we might alter the
visibility rules in a way that will make the results more intuitive
and/or useful while still allowing concurrency.

This will take thought, research and discussion. A quick fix is the
last thing that should be on our minds.

A first question: where did the MVCC rules come from originally, anyway?
Is there any academic research to look at?

regards, tom lane

#21Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: Tom Lane (#20)
AW: Re: [SQL] possible row locking bug in 7.0.3 & 7.1

If I remember correctly, UPDATE establishes a lock on the affected rows,
which will block another UPDATE on the same rows for the duration of the
transaction. If that's true, shouldn't I be able to achieve my desired
behavior by removing the initial as follows:

create function nextid( varchar(32)) returns int8 as '
update idseq set id = id + 1 where name = $1::text;
select id from idseq where name = $1::text;
' language 'sql';

Yes, better, but be sure, to only use this function from inside a transaction.
If you use it in autocommit mode (no begin work) you might in theory read a row,
that another session modified between the two lines.

Or, would I still have to add FOR UPDATE to that final SELECT?

Now, this certainly looks very funny. You actually get reasonable results only
if you do include the "for update" with RC1 sources .

To the rest on the list:
Try the above example by adding a lock between the two lines:

create function nextid( varchar(32)) returns int8 as '
update idseq set id = id + 1 where name = $1::text;
select * from lock1;
select id from idseq where name = $1::text for update;
' language 'sql';

session1:
begin work;
lock table lock1 in access exclusive mode;
session 2:
not in txn: select nextid('one'); // this blocks
select nextid('one');
commit work;

And stare at the results you get with and without for update :-(
Something is definitely fishy with the visibility of SELECT here.

Andreas

#22Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: Zeugswetter Andreas SB (#21)
AW: Re: [SQL] possible row locking bug in 7.0.3 & 7.1

To the rest on the list:
Try the above example by adding a lock between the two lines:

create function nextid( varchar(32)) returns int8 as '
update idseq set id = id + 1 where name = $1::text;
select * from lock1;
select id from idseq where name = $1::text for update;
' language 'sql';

session1:
begin work;
lock table lock1 in access exclusive mode;
session 2:
not in txn: select nextid('one'); // this blocks
select nextid('one');
commit work;

And stare at the results you get with and without for update :-(
Something is definitely fishy with the visibility of SELECT here.

Without "for update" I see a tuple in session2 from before session1 began.
After both complete, the net result is correct (id is incremented by 2).
This is very interesting, unfortunately I must leave Internet access until
monday since my daughter called me home, and mail is so dead slow,
that I did not even receive my last mails yet :-(

Andreas

#23Mikheev, Vadim
vmikheev@SECTORBASE.COM
In reply to: Zeugswetter Andreas SB (#22)
RE: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

I doubt if it's a bug of SELECT. Well what
'concurrent UPDATE then SELECT FOR UPDATE +
SELECT' return ?

I'm going to add additional check to heapgettup and
heap_fetch:

SELECT seems to be able to return a different result
from that of preceding SELECT FOR UPDATE even after
applying your change.

Oh, you're right. Well, if we really want that SELECT
returns the same result as SELECT FOR UPDATE *in functions*
(out of functions results are already same) then we have to
add some modifications to fix proposed:

1. If newer version of visible tuple T is marked for update by
us *before* query began then do not return T.

2. If tuple T1 is *not visible* because of it was inserted by
concurrent committed TX then check if it's marked for update
by current TX *before* query began and return *this* tuple
version if yes.

This will be in accordance with standard which requires
us return committed (whenever) rows in READ COMMITTED mode.
In fact, in this mode our SELECTs provide higher isolation
than required by standard returning rows committed *before*
query began. Why? Because of SELECT doesn't lock rows and
the same row may be visited by SELECT in join queries
many times - so we have to be protected against concurrent
updates. SELECT FOR UPDATE protects us BUT if query itself
calls some functions which updates queried table then currently
we may lose information that tuple was marked for update before
query began - so updating tuple inserted by concurrent committed
TX and marked for update by us we would have to save its t_cmax
in t_cmin (and either add new flag to t_infomask or don't turn
OFF HEAP_MARKED_FOR_UPDATE in this case).

This is not what I would like to do in 7.1

SELECT doesn't seem guilty but the result is far
from intuitive.

I think that SELECT is guilty. At least returning two versions
of the same row! (One that could be fixed easy).

Vadim

#24Forest Wilkinson
fspam@home.com
In reply to: Tom Lane (#17)
Re: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

On Thursday 29 March 2001 22:15, Tom Lane wrote:

Just looked in heapam.c - I can fix it in two hours.
The question is - should we do this now?

This scares the hell out of me.

I do NOT think we should be making quick-hack changes in fundamental
system semantics at this point of the release cycle.

Although I'm the one who is being bit by this bug, I tend to agree.

The problem went unnoticed for two full release cycles

I first reported the problem on 25 September 2000, on the pgsql-sql list,
message subject "SQL functions not locking properly?" I was using 7.0.2 at
the time. Also, I seem to remember that a problem of this nature bit me in
6.5.x as well.

it can wait another cycle for a fix that has been considered, reviewed,
and tested. Let's not risk making things worse by releasing a new
behavior we might find out is also wrong.

Good point. How long is the next cycle likely to take?

Forest

#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Forest Wilkinson (#24)
Re: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

Forest Wilkinson <fspam@home.com> writes:

Good point. How long is the next cycle likely to take?

Good question. I'd like to say 4 to 6 months, but that was how long 7.1
was supposed to take, and we're pushing a year now.

What might make the most sense is to develop and test a fix in the early
part of the 7.2 development cycle, and then back-patch it into a 7.1.x
release perhaps 2 or 3 months from now.

regards, tom lane

#26Forest Wilkinson
fspam@home.com
In reply to: Philip Warner (#13)
Re: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

I see postgres 7.1.1 is out now. Was the fix for this problem included in
the new release?

Show quoted text

On Thursday 29 March 2001 20:02, Philip Warner wrote:

At 19:14 29/03/01 -0800, Mikheev, Vadim wrote:

Reported problem is caused by bug (only one tuple version must be
returned by SELECT) and this is way to fix it.

I assume this is not possible in 7.1?

Just looked in heapam.c - I can fix it in two hours.
The question is - should we do this now?
Comments?

It's a bug; how confident are you of the fix?

#27Mikheev, Vadim
vmikheev@SECTORBASE.COM
In reply to: Forest Wilkinson (#26)
RE: [HACKERS] Re: possible row locking bug in 7.0.3 & 7.1

I see postgres 7.1.1 is out now. Was the fix for this
problem included in the new release?

I fear it will be in 7.2 only.

Show quoted text

On Thursday 29 March 2001 20:02, Philip Warner wrote:

At 19:14 29/03/01 -0800, Mikheev, Vadim wrote:

Reported problem is caused by bug (only one tuple

version must be

returned by SELECT) and this is way to fix it.

I assume this is not possible in 7.1?

Just looked in heapam.c - I can fix it in two hours.
The question is - should we do this now?
Comments?

It's a bug; how confident are you of the fix?

---------------------------(end of
broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster