RE: Update is not atomic

Started by Mikheev, Vadimover 24 years ago8 messages
#1Mikheev, Vadim
vmikheev@SECTORBASE.COM

Problem can be demonstrated by following example

create table a (a numeric primary key);
insert into a values (1);
insert into a values (2);
insert into a values (3);
insert into a values (4);
update a set a=a+1 where a>2;
ERROR: Cannot insert a duplicate key into unique index a_pkey

We use uniq index for UK/PK but shouldn't. Jan?

Vadim

#2Mikheev, Vadim
vmikheev@SECTORBASE.COM
In reply to: Mikheev, Vadim (#1)
RE: [BUGS] Update is not atomic

update a set a=a+1 where a>2;
ERROR: Cannot insert a duplicate key into unique index a_pkey

This is a known problem with unique contraints, but it's not
easy to fix it.

Yes, it requires dirty reads.

Vadim

#3Jan Wieck
JanWieck@Yahoo.com
In reply to: Mikheev, Vadim (#1)
Re: [HACKERS] RE: Update is not atomic

Mikheev, Vadim wrote:

Problem can be demonstrated by following example

create table a (a numeric primary key);
insert into a values (1);
insert into a values (2);
insert into a values (3);
insert into a values (4);
update a set a=a+1 where a>2;
ERROR: Cannot insert a duplicate key into unique index a_pkey

We use uniq index for UK/PK but shouldn't. Jan?

What else can you use than an index? A "deferred until
statement end" trigger checking for duplicates? Think it'd
have a real bad performance impact.

Whatever the execution order might be, the update of '3' to
'4' will see the other '4' as existent WRT the scan commandId
and given snapshot - right? If we at the time we now fire up
the ERROR add the key, the index and heap to a list of
"possible dupkeys", that we'll check at the end of the actual
command, the above would work. The check at statement end
would have to increment the commandcounter and for each entry
do an index scan with the key, counting the number of found,
valid heap tuples.

Well, with some million rows doing a "set a = a + 1" could
run out of memory. So this would be something that'd work in
the sandbox and for non-broken applications (tm). Maybe at
some level (when we escalate the lock to a full table lock?)
we simply forget about single keys, but have a new index
access function that checks the entire index for uniqueness.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

#4Mikheev, Vadim
vmikheev@SECTORBASE.COM
In reply to: Jan Wieck (#3)
RE: RE: [BUGS] Update is not atomic

update a set a=a+1 where a>2;
ERROR: Cannot insert a duplicate key into unique index a_pkey

We use uniq index for UK/PK but shouldn't. Jan?

What else can you use than an index? A "deferred until
statement end" trigger checking for duplicates? Think it'd
have a real bad performance impact.

AFAIR, standard requires "deffered" (until statement/transaction(?)
end) as default behaviour for RI (all?) constraints. But no matter
what is default, "deffered" *must* be available => uniq indices
must not be used.

Whatever the execution order might be, the update of '3' to
'4' will see the other '4' as existent WRT the scan commandId
and given snapshot - right? If we at the time we now fire up
the ERROR add the key, the index and heap to a list of
"possible dupkeys", that we'll check at the end of the actual
command, the above would work. The check at statement end
would have to increment the commandcounter and for each entry
do an index scan with the key, counting the number of found,
valid heap tuples.

Incrementing comand counter is not enough - dirty reads are required
to handle concurrent PK updates.

Well, with some million rows doing a "set a = a + 1" could
run out of memory. So this would be something that'd work in
the sandbox and for non-broken applications (tm). Maybe at

How is this different from (deffered) updates of million FK we allow
right now? Let's user decide what behaviour (deffered/immediate) he
need. The point is that now user has no ability to choose what's
right for him.

some level (when we escalate the lock to a full table lock?)
we simply forget about single keys, but have a new index
access function that checks the entire index for uniqueness.

I wouldn't bother to implement this. User always has ability to excl.
lock table, drop constraints, update whatever he want and recreate
constraints again.

Vadim

#5Jan Wieck
JanWieck@Yahoo.com
In reply to: Mikheev, Vadim (#4)
Re: RE: [BUGS] Update is not atomic

Mikheev, Vadim wrote:

update a set a=a+1 where a>2;
ERROR: Cannot insert a duplicate key into unique index a_pkey

We use uniq index for UK/PK but shouldn't. Jan?

What else can you use than an index? A "deferred until
statement end" trigger checking for duplicates? Think it'd
have a real bad performance impact.

AFAIR, standard requires "deffered" (until statement/transaction(?)
end) as default behaviour for RI (all?) constraints. But no matter
what is default, "deffered" *must* be available => uniq indices
must not be used.

Right.

Whatever the execution order might be, the update of '3' to
'4' will see the other '4' as existent WRT the scan commandId
and given snapshot - right? If we at the time we now fire up
the ERROR add the key, the index and heap to a list of
"possible dupkeys", that we'll check at the end of the actual
command, the above would work. The check at statement end
would have to increment the commandcounter and for each entry
do an index scan with the key, counting the number of found,
valid heap tuples.

Incrementing comand counter is not enough - dirty reads are required
to handle concurrent PK updates.

What's that with you and dirty reads? Every so often you tell
me that something would require them - you really like to
read dirty things - no? :-)

So let me get it straight: I execute the entire UPDATE SET
A=A+1, then increment the command counter and don't see my
own results? So an index scan with heap tuple check will
return OLD (+NEW?) rows? Last time I fiddled around with
Postgres it didn't, but I could be wrong.

Well, with some million rows doing a "set a = a + 1" could
run out of memory. So this would be something that'd work in
the sandbox and for non-broken applications (tm). Maybe at

How is this different from (deffered) updates of million FK we allow
right now? Let's user decide what behaviour (deffered/immediate) he
need. The point is that now user has no ability to choose what's
right for him.

It isn't and I could live with that. I just wanted to point
out before we implement it and get complaints.

some level (when we escalate the lock to a full table lock?)
we simply forget about single keys, but have a new index
access function that checks the entire index for uniqueness.

I wouldn't bother to implement this. User always has ability to excl.
lock table, drop constraints, update whatever he want and recreate
constraints again.

It'd be easy to implement for btree. Just do an entire index
scan - returns every index entry in sort order. Check if the
heap tuple is alive and if the key is equal to the previous
one found alive, abort with a dupkey error. Well, not really
super performant, but we where about to run out of memory, so
it's not a performance question any more, it's a question of
survival.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

#6Mikheev, Vadim
vmikheev@SECTORBASE.COM
In reply to: Jan Wieck (#5)
RE: RE: [BUGS] Update is not atomic

Incrementing comand counter is not enough - dirty reads are required
to handle concurrent PK updates.

What's that with you and dirty reads? Every so often you tell
me that something would require them - you really like to
read dirty things - no? :-)

Dirty things occure - I like to handle them -:)
All MVCC stuff is just ability to handle dirties, unlike old,
locking, behaviour when transaction closed doors to table while
doing its dirty things. "Welcome to open world but be ready to
handle dirty things" -:)

So let me get it straight: I execute the entire UPDATE SET
A=A+1, then increment the command counter and don't see my
own results? So an index scan with heap tuple check will
return OLD (+NEW?) rows? Last time I fiddled around with
Postgres it didn't, but I could be wrong.

How are you going to see concurrent PK updates without dirty reads?
If two transactions inserted same PK and perform duplicate check at
the same time - how will they see duplicates if no one committed yet?
Look - there is very good example of using dirty reads in current
system: uniq indices, from where we started this thread. So, how uniq
btree handles concurrent (and own!) duplicates? Btree calls heap_fetch
with SnapshotDirty to see valid and *going to be valid* tuples with
duplicate key. If VALID --> ABORT, if UNCOMMITTED (going to be valid)
--> wait for concurrent transaction commit/abort (note that for
obvious reasons heap_fetch(SnapshotDirty) doesn't return OLD rows
modified by current transaction). I had to add all this SnapshotDirty
stuff right to get uniq btree working with MVCC. All what I propose now
is to add ability to perform dirty scans to SPI (and so to PL/*), to be
able make right decisions in SPI functions and triggers, and make those
decisions *at right time*, unlike uniq btree which makes decision
too soon. Is it clear now how to use dirty reads for PK *and* FK?

You proposed using share *row* locks for FK before. I objected then and
object now. It will not work for PK because of PK rows "do not exist"
for concurrent transactions. What would work here is *key* locks (locks
placed for some key in a table, no matter does row with this key exist
or not). This is what good locking systems, like Informix, use. But
PG is not locking system, no reasons to add key lock overhead, because
of PG internals are able to handle dirties and we need just add same
abilities to externals.

Vadim

#7Jan Wieck
JanWieck@Yahoo.com
In reply to: Mikheev, Vadim (#6)
Re: RE: [BUGS] Update is not atomic

Mikheev, Vadim wrote:

Dirty things occure - I like to handle them -:)

Got it now - you're right. Thanks for your patience.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

#8Hannu Krosing
hannu@tm.ee
In reply to: Jan Wieck (#7)
Re: RE: [BUGS] Update is not atomic

Jan Wieck wrote:

Mikheev, Vadim wrote:

Dirty things occure - I like to handle them -:)

Got it now - you're right. Thanks for your patience.

But can we count on you (or Vadim) to actually fix this anytime soon ?

-------------
Hannu