Savepoints...

Started by Vadim Mikheevover 26 years ago10 messages
#1Vadim Mikheev
vadim@krs.ru

To have them I need to add tuple id (6 bytes) to heap tuple
header. Are there objections? Though it's not good to increase
tuple header size, subj is, imho, very nice feature...

Implementation is , hm, "easy":

- heap_insert/heap_delete/heap_replace/heap_mark4update will
remember updated tid (and current command id) in relation cache
and store previously updated tid (remembered in relation cache)
in additional heap header tid;
- lmgr will remember command id when lock was acquired;
- for a savepoint we will just store command id when
the savepoint was setted;
- when going to sleep due to concurrent the-same-row update,
backend will store MyProc and tuple id in shmem hash table.

When rolling back to a savepoint, backend will:

- release locks acquired after savepoint;
- for a relation updated after savepoint, get last updated tid
from relation cache, walk through relation, set
HEAP_XMIN_INVALID/HEAP_XMAX_INVALID in all tuples updated
after savepoint and wake up concurrent writers blocked
on these tuples (using shmem hash table mentioned above).

The last feature (waking up of concurrent writers) is most hard
part to implement. AFAIK, Oracle 7.3 was not able to do it.
Can someone comment is this feature implemented in Oracle 8.X,
other DBMSes?

Now about implicit savepoints. Backend will place them before
user statements execution. In the case of failure, transaction
state will be rolled back to the one before execution of query.
As side-effect, this means that we'll get rid of complaints
about entire transaction abort in the case of mistyping
causing abort due to parser errors...

Comments?

Vadim

#2Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Vadim Mikheev (#1)
Re: [HACKERS] Savepoints...

To have them I need to add tuple id (6 bytes) to heap tuple
header. Are there objections? Though it's not good to increase
tuple header size, subj is, imho, very nice feature...

Gee, that's a lot of overhead. We would go from 40 bytes ->46 bytes.

How is this different from the tid or oid? Reading your description, I
see there probably isn't another way to do it.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#3Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Vadim Mikheev (#1)
RE: [HACKERS] Savepoints...

-----Original Message-----
From: owner-pgsql-hackers@postgreSQL.org
[mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Vadim Mikheev
Sent: Wednesday, June 16, 1999 10:13 PM
To: PostgreSQL Developers List
Subject: [HACKERS] Savepoints...

To have them I need to add tuple id (6 bytes) to heap tuple
header. Are there objections? Though it's not good to increase
tuple header size, subj is, imho, very nice feature...

Implementation is , hm, "easy":

- heap_insert/heap_delete/heap_replace/heap_mark4update will
remember updated tid (and current command id) in relation cache
and store previously updated tid (remembered in relation cache)
in additional heap header tid;

- lmgr will remember command id when lock was acquired;

Does this mean that many writing commands in a transaction
require many command id-s to remember ?

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp

#4Vadim Mikheev
vadim@krs.ru
In reply to: Hiroshi Inoue (#3)
Re: [HACKERS] Savepoints...

Hiroshi Inoue wrote:

- lmgr will remember command id when lock was acquired;

Does this mean that many writing commands in a transaction
require many command id-s to remember ?

Did you mean such cases:

begin;
...
update t set...;
...
update t set...;
...
end;

?

We'll remember command id for the first "update t" only
(i.e. for the first ROW EXCLUSIVE mode lock over table t).

Vadim

#5Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Vadim Mikheev (#4)
RE: [HACKERS] Savepoints...

-----Original Message-----
From: root@sunpine.krs.ru [mailto:root@sunpine.krs.ru]On Behalf Of Vadim
Mikheev
Sent: Thursday, June 17, 1999 12:58 PM
To: Hiroshi Inoue
Cc: PostgreSQL Developers List
Subject: Re: [HACKERS] Savepoints...

Hiroshi Inoue wrote:

- lmgr will remember command id when lock was acquired;

Does this mean that many writing commands in a transaction
require many command id-s to remember ?

Did you mean such cases:

Yes.

begin;
...
update t set...;
...
update t set...;
...
end;

?

We'll remember command id for the first "update t" only
(i.e. for the first ROW EXCLUSIVE mode lock over table t).

How to reduce lock counter for ROW EXCLUSIVE mode lock
over table t?

And more questions.

HEAP_MARKED_FOR_UPDATE state could be rollbacked ?

For example

..
[savepoint 1]
select .. from t1 where key=1 for update;
[savepoint 2]
select .. from t1 where key=1 for update;
[savepoint 3]
update t1 set .. where key=1;

Rollback to savepoint 3 OK ?
Rollback to savepoint 2 OK ?
Rollback to savepoint 1 OK ?

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp

#6Vadim Mikheev
vadim@krs.ru
In reply to: Hiroshi Inoue (#5)
Re: [HACKERS] Savepoints...

Hiroshi Inoue wrote:

We'll remember command id for the first "update t" only
(i.e. for the first ROW EXCLUSIVE mode lock over table t).

How to reduce lock counter for ROW EXCLUSIVE mode lock
over table t?

No reasons to do it for ROW EXCLUSIVE mode lock (backend releases
such locks only when commit/rollback[to savepoint]), but we have to
do it in some other cases - when we explicitly release acquired locks
after scan/statement is done. And so, you're right: in these cases
we have to track lock acquisitions. Well, we'll add new arg to
LockAcquire (and other funcs; we have to do it anyway to implement
NO WAIT, WAIT XXX secs locks) to flag lmgr that if the lock counter
is not 0 (for 0s - i.e. first lock acquisition - command id will be
remembered by lmgr anyway) than this counter must be preserved in
implicit savepoint. In the case of abort lock counters will be restored.
Space allocated in implicit savepoint will released.

All the above will work till there is no UNLOCK statement.

Thanks!

And more questions.

HEAP_MARKED_FOR_UPDATE state could be rollbacked ?

Yes. FOR UPDATE changes t_xmax and t_cmax.

Vadim

#7Zeugswetter Andreas IZ5
Andreas.Zeugswetter@telecom.at
In reply to: Vadim Mikheev (#6)
Re: [HACKERS] Savepoints...

Actually, I think a lot of the cases where rollback to savepoint
would be done implicitly could be avoided by adding a fourth
behavior of elog.

This elog, let's e.g. call it elog(WARN,...) would actually behave
like an elog(NOTICE,..) in the backend, but would return ERROR
to the client. I think at least all elogs that happen in the parser
could be handled like this, and a lot of the others.
Of course the client would need an error code, but that is your
2. item anyway :-)

The following example is IMHO not necessary,
with or without savepoints:

regression=> begin work;
BEGIN
regression=> insert into t2 values (1, 'one');
INSERT 151498 1
regression=> blabla;
ERROR: parser: parse error at or near "blabla"
regression=> commit work; -- actually this is currently a bug,
-- it must ERROR, since only
rollback work
END -- is allowed in txn abort state
regression=> select * from t2;
a|b
-+-
(0 rows)

Andreas

#8Vadim Mikheev
vadim@krs.ru
In reply to: Bruce Momjian (#2)
Re: [HACKERS] Savepoints...

Bruce Momjian wrote:

To have them I need to add tuple id (6 bytes) to heap tuple
header. Are there objections? Though it's not good to increase
tuple header size, subj is, imho, very nice feature...

Gee, that's a lot of overhead. We would go from 40 bytes ->46 bytes.

40? offsetof(HeapTupleHeaderData, t_bits) is 31...

Well, seems that we can remove 5 bytes from tuple header.

1. t_hoff (1 byte) may be computed - no reason to store it.
2. we need in both t_cmin and t_cmax only when tuple is updated
by the same xaction as it was inserted - in such cases we
can put delete command id (t_cmax) to t_xmax and set
flag HEAP_XMAX_THE_SAME (as t_xmin), in all other cases
we will overwrite insert command id with delete command id
(no one is interested in t_cmin of committed insert xaction)
-> yet another 4 bytes (sizeof command id).

If now we'll add 6 bytes to header then
offsetof(HeapTupleHeaderData, t_bits) will be 32 and for
no-nulls tuples there will be no difference at all
(with/without additional 6 bytes), due to double alignment
of header. So, the choice is: new feature or more compact
(than current) header for tuples with nulls.

How is this different from the tid or oid? Reading your description, I

t_ctid could be used but would require additional disk write.

see there probably isn't another way to do it.

There is one - WAL. I'm thinking about it, but it's too long story -:)

BTW, additional tid in header would allow us to implement
RI/U constraints without rules: knowing what tuples were changed
we could just read these tuples and perform checks. This would be
faster and don't require to store deffered rule plans in memory.

I'm still like the idea of deffered rules, Jan - they allow
to implement much more complex constraints than RI/U ones.
Though, did you think about [deffered] statement level triggers
implementation, Jan? You are the best one who could make it,
because of they are children of overwrite system and PL.

Vadim

#9Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Vadim Mikheev (#8)
Re: [HACKERS] Savepoints...

Bruce Momjian wrote:

To have them I need to add tuple id (6 bytes) to heap tuple
header. Are there objections? Though it's not good to increase
tuple header size, subj is, imho, very nice feature...

Gee, that's a lot of overhead. We would go from 40 bytes ->46 bytes.

40? offsetof(HeapTupleHeaderData, t_bits) is 31...

Yes, I saw this. I even updated the FAQ to show a 32-byte overhead.

Well, seems that we can remove 5 bytes from tuple header.

I was hoping you could do something like this.

1. t_hoff (1 byte) may be computed - no reason to store it.

Yes.

2. we need in both t_cmin and t_cmax only when tuple is updated
by the same xaction as it was inserted - in such cases we
can put delete command id (t_cmax) to t_xmax and set
flag HEAP_XMAX_THE_SAME (as t_xmin), in all other cases
we will overwrite insert command id with delete command id
(no one is interested in t_cmin of committed insert xaction)
-> yet another 4 bytes (sizeof command id).

Good.

If now we'll add 6 bytes to header then
offsetof(HeapTupleHeaderData, t_bits) will be 32 and for
no-nulls tuples there will be no difference at all
(with/without additional 6 bytes), due to double alignment
of header. So, the choice is: new feature or more compact
(than current) header for tuples with nulls.

That's a tough one. What do other DB's have for row overhead?

How is this different from the tid or oid? Reading your description, I

t_ctid could be used but would require additional disk write.

OK, I understand.

see there probably isn't another way to do it.

There is one - WAL. I'm thinking about it, but it's too long story -:)

OK.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#10Thomas Lockhart
lockhart@alumni.caltech.edu
In reply to: Bruce Momjian (#2)
2 attachment(s)
Info on Data Storage

istm that this discussion and the one on the 1GB limit on table
segments could form the basis for a missing chapter on "Data Storage"
in the Admin Guide. Would someone (other than Vadim, who we need to
keep coding! :) please keep following this and related threads and
extract the info for the Admin Guide chapter? It doesn't need to be
very long, perhaps just suggesting how to calculate table storage
size, discussing upper limits (e.g. 32-bit OID), and describing the
table segmentation scheme. There is already a chapter (with more
detail than the AG needs) in the Developer's Guide which should be
updated too.

Anyway, both chapters are enclosed; the originals are also in
doc/src/sgml/{storage,page}.sgml)
All we really need is the info, and I can do the markup if whoever
picks this up doesn't feel comfortable with trying the SGML markup.

Volunteers appreciated...

- Thomas

To have them I need to add tuple id (6 bytes) to heap tuple
header. Are there objections? Though it's not good to increase
tuple header size, subj is, imho, very nice feature...

Gee, that's a lot of overhead. We would go from 40 bytes ->46 bytes.

40? offsetof(HeapTupleHeaderData, t_bits) is 31...
Well, seems that we can remove 5 bytes from tuple header.
1. t_hoff (1 byte) may be computed - no reason to store it.
2. we need in both t_cmin and t_cmax only when tuple is updated
by the same xaction as it was inserted - in such cases we
can put delete command id (t_cmax) to t_xmax and set
flag HEAP_XMAX_THE_SAME (as t_xmin), in all other cases
we will overwrite insert command id with delete command id
(no one is interested in t_cmin of committed insert xaction)
-> yet another 4 bytes (sizeof command id).
If now we'll add 6 bytes to header then
offsetof(HeapTupleHeaderData, t_bits) will be 32 and for
no-nulls tuples there will be no difference at all
(with/without additional 6 bytes), due to double alignment
of header. So, the choice is: new feature or more compact
(than current) header for tuples with nulls.

--
Thomas Lockhart lockhart@alumni.caltech.edu
South Pasadena, California

Attachments:

storage.sgmltext/html; charset=us-ascii; name=storage.sgmlDownload
page.sgmltext/html; charset=us-ascii; name=page.sgmlDownload