CONSTRAINTS...
So, could someone send me the SQL92 constraints syntax as well as the
definition of what a deferrable constraint is supposed to be?
ADVthanksANCE
-DEJ
"Jackson, DeJuan" <djackson@cpsgroup.com> writes:
So, could someone send me the SQL92 constraints syntax as well as the
definition of what a deferrable constraint is supposed to be?
Maybe someone emailed you a good answer already...
There's a whole chapter on constraints in "A Guide to the SQL
Standard" 4th ed. by Date and Darwen. Constraint syntax includes
CREATE DOMAIN, ALTER DOMAIN, DROP DOMAIN, CREATE ASSERTION, DROP
ASSERTION, CREATE TABLE, ALTER TABLE, FOREIGN KEY, CHECK, and SET
CONSTRAINTS, and others.
A deferrable constraint is one that may be deferred,using INITIALLY
DEFERRED or DEFERRABLE in its definition, or later, using SET
CONSTRAINTS ... DEFERRED.
A constraint is presumably deferred until a) the end of the applicable
transaction, or b) the next COMMIT, or c) the next SET CONSTRAINTS
... IMMEDIATE, whichever comes first. A possible use of deferrable
constraints is to avoid Catch-22 when setting up tables which have
cyclic dependency among foreign keys. [SQL lawyers, check me on this.]
BTW, AltaVista Web search on "sql92 ~ constraint" yielded 8 hits -
7 of which were PostgreSQL items.
Import Notes
Reply to msg id not found: JacksonDeJuansmessageofMon11Jan1999134424-0600
From "A Guide to The SQL standard" C.J.DATE:
FOREIGN KEY Syntax:
* base-table-constraint-def
::= [ CONSTRAINT constraint ]
foreign-key-def [ deferrability ]
foreign-key-def
::= FOREIGN KEY ( column-commalist ) references-def
references-def
::= REFERENCES base-table [ ( column-commalist ) ]
[ MATCH { FULL | PARTIAL } ]
[ ON DELETE referential-action ]
[ ON UPDATE referential-action ]
referential-action
::= NO ACTION | CASCADE | SET DEFAULT | SET NULL
deferrability
::= INITIALLY { DEFERRED | IMMEDIATE } [ NOT ] DEFERRABLE
* column-constraint-def
::= references-def [ deferrability ]
14.6 DEFERRED CONSTRAINT CHECKING
Up to this point we have been assuming that all integrity constraints
are checked "immediately," i.e., as the final step in executing any SQL
statement - and, if any constraint is found to he violated, the
offending SQL statement is simply rejected, so that its overall effect
on the database is nil. Sometimes, however, it is necessary that certain
constraints not he checked until some later time, on the grounds that if
they were to be checked "immediately" they would always fail. Here is an
example (involving a referential cycle):
* Suppose we have two base tables, Tl and T2, each of which includes a
foreign key that references some candidate key of the other, and suppose
we start with both tables empty. Then, if all foreign key checking is
done immediately, there is no way to get started: Any attempt to insert
a row into either table will fail, because there is no target row in the
other table that it can possibly reference.
The facilities described immediately following are intended to address
such situations.
1. At any given time, with respect to any given transaction, any given
constraint must be in one or two "modes," immediate or deferred.*
Immediate means the constraint is checked "immediately" (as explained
above); deferred means it is not.
2. Any given constraint definition can optionally include either or both
of the following:
INITIALLY { DEFERRED | IMMEDIATE }
[ NOT ] DEFERRABLE
These specifications appear as the final syntactic component of the
constraint definition. They can appear in either order.
- INITIALLY DEFERRED and NOT DEFERRABLE are mutually exclusive. If
neither INITIALLY DEFERRED nor INITIALLY IMMEDIATE is specified,
INITIALLY IMMEDIATE is implied. If INITIALLY IMMEDIATE is specified or
implied, then if neither DEFERRABLE nor NOT DEFERRABLE is specified, NOT
DEFERRABLE is implied. If lNITIALLY DEFERRED is specified, then (as
already explained) NOT DEFERRABLE must not he specified; DEFERRABLE can
be specified, but is implied anyway.
- INITIALLY DEFERRED and INITIALLY IMMEDIATE specify the "initial" mode
of the constraint i.e., its mode immediately after it is defined and
at the start of every transaction'! as deferred or immediate,
respectively. . DEFERRABLE and NOT DEFERRABLE specify whether or not
this constraint can ever be in deferred mode. DEFERRABLE means it can;
NOT DEFERRABLE means it cannot.
3. The SET CONSTRAINTS statement is used to set the mode for specified
constraints with respect to the current transaction and current session
(or the next transaction to he initiated in the current session, if the
SQL-agent has no transaction currently executing). The syntax is:
SET CONSTRAINTS { constraint-commalist | ALL }
{ DEFERRED | IMMEDIATE }
Each "constraint" mentioned by name must he DEFERRABLE; ALL is short-
hand for "all DEFERRABLE constraints." If DEFERRED is specified, the
mode of all indicated constraints is set to deferred. If 1MMED1ATE is
specified, the mode of all indicated constraints is set to immediate,
and those constraints are then checked; if any check fails, the SET
CONSTRAINTS fails, and the mode of all indicated constraints remains
unchanged. Note that because of paragraph 4 below, the checks should not
fail if the SET CONSTRAINTS statement is executed while the SQL-agent
has no current transaction.
4. COMMIT implies SET CONSTRAINTS ALL IMMEDIATE (for every active
SQL-session for the applicable SQL-transaction). If some implied
integrity check then fails, the COMMIT fails, and the transaction fails
also (i.e., is rolled back).
To revert to the example mentioned at the beginning of this section (the
referential cycle involving two tables): We could deal with the problem
using the foregoing facilities as indicated by the following pseudocode.
Data definitions:
CREATE TABLE Tl CONSTRAINT T1FK FOREIGN KEY ... REFERENCES T2
INITIALLY DEFERRED
CREATE TABLE T2 CONSTRAINT T2FK FOREIGN KEY ... REFERENCES T1
INITIALLY DEFERRED
SQL-transaction:
INSERT INTO T1 ( ... ) VALUES ( ... )
INSERT INTO T2 ( ... ) VALUES ( ... )
SET CONSTRAINTS T1FK, T2FK IMMEDIATE
IF SQLSTATE = code meaning "SET CONSTRAINTS failed"
THEN ROLLBACK --cancel the INSERTs
Jackson, DeJuan wrote:
So, could someone send me the SQL92 constraints syntax as well as the
definition of what a deferrable constraint is supposed to be?
ADVthanksANCE
-DEJ
-Jose'-
From "A Guide to The SQL standard" C.J.DATE:
FOREIGN KEY Syntax:
[Good description of foreign key constraints - tnx]
Jackson, DeJuan wrote:
So, could someone send me the SQL92 constraints syntax as well as the
definition of what a deferrable constraint is supposed to be?
ADVthanksANCE
-DEJ-Jose'-
This reminds me on one of my personal TODO's, because it
show's to me that an implementation of constraints using
triggers or the like wouldn't be such a good idea. Especially
the part on deferred constraint checks would mean a lot of
buffering to do the checks at the end.
My idea on constraints was to use the rewrite rule system for
them. I wanted first to implement another optimizable
statement - RAISE. RAISE is mostly the same as a SELECT, but
the result will not be sent to the frontend. Instead it will
produce some formatted elog message(s?).
The syntax of RAISE I have in mind is:
RAISE [ALL | FIRST [n]] expr [, expr ...] FROM ...
Anything after FROM is exactly the same as for a SELECT.
If the first result attribute of RAISE is a (var|bp)char or
text field, single occurences of % in it will be substituted
by the following attributes. Otherwise all the attrs are
simply concatenated with a padding blank to form the error
message.
ALL or FIRST n means, that not only the first error should be
shown. A bit tricky to implement but I think a bunch of
NOTICE and a final "ERROR: 5 errors counted" would be
possible.
Having this, a foreign key constraint rule could look like
this:
CREATE RULE _CIconstraint_name AS ON INSERT TO mytab DO
RAISE 'Key "%" not in keytab', new.myatt FROM keytab
WHERE NOT EXISTS (SELECT * FROM keytab WHERE keyatt = new.myatt);
Similar rules for update are simple and an ON DELETE CASCADE
rule isn't that hard too.
For the deferred constraints we now need some more
informations on the rules themself. Currently all queries
thrown in by the rule system are executed prior to the
original query. If we add some syntax to CREATE RULE so we
can tell
CREATE [DEFERRABLE] [INITIALLY DEFERRED] RULE ...
the rule system would be able to collect those queries (they
all would be RAISE statements) to a global querytree list if
they should be deferred. This global list is drained out
(all queries run) when either the transaction commits or the
SET ... IMMEDIATE is executed.
Well, the information to remember isn't a small amount. Per
constraint that is to be deferred, there will be one
querytree. And that for every single INSERT/UPDATE/DELETE.
And if a table has 5 constraints, it will be 5 remembered
querytrees per operation. But the information to remember
doesn't depend on the amount of data affected in the
statement (like it would be in a trigger implementation). So
it will work in a situation like
BEGIN TRANSACTION;
SET CONSTRAINST ALL DEFERRED;
UPDATE tab1 SET ref1 = ref1 + 1900;
UPDATE tab2 SET key1 = key1 + 1900;
COMMIT TRANSACTION;
even if there are millions of rows in the tables.
As Bruce said once I mentioned using the rule system for
constraints: "It's a tempting solution". And I'm glad to have
the work delayed until now because yet the DEFERRED problem
surfaced and could be taken into account too.
Comments? (sure :-)
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#======================================== jwieck@debis.com (Jan Wieck) #
For the deferred constraints we now need some more
informations on the rules themself. Currently all queries
thrown in by the rule system are executed prior to the
original query. If we add some syntax to CREATE RULE so we
can tellCREATE [DEFERRABLE] [INITIALLY DEFERRED] RULE ...
the rule system would be able to collect those queries (they
all would be RAISE statements) to a global querytree list if
they should be deferred. This global list is drained out
(all queries run) when either the transaction commits or the
SET ... IMMEDIATE is executed.Well, the information to remember isn't a small amount. Per
constraint that is to be deferred, there will be one
querytree. And that for every single INSERT/UPDATE/DELETE.
And if a table has 5 constraints, it will be 5 remembered
querytrees per operation. But the information to remember
doesn't depend on the amount of data affected in the
statement (like it would be in a trigger implementation). So
it will work in a situation like
Let's look at it another way. If we didn't use the query rewrite
system, what method could we use for foreign key/contraints that would
function better than this?
As far as I remember, triggers are C functions? We can't generate these
on the fly inside the backend. (Though compiling C code from the
backend and dynamically linking it into the engine is way too cool.)
Could we generate generic triggers that would handle most/all
situations? I don't know. Even if we can, would they be much faster
than the needed queries themselves? Seems triggers work on single
tables. How do we span tables? If it is going to launch queries from
the trigger, we should use the rewrite system. It is better suited to
this, with predigested queries and queries that flow through the
executor in step with the user queries!
Maybe let's go with the rewrite system, because it works, and is
flexible and strangely designed for this type of problem. Similar to
how we use the rewrite system for views.
I am basically asking for a reason _not_ to use the rewrite system for
this. I can't think of one myself.
--
Bruce Momjian | http://www.op.net/~candle
maillist@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Let's look at it another way. If we didn't use the query rewrite
system, what method could we use for foreign key/contraints that would
function better than this?As far as I remember, triggers are C functions? We can't generate these
on the fly inside the backend. (Though compiling C code from the
backend and dynamically linking it into the engine is way too cool.)Could we generate generic triggers that would handle most/all
situations? I don't know. Even if we can, would they be much faster
than the needed queries themselves? Seems triggers work on single
tables. How do we span tables? If it is going to launch queries from
the trigger, we should use the rewrite system. It is better suited to
this, with predigested queries and queries that flow through the
executor in step with the user queries!
Generic triggers in C that are argument driven would be
possible. But the drawback is that those triggers have to be
very smart to use saved SPI plans (one for every different
argument set). And it must be row level triggers, so for an
update to a 2 meg row table they will be fired 2 million
times and run their queries inside - will take some time.
More painful in the 2 meg row situation is that trigger
invocation has to be delayed until COMMIT if the constraint
is deferred. I think we cannot remember 2 million OLD plus 2
million NEW tuples if one tuple can have up to 8K (will be
32GB to remember plus overhead), so we need to remember at
least the CTID's of OLD and NEW and refetch them for the
trigger invocation. OUTCH - the OLD ones are at the head and
all the NEW ones are at the end of the tables file!
Maybe let's go with the rewrite system, because it works, and is
flexible and strangely designed for this type of problem. Similar to
how we use the rewrite system for views.
And the other changes I've planned for the rewrite system
will improve this much more.
1. Change pg_rewrite.ev_attr into an int28. This would be
useful for ON UPDATE rules so the rewrite system can
easily check if a rule has to be applied or not. If none
of named attributes gets something different assigned
than it's own OLD value, they aren't updated so the rule
could never result in an action and can be omitted
completely.
2. Create cross reference catalog that lists all relations
used in a rule (rangetable). If we have a DELETE CASCADE
constraint, the rule is triggered on the key table and
the action is a DELETE from the referencing table. If now
the referencing table is dropped, the rule get's
corrupted because the resulting querytree isn't plannable
any longer (the relations in the rules rangetable are
identified by the OID in pg_class, not by relname). You
can see the effect if you create a view and drop one of
the base tables.
Well, we need to define what to do if a table is dropped
that occurs in the crossref. First of all, the rules have
to be dropped too, but in the case of a view rule, maybe
the whole view too?
And in the case where a key table to which another one
has a CHECK reference is dropped? The rule action will
allways abort, so it isn't useful any more. But I
wouldn't like to silently drop it, because someone might
want to drop and recreate the key table and this would
silently result in that all the constraints have been
lost.
Maybe we should change the rulesystem at all so that the
rangetable entries in the rule actions etc. are updated
with a lookup from pg_class at rewrite time. Must be done
carefully because someone might drop a table and recreate
it with a different schema corrupting the parsetree of
the rule actions though.
3. Allow an unlimited number of rules on a relation.
Currently there is a hard coded limit on the number of
rules the relation can hold in it's slots.
I am basically asking for a reason _not_ to use the rewrite system for
this. I can't think of one myself.
It might interfere with the new MVCC code. The rule actions
must see exactly the OLD tuples that where used in the
original statements. Not only those in the updated table
itself, think of an INSERT...SELECT or an UPDATE where the
TLE or qual expressions are values from other tables.
Not a real reason, just something to have in mind and maybe
switching silently to another MVCC isolation level if
constraint rules get applied, so all tables read from now on
will get a read lock applied and cannot get updated
concurrently until COMMIT.
And it's a problem I've came across just writing this note
where MVCC already could have broken rewrite rule system
semantics.
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#======================================== jwieck@debis.com (Jan Wieck) #
-----Original Message-----
Generic triggers in C that are argument driven would be
possible. But the drawback is that those triggers have to be
very smart to use saved SPI plans (one for every different
argument set). And it must be row level triggers, so for an
update to a 2 meg row table they will be fired 2 million
times and run their queries inside - will take some time.More painful in the 2 meg row situation is that trigger
invocation has to be delayed until COMMIT if the constraint
is deferred. I think we cannot remember 2 million OLD plus 2
million NEW tuples if one tuple can have up to 8K (will be
32GB to remember plus overhead), so we need to remember at
least the CTID's of OLD and NEW and refetch them for the
trigger invocation. OUTCH - the OLD ones are at the head and
all the NEW ones are at the end of the tables file!
Agreed scrap the triggers idea.
Maybe let's go with the rewrite system, because it works, and is
flexible and strangely designed for this type of problem.Similar to
how we use the rewrite system for views.
And the other changes I've planned for the rewrite system
will improve this much more.1. Change pg_rewrite.ev_attr into an int28. This would be
useful for ON UPDATE rules so the rewrite system can
easily check if a rule has to be applied or not. If none
of named attributes gets something different assigned
than it's own OLD value, they aren't updated so the rule
could never result in an action and can be omitted
completely.
Would we add a system table for constraints? I believe that the
DEFERRABLE option means we need special information about all
constraints. (ie. is_deferrable boolean, is_deferred boolean).
2. Create cross reference catalog that lists all relations
used in a rule (rangetable). If we have a DELETE CASCADE
constraint, the rule is triggered on the key table and
the action is a DELETE from the referencing table. If now
the referencing table is dropped, the rule get's
corrupted because the resulting querytree isn't plannable
any longer (the relations in the rules rangetable are
identified by the OID in pg_class, not by relname). You
can see the effect if you create a view and drop one of
the base tables.
*Bleck* We can't place cascading or restriction information into the
table definition. Don't know if you had thought about it, but you could
only create them in reverse order (child<-parent), very messy when you
have many 'child' tables to cascade.
Well, we need to define what to do if a table is dropped
that occurs in the crossref. First of all, the rules have
to be dropped too, but in the case of a view rule, maybe
the whole view too?
I agree that we should drop rules defined on the dropped table, but not
those that simply access the dropped table, including views. If I need
to drop a table which has a view defined for it simply to rearrange
column order, I'd like not to have to recreate the view as well and/or
any RI-constraints that reference that table as well.
And in the case where a key table to which another one
has a CHECK reference is dropped? The rule action will
allways abort, so it isn't useful any more. But I
wouldn't like to silently drop it, because someone might
want to drop and recreate the key table and this would
silently result in that all the constraints have been
lost.
This is what I mean about views and RI-constraints.
Maybe we should change the rulesystem at all so that the
rangetable entries in the rule actions etc. are updated
with a lookup from pg_class at rewrite time. Must be done
carefully because someone might drop a table and recreate
it with a different schema corrupting the parsetree of
the rule actions though.
Could we allow a 'recompile' of the rules? Which would update the rule
actions and report which rules have been corrupted. And possibly a way
to simple drop all invalid rules. We could simply mark the rules that
need to be recompiled on a DROP or ALTER TABLE (to allow for dropping a
column, or changing a column type) from the crossref.
3. Allow an unlimited number of rules on a relation.
Currently there is a hard coded limit on the number of
rules the relation can hold in it's slots.
I agree with this one without reference, especially if we actually
implement RI/constraints using rewrite.
I am basically asking for a reason _not_ to use the rewrite
system for
this. I can't think of one myself.
It might interfere with the new MVCC code. The rule actions
must see exactly the OLD tuples that where used in the
original statements. Not only those in the updated table
itself, think of an INSERT...SELECT or an UPDATE where the
TLE or qual expressions are values from other tables.
Shouldn't be too hard to accomplish. We simply need to be sure that the
values are either cached or re-cacheable/re-producible (if there are
memory concerns) until the end of the transaction. So, either:
1. cache all the values
2. store the part of the parsetree needed to recreate the values and
have MVCC mark any needed rows off limits to vacuum
I know that there are many other subtleties that I'm not addressing such
as indexes, but the above will get you the answer.
FWIW, RI and CONSTRAINTS would be affected by this.
Not a real reason, just something to have in mind and maybe
switching silently to another MVCC isolation level if
constraint rules get applied, so all tables read from now on
will get a read lock applied and cannot get updated
concurrently until COMMIT.
Not needed, see the above. We just need to mark the rows with the tid,
or some-such, because dropped/updated rows will hang around until a
vacuum.
And it's a problem I've came across just writing this note
where MVCC already could have broken rewrite rule system
semantics.
Maybe, but I know you'll straighten it all out. I have faith in you,
Jan.
Jan
-DEJ
P.S. I love the discussion that my simple little request has sparked.
Import Notes
Resolved by subject fallback
Pardon.
First of all I strongly recommend that someone reads the
section about the rewrite rule system in the programmers
manual. From some statements in this whole discussion I'm in
doubt if anyone really knows how the rule system in detail
works and what it actually does with a query.
And please - there where substantial changes in the rule
system for v6.4. So anyone who THINKS he knows, update your
knowlege.
I really didn't wanted to step on someones feet, but detailed
knowlege of the rule system is required here.
DeJuan Jackson wrote:
Agreed scrap the triggers idea.
[...]
Would we add a system table for constraints? I believe that the
DEFERRABLE option means we need special information about all
constraints. (ie. is_deferrable boolean, is_deferred boolean).
No, we don't need another system catalog for that. The only
thing that must be assured is that a rule which is deferrable
is a NON-INSTEAD one with exactly one RAISE statement as
action. Can be checked at rule create time at the same place
the checks for view rules are currently done.
I thought to add is_deferrable and is_initdeferred to
pg_rewrite itself to hold the information about DEFERRABLE
and INITIAL DEFERRED. If a constraint actually has to be
deferred (added to global deferred querytree list) or not is
a per transaction runtime information and isn't useful in the
catalog.
2. Create cross reference catalog that lists all relations
used in a rule (rangetable). If we have a DELETE CASCADE
constraint, the rule is triggered on the key table and
the action is a DELETE from the referencing table. If now
the referencing table is dropped, the rule get's
corrupted because the resulting querytree isn't plannable
any longer (the relations in the rules rangetable are
identified by the OID in pg_class, not by relname). You
can see the effect if you create a view and drop one of
the base tables.*Bleck* We can't place cascading or restriction information into the
table definition. Don't know if you had thought about it, but you could
only create them in reverse order (child<-parent), very messy when you
have many 'child' tables to cascade.
I don't want to put it into the table definition. Anything
that will be required for checks and cascaded deletes are
just rules going into pg_rewrite.
OTOH as I read Oracles SQL Language Quick Reference I can do
a
ALTER TABLE tab1 MODIFY key1 integer
CONSTRAINT ref_tab2 REFERENCES ... ON DELETE CASCADE;
I think I can create the tables in any order and later define
all the (maybe circular) constraints.
Well, we need to define what to do if a table is dropped
that occurs in the crossref. First of all, the rules have
to be dropped too, but in the case of a view rule, maybe
the whole view too?I agree that we should drop rules defined on the dropped table, but not
those that simply access the dropped table, including views. If I need
to drop a table which has a view defined for it simply to rearrange
column order, I'd like not to have to recreate the view as well and/or
any RI-constraints that reference that table as well.[...]
Could we allow a 'recompile' of the rules? Which would update the rule
actions and report which rules have been corrupted. And possibly a way
to simple drop all invalid rules. We could simply mark the rules that
need to be recompiled on a DROP or ALTER TABLE (to allow for dropping a
column, or changing a column type) from the crossref.
Nice idea. Would require the crossref catalog, one more
catalog to store the rules definition text and a compile
status field in pg_rewrite.
If a relation used in a rule is dropped or altered, the
status is set to RECOMPILE. So the rule system can check the
rule locks and try to recompile those rules from the source
before applying them.
Yeah - I really like it! This way only those rules defined on
the dropped table have to be dropped too (as it is now). And
as long as another table used in the rule isn't
recreated/altered with a suitable schema, the action that
fires the rule cannot be performed on that table (the
recompile would fail and abort the transaction).
There are some subtle points about the rule caching
performed. Actually the relation descriptor contains rule
locks where in memory parsetrees are placed at heap_open(),
but it must be possible.
It might interfere with the new MVCC code. The rule actions
must see exactly the OLD tuples that where used in the
original statements. Not only those in the updated table
itself, think of an INSERT...SELECT or an UPDATE where the
TLE or qual expressions are values from other tables.Shouldn't be too hard to accomplish. We simply need to be sure that the
values are either cached or re-cacheable/re-producible (if there are
memory concerns) until the end of the transaction. So, either:
1. cache all the values
2. store the part of the parsetree needed to recreate the values and
have MVCC mark any needed rows off limits to vacuum
I know that there are many other subtleties that I'm not addressing such
as indexes, but the above will get you the answer.
FWIW, RI and CONSTRAINTS would be affected by this.
We cannot cache all the values without the danger running out
of memory.
Those parts of the parsetree needed to recreate the values
are actually incorporated into the parsetrees of the rule
actions. BUT THESE ARE PARSETREES! They result in another
execution plan and when they get executed they perform their
own table scans!
Not a real reason, just something to have in mind and maybe
switching silently to another MVCC isolation level if
constraint rules get applied, so all tables read from now on
will get a read lock applied and cannot get updated
concurrently until COMMIT.Not needed, see the above. We just need to mark the rows with the tid,
or some-such, because dropped/updated rows will hang around until a
vacuum.
As I understood MVCC, it is possible that two identical scans
in one and the same transaction, can result in different
rows.
Maybe I misunderstood something. Isn't changing the isolation
level exactly that what can force MVCC to do locking/marking
with the result that the two identical scans will produce the
same rows?
Vadim - please clearify!
But uh - oh! Now it's me who forgot how in detail the rule
system works :-)
The deferred constraint queries performed at transaction end
will do their own scans. So tuples removed/changed during
this transaction could not be rescanned as they where
(command counter is incremented!). And the incremented
command counter is required to see that finally all circular
constraints are satisfied.
This one in detail now. Let's have the following setup:
CREATE TABLE t1 (key1 int4);
CREATE TABLE t2 (ref1 int4);
CREATE TABLE t3 (a int4, b int4);
The check constraint for ref1->key1 is the rule
CREATE DEFERRABLE RULE _CU_t2_ref1 AS ON UPDATE TO t2 DO
RAISE 'key "%" not in t1', new.ref1
WHERE NOT EXISTS (SELECT key1 FROM t1 WHERE key1 = new.ref1);
Now we execute
BEGIN;
SET CONSTRAINT ALL DEFERRED;
UPDATE t2 SET ref1 = t3.b WHERE ref1 = t3.a;
DELETE FROM t3;
INSERT INTO t1 VALUES (4);
COMMIT;
The rewritten parsetree for the rule action would look like
one for the query
RAISE 'key "%" not in t1', t3.b
FROM t2, t3
WHERE t2.ref1 = t3.a
AND NOT EXISTS (SELECT key1 FROM t1 WHERE key1 = t3.b);
This parsetree is deferred until the end of the transaction,
the UPDATE will already have set cmax in the tuples of t2
that where updated. So this RAISE cannot work as expected! To
do so, the scans of t2 and t3 must be done against the
command counter of the UPDATE, but the scan of t1 against the
final command counter at COMMIT.
This mess is why I changed the rule system for v6.4 to
perform the rule actions BEFORE the original query itself.
They cannot work AFTER (what would happen for deferred ones).
P.S. I love the discussion that my simple little request has sparked.
Me too!
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#======================================== jwieck@debis.com (Jan Wieck) #
Jan Wieck wrote:
Generic triggers in C that are argument driven would be
possible. But the drawback is that those triggers have to be
very smart to use saved SPI plans (one for every different
argument set). And it must be row level triggers, so for an
update to a 2 meg row table they will be fired 2 million
times and run their queries inside - will take some time.More painful in the 2 meg row situation is that trigger
invocation has to be delayed until COMMIT if the constraint
is deferred. I think we cannot remember 2 million OLD plus 2
million NEW tuples if one tuple can have up to 8K (will be
32GB to remember plus overhead), so we need to remember at
least the CTID's of OLD and NEW and refetch them for the
trigger invocation. OUTCH - the OLD ones are at the head and
all the NEW ones are at the end of the tables file!
(Note that now in the case of UPDATE t_ctid of OLD tuples
points to TID of NEW tuples.)
I am basically asking for a reason _not_ to use the rewrite system for
this. I can't think of one myself.It might interfere with the new MVCC code. The rule actions
must see exactly the OLD tuples that where used in the
original statements. Not only those in the updated table
itself, think of an INSERT...SELECT or an UPDATE where the
TLE or qual expressions are values from other tables.
Two things define data visibility: SnapShot & CommandId.
We would have to save them for deffered rules and restore them
before run rule actions. But there is one issue: for what
scans old visibility should be used? There are scans from
user query and there are scans added by rule action. Ok,
let's assume that for added scans current visibility will be used
- this is what we need for RI rules (actually, something more -
see below).
So, first task is enable different scans in (rewritten) query
use different visibilities (SnapShot/CommandId pair -
"snapshot", in short). We have to add new stuff to Executor
and heap scan code and so I propose also new feature addition:
1. add
SET SNAPSHOT snapshot_name;
statement to let users define some snapshot.
2. extend query syntax to let users specify what snapshot
must be used when a query table is scanned:
SELECT ... FROM t1 AT SNAPSHOT s1, t2 AT SNAPSHOT s2 ...etc..
Up to now new requirement due to MVCC is taking into account
not only CommandId (as already noted in last posting I got
from Jan), but SnapShot too.
Not a real reason, just something to have in mind and maybe
switching silently to another MVCC isolation level if
constraint rules get applied, so all tables read from now on
will get a read lock applied and cannot get updated
concurrently until COMMIT.
There is no isolevel in MVCC where locking would be used
implicitly. We could use LOCK IN SHARE or SELECT FOR UPDATE
(FOR SHARE LOCK ?) - using rules for RI is like implementing
RI on applic level (!), - but this is bad.
Fortunately, there is a way without each row/table locking
if scans added by RI rule could see uncommitted chages
made by concurrent xactions. Unique btree code already use
special SnapshotDirty to see uncommitted changes and
avoid long-term row/page locking. With this Snapshot
HeapTupleSatisfies returns true if
t_xmin committed and (t_xmax is invalid OR is in-progress)
OR
t_xmin is in-progress and t_xmax is invalid
- so, caller can wait (just like the same row writers do -
by locking in-progress xaction ID in transaction pseudo-table)
for in-progress t_xmin/t_xmax xaction and decide what to do after
concurrent xaction COMMITs/ABORTs.
But before continuing with this approach I need in answer to
one question. Let's consider this case:
Xaction T1 with isolevel SERIALIZABLE inserts some row
into child table with deffered checking of primary key existance.
There were no primary key P for row inserted by T1 at the moment
when T1 begun, but before T1 begins constraint checking another
concurrent xaction T2 inserts P and commits.
After that T1 performs checking and - what? Will be the constraint
satisfied? I ask this because of all subsequent selects in T1
will not see P, but will see foreign key inserted - so, from the
point of application, child --> parent relationship will be broken...
Comments?
Could someone run test below in Oracle?
1. In session 1:
CREATE TABLE p (x integer PRIMARY KEY);
CREATE TABLE c (y integer REFERENCES p);
INSERT INTO p VALUES (1);
2. In session 2:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT * FROM p;
-- empty
INSERT INTO c VALUES (1);
-- what? waits or rejects insertion?
3. In session 1:
COMMIT;
-- what in 1 if it was waiting?
4. In session 2:
INSERT INTO c VALUES (1);
-- Ok or rejected?
SELECT * FROM p;
SELECT * FROM c;
COMMIT;
TIA !!!
And it's a problem I've came across just writing this note
where MVCC already could have broken rewrite rule system
semantics.
How?
Vadim
Vadim wrote:
Jan Wieck wrote:
(Note that now in the case of UPDATE t_ctid of OLD tuples
points to TID of NEW tuples.)Two things define data visibility: SnapShot & CommandId.
We would have to save them for deffered rules and restore them
before run rule actions. But there is one issue: for what
scans old visibility should be used? There are scans from
user query and there are scans added by rule action. Ok,
let's assume that for added scans current visibility will be used
- this is what we need for RI rules (actually, something more -
see below).
I addressed that problem (different visibility required for
scans in one command) also in my other mail.
Anyway, I just checked what happens in the following case:
T1: begin;
T1: select ...
T2: update ...
T1: select ... (gets the same (old) values)
That's the result as long as T1 doesn't run in READ COMMITTED
mode. And that's fine, because it doesn't have to worry
about concurrent transactions of others.
So the only problem left is the different visability. I think
it is possible to change the visibility code not to check
against the global command counter. Instead it might look at
a command counter value in the range table entry related to
the scan node. So the rewrite system and tcop could place
the correct values there during query rewrite/processing.
The range table of a rules rewritten parsetree is a
combination of the range tables from the original user query,
applied view rules and the the rule itself. For deferred
rules, only the those coming with the rule action itself must
have the command counter at COMMIT. All others must get the
command counter value that is there when the query that fired
this rule get's executed.
The deferred querytrees can first be held in a new list of
the rewritten querytree for the original user statement. The
rewrite system puts into the rangetable entries
USE_CURRENT_CMDID or USE_COMMIT_CMDID depending on where they
are coming from.
Before tcop calls the executor, a new function in the rewrite
system is called to set the actual values for the command
counter to use into the rangetable entries for one query and
it's deferred ones. Then it adds all the deferred queries to
the global deferred list and runs the query itself.
At commit time, when all the deferred queries have to get
run, those RTE's in them having USE_COMMIT_CMDID are set to
the command counter at commit before running the plans.
Voila.
And it's a problem I've came across just writing this note
where MVCC already could have broken rewrite rule system
semantics.How?
Yes it did!
If a transaction runs in READ COMMITTED mode, the scan for
the rules actions (performed first) could have different
results than that for the original query (performed last).
For now I see only one solution. READ COMMITTED is forbidden
for anything that invokes non-view rules. This check must be
done in the tcop and SPI, because saved SPI plans can be run
without invoking the rewrite system at any time. So the plan
must remember somewhere if READ COMMITTED is allowed for it
or not.
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#======================================== jwieck@debis.com (Jan Wieck) #
Jan Wieck wrote:
Two things define data visibility: SnapShot & CommandId.
We would have to save them for deffered rules and restore them
before run rule actions. But there is one issue: for what
scans old visibility should be used? There are scans from
user query and there are scans added by rule action. Ok,
let's assume that for added scans current visibility will be used
- this is what we need for RI rules (actually, something more -
see below).I addressed that problem (different visibility required for
scans in one command) also in my other mail.Anyway, I just checked what happens in the following case:
T1: begin;
T1: select ...T2: update ...
T1: select ... (gets the same (old) values)
That's the result as long as T1 doesn't run in READ COMMITTED
mode. And that's fine, because it doesn't have to worry
about concurrent transactions of others.So the only problem left is the different visability. I think
it is possible to change the visibility code not to check
against the global command counter. Instead it might look at
a command counter value in the range table entry related to
the scan node. So the rewrite system and tcop could place
the correct values there during query rewrite/processing.
Why you talk about CommandID only? What about SnapShot data?
The difference between scans in SERIALIZABLE/READ COMMITTED
isolevels is that in SERIALIZABLE mode all queries use
the same SnapShot data (array of running xactions)
and in READ COMMITTED mode new SnapShot data to use is created
for each query.
CommandId defines visibility of self-changes.
SnapShot defines visibility of concurrent changes.
The range table of a rules rewritten parsetree is a
combination of the range tables from the original user query,
applied view rules and the the rule itself. For deferred
rules, only the those coming with the rule action itself must
have the command counter at COMMIT. All others must get the
Actually, not at COMMIT but when SET CONSTRAINT IMMEDIATE
is called. COMMIT just imlicitly switches into immediate mode.
command counter value that is there when the query that fired
this rule get's executed.The deferred querytrees can first be held in a new list of
the rewritten querytree for the original user statement. The
rewrite system puts into the rangetable entries
USE_CURRENT_CMDID or USE_COMMIT_CMDID depending on where they
are coming from.Before tcop calls the executor, a new function in the rewrite
system is called to set the actual values for the command
counter to use into the rangetable entries for one query and
it's deferred ones. Then it adds all the deferred queries to
the global deferred list and runs the query itself.At commit time, when all the deferred queries have to get
run, those RTE's in them having USE_COMMIT_CMDID are set to
the command counter at commit before running the plans.
Voila.And it's a problem I've came across just writing this note
where MVCC already could have broken rewrite rule system
semantics.How?
Yes it did!
If a transaction runs in READ COMMITTED mode, the scan for
the rules actions (performed first) could have different
results than that for the original query (performed last).For now I see only one solution. READ COMMITTED is forbidden
for anything that invokes non-view rules. This check must be
done in the tcop and SPI, because saved SPI plans can be run
without invoking the rewrite system at any time. So the plan
must remember somewhere if READ COMMITTED is allowed for it
or not.
READ COMMITTED will be default mode when writers in this
mode will be supported...
The solution is to use the same SnapShot data for both
action' and original scans.
But there are other abilities for inconsistances in READ COMMITTED
mode:
create table t (x int);
create table tlog (xold int, xnew int);
insert into t values (1);
create rule r as on update to t do
insert into tlog values (old.x, new.x);
Now consider two concurrent
update t set x = 2 where x = 1;
and
update t set x = 3 where x = 1;
: two rows will be inserted into tlog - (1,2) and (1,3) -
by rule actions run BEFORE original queries, but actually
only one update will be succeeded - another one will see
not 1 in t.x after first update commit and so row will not
be updated by second update...
One approach is to force FOR UPDATE OF "OLD"-table
in action' INSERT INTO ... SELECT, another one is
to run rule action AFTER original query (wouldn't
it be possible having ability to directly set
scan visibilities?).
Actually, for non-locking RI implementation (using dirty read)
constraint checking must occure AFTER data changes are made
(i.e. deffered untill query end): this will allow PK deleters
notice that there are concurrent FK inserters, UK inserters see
that someone else tries to insert the same key, etc, wait for
concurrent COMMIT/ABORT and make appropriate things after that.
More letters will follow, but maybe after week-end only...
Vadim