foreign keys and transactions

Started by Edwin Grubbsabout 24 years ago6 messagesgeneral

egrubbs@rackspace.com

about 24 years ago

Does anyone know a good solution to determining whether a row is
referenced by a foreign key? The problem is that multiple tables may have
foreign keys referencing a single table; therefore, even if you delete a
given foreign key from one table, the delete on the table with the primary
key may fail, which will cause the transaction to abort. For example, a
table of contacts might be referenced by foreign keys in an account table,
a log table, a group table, and an employee table. If we delete an
account, we want to try to delete the contact, and if it fails we can go
on our merry way because it should just mean that it is referenced by
another table. Querying every single table that could possibly have a
foreign key referencing the contact seems error prone and a duplication of
the foreign key checks.

I don't want to just have a separate transaction for each delete from the
table with the primary key, since that will require placing all the
deletes after the transaction which contains all the other statements.
This would make it unbelievably difficult to use functions in our code to
handle related sql queries, since all the deletes would have to be
postponed till after the rest of the transaction has finished.

-Edwin Grubbs

Jan Wieck

JanWieck@Yahoo.com

about 24 years ago

In reply to: Edwin Grubbs (#1)

Re: foreign keys and transactions

Edwin Grubbs wrote:

Does anyone know a good solution to determining whether a row is
referenced by a foreign key? The problem is that multiple tables may have
foreign keys referencing a single table; therefore, even if you delete a
given foreign key from one table, the delete on the table with the primary
key may fail, which will cause the transaction to abort. For example, a
table of contacts might be referenced by foreign keys in an account table,
a log table, a group table, and an employee table. If we delete an
account, we want to try to delete the contact, and if it fails we can go
on our merry way because it should just mean that it is referenced by
another table. Querying every single table that could possibly have a
foreign key referencing the contact seems error prone and a duplication of
the foreign key checks.

I don't want to just have a separate transaction for each delete from the
table with the primary key, since that will require placing all the
deletes after the transaction which contains all the other statements.
This would make it unbelievably difficult to use functions in our code to
handle related sql queries, since all the deletes would have to be
postponed till after the rest of the transaction has finished.

Edwin,

you should define for yourself what the different REFERENCES
from all the foreign key tables mean in your business model,
and then declare them accordingly.

Foreign keys in PostgreSQL can have referential actions. So
you can define per reference, what to do if the primary key
get's changed or deleted. For changes (if you allow them at
all), CASCADE is what you probably want, because all the
references would silently follow.

But for removal of a primary key, you have to decide what the
reference should do. Basically you have these options:

1. ON DELETE NO ACTION

Raise an error and abort the transaction. You know that
one too well already :-)

2. ON DELETE SET NULL

Set the reference to the SQL NULL value. This means, that
the row referencing stays intact, but the foreign key
fields get NULLd out.

3. ON DELETE SET DEFAULT

Similar to SET NULL, but the column default values are
used.

4. ON DELETE CASCADE

The referencing rows get deleted silently.

Jan

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

Stephan Szabo

sszabo@megazone23.bigpanda.com

about 24 years ago

In reply to: Edwin Grubbs (#1)

Re: foreign keys and transactions

On Fri, 8 Feb 2002, Edwin Grubbs wrote:

Does anyone know a good solution to determining whether a row is
referenced by a foreign key? The problem is that multiple tables may have
foreign keys referencing a single table; therefore, even if you delete a
given foreign key from one table, the delete on the table with the primary
key may fail, which will cause the transaction to abort. For example, a
table of contacts might be referenced by foreign keys in an account table,
a log table, a group table, and an employee table. If we delete an
account, we want to try to delete the contact, and if it fails we can go
on our merry way because it should just mean that it is referenced by
another table. Querying every single table that could possibly have a
foreign key referencing the contact seems error prone and a duplication of
the foreign key checks.

Well, this sounds like something that nested transactions or savepoints
might be able to help with, but neither of those are implemented
currently. ISTM, you're kind of trying to do something foreign keys
really weren't meant for.

You might be able to make a trigger that does reference counting. You'd
have to pay attention to concurrent transaction issues and locking, but
that's managable (at least to the level the fk constraints do which is
not quite good.)

Jason Earl

jason.earl@simplot.com

about 24 years ago

In reply to: Edwin Grubbs (#1)

Re: foreign keys and transactions

I am a little confused about what exactly you are trying to say, so if
this comes out all wrong please forgive me.

Edwin Grubbs <egrubbs@rackspace.com> writes:

Does anyone know a good solution to determining whether a row is
referenced by a foreign key? The problem is that multiple tables may
have foreign keys referencing a single table; therefore, even if you
delete a given foreign key from one table, the delete on the table
with the primary key may fail, which will cause the transaction to
abort. For example, a table of contacts might be referenced by
foreign keys in an account table, a log table, a group table, and an
employee table. If we delete an account, we want to try to delete
the contact, and if it fails we can go on our merry way because it
should just mean that it is referenced by another table. Querying
every single table that could possibly have a foreign key
referencing the contact seems error prone and a duplication of the
foreign key checks.

Let's pretend you had a simple schema like this:

CREATE TABLE people (
id SERIAL NOT NULL PRIMARY KEY,
last_name text,
first_name text
);

CREATE TABLE orders (
id SERIAL NOT NULL PRIMARY KEY,
customer int REFERENCES people(id),
item int
);

CREATE TABLE shipments (
id SERIAL NOT NULL PRIMARY KEY,
customer int REFERENCES people(id),
item int
);

Now, clearly this is over-simplified and for educational purposes
only. With this particular arrangement deletes to the "people" table
will fail if there are rows that reference the key in either
"shipments" or "orders." That's probably precisely what you want in
this case, because you don't want to lose orders or lose track of your
shipments. However, let's say that for some reason you wanted to be
able to remove people that had orders, but no shipments. You could
start a transaction with BEGIN, delete from orders where 'customer'
was equal to a certain number and then delete from people where id was
equal to that same number. If that person had an entry in the
shipments table then the delete from people would fail, and the
transaction would get rolled back. The SQL would look like this:

BEGIN;
DELETE FROM orders WHERE customer = <your_people.id>;
DELETE FROM people WHERE id = <your_people.id>;
COMMIT;

If you *always* wanted to be able to delete folks from the people
table who only had entries in the order table you could change the
schema slightly so that it looked like this:

CREATE TABLE orders (
id SERIAL NOT NULL PRIMARY KEY,
customer int REFERENCES people(id)
ON DELETE CASCADE,
item int
);

Then you could remove people from both the people table and the orders
table with a simple:

DELETE FROM people WHERE id = <your_people.id>;

Once again, this would do the right thing if the user only had entries
in the orders table, and if the person had entries in the shipments
table as well it would fail with an error. The only real difference

I don't want to just have a separate transaction for each delete
from the table with the primary key, since that will require placing
all the deletes after the transaction which contains all the other
statements. This would make it unbelievably difficult to use
functions in our code to handle related sql queries, since all the
deletes would have to be postponed till after the rest of the
transaction has finished.

-Edwin Grubbs

I hope that this was an example of what you were looking for. If not,
you might want to be a bit more specific. And remember, transactions
are your friends.

Jason

Edwin Grubbs

egrubbs@rackspace.com

about 24 years ago

In reply to: Jason Earl (#4)

Re: foreign keys and transactions

My main concern is with a transaction being aborted by trying to delete a
row that still has a foreign key referencing it. In your example schema
below, I could check for foreign keys by just checking the "orders" and
"shipments" tables. However, if I then add a "notes" table, I would have
to go through all the code that tries to delete the row out of the
"person" table to make sure it checks the "notes" table for foreign keys
referencing that row. Even though I don't care if the delete statement
fails, I still have to worry about it aborting the entire transaction. It
just seems that there should be an easier way to check for foreign keys
or to prevent the delete from aborting the rest of the transaction.

-Edwin

On 8 Feb 2002, Jason Earl wrote:

Show quoted text

I am a little confused about what exactly you are trying to say, so if
this comes out all wrong please forgive me.

Edwin Grubbs <egrubbs@rackspace.com> writes:

Does anyone know a good solution to determining whether a row is
referenced by a foreign key? The problem is that multiple tables may
have foreign keys referencing a single table; therefore, even if you
delete a given foreign key from one table, the delete on the table
with the primary key may fail, which will cause the transaction to
abort. For example, a table of contacts might be referenced by
foreign keys in an account table, a log table, a group table, and an
employee table. If we delete an account, we want to try to delete
the contact, and if it fails we can go on our merry way because it
should just mean that it is referenced by another table. Querying
every single table that could possibly have a foreign key
referencing the contact seems error prone and a duplication of the
foreign key checks.

Let's pretend you had a simple schema like this:

CREATE TABLE people (
id SERIAL NOT NULL PRIMARY KEY,
last_name text,
first_name text
);

CREATE TABLE orders (
id SERIAL NOT NULL PRIMARY KEY,
customer int REFERENCES people(id),
item int
);

CREATE TABLE shipments (
id SERIAL NOT NULL PRIMARY KEY,
customer int REFERENCES people(id),
item int
);

Now, clearly this is over-simplified and for educational purposes
only. With this particular arrangement deletes to the "people" table
will fail if there are rows that reference the key in either
"shipments" or "orders." That's probably precisely what you want in
this case, because you don't want to lose orders or lose track of your
shipments. However, let's say that for some reason you wanted to be
able to remove people that had orders, but no shipments. You could
start a transaction with BEGIN, delete from orders where 'customer'
was equal to a certain number and then delete from people where id was
equal to that same number. If that person had an entry in the
shipments table then the delete from people would fail, and the
transaction would get rolled back. The SQL would look like this:

BEGIN;
DELETE FROM orders WHERE customer = <your_people.id>;
DELETE FROM people WHERE id = <your_people.id>;
COMMIT;

If you *always* wanted to be able to delete folks from the people
table who only had entries in the order table you could change the
schema slightly so that it looked like this:

CREATE TABLE orders (
id SERIAL NOT NULL PRIMARY KEY,
customer int REFERENCES people(id)
ON DELETE CASCADE,
item int
);

Then you could remove people from both the people table and the orders
table with a simple:

DELETE FROM people WHERE id = <your_people.id>;

Once again, this would do the right thing if the user only had entries
in the orders table, and if the person had entries in the shipments
table as well it would fail with an error. The only real difference

I don't want to just have a separate transaction for each delete
from the table with the primary key, since that will require placing
all the deletes after the transaction which contains all the other
statements. This would make it unbelievably difficult to use
functions in our code to handle related sql queries, since all the
deletes would have to be postponed till after the rest of the
transaction has finished.

-Edwin Grubbs

I hope that this was an example of what you were looking for. If not,
you might want to be a bit more specific. And remember, transactions
are your friends.

Jason

Jason Earl

jason.earl@simplot.com

about 24 years ago

In reply to: Edwin Grubbs (#5)

Re: foreign keys and transactions

Edwin Grubbs <egrubbs@rackspace.com> writes:

My main concern is with a transaction being aborted by trying to delete a
row that still has a foreign key referencing it. In your example schema
below, I could check for foreign keys by just checking the "orders" and
"shipments" tables. However, if I then add a "notes" table, I would have
to go through all the code that tries to delete the row out of the
"person" table to make sure it checks the "notes" table for foreign keys
referencing that row. Even though I don't care if the delete statement
fails, I still have to worry about it aborting the entire transaction. It
just seems that there should be an easier way to check for foreign keys
or to prevent the delete from aborting the rest of the transaction.

-Edwin

That leaves you with one of two options. Either you code into your
application smarts about which tables might rely on "people" or you
simply set all of your tables with the appropriate ON DELETE setting.

For example, if you want to be able to blow away rows in people no
matter what rows might be referencing it in other tables make sure
that when you create your tables they all have appropriate ON DELETE
settings. Like this:

CREATE TABLE notes (
id SERIAL NOT NULL PRIMARY KEY,
customer references people(id)
ON DELETE CASCADE,
);

Now you can delete every all the rows from people that you want, and
the corresponding notes will also disappear. You can also use ON
DELETE NULL if you don't want to potentially blow away valuable
information. If you do that to *every* table that references "people"
then you can delete from people and the referencing rows in the other
tables will simply disappear.

It sounds like you are trying to make things harder than they are. If
you set all of your referencing tables to CASCADE deletes then
PostgreSQL will take care of all of the work of deleting those rows
that reference your foreign key. So just add ON CASCADE DELETE to all
of your tables and be happy :).

Try it, you'll like it.

Jason

Show quoted text

On 8 Feb 2002, Jason Earl wrote:

I am a little confused about what exactly you are trying to say, so if
this comes out all wrong please forgive me.

Edwin Grubbs <egrubbs@rackspace.com> writes:

Does anyone know a good solution to determining whether a row is
referenced by a foreign key? The problem is that multiple tables may
have foreign keys referencing a single table; therefore, even if you
delete a given foreign key from one table, the delete on the table
with the primary key may fail, which will cause the transaction to
abort. For example, a table of contacts might be referenced by
foreign keys in an account table, a log table, a group table, and an
employee table. If we delete an account, we want to try to delete
the contact, and if it fails we can go on our merry way because it
should just mean that it is referenced by another table. Querying
every single table that could possibly have a foreign key
referencing the contact seems error prone and a duplication of the
foreign key checks.

Let's pretend you had a simple schema like this:

CREATE TABLE people (
id SERIAL NOT NULL PRIMARY KEY,
last_name text,
first_name text
);

CREATE TABLE orders (
id SERIAL NOT NULL PRIMARY KEY,
customer int REFERENCES people(id),
item int
);

CREATE TABLE shipments (
id SERIAL NOT NULL PRIMARY KEY,
customer int REFERENCES people(id),
item int
);

Now, clearly this is over-simplified and for educational purposes
only. With this particular arrangement deletes to the "people" table
will fail if there are rows that reference the key in either
"shipments" or "orders." That's probably precisely what you want in
this case, because you don't want to lose orders or lose track of your
shipments. However, let's say that for some reason you wanted to be
able to remove people that had orders, but no shipments. You could
start a transaction with BEGIN, delete from orders where 'customer'
was equal to a certain number and then delete from people where id was
equal to that same number. If that person had an entry in the
shipments table then the delete from people would fail, and the
transaction would get rolled back. The SQL would look like this:

BEGIN;
DELETE FROM orders WHERE customer = <your_people.id>;
DELETE FROM people WHERE id = <your_people.id>;
COMMIT;

If you *always* wanted to be able to delete folks from the people
table who only had entries in the order table you could change the
schema slightly so that it looked like this:

CREATE TABLE orders (
id SERIAL NOT NULL PRIMARY KEY,
customer int REFERENCES people(id)
ON DELETE CASCADE,
item int
);

Then you could remove people from both the people table and the orders
table with a simple:

DELETE FROM people WHERE id = <your_people.id>;

Once again, this would do the right thing if the user only had entries
in the orders table, and if the person had entries in the shipments
table as well it would fail with an error. The only real difference

I don't want to just have a separate transaction for each delete
from the table with the primary key, since that will require placing
all the deletes after the transaction which contains all the other
statements. This would make it unbelievably difficult to use
functions in our code to handle related sql queries, since all the
deletes would have to be postponed till after the rest of the
transaction has finished.

-Edwin Grubbs

I hope that this was an example of what you were looking for. If not,
you might want to be a bit more specific. And remember, transactions
are your friends.

Jason