TRUNCATE

Started by Rod Taylorover 23 years ago13 messages
#1Rod Taylor
rbt@zort.ca

For my own protection I'm adding checks to truncate so that if there
is an ON DELETE trigger it will not execute the truncate command.

Anyway, should it really only be 'Disallow TRUNCATE on tables that are
involved in referential constraints'?

I'm thinking it should check for an on delete rule as well as user
triggers.

--
Rod

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rod Taylor (#1)
Re: TRUNCATE

"Rod Taylor" <rbt@zort.ca> writes:

I'm thinking it should check for an on delete rule as well as user
triggers.

Seems reasonable to me.

Should there be a "FORCE" option to override these checks and do it
anyway? Or is that just asking for trouble?

regards, tom lane

#3Rod Taylor
rbt@zort.ca
In reply to: Rod Taylor (#1)
Re: TRUNCATE

The only time I can think of that a FORCE type mechanism would be
allowed would be internal functions. Perhaps a new cluster (copy
data, truncate table, copy data back sorted).

Internal stuff can call heap_truncate() directly rather than going
through TruncateRelation.

A user style force is to simply drop all rules, foreign keys,
triggers, etc -- do the action -- re-apply constraints. Anything else
could mean their data isn't consistent.

--
Rod
----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Rod Taylor" <rbt@zort.ca>
Cc: "Hackers List" <pgsql-hackers@postgresql.org>
Sent: Sunday, May 12, 2002 12:30 PM
Subject: Re: [HACKERS] TRUNCATE

Show quoted text

"Rod Taylor" <rbt@zort.ca> writes:

I'm thinking it should check for an on delete rule as well as user
triggers.

Seems reasonable to me.

Should there be a "FORCE" option to override these checks and do it
anyway? Or is that just asking for trouble?

regards, tom lane

#4Joel Burton
joel@joelburton.com
In reply to: Tom Lane (#2)
Re: TRUNCATE

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Tom Lane
Sent: Sunday, May 12, 2002 12:30 PM
To: Rod Taylor
Cc: Hackers List
Subject: Re: [HACKERS] TRUNCATE

"Rod Taylor" <rbt@zort.ca> writes:

I'm thinking it should check for an on delete rule as well as user
triggers.

Seems reasonable to me.

Should there be a "FORCE" option to override these checks and do it
anyway? Or is that just asking for trouble?

I've relied on being able to TRUNCATE w/o having RI kick in to lots of data
clean ups, forced sorts, etc. I'd find it annoying if I couldn't do this
anymore (or had to do equally-annoying things, like manually drop then
recreate the triggers, etc.)

I'm happy w/o the FORCE option (just let TRUNCATE do it), but if enough
people think that the FORCE keyword should be added to allow overriding of
triggers, that could be a good compromise.

But, please, don't take away the ability to TRUNCATE. Doing it when there
are triggers is one the strengths of TRUNCATE, IMNSHO.

- J.

Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant

#5Rod Taylor
rbt@zort.ca
In reply to: Joel Burton (#4)
Re: TRUNCATE

From my limited understanding of truncate in Oracle is it requires the
user to first disable integrity constraints on the table before
truncate will run.

In SQL Server that truncate will not allow truncate if foreign key
constraints exist, but does not execute user delete triggers.

Can't remember nor confirm either of these now. But, for consistency
sake we should enforce the foreign key case. But I really think it
should apply to all constraints, system or user enforced (rules, user
written triggers).

Besides that, theres always Codds twelfth rule which I've always
liked:
The nonsubversion rule: If low-level access is permitted it should not
bypass security or integrity rules.

--
Rod
----- Original Message -----
From: "Joel Burton" <joel@joelburton.com>
To: "Tom Lane" <tgl@sss.pgh.pa.us>; "Rod Taylor" <rbt@zort.ca>
Cc: "Hackers List" <pgsql-hackers@postgresql.org>
Sent: Sunday, May 12, 2002 3:48 PM
Subject: RE: [HACKERS] TRUNCATE

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Tom Lane
Sent: Sunday, May 12, 2002 12:30 PM
To: Rod Taylor
Cc: Hackers List
Subject: Re: [HACKERS] TRUNCATE

"Rod Taylor" <rbt@zort.ca> writes:

I'm thinking it should check for an on delete rule as well as

user

triggers.

Seems reasonable to me.

Should there be a "FORCE" option to override these checks and do

it

anyway? Or is that just asking for trouble?

I've relied on being able to TRUNCATE w/o having RI kick in to lots

of data

clean ups, forced sorts, etc. I'd find it annoying if I couldn't do

this

anymore (or had to do equally-annoying things, like manually drop

then

recreate the triggers, etc.)

I'm happy w/o the FORCE option (just let TRUNCATE do it), but if

enough

people think that the FORCE keyword should be added to allow

overriding of

triggers, that could be a good compromise.

But, please, don't take away the ability to TRUNCATE. Doing it when

there

are triggers is one the strengths of TRUNCATE, IMNSHO.

- J.

Joel BURTON | joel@joelburton.com | joelburton.com | aim:

wjoelburton

Show quoted text

Knowledge Management & Technology Consultant

#6Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Joel Burton (#4)
Re: TRUNCATE

I'm happy w/o the FORCE option (just let TRUNCATE do it), but if enough
people think that the FORCE keyword should be added to allow overriding of
triggers, that could be a good compromise.

But, please, don't take away the ability to TRUNCATE. Doing it when there
are triggers is one the strengths of TRUNCATE, IMNSHO.

It seems to me that there's more and more need for an 'SET CONSTRAINTS
DISABLED' and 'SET CONSTRAINTS ENABLED' command that affects only foreign
keys. This would basically make it ignore foreign key checks for the
remainder of the transaction. This could be used before a TRUNCATE command,
and would also be essential when we switch to dumping ALTER TABLE/FOREIGN
KEY commands in pg_dump, and we don't want them to be checked...

Chris

#7Joel Burton
joel@joelburton.com
In reply to: Rod Taylor (#5)
Re: TRUNCATE

From my limited understanding of truncate in Oracle is it requires the

user to first disable integrity constraints on the table before
truncate will run.

In SQL Server that truncate will not allow truncate if foreign key
constraints exist, but does not execute user delete triggers.

Can't remember nor confirm either of these now. But, for consistency
sake we should enforce the foreign key case. But I really think it
should apply to all constraints, system or user enforced (rules, user
written triggers).

Besides that, theres always Codds twelfth rule which I've always
liked:
The nonsubversion rule: If low-level access is permitted it should not
bypass security or integrity rules.

Dare I go against Codd, but, really, I've found it very convenient to be
able to export a single table, TRUNCATE it, clean up the data in another
program, and pull it back in. It's much more of a pain to have to dump the
whole db (neccessary or at least sanity preserving if there are lots of
complicated foreign key or trigger rules) or to drop/recreate the
triggers/rules.

The security issue is important, though: it's very likely that I might want
to let an certain class of user DELETE a record (with all the usual
rules/triggers/RI applying), but not let them bypass all that to TRUNCATE.

But I still wouldn't want to see hassle-free truncation disappear in the
name of security or idiot-proofing, if there are reasonable compromises.

- J.

Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant

#8Joel Burton
joel@joelburton.com
In reply to: Christopher Kings-Lynne (#6)
Re: TRUNCATE

-----Original Message-----
From: Christopher Kings-Lynne [mailto:chriskl@familyhealth.com.au]
Sent: Sunday, May 12, 2002 10:17 PM
To: Joel Burton; Tom Lane; Rod Taylor
Cc: Hackers List
Subject: RE: [HACKERS] TRUNCATE

I'm happy w/o the FORCE option (just let TRUNCATE do it), but if enough
people think that the FORCE keyword should be added to allow

overriding of

triggers, that could be a good compromise.

But, please, don't take away the ability to TRUNCATE. Doing it

when there

are triggers is one the strengths of TRUNCATE, IMNSHO.

It seems to me that there's more and more need for an 'SET CONSTRAINTS
DISABLED' and 'SET CONSTRAINTS ENABLED' command that affects only foreign
keys. This would basically make it ignore foreign key checks for the
remainder of the transaction. This could be used before a
TRUNCATE command,
and would also be essential when we switch to dumping ALTER TABLE/FOREIGN
KEY commands in pg_dump, and we don't want them to be checked...

This would be different than SET CONSTRAINTS DEFERRED, in that DISABLED
would never perform the checks, even at the end of the transaction?

- J.

Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant

#9Noname
nconway@klamath.dyndns.org
In reply to: Christopher Kings-Lynne (#6)
Re: TRUNCATE

On Mon, May 13, 2002 at 10:17:07AM +0800, Christopher Kings-Lynne wrote:

I'm happy w/o the FORCE option (just let TRUNCATE do it), but if enough
people think that the FORCE keyword should be added to allow overriding of
triggers, that could be a good compromise.

But, please, don't take away the ability to TRUNCATE. Doing it when there
are triggers is one the strengths of TRUNCATE, IMNSHO.

It seems to me that there's more and more need for an 'SET CONSTRAINTS
DISABLED' and 'SET CONSTRAINTS ENABLED' command that affects only foreign
keys.

I really dislike the idea of referring to "constraints" but only affecting
foreign key constraints.

And what would be the security/data-integrity ramifications of allowing
this?

Cheers,

Neil

--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC

#10Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Noname (#9)
Re: TRUNCATE

It seems to me that there's more and more need for an 'SET CONSTRAINTS
DISABLED' and 'SET CONSTRAINTS ENABLED' command that affects

only foreign

keys.

I really dislike the idea of referring to "constraints" but only affecting
foreign key constraints.

All the other SET CONSTRAINTS statments refer only to foreign keys...

And what would be the security/data-integrity ramifications of allowing
this?

Well, if only super users could do it...

Chris

#11Rod Taylor
rbt@zort.ca
In reply to: Joel Burton (#7)
Re: TRUNCATE

I still highly recommend that it be a drop foreign key, grab data,
truncate, import data, reapply foreign key (which will double check
your work) as I believe data and schema integrity should be high goals
of Postgresql (myself anyway).

However, I'd like to know what your doing. ie. Why is this method
the fastest and easiest way.

Given a dataset, how much (%age wise) do you generally modify when you
clean it up? And what is the general dataset size (half million
records?).

I'm making the assumption you almost never delete data (primary key
wise), otherwise foreign keyd data may no longer align. I'm also
making the assumption your either the sole user of the database, or
have a long period where the database is not in use (overnight?).

What do you use to clean it up? Custom script for each job? Regular
expressions? Simple spreadsheet like format filling in numbers?
Complete dump and replace of the data?

Lastly, would a data diff make it easier? Compare the data between
the table (based on the primary key) and your working copy then update
old records as necessary to bring them up to date and insert new
records?

--
Rod
----- Original Message -----
From: "Joel Burton" <joel@joelburton.com>
To: "Rod Taylor" <rbt@zort.ca>; "Tom Lane" <tgl@sss.pgh.pa.us>
Cc: "Hackers List" <pgsql-hackers@postgresql.org>
Sent: Monday, May 13, 2002 12:12 AM
Subject: Re: [HACKERS] TRUNCATE

From my limited understanding of truncate in Oracle is it

requires the

user to first disable integrity constraints on the table before
truncate will run.

In SQL Server that truncate will not allow truncate if foreign key
constraints exist, but does not execute user delete triggers.

Can't remember nor confirm either of these now. But, for

consistency

sake we should enforce the foreign key case. But I really think

it

should apply to all constraints, system or user enforced (rules,

user

written triggers).

Besides that, theres always Codds twelfth rule which I've always
liked:
The nonsubversion rule: If low-level access is permitted it should

not

bypass security or integrity rules.

Dare I go against Codd, but, really, I've found it very convenient

to be

able to export a single table, TRUNCATE it, clean up the data in

another

program, and pull it back in. It's much more of a pain to have to

dump the

whole db (neccessary or at least sanity preserving if there are lots

of

complicated foreign key or trigger rules) or to drop/recreate the
triggers/rules.

The security issue is important, though: it's very likely that I

might want

to let an certain class of user DELETE a record (with all the usual
rules/triggers/RI applying), but not let them bypass all that to

TRUNCATE.

But I still wouldn't want to see hassle-free truncation disappear in

the

name of security or idiot-proofing, if there are reasonable

compromises.

- J.

Joel BURTON | joel@joelburton.com | joelburton.com | aim:

wjoelburton

Knowledge Management & Technology Consultant

---------------------------(end of

broadcast)---------------------------

Show quoted text

TIP 4: Don't 'kill -9' the postmaster

#12Joel Burton
joel@joelburton.com
In reply to: Rod Taylor (#11)
Re: TRUNCATE

I still highly recommend that it be a drop foreign key, grab data,
truncate, import data, reapply foreign key (which will double check
your work) as I believe data and schema integrity should be high goals
of Postgresql (myself anyway).

I agree that they should be high goals.

However, I'd like to know what your doing. ie. Why is this method
the fastest and easiest way.

It's easier than dropping and recreating rules because that takes a bit of
trouble. (If there were any easy way in pg_dump or in psql directly to get
the text of just the rules/triggers/RI declarations for a table, that would
make it a bit easier than pulling that out of the other table stuff in
pg_dump output).

It's easier than a full-database dump/fix/restore because sometimes
(hopefully now historically :) ) pg_dump wasn't a perfect tool: for a while,
it would drop RI statements, or occassionally have a hard time recreating a
view, etc. Plus, of course, with a large database, it can take quite a while
to process.

A limited-to-that-table dump/fix/restore can be a problem because of the
interrelationships of RI among tables. If there were any easier way to dump
information about a table so that I could restore the RI that other tables
have on it, that might be a solution.

Given a dataset, how much (%age wise) do you generally modify when you
clean it up? And what is the general dataset size (half million
records?).

More often than not, I'm working with complex tables and fairly small # of
rows. Perhaps 30 fields x 10,000 records.

I'm making the assumption you almost never delete data (primary key
wise), otherwise foreign keyd data may no longer align. I'm also
making the assumption your either the sole user of the database, or
have a long period where the database is not in use (overnight?).

No, I wouldn't delete things. I don't want to bypass RI, just not have to
deal with removing/creating all the rules every time I need to clean up some
data.

In most cases, yes, I can either take db offline for an hour or ensure that
there will be no writes to the db.

What do you use to clean it up? Custom script for each job? Regular
expressions? Simple spreadsheet like format filling in numbers?
Complete dump and replace of the data?

Generally, I'm doing something like pulling the data into a text file and
using regexes or spreadsheet tools to clean it up. Some of which could be
done (through plperl or plpython or such), but is often easier with full
text manipulation/emacs/etc.

Sometimes, though, I'm just cleaning out test data. For example: often, I'll
create a table where records can't be deleted w/out logging information
going into another table (via rule or trigger, and I usually prohibit
deletions at all from the log table). I'll put some fake records in, delete
a few, see the logging data, and later, when I want to delete the fake data
(& the fake logging data), I'll use TRUNCATE. I could only do this w/a
normal DELETE by dropping these rules/triggers, deleting, and re-creating.
Which is more of a pain than I'd like to do.

Given that only the owner of a table can truncate it, I'm not too worried
about the security of truncate: the owner is the person who would understand
the ramifications of truncate vs. delete. Having it either emit a warning
that there were triggers/rules/RI or (better) requiring a FORCE parameter to
truncate when there are might make others feel safe, though.

- J.

Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant

#13Rod Taylor
rbt@zort.ca
In reply to: Joel Burton (#12)
Re: TRUNCATE

A limited-to-that-table dump/fix/restore can be a problem because of

the

interrelationships of RI among tables. If there were any easier way

to dump

information about a table so that I could restore the RI that other

tables

have on it, that might be a solution.

Agreed about making that easier.

What do you use to clean it up? Custom script for each job?

Regular

expressions? Simple spreadsheet like format filling in numbers?
Complete dump and replace of the data?

Generally, I'm doing something like pulling the data into a text

file and

using regexes or spreadsheet tools to clean it up. Some of which

could be

done (through plperl or plpython or such), but is often easier with

full

text manipulation/emacs/etc.

Internal regex support would be useful, as would plpgsql from anywhere
(merge most into standard frontend parser).

Sometimes, though, I'm just cleaning out test data. For example:

often, I'll

create a table where records can't be deleted w/out logging

information

You don't create database testdb with template = productiondb?
Especially since you take it offline anyway.

that there were triggers/rules/RI or (better) requiring a FORCE

parameter to

truncate when there are might make others feel safe, though.

FORCE doesn't really solve the issue for me. I want to remove the
ability to unexpectedly mess up the database. They're usually good
enough to know that drop database is a bad thing. But some of the
other commands have interesting seemingly non-related failures.
Truncate was one, object inter-dependence (what pg_depend covers) was
another area.

Anyway, I'm willing to wait until I (or someone else) can remove the
advantages of truncate over other methods :)