Thread: TRUNCATE

TRUNCATE

From
"Rod Taylor"
Date:
For my own protection I'm adding checks to truncate so that if there
is an ON DELETE trigger it will not execute the truncate command.

Anyway, should it really only be 'Disallow TRUNCATE on tables that are
involved in referential constraints'?

I'm thinking it should check for an on delete rule as well as user
triggers.


--
Rod



Re: TRUNCATE

From
Tom Lane
Date:
"Rod Taylor" <rbt@zort.ca> writes:
> I'm thinking it should check for an on delete rule as well as user
> triggers.

Seems reasonable to me.

Should there be a "FORCE" option to override these checks and do it
anyway?  Or is that just asking for trouble?
        regards, tom lane


Re: TRUNCATE

From
"Rod Taylor"
Date:
The only time I can think of that a FORCE type mechanism would be
allowed would be internal functions.  Perhaps a new cluster (copy
data, truncate table, copy data back sorted).

Internal stuff can call heap_truncate() directly rather than going
through TruncateRelation.

A user style force is to simply drop all rules, foreign keys,
triggers, etc -- do the action -- re-apply constraints.  Anything else
could mean their data isn't consistent.

--
Rod
----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Rod Taylor" <rbt@zort.ca>
Cc: "Hackers List" <pgsql-hackers@postgresql.org>
Sent: Sunday, May 12, 2002 12:30 PM
Subject: Re: [HACKERS] TRUNCATE


> "Rod Taylor" <rbt@zort.ca> writes:
> > I'm thinking it should check for an on delete rule as well as user
> > triggers.
>
> Seems reasonable to me.
>
> Should there be a "FORCE" option to override these checks and do it
> anyway?  Or is that just asking for trouble?
>
> regards, tom lane
>



Re: TRUNCATE

From
"Joel Burton"
Date:
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Tom Lane
> Sent: Sunday, May 12, 2002 12:30 PM
> To: Rod Taylor
> Cc: Hackers List
> Subject: Re: [HACKERS] TRUNCATE
>
>
> "Rod Taylor" <rbt@zort.ca> writes:
> > I'm thinking it should check for an on delete rule as well as user
> > triggers.
>
> Seems reasonable to me.
>
> Should there be a "FORCE" option to override these checks and do it
> anyway?  Or is that just asking for trouble?

I've relied on being able to TRUNCATE w/o having RI kick in to lots of data
clean ups, forced sorts, etc. I'd find it annoying if I couldn't do this
anymore (or had to do equally-annoying things, like manually drop then
recreate the triggers, etc.)

I'm happy w/o the FORCE option (just let TRUNCATE do it), but if enough
people think that the FORCE keyword should be added to allow overriding of
triggers, that could be a good compromise.

But, please, don't take away the ability to TRUNCATE. Doing it when there
are triggers is one the strengths of TRUNCATE, IMNSHO.

- J.

Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant



Re: TRUNCATE

From
"Rod Taylor"
Date:
From my limited understanding of truncate in Oracle is it requires the
user to first disable integrity constraints on the table before
truncate will run.

In SQL Server that truncate will not allow truncate if foreign key
constraints exist, but does not execute user delete triggers.

Can't remember nor confirm either of these now.  But, for consistency
sake we should enforce the foreign key case.  But I really think it
should apply to all constraints, system or user enforced (rules, user
written triggers).

Besides that, theres always Codds twelfth rule which I've always
liked:
The nonsubversion rule: If low-level access is permitted it should not
bypass security or integrity rules.

--
Rod
----- Original Message -----
From: "Joel Burton" <joel@joelburton.com>
To: "Tom Lane" <tgl@sss.pgh.pa.us>; "Rod Taylor" <rbt@zort.ca>
Cc: "Hackers List" <pgsql-hackers@postgresql.org>
Sent: Sunday, May 12, 2002 3:48 PM
Subject: RE: [HACKERS] TRUNCATE


> > -----Original Message-----
> > From: pgsql-hackers-owner@postgresql.org
> > [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Tom Lane
> > Sent: Sunday, May 12, 2002 12:30 PM
> > To: Rod Taylor
> > Cc: Hackers List
> > Subject: Re: [HACKERS] TRUNCATE
> >
> >
> > "Rod Taylor" <rbt@zort.ca> writes:
> > > I'm thinking it should check for an on delete rule as well as
user
> > > triggers.
> >
> > Seems reasonable to me.
> >
> > Should there be a "FORCE" option to override these checks and do
it
> > anyway?  Or is that just asking for trouble?
>
> I've relied on being able to TRUNCATE w/o having RI kick in to lots
of data
> clean ups, forced sorts, etc. I'd find it annoying if I couldn't do
this
> anymore (or had to do equally-annoying things, like manually drop
then
> recreate the triggers, etc.)
>
> I'm happy w/o the FORCE option (just let TRUNCATE do it), but if
enough
> people think that the FORCE keyword should be added to allow
overriding of
> triggers, that could be a good compromise.
>
> But, please, don't take away the ability to TRUNCATE. Doing it when
there
> are triggers is one the strengths of TRUNCATE, IMNSHO.
>
> - J.
>
> Joel BURTON | joel@joelburton.com | joelburton.com | aim:
wjoelburton
> Knowledge Management & Technology Consultant
>



Re: TRUNCATE

From
"Christopher Kings-Lynne"
Date:
> I'm happy w/o the FORCE option (just let TRUNCATE do it), but if enough
> people think that the FORCE keyword should be added to allow overriding of
> triggers, that could be a good compromise.
>
> But, please, don't take away the ability to TRUNCATE. Doing it when there
> are triggers is one the strengths of TRUNCATE, IMNSHO.

It seems to me that there's more and more need for an 'SET CONSTRAINTS
DISABLED' and 'SET CONSTRAINTS ENABLED' command that affects only foreign
keys.  This would basically make it ignore foreign key checks for the
remainder of the transaction.  This could be used before a TRUNCATE command,
and would also be essential when we switch to dumping ALTER TABLE/FOREIGN
KEY commands in pg_dump, and we don't want them to be checked...

Chris



Re: TRUNCATE

From
"Joel Burton"
Date:
> >From my limited understanding of truncate in Oracle is it requires the
> user to first disable integrity constraints on the table before
> truncate will run.
>
> In SQL Server that truncate will not allow truncate if foreign key
> constraints exist, but does not execute user delete triggers.
>
> Can't remember nor confirm either of these now.  But, for consistency
> sake we should enforce the foreign key case.  But I really think it
> should apply to all constraints, system or user enforced (rules, user
> written triggers).
>
> Besides that, theres always Codds twelfth rule which I've always
> liked:
> The nonsubversion rule: If low-level access is permitted it should not
> bypass security or integrity rules.

Dare I go against Codd, but, really, I've found it very convenient to be
able to export a single table, TRUNCATE it, clean up the data in another
program, and pull it back in. It's much more of a pain to have to dump the
whole db (neccessary or at least sanity preserving if there are lots of
complicated foreign key or trigger rules) or to drop/recreate the
triggers/rules.

The security issue is important, though: it's very likely that I might want
to let an certain class of user DELETE a record (with all the usual
rules/triggers/RI applying), but not let them bypass all that to TRUNCATE.

But I still wouldn't want to see hassle-free truncation disappear in the
name of security or idiot-proofing, if there are reasonable compromises.

- J.

Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant



Re: TRUNCATE

From
"Joel Burton"
Date:
> -----Original Message-----
> From: Christopher Kings-Lynne [mailto:chriskl@familyhealth.com.au]
> Sent: Sunday, May 12, 2002 10:17 PM
> To: Joel Burton; Tom Lane; Rod Taylor
> Cc: Hackers List
> Subject: RE: [HACKERS] TRUNCATE
>
>
> > I'm happy w/o the FORCE option (just let TRUNCATE do it), but if enough
> > people think that the FORCE keyword should be added to allow
> overriding of
> > triggers, that could be a good compromise.
> >
> > But, please, don't take away the ability to TRUNCATE. Doing it
> when there
> > are triggers is one the strengths of TRUNCATE, IMNSHO.
>
> It seems to me that there's more and more need for an 'SET CONSTRAINTS
> DISABLED' and 'SET CONSTRAINTS ENABLED' command that affects only foreign
> keys.  This would basically make it ignore foreign key checks for the
> remainder of the transaction.  This could be used before a
> TRUNCATE command,
> and would also be essential when we switch to dumping ALTER TABLE/FOREIGN
> KEY commands in pg_dump, and we don't want them to be checked...

This would be different than SET CONSTRAINTS DEFERRED, in that DISABLED
would never perform the checks, even at the end of the transaction?

- J.

Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant



Re: TRUNCATE

From
nconway@klamath.dyndns.org (Neil Conway)
Date:
On Mon, May 13, 2002 at 10:17:07AM +0800, Christopher Kings-Lynne wrote:
> > I'm happy w/o the FORCE option (just let TRUNCATE do it), but if enough
> > people think that the FORCE keyword should be added to allow overriding of
> > triggers, that could be a good compromise.
> >
> > But, please, don't take away the ability to TRUNCATE. Doing it when there
> > are triggers is one the strengths of TRUNCATE, IMNSHO.
> 
> It seems to me that there's more and more need for an 'SET CONSTRAINTS
> DISABLED' and 'SET CONSTRAINTS ENABLED' command that affects only foreign
> keys.

I really dislike the idea of referring to "constraints" but only affecting
foreign key constraints.

And what would be the security/data-integrity ramifications of allowing
this?

Cheers,

Neil

-- 
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC


Re: TRUNCATE

From
"Christopher Kings-Lynne"
Date:
> > It seems to me that there's more and more need for an 'SET CONSTRAINTS
> > DISABLED' and 'SET CONSTRAINTS ENABLED' command that affects
> only foreign
> > keys.
>
> I really dislike the idea of referring to "constraints" but only affecting
> foreign key constraints.

All the other SET CONSTRAINTS statments refer only to foreign keys...

> And what would be the security/data-integrity ramifications of allowing
> this?

Well, if only super users could do it...

Chris



Re: TRUNCATE

From
"Rod Taylor"
Date:
I still highly recommend that it be a drop foreign key, grab data,
truncate, import data, reapply foreign key (which will double check
your work) as I believe data and schema integrity should be high goals
of Postgresql (myself anyway).

However, I'd like to know what your doing.  ie.  Why is this method
the fastest and easiest way.

Given a dataset, how much (%age wise) do you generally modify when you
clean it up? And what is the general dataset size (half million
records?).

I'm making the assumption you almost never delete data (primary key
wise), otherwise foreign keyd data may no longer align.  I'm also
making the assumption your either the sole user of the database, or
have a long period where the database is not in use (overnight?).

What do you use to clean it up?  Custom script for each job?  Regular
expressions?  Simple spreadsheet like format filling in numbers?
Complete dump and replace of the data?


Lastly, would a data diff make it easier?  Compare the data between
the table (based on the primary key) and your working copy then update
old records as necessary to bring them up to date and insert new
records?

--
Rod
----- Original Message -----
From: "Joel Burton" <joel@joelburton.com>
To: "Rod Taylor" <rbt@zort.ca>; "Tom Lane" <tgl@sss.pgh.pa.us>
Cc: "Hackers List" <pgsql-hackers@postgresql.org>
Sent: Monday, May 13, 2002 12:12 AM
Subject: Re: [HACKERS] TRUNCATE


> > >From my limited understanding of truncate in Oracle is it
requires the
> > user to first disable integrity constraints on the table before
> > truncate will run.
> >
> > In SQL Server that truncate will not allow truncate if foreign key
> > constraints exist, but does not execute user delete triggers.
> >
> > Can't remember nor confirm either of these now.  But, for
consistency
> > sake we should enforce the foreign key case.  But I really think
it
> > should apply to all constraints, system or user enforced (rules,
user
> > written triggers).
> >
> > Besides that, theres always Codds twelfth rule which I've always
> > liked:
> > The nonsubversion rule: If low-level access is permitted it should
not
> > bypass security or integrity rules.
>
> Dare I go against Codd, but, really, I've found it very convenient
to be
> able to export a single table, TRUNCATE it, clean up the data in
another
> program, and pull it back in. It's much more of a pain to have to
dump the
> whole db (neccessary or at least sanity preserving if there are lots
of
> complicated foreign key or trigger rules) or to drop/recreate the
> triggers/rules.
>
> The security issue is important, though: it's very likely that I
might want
> to let an certain class of user DELETE a record (with all the usual
> rules/triggers/RI applying), but not let them bypass all that to
TRUNCATE.
>
> But I still wouldn't want to see hassle-free truncation disappear in
the
> name of security or idiot-proofing, if there are reasonable
compromises.
>
> - J.
>
> Joel BURTON | joel@joelburton.com | joelburton.com | aim:
wjoelburton
> Knowledge Management & Technology Consultant
>
>
> ---------------------------(end of
broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>



Re: TRUNCATE

From
"Joel Burton"
Date:
> I still highly recommend that it be a drop foreign key, grab data,
> truncate, import data, reapply foreign key (which will double check
> your work) as I believe data and schema integrity should be high goals
> of Postgresql (myself anyway).

I agree that they should be high goals.

> However, I'd like to know what your doing.  ie.  Why is this method
> the fastest and easiest way.

It's easier than dropping and recreating rules because that takes a bit of
trouble. (If there were any easy way in pg_dump or in psql directly to get
the text of just the rules/triggers/RI declarations for a table, that would
make it a bit easier than pulling that out of the other table stuff in
pg_dump output).

It's easier than a full-database dump/fix/restore because sometimes
(hopefully now historically :) ) pg_dump wasn't a perfect tool: for a while,
it would drop RI statements, or occassionally have a hard time recreating a
view, etc. Plus, of course, with a large database, it can take quite a while
to process.

A limited-to-that-table dump/fix/restore can be a problem because of the
interrelationships of RI among tables. If there were any easier way to dump
information about a table so that I could restore the RI that other tables
have on it, that might be a solution.

> Given a dataset, how much (%age wise) do you generally modify when you
> clean it up? And what is the general dataset size (half million
> records?).

More often than not, I'm working with complex tables and fairly small # of
rows. Perhaps 30 fields x 10,000 records.

> I'm making the assumption you almost never delete data (primary key
> wise), otherwise foreign keyd data may no longer align.  I'm also
> making the assumption your either the sole user of the database, or
> have a long period where the database is not in use (overnight?).

No, I wouldn't delete things. I don't want to bypass RI, just not have to
deal with removing/creating all the rules every time I need to clean up some
data.

In most cases, yes, I can either take db offline for an hour or ensure that
there will be no writes to the db.

> What do you use to clean it up?  Custom script for each job?  Regular
> expressions?  Simple spreadsheet like format filling in numbers?
> Complete dump and replace of the data?

Generally, I'm doing something like pulling the data into a text file and
using regexes or spreadsheet tools to clean it up. Some of which could be
done (through plperl or plpython or such), but is often easier with full
text manipulation/emacs/etc.

Sometimes, though, I'm just cleaning out test data. For example: often, I'll
create a table where records can't be deleted w/out logging information
going into another table (via rule or trigger, and I usually prohibit
deletions at all from the log table). I'll put some fake records in, delete
a few, see the logging data, and later, when I want to delete the fake data
(& the fake logging data), I'll use TRUNCATE. I could only do this w/a
normal DELETE by dropping these rules/triggers, deleting, and re-creating.
Which is more of a pain than I'd like to do.

Given that only the owner of a table can truncate it, I'm not too worried
about the security of truncate: the owner is the person who would understand
the ramifications of truncate vs. delete. Having it either emit a warning
that there were triggers/rules/RI or (better) requiring a FORCE parameter to
truncate when there are might make others feel safe, though.

- J.

Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant



Re: TRUNCATE

From
"Rod Taylor"
Date:
> A limited-to-that-table dump/fix/restore can be a problem because of
the
> interrelationships of RI among tables. If there were any easier way
to dump
> information about a table so that I could restore the RI that other
tables
> have on it, that might be a solution.

Agreed about making that easier.

> > What do you use to clean it up?  Custom script for each job?
Regular
> > expressions?  Simple spreadsheet like format filling in numbers?
> > Complete dump and replace of the data?
>
> Generally, I'm doing something like pulling the data into a text
file and
> using regexes or spreadsheet tools to clean it up. Some of which
could be
> done (through plperl or plpython or such), but is often easier with
full
> text manipulation/emacs/etc.

Internal regex support would be useful, as would plpgsql from anywhere
(merge most into standard frontend parser).

> Sometimes, though, I'm just cleaning out test data. For example:
often, I'll
> create a table where records can't be deleted w/out logging
information

You don't create database testdb with template = productiondb?
Especially since you take it offline anyway.

> that there were triggers/rules/RI or (better) requiring a FORCE
parameter to
> truncate when there are might make others feel safe, though.

FORCE doesn't really solve the issue for me.  I want to remove the
ability to unexpectedly mess up the database.  They're usually good
enough to know that drop database is a bad thing.  But some of the
other commands have interesting seemingly non-related failures.
Truncate was one, object inter-dependence (what pg_depend covers) was
another area.

Anyway, I'm willing to wait until I (or someone else) can remove the
advantages of truncate over other methods :)