Thread: TRUNCATE
For my own protection I'm adding checks to truncate so that if there is an ON DELETE trigger it will not execute the truncate command. Anyway, should it really only be 'Disallow TRUNCATE on tables that are involved in referential constraints'? I'm thinking it should check for an on delete rule as well as user triggers. -- Rod
"Rod Taylor" <rbt@zort.ca> writes: > I'm thinking it should check for an on delete rule as well as user > triggers. Seems reasonable to me. Should there be a "FORCE" option to override these checks and do it anyway? Or is that just asking for trouble? regards, tom lane
The only time I can think of that a FORCE type mechanism would be allowed would be internal functions. Perhaps a new cluster (copy data, truncate table, copy data back sorted). Internal stuff can call heap_truncate() directly rather than going through TruncateRelation. A user style force is to simply drop all rules, foreign keys, triggers, etc -- do the action -- re-apply constraints. Anything else could mean their data isn't consistent. -- Rod ----- Original Message ----- From: "Tom Lane" <tgl@sss.pgh.pa.us> To: "Rod Taylor" <rbt@zort.ca> Cc: "Hackers List" <pgsql-hackers@postgresql.org> Sent: Sunday, May 12, 2002 12:30 PM Subject: Re: [HACKERS] TRUNCATE > "Rod Taylor" <rbt@zort.ca> writes: > > I'm thinking it should check for an on delete rule as well as user > > triggers. > > Seems reasonable to me. > > Should there be a "FORCE" option to override these checks and do it > anyway? Or is that just asking for trouble? > > regards, tom lane >
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Tom Lane > Sent: Sunday, May 12, 2002 12:30 PM > To: Rod Taylor > Cc: Hackers List > Subject: Re: [HACKERS] TRUNCATE > > > "Rod Taylor" <rbt@zort.ca> writes: > > I'm thinking it should check for an on delete rule as well as user > > triggers. > > Seems reasonable to me. > > Should there be a "FORCE" option to override these checks and do it > anyway? Or is that just asking for trouble? I've relied on being able to TRUNCATE w/o having RI kick in to lots of data clean ups, forced sorts, etc. I'd find it annoying if I couldn't do this anymore (or had to do equally-annoying things, like manually drop then recreate the triggers, etc.) I'm happy w/o the FORCE option (just let TRUNCATE do it), but if enough people think that the FORCE keyword should be added to allow overriding of triggers, that could be a good compromise. But, please, don't take away the ability to TRUNCATE. Doing it when there are triggers is one the strengths of TRUNCATE, IMNSHO. - J. Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton Knowledge Management & Technology Consultant
From my limited understanding of truncate in Oracle is it requires the user to first disable integrity constraints on the table before truncate will run. In SQL Server that truncate will not allow truncate if foreign key constraints exist, but does not execute user delete triggers. Can't remember nor confirm either of these now. But, for consistency sake we should enforce the foreign key case. But I really think it should apply to all constraints, system or user enforced (rules, user written triggers). Besides that, theres always Codds twelfth rule which I've always liked: The nonsubversion rule: If low-level access is permitted it should not bypass security or integrity rules. -- Rod ----- Original Message ----- From: "Joel Burton" <joel@joelburton.com> To: "Tom Lane" <tgl@sss.pgh.pa.us>; "Rod Taylor" <rbt@zort.ca> Cc: "Hackers List" <pgsql-hackers@postgresql.org> Sent: Sunday, May 12, 2002 3:48 PM Subject: RE: [HACKERS] TRUNCATE > > -----Original Message----- > > From: pgsql-hackers-owner@postgresql.org > > [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Tom Lane > > Sent: Sunday, May 12, 2002 12:30 PM > > To: Rod Taylor > > Cc: Hackers List > > Subject: Re: [HACKERS] TRUNCATE > > > > > > "Rod Taylor" <rbt@zort.ca> writes: > > > I'm thinking it should check for an on delete rule as well as user > > > triggers. > > > > Seems reasonable to me. > > > > Should there be a "FORCE" option to override these checks and do it > > anyway? Or is that just asking for trouble? > > I've relied on being able to TRUNCATE w/o having RI kick in to lots of data > clean ups, forced sorts, etc. I'd find it annoying if I couldn't do this > anymore (or had to do equally-annoying things, like manually drop then > recreate the triggers, etc.) > > I'm happy w/o the FORCE option (just let TRUNCATE do it), but if enough > people think that the FORCE keyword should be added to allow overriding of > triggers, that could be a good compromise. > > But, please, don't take away the ability to TRUNCATE. Doing it when there > are triggers is one the strengths of TRUNCATE, IMNSHO. > > - J. > > Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton > Knowledge Management & Technology Consultant >
> I'm happy w/o the FORCE option (just let TRUNCATE do it), but if enough > people think that the FORCE keyword should be added to allow overriding of > triggers, that could be a good compromise. > > But, please, don't take away the ability to TRUNCATE. Doing it when there > are triggers is one the strengths of TRUNCATE, IMNSHO. It seems to me that there's more and more need for an 'SET CONSTRAINTS DISABLED' and 'SET CONSTRAINTS ENABLED' command that affects only foreign keys. This would basically make it ignore foreign key checks for the remainder of the transaction. This could be used before a TRUNCATE command, and would also be essential when we switch to dumping ALTER TABLE/FOREIGN KEY commands in pg_dump, and we don't want them to be checked... Chris
> >From my limited understanding of truncate in Oracle is it requires the > user to first disable integrity constraints on the table before > truncate will run. > > In SQL Server that truncate will not allow truncate if foreign key > constraints exist, but does not execute user delete triggers. > > Can't remember nor confirm either of these now. But, for consistency > sake we should enforce the foreign key case. But I really think it > should apply to all constraints, system or user enforced (rules, user > written triggers). > > Besides that, theres always Codds twelfth rule which I've always > liked: > The nonsubversion rule: If low-level access is permitted it should not > bypass security or integrity rules. Dare I go against Codd, but, really, I've found it very convenient to be able to export a single table, TRUNCATE it, clean up the data in another program, and pull it back in. It's much more of a pain to have to dump the whole db (neccessary or at least sanity preserving if there are lots of complicated foreign key or trigger rules) or to drop/recreate the triggers/rules. The security issue is important, though: it's very likely that I might want to let an certain class of user DELETE a record (with all the usual rules/triggers/RI applying), but not let them bypass all that to TRUNCATE. But I still wouldn't want to see hassle-free truncation disappear in the name of security or idiot-proofing, if there are reasonable compromises. - J. Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton Knowledge Management & Technology Consultant
> -----Original Message----- > From: Christopher Kings-Lynne [mailto:chriskl@familyhealth.com.au] > Sent: Sunday, May 12, 2002 10:17 PM > To: Joel Burton; Tom Lane; Rod Taylor > Cc: Hackers List > Subject: RE: [HACKERS] TRUNCATE > > > > I'm happy w/o the FORCE option (just let TRUNCATE do it), but if enough > > people think that the FORCE keyword should be added to allow > overriding of > > triggers, that could be a good compromise. > > > > But, please, don't take away the ability to TRUNCATE. Doing it > when there > > are triggers is one the strengths of TRUNCATE, IMNSHO. > > It seems to me that there's more and more need for an 'SET CONSTRAINTS > DISABLED' and 'SET CONSTRAINTS ENABLED' command that affects only foreign > keys. This would basically make it ignore foreign key checks for the > remainder of the transaction. This could be used before a > TRUNCATE command, > and would also be essential when we switch to dumping ALTER TABLE/FOREIGN > KEY commands in pg_dump, and we don't want them to be checked... This would be different than SET CONSTRAINTS DEFERRED, in that DISABLED would never perform the checks, even at the end of the transaction? - J. Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton Knowledge Management & Technology Consultant
On Mon, May 13, 2002 at 10:17:07AM +0800, Christopher Kings-Lynne wrote: > > I'm happy w/o the FORCE option (just let TRUNCATE do it), but if enough > > people think that the FORCE keyword should be added to allow overriding of > > triggers, that could be a good compromise. > > > > But, please, don't take away the ability to TRUNCATE. Doing it when there > > are triggers is one the strengths of TRUNCATE, IMNSHO. > > It seems to me that there's more and more need for an 'SET CONSTRAINTS > DISABLED' and 'SET CONSTRAINTS ENABLED' command that affects only foreign > keys. I really dislike the idea of referring to "constraints" but only affecting foreign key constraints. And what would be the security/data-integrity ramifications of allowing this? Cheers, Neil -- Neil Conway <neilconway@rogers.com> PGP Key ID: DB3C29FC
> > It seems to me that there's more and more need for an 'SET CONSTRAINTS > > DISABLED' and 'SET CONSTRAINTS ENABLED' command that affects > only foreign > > keys. > > I really dislike the idea of referring to "constraints" but only affecting > foreign key constraints. All the other SET CONSTRAINTS statments refer only to foreign keys... > And what would be the security/data-integrity ramifications of allowing > this? Well, if only super users could do it... Chris
I still highly recommend that it be a drop foreign key, grab data, truncate, import data, reapply foreign key (which will double check your work) as I believe data and schema integrity should be high goals of Postgresql (myself anyway). However, I'd like to know what your doing. ie. Why is this method the fastest and easiest way. Given a dataset, how much (%age wise) do you generally modify when you clean it up? And what is the general dataset size (half million records?). I'm making the assumption you almost never delete data (primary key wise), otherwise foreign keyd data may no longer align. I'm also making the assumption your either the sole user of the database, or have a long period where the database is not in use (overnight?). What do you use to clean it up? Custom script for each job? Regular expressions? Simple spreadsheet like format filling in numbers? Complete dump and replace of the data? Lastly, would a data diff make it easier? Compare the data between the table (based on the primary key) and your working copy then update old records as necessary to bring them up to date and insert new records? -- Rod ----- Original Message ----- From: "Joel Burton" <joel@joelburton.com> To: "Rod Taylor" <rbt@zort.ca>; "Tom Lane" <tgl@sss.pgh.pa.us> Cc: "Hackers List" <pgsql-hackers@postgresql.org> Sent: Monday, May 13, 2002 12:12 AM Subject: Re: [HACKERS] TRUNCATE > > >From my limited understanding of truncate in Oracle is it requires the > > user to first disable integrity constraints on the table before > > truncate will run. > > > > In SQL Server that truncate will not allow truncate if foreign key > > constraints exist, but does not execute user delete triggers. > > > > Can't remember nor confirm either of these now. But, for consistency > > sake we should enforce the foreign key case. But I really think it > > should apply to all constraints, system or user enforced (rules, user > > written triggers). > > > > Besides that, theres always Codds twelfth rule which I've always > > liked: > > The nonsubversion rule: If low-level access is permitted it should not > > bypass security or integrity rules. > > Dare I go against Codd, but, really, I've found it very convenient to be > able to export a single table, TRUNCATE it, clean up the data in another > program, and pull it back in. It's much more of a pain to have to dump the > whole db (neccessary or at least sanity preserving if there are lots of > complicated foreign key or trigger rules) or to drop/recreate the > triggers/rules. > > The security issue is important, though: it's very likely that I might want > to let an certain class of user DELETE a record (with all the usual > rules/triggers/RI applying), but not let them bypass all that to TRUNCATE. > > But I still wouldn't want to see hassle-free truncation disappear in the > name of security or idiot-proofing, if there are reasonable compromises. > > - J. > > Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton > Knowledge Management & Technology Consultant > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >
> I still highly recommend that it be a drop foreign key, grab data, > truncate, import data, reapply foreign key (which will double check > your work) as I believe data and schema integrity should be high goals > of Postgresql (myself anyway). I agree that they should be high goals. > However, I'd like to know what your doing. ie. Why is this method > the fastest and easiest way. It's easier than dropping and recreating rules because that takes a bit of trouble. (If there were any easy way in pg_dump or in psql directly to get the text of just the rules/triggers/RI declarations for a table, that would make it a bit easier than pulling that out of the other table stuff in pg_dump output). It's easier than a full-database dump/fix/restore because sometimes (hopefully now historically :) ) pg_dump wasn't a perfect tool: for a while, it would drop RI statements, or occassionally have a hard time recreating a view, etc. Plus, of course, with a large database, it can take quite a while to process. A limited-to-that-table dump/fix/restore can be a problem because of the interrelationships of RI among tables. If there were any easier way to dump information about a table so that I could restore the RI that other tables have on it, that might be a solution. > Given a dataset, how much (%age wise) do you generally modify when you > clean it up? And what is the general dataset size (half million > records?). More often than not, I'm working with complex tables and fairly small # of rows. Perhaps 30 fields x 10,000 records. > I'm making the assumption you almost never delete data (primary key > wise), otherwise foreign keyd data may no longer align. I'm also > making the assumption your either the sole user of the database, or > have a long period where the database is not in use (overnight?). No, I wouldn't delete things. I don't want to bypass RI, just not have to deal with removing/creating all the rules every time I need to clean up some data. In most cases, yes, I can either take db offline for an hour or ensure that there will be no writes to the db. > What do you use to clean it up? Custom script for each job? Regular > expressions? Simple spreadsheet like format filling in numbers? > Complete dump and replace of the data? Generally, I'm doing something like pulling the data into a text file and using regexes or spreadsheet tools to clean it up. Some of which could be done (through plperl or plpython or such), but is often easier with full text manipulation/emacs/etc. Sometimes, though, I'm just cleaning out test data. For example: often, I'll create a table where records can't be deleted w/out logging information going into another table (via rule or trigger, and I usually prohibit deletions at all from the log table). I'll put some fake records in, delete a few, see the logging data, and later, when I want to delete the fake data (& the fake logging data), I'll use TRUNCATE. I could only do this w/a normal DELETE by dropping these rules/triggers, deleting, and re-creating. Which is more of a pain than I'd like to do. Given that only the owner of a table can truncate it, I'm not too worried about the security of truncate: the owner is the person who would understand the ramifications of truncate vs. delete. Having it either emit a warning that there were triggers/rules/RI or (better) requiring a FORCE parameter to truncate when there are might make others feel safe, though. - J. Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton Knowledge Management & Technology Consultant
> A limited-to-that-table dump/fix/restore can be a problem because of the > interrelationships of RI among tables. If there were any easier way to dump > information about a table so that I could restore the RI that other tables > have on it, that might be a solution. Agreed about making that easier. > > What do you use to clean it up? Custom script for each job? Regular > > expressions? Simple spreadsheet like format filling in numbers? > > Complete dump and replace of the data? > > Generally, I'm doing something like pulling the data into a text file and > using regexes or spreadsheet tools to clean it up. Some of which could be > done (through plperl or plpython or such), but is often easier with full > text manipulation/emacs/etc. Internal regex support would be useful, as would plpgsql from anywhere (merge most into standard frontend parser). > Sometimes, though, I'm just cleaning out test data. For example: often, I'll > create a table where records can't be deleted w/out logging information You don't create database testdb with template = productiondb? Especially since you take it offline anyway. > that there were triggers/rules/RI or (better) requiring a FORCE parameter to > truncate when there are might make others feel safe, though. FORCE doesn't really solve the issue for me. I want to remove the ability to unexpectedly mess up the database. They're usually good enough to know that drop database is a bad thing. But some of the other commands have interesting seemingly non-related failures. Truncate was one, object inter-dependence (what pg_depend covers) was another area. Anyway, I'm willing to wait until I (or someone else) can remove the advantages of truncate over other methods :)