Thread: Problem with disabling triggers in pg_dump
An interesting problem has been brought to my attention in pg_dump (V7.0.2+, I think). It uses the following code to disable triggers prior to a data load: UPDATE "pg_class" SET "reltriggers" = 0 WHERE "relname" !~ '^pg_'; which works fine if it is currently connected as a superuser, or as the datdba. However, if it is connected as anybody else, they will get the error: pg_class: Permission denied The obvious solution is to reconnect as the datdba before running the code. But that option may not be possible because passwords may be enabled and the person restoring the database may not know the password for the datdba, but might have a valid 'superuser' account which they could use instead. So, how does this sound: add another arg to pg_restore, --superuser=name, which allows the person restoring the database to specify the superuser account to use, and if none is specified, then use the account it finds in the dba field of the database it is restoring to. I would also modify pg_dump to dump the original datdba, in case a mythical future release does the 'create database' part as well. Another option would be to pop up a username prompt as well as the password prompt when it wants the superuser on a system that requires passwords. Comments would definitely be appreciated... ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.C.N. 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner wrote: > > An interesting problem has been brought to my attention in pg_dump > (V7.0.2+, I think). > > It uses the following code to disable triggers prior to a data load: > > UPDATE "pg_class" SET "reltriggers" = 0 WHERE "relname" !~ '^pg_'; > > which works fine if it is currently connected as a superuser, or as the > datdba. However, if it is connected as anybody else, they will get the error: > > pg_class: Permission denied > > The obvious solution is to reconnect as the datdba before running the code. The above was a rude hack. Introduced because of the lack of time when we discovered the problem during 7.0 BETA cycle (IIRC). The ovious "solution" is to fix it. The clean solution would be to have something like ALTER SESSION DISABLE/ENABLE TRIGGERS A global variable (like those for other session settings) would cause the trigger manager to suppress their invocation at all. The new utility has to check if either the user has ALTER ALL TABLES privilege, or has ALTER TABLE privilege for any existing user table where triggers are defined for. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
At 15:52 25/07/00 +0200, Jan Wieck wrote: >Philip Warner wrote: >> >> An interesting problem has been brought to my attention in pg_dump >> (V7.0.2+, I think). >> >> It uses the following code to disable triggers prior to a data load: >> >> UPDATE "pg_class" SET "reltriggers" = 0 WHERE "relname" !~ '^pg_'; >> >> which works fine if it is currently connected as a superuser, or as the >> datdba. However, if it is connected as anybody else, they will get the error: >> >> pg_class: Permission denied >> >> The obvious solution is to reconnect as the datdba before running the code. > > The clean solution would be to have something like > > ALTER SESSION DISABLE/ENABLE TRIGGERS Unfortunately, I am trying to keep the new pg_dump working with 7.0.2 as well, so I need a solution for the old hack as well as the new priv model... But, as far as 7.1 goes, this sounds like the way to go. FWIW, we'd also want to consider: ALTER SEESION DISABLE CONSTRAINTS etc etc. plus even, ALTER TRIGGER <name> on <table> DISABLE or ALTER TABLE <name> DISABLE TRIGGERS or (to avoid messing with valid DDL), ALTER SESSION DISABLE TRIGGER <trig-name> ALTER SESSION DISABLE TRIGGERSON <rel-name> ALTER SESSION DISABLE ALL TRIGGERS > A global variable (like those for other session settings) > would cause the trigger manager to suppress their invocation > at all. Maybe it would be best stored on a per-trigger basis. > The new utility has to check if either the user has ALTER ALL > TABLES privilege, or has ALTER TABLE privilege for any > existing user table where triggers are defined for. Or maybe 'ALTER' priv on the individual 'TRIGGER' objects...since I would not necessarily consider a trigger subject to the ALTER TABLE rules. ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.C.N. 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes: > The obvious solution is to reconnect as the datdba before running the code. > But that option may not be possible because passwords may be enabled pg_dump scripts that try to restore ownership have always been essentially unusable on systems with password authentication, because it's just not practical to keep entering the passwords for every \connect (even assuming you know them all). This trigger hack is just the tip of the iceberg. We have talked about ways to solve the real problem. One way is to run the entire restore script as superuser, doing something other than \connect to set ownership of created objects. You could do that now with something likecreate table newtable ...update pg_class set relowner = xxx where relname = 'newtable'; although it'd be nicer to invent ALTER commands to handle this. Another issue to think about is that it should be possible to run restore scripts as a non-superuser, with the restriction that all the created objects end up being owned by you not by their original owners. (Compare the behavior of "tar x" when run as superuser or not.) This has not worked in the past (because those \connect commands can't be ignored), but it would work with an ALTER-based approach, because the ALTERs would simply fail. With a slightly smarter pg_restore, there'd be an option not to emit the ALTERs in the first place, but this is inessential. The thing that really bothers me about this reltriggers hack is that it doesn't work if the script is being run as non-superuser. I don't see why it's necessary anyway; shouldn't the order of operations becreate table;load data;create triggers and indexes; ? > So, how does this sound: add another arg to pg_restore, --superuser=name, > which allows the person restoring the database to specify the superuser > account to use, and if none is specified, then use the account it finds in > the dba field of the database it is restoring to. I think pg_restore ought to run under the same userid that would be established for a plain psql session --- ie, -u or $PGUSER or $USER. > I would also modify > pg_dump to dump the original datdba, in case a mythical future release does > the 'create database' part as well. Uh, how does pg_dumpall enter into this? regards, tom lane
At 11:17 25/07/00 -0400, Tom Lane wrote: >Philip Warner <pjw@rhyme.com.au> writes: >> The obvious solution is to reconnect as the datdba before running the code. >> But that option may not be possible because passwords may be enabled > >We have talked about ways to solve the real problem. One way is to >run the entire restore script as superuser, doing something other than >\connect to set ownership of created objects. You could do that now >with something like > create table newtable ... > update pg_class set relowner = xxx where relname = 'newtable'; I considered this, but lack of confidence as to the ramifications of updating pg_class + the fact that it also requires superuser access, put me off. But maybe I should put an 'Ownersip' phase in pg_restore (at the end, obviosly). >although it'd be nicer to invent ALTER commands to handle this. I agree, but not for 7.0.2. Doing something as apparently simple might be a good way for me to see how the process of dealing with a command works from psql->backend->psql, which is something I'll need for 'insert...returning'. >Another issue to think about is that it should be possible to run >restore scripts as a non-superuser, with the restriction that all the >created objects end up being owned by you not by their original owners. This was part of the motivation; if a DBA creates a db and can set it's ownership to the appropriate user, then the user should be able to restore into it. >(Compare the behavior of "tar x" when run as superuser or not.) >This has not worked in the past (because those \connect commands can't >be ignored), I don't follow...do you mean that psql barfs, or do you mean that there has been no way to prevent them being output? >but it would work with an ALTER-based approach, because >the ALTERs would simply fail. With a slightly smarter pg_restore, >there'd be an option not to emit the ALTERs in the first place, but >this is inessential. --ignore-ownership or similar on pg_restore is about 10 lines of code. >The thing that really bothers me about this reltriggers hack is that >it doesn't work if the script is being run as non-superuser. I don't >see why it's necessary anyway; shouldn't the order of operations be > create table; > load data; > create triggers and indexes; Yes, for a full restore that is true. But for a partial restore (where I have assumed triggers may exist), it's nice to disable the triggers...but (and this is a horrible idea), pg_restore could use pg_dump to dump the triggers to a null archive, drop them, then restore them at the end of the restore op... >> So, how does this sound: add another arg to pg_restore, --superuser=name, >> which allows the person restoring the database to specify the superuser >> account to use, and if none is specified, then use the account it finds in >> the dba field of the database it is restoring to. > >I think pg_restore ought to run under the same userid that would be >established for a plain psql session --- ie, -u or $PGUSER or $USER. It does; but with a direct db connection it reconnects appropriately as each user (with password prompts if necessary). The idea was that for the reltriggers hack it knows it needs to connect as a superuser, so I could allow the user to specify a valid one with a known password. >> I would also modify >> pg_dump to dump the original datdba, in case a mythical future release does >> the 'create database' part as well. > >Uh, how does pg_dumpall enter into this? It doesn't really; pg_dumpall inserts text into the output stream and creates one huge script for restoring all databases. It also messes with pg_* tables, so I can't really make the two work together. The current version of pg_restore that I am working on has a --create (-C) option which will issue the appropriate 'create database' commands. But pg_dumpall is unaffected. Sounds like I need to add --ignore-owner. At *some* point in the future, I'll try to put together a pg_backupall & pg_restoreall, but that's not really high on my priorities - I tend to backup DBs individually. ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.C.N. 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes: >> (Compare the behavior of "tar x" when run as superuser or not.) >> This has not worked in the past (because those \connect commands can't >> be ignored), > I don't follow...do you mean that psql barfs, or do you mean that there has > been no way to prevent them being output? No, I mean that if they're in the script there is no way to have them be ignored. There is a way to prevent them from being output (-z switch to pg_dump, or some such) but *that puts control at the wrong end of the process*. When you make a tarfile you don't have to specify whether it will be restored with the same file ownerships or not; you determine that when you do the restore. pg_dump scripts need the same flexibility. An additional reason for getting rid of the \connect's is performance; each one causes a fresh backend startup. You might also care to contemplate the implications for per-session variables like the "disable triggers" switch you were suggesting. >> The thing that really bothers me about this reltriggers hack is that >> it doesn't work if the script is being run as non-superuser. I don't >> see why it's necessary anyway; shouldn't the order of operations be >> create table; >> load data; >> create triggers and indexes; > Yes, for a full restore that is true. But for a partial restore (where I > have assumed triggers may exist), it's nice to disable the triggers... Why is that nice? If you are loading more data into an existing table structure, seems to me the *last* thing you'd want is to disable your consistency checks. Too risky --- certainly not something I want pg_dump doing automatically without my knowledge or consent. I've forgotten the exact details of the scenario that forced us to put in the reltriggers hack in the first place, but it may well be that the problem has a better solution now that pg_dump is smart enough to reorder its output. I can see from the CVS logs that Jan put the trigger change into pg_dump on 7-Feb-2000, but I didn't have much luck finding any related discussion in the mail archives. Jan, do you recall the reasoning for it? regards, tom lane
> >> The thing that really bothers me about this reltriggers hack is that > >> it doesn't work if the script is being run as non-superuser. I don't > >> see why it's necessary anyway; shouldn't the order of operations be > >> create table; > >> load data; > >> create triggers and indexes; > > > Yes, for a full restore that is true. But for a partial restore (where I > > have assumed triggers may exist), it's nice to disable the triggers... > > Why is that nice? If you are loading more data into an existing table > structure, seems to me the *last* thing you'd want is to disable your > consistency checks. Too risky --- certainly not something I want > pg_dump doing automatically without my knowledge or consent. > > I've forgotten the exact details of the scenario that forced us to put > in the reltriggers hack in the first place, but it may well be that > the problem has a better solution now that pg_dump is smart enough to > reorder its output. I can see from the CVS logs that Jan put the > trigger change into pg_dump on 7-Feb-2000, but I didn't have much luck > finding any related discussion in the mail archives. Jan, do you recall > the reasoning for it? For constraints it's a circular dependency issue. Unless all of the restore is being done in a single transaction, you can't put data in a table set that is something like table1 references table2 references table1. With deferred triggers you can get data sets that cannot be created outside of a transaction. If we had drop constraint, we could drop and re-add the constraints which would at least give an error message if the data failed the constraint, but is still not a real solution. In addition, if you're loading a data only dump into a created schema, you're asking for trouble if you have triggers anyway, since how does the system know whether or not a particular trigger's effects are already included in the data only dump or not (say, a logging trigger). Some triggers you want to run theoretically because they aren't circular dependencies and do validation of data. Other triggers have effects like creating rows that are already in the data dump and you wouldn't want to run again.
Tom Lane wrote: > >> The thing that really bothers me about this reltriggers hack is that > >> it doesn't work if the script is being run as non-superuser. I don't > >> see why it's necessary anyway; shouldn't the order of operations be > >> create table; > >> load data; > >> create triggers and indexes; > > > Yes, for a full restore that is true. But for a partial restore (where I > > have assumed triggers may exist), it's nice to disable the triggers... > > Why is that nice? If you are loading more data into an existing table > structure, seems to me the *last* thing you'd want is to disable your > consistency checks. Too risky --- certainly not something I want > pg_dump doing automatically without my knowledge or consent. > > I've forgotten the exact details of the scenario that forced us to put > in the reltriggers hack in the first place, but it may well be that > the problem has a better solution now that pg_dump is smart enough to > reorder its output. I can see from the CVS logs that Jan put the > trigger change into pg_dump on 7-Feb-2000, but I didn't have much luck > finding any related discussion in the mail archives. Jan, do you recall > the reasoning for it? It was for data-only dumps. Someone might have a schema, that doesn't dump well. So he keeps his schema creation in separate, hand maintained SQL scripts and takes data-only dumps. On a restore, his schema scripts would already create all the constraints, triggers and the like. Now the restore loadsthe data, table by table. But circular dependencies will fail all the way. And triggers might modify othertables (COPY invokes triggers), but these derived actions are already part of the dumped data. So it'll not doany good. At least it'll not restore the database to the same content it had at dump time. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Tom Lane writes: > We have talked about ways to solve the real problem. One way is to > run the entire restore script as superuser, doing something other than > \connect to set ownership of created objects. You could do that now > with something like > create table newtable ... > update pg_class set relowner = xxx where relname = 'newtable'; > although it'd be nicer to invent ALTER commands to handle this. And this is what good ol' SQL has in store for us: CREATE SCHEMA AUTHORIZATION "username" CREATE TABLE t1 (...) CREATE TABLE t2 (...) ; -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
At 12:23 25/07/00 -0400, Tom Lane wrote: >Philip Warner <pjw@rhyme.com.au> writes: >>> (Compare the behavior of "tar x" when run as superuser or not.) >>> This has not worked in the past (because those \connect commands can't >>> be ignored), > >> I don't follow...do you mean that psql barfs, or do you mean that there has >> been no way to prevent them being output? > >No, I mean that if they're in the script there is no way to have them >be ignored. There is a way to prevent them from being output (-z switch >to pg_dump, or some such) but *that puts control at the wrong end of the >process*. When you make a tarfile you don't have to specify whether it >will be restored with the same file ownerships or not; you determine >that when you do the restore. pg_dump scripts need the same flexibility. OK, so adding '--no-owner, -O(?)' on pg_restore is probably worth doing. >An additional reason for getting rid of the \connect's is performance; >each one causes a fresh backend startup. You might also care to >contemplate the implications for per-session variables like the "disable >triggers" switch you were suggesting. Yes good point; per-session probably is better than updating reltriggers, but I'd argue that a restore operation (even a partial one) should not be done while the DB is online anyway. >>> The thing that really bothers me about this reltriggers hack is that >>> it doesn't work if the script is being run as non-superuser. I don't >>> see why it's necessary anyway; shouldn't the order of operations be >>> create table; >>> load data; >>> create triggers and indexes; > >> Yes, for a full restore that is true. But for a partial restore (where I >> have assumed triggers may exist), it's nice to disable the triggers... > >Why is that nice? If you are loading more data into an existing table >structure, seems to me the *last* thing you'd want is to disable your >consistency checks. Too risky --- certainly not something I want >pg_dump doing automatically without my knowledge or consent. But, but, but...that's the default behaviour in 7.0.2. I can probably safely change the default behaviour in pg_restore, but are you also suggesting I do it for pg_dump as well. >I've forgotten the exact details of the scenario that forced us to put >in the reltriggers hack in the first place, but it may well be that >the problem has a better solution now that pg_dump is smart enough to >reorder its output. I can see from the CVS logs that Jan put the >trigger change into pg_dump on 7-Feb-2000, but I didn't have much luck >finding any related discussion in the mail archives. Jan, do you recall >the reasoning for it? From Jan's subsequent email, this is because of RI triggers. How does this sound as a plan: 1. Add --no-owner to pg_restore & pg_dump (both versions) This will use the current uid to build the entire database. The will affect pg_restore as well as plain text output from pg_dump. 2. Add --superuser[=<name>], -S to pg_restore & pg_dump (both versions) If restoring directly to a db, then pg_restore will check if the current uid has superuser, and if so, will run as though --superuser=<current-uid> were specified (unless another name was already specified). Without --superuser it will NOT DISABLE TRIGGERS, and will issue a warning to that effect (since it can't anyway). With --superuser specified (or defaulted), it will issue '\connect - <superuser>' before disabling triggers, then reconnect as the previous user afterwards. If and when 'SET SESSION CONSTRAINTS DISABLED' & 'SET SESSION TRIGGERS DISABLED' comes along, the code for disabling triggers can be modified appropriately. I'm not too worried about the cost of reconnecting; it is not is any where near as great as the cost of actually restoring the data for any big DB, I think. Does this sound like a reasonable approach? ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.C.N. 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/