Thread: pg_dump and DEFAULT column values
I have a function and a table: create table foo ( f1 integer default foo_func(); ); create function foo_func returns integer as 'select max(f1)+1 from foo' language 'sql'; when I use pg_dump, and then psql (\i <dumpfile>) to dump/reload everything, table foo can't be created because it uses foo_func which can't be created because it uses a table (foo) which doesn't exist. So I always have to hand-edit the dump file: create table foo ( f1 integer; ); create function foo_func returns integer as 'select max(f1) from foo' language 'sql'; alter table foo alter column f1 set default foo_func(); Is there a way for pg_dump to do this automagically? or should I define the table and function differently? Should the default value be replaced with an on insert trigger? eric
"Eric Ridge" <ebr@tcdi.com> writes: > I have a function and a table: > create table foo ( > f1 integer default foo_func(); > ); > create function foo_func returns integer > as 'select max(f1)+1 from foo' > language 'sql'; > when I use pg_dump, and then psql (\i <dumpfile>) to dump/reload > everything, table foo can't be created because it uses foo_func which > can't be created because it uses a table (foo) which doesn't exist. pg_dump has a hard time with circular dependencies :-( Have you considered using a sequence, rather than generating new values as shown above? The approach you are using is extremely fragile: consider what happens if two backends try to insert at the same time. regards, tom lane
> pg_dump has a hard time with circular dependencies :-( but on the plus side, pg_dump is really fast! much faster than dump utilities of other databases I've seen. > Have you considered using a sequence, rather than generating > new values > as shown above? The approach you are using is extremely fragile: > consider what happens if two backends try to insert at the same time. a sequence it is! thanks. eric
> > Have you considered using a sequence, rather than generating > > new values > > as shown above? The approach you are using is extremely fragile: > > consider what happens if two backends try to insert at the > same time. > > a sequence it is! thanks. well, but then again, I want the default value of that field to be 1 greater than the max value, not the next value in a sequence. The client application has the ability to change the value of that field, but new records need to be max+1. So I guess to make pg_dump happy, and to solve potential concurrency issues, I need a trigger for that field? Or will I have the same circular reference problem with a trigger? eric
"Eric Ridge" <ebr@tcdi.com> writes: >> a sequence it is! thanks. > well, but then again, I want the default value of that field to be 1 > greater than the max value, not the next value in a sequence. > The client application has the ability to change the value of that > field, but new records need to be max+1. Hmm. I think you should think hard about why you believe that the default has to work that way and you can't just use a sequence. You're paying a high price to conform to what seems a very questionable set of assumptions. > So I guess to make pg_dump happy, and to solve potential concurrency > issues, I need a trigger for that field? Or will I have the same > circular reference problem with a trigger? The circular reference problem arises because SQL-language functions are parsed and checked at CREATE FUNCTION time. A cheezy way to get around it is to define the function before the table, but write it in a PL language --- presently, at least, PL function definitions are not checked until first use. So you could do create function get_the_max() ... language 'plpgsql'; create table ... default get_the_max() and it should work and also dump correctly. regards, tom lane
> Hmm. I think you should think hard about why you believe that the > default has to work that way and you can't just use a sequence. > You're paying a high price to conform to what seems a very > questionable set of assumptions. I use the field for sorting, and since the client application has the ability to change the value of this field, it is possible they could set it to a number larger than the next value of the sequence. then an insert of a new record would appear before the changed record, instead of after it. > The circular reference problem arises because SQL-language functions > are parsed and checked at CREATE FUNCTION time. A cheezy way to get > around it is to define the function before the table, but write it in > a PL language --- presently, at least, PL function definitions are not > checked until first use. So you could do I ended up creating a "before on insert" trigger in plpgsql. pg_dump seems to like this, and it gives me the ability to easily change how I find the default value for this field. And thanks for your time. It is very much appreciated. eric
"Eric Ridge" <ebr@tcdi.com> writes: >> Hmm. I think you should think hard about why you believe that the >> default has to work that way and you can't just use a sequence. >> You're paying a high price to conform to what seems a very >> questionable set of assumptions. > I use the field for sorting, and since the client application has the > ability to change the value of this field, it is possible they could set > it to a number larger than the next value of the sequence. then an > insert of a new record would appear before the changed record, instead > of after it. But if you allow applications to change the field, then all bets about sort order are off anyway, no? It's far from clear exactly what semantics you think you are guaranteeing here. In any case it seems like changing the initially-assigned field value is an infrequent operation, and that you ought to make *that* be the expensive and not-safe-for-parallelism case, not row insertion. (Perhaps this is a wrong assumption on my part, in which case ignore what follows.) I'd still be inclined to use a sequence for insertion, and to enforce consistency on update with an AFTER UPDATE trigger that does something like if old.fld != new.fld then begin lock mytable; select setval('seq', max(fld)) from table; end; (Untested, but I hope the idea is clear.) regards, tom lane
> But if you allow applications to change the field, then all bets about > sort order are off anyway, no? It's far from clear exactly what > semantics you think you are guaranteeing here. The only thing I'm trying to guarantee is that new records appear last (or first, depending on sort direction). Once the user "physically" re-sorts the records (by changing the value of this field) then I don't care what the old values are. Only that new records are greater than the last existing value. > In any case it seems like changing the initially-assigned field value > is an infrequent operation, and that you ought to make *that* be the > expensive and not-safe-for-parallelism case, not row insertion. About as frequent as insertions. As records are added, they're manually sorted. So every INSERT (eventually) causes a manual UPDATE for all records. In most cases I know before I do the INSERT, how the record needs to be sorted, so I can set the this value accordingly. But other cases involve a bulk insert process, and a manual resorting process. > (Perhaps this is a wrong assumption on my part, in which case ignore > what follows.) I'd still be inclined to use a sequence for insertion, > and to enforce consistency on update with an AFTER UPDATE trigger that > does something like > if old.fld != new.fld then > begin > lock mytable; > select setval('seq', max(fld)) from table; > end; > (Untested, but I hope the idea is clear.) Very clear. However, this table is going to receive many UPDATEs that aren't related to this sorting field. Would the overall overhead of executing the trigger on every UPDATE outweigh the overhead of doing it on BEFORE INSERT: begin if NEW.fld is null then -- user didn't specify a value lock mytable; NEW.fld := select max(fld)+1 from table; end; I guess it depends on the frequency of updates v/s the frequency of inserts. I'm not sure, at this point, which will be greater. eric
You might want to move that logic to your client application or the insert statement, rather than having it done in the database, if people aren't going to be going into the database without going through your application. You could have something like INSERT INTO Table (Field, Field2, Field3) VALUES ((SELECT MAX(Field) FROM Table)+1, 'Value2', 17); Thanks, Peter Darley -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Eric Ridge Sent: Wednesday, November 07, 2001 9:46 PM To: Tom Lane Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] pg_dump and DEFAULT column values > But if you allow applications to change the field, then all bets about > sort order are off anyway, no? It's far from clear exactly what > semantics you think you are guaranteeing here. The only thing I'm trying to guarantee is that new records appear last (or first, depending on sort direction). Once the user "physically" re-sorts the records (by changing the value of this field) then I don't care what the old values are. Only that new records are greater than the last existing value. > In any case it seems like changing the initially-assigned field value > is an infrequent operation, and that you ought to make *that* be the > expensive and not-safe-for-parallelism case, not row insertion. About as frequent as insertions. As records are added, they're manually sorted. So every INSERT (eventually) causes a manual UPDATE for all records. In most cases I know before I do the INSERT, how the record needs to be sorted, so I can set the this value accordingly. But other cases involve a bulk insert process, and a manual resorting process. > (Perhaps this is a wrong assumption on my part, in which case ignore > what follows.) I'd still be inclined to use a sequence for insertion, > and to enforce consistency on update with an AFTER UPDATE trigger that > does something like > if old.fld != new.fld then > begin > lock mytable; > select setval('seq', max(fld)) from table; > end; > (Untested, but I hope the idea is clear.) Very clear. However, this table is going to receive many UPDATEs that aren't related to this sorting field. Would the overall overhead of executing the trigger on every UPDATE outweigh the overhead of doing it on BEFORE INSERT: begin if NEW.fld is null then -- user didn't specify a value lock mytable; NEW.fld := select max(fld)+1 from table; end; I guess it depends on the frequency of updates v/s the frequency of inserts. I'm not sure, at this point, which will be greater. eric ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html