Thread: pg_dump future problem.
I see an obvious problem in the way pg_dump dumps serials. At the moment, we have length 64 identifiers. If someone has a 64 character table name and adds a serial to it, it will get a truncated sequence name, with _seq o the end. pg_dump will dump it like this: CREATE TABLE really_long_name (a SERIAL UNIQUE ); SELECT SETVAL('really_long_na_seq', 120); However, if we up 7.4, say, to use 128 character identifiers, then this restore will fail, as the sequence name will NOT be truncated and the setval() call will fail. We really need: ALTER SEQUENCE ON really_long_name(a) .... Chris
At 12:14 PM 4/05/2003 +0800, Christopher Kings-Lynne wrote: >CREATE TABLE really_long_name ( > a SERIAL UNIQUE >); At least in production versions, pg_dump does not do this. It will dump it as a integer default nextval('really_long_na_seq') ) not sure about CVS. ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 03 5330 3172 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
> At 12:14 PM 4/05/2003 +0800, Christopher Kings-Lynne wrote: > >CREATE TABLE really_long_name ( > > a SERIAL UNIQUE > >); > > At least in production versions, pg_dump does not do this. It will dump it as > a integer default nextval('really_long_na_seq') > ) No, 7.3 release dumps tables like this. Assuming that you've use contrib/adddepend to add the dependencies from your old tables into the dependencies table. (Which you should do...) Even if you don't, then any tables you create under 7.3 will certainly be dumped like that... Chris
At 01:49 PM 4/05/2003 +0800, Christopher Kings-Lynne wrote: >No, 7.3 release dumps tables like this. Good point. I had not even noticed, but I had noticed something odd in the last few dumps. My preference for a solution to this problem would be to remove even more implementation knowledge from pg_dump, and add a command like: alter table really_long_name set next a = <next sequence value>; Assuming this kind of approach is acceptable, I'm even volunteering to do it since it seems small and hits a few areas I have not had a chance to break yet. Unless of course someone else is already in the relevant areas. It has the advantage of bypassing the problem and, if we ever redesign sequences, there is no impact on pg_dump. ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 03 5330 3172 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes: > I see an obvious problem in the way pg_dump dumps serials. At the moment, > we have length 64 identifiers. If someone has a 64 character table name > and adds a serial to it, it will get a truncated sequence name, with _seq > o the end. > However, if we up 7.4, say, to use 128 character identifiers, then this > restore will fail, as the sequence name will NOT be truncated and the > setval() call will fail. <yawn> ... AFAIR, we have heard no reports of such problems with the 32-to-64 name conversion. The odds of problems with a (hypothetical) future increase to 128 would be far less. I think we have more immediate problems to worry about. regards, tom lane
Philip Warner <pjw@rhyme.com.au> writes: > My preference for a solution to this problem would be to remove even more > implementation knowledge from pg_dump, and add a command like: > alter table really_long_name set next a = <next sequence value>; And the is_called flag fits into this where? regards, tom lane
At 10:10 AM 4/05/2003 -0400, Tom Lane wrote: >And the is_called flag fits into this where? My recollection is that is_called is used to cover the boundary case where the 'current' value should not be incremented before being returned, but my memory is hazy. If this is the case, can we not just set it to true and set the value to (next - 1) unless (next - 1) < minv, in which case we set it to minv and set is_called to false? Note that for the purpose of serial values we do not need to set the sequence exactly as it was internally, we just need to make sure that the next allocated value will be what we expect. ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 03 5330 3172 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes: > At 10:10 AM 4/05/2003 -0400, Tom Lane wrote: >> And the is_called flag fits into this where? > My recollection is that is_called is used to cover the boundary case where > the 'current' value should not be incremented before being returned, but my > memory is hazy. If this is the case, can we not just set it to true and set > the value to (next - 1) unless (next - 1) < minv, in which case we set it > to minv and set is_called to false? This would fail to cover the case where the user has used setval() to set is_called false and last_value to something other than minv. > Note that for the purpose of serial values we do not need to set the > sequence exactly as it was internally, we just need to make sure that the > next allocated value will be what we expect. I disagree with the notion that pg_dump need not dump legal configurations of the database. ISTM the real issue here is that we don't want pg_dump dumps to hard-wire knowledge of the algorithm for deriving a sequence name from a table and column name. Why not attack that straightforwardly: provide a backend function that computes one from the other? Then the SETVAL calls could become something like SELECT setval(serial_seq_name('table', 'column'), 42, true); and we're not sacrificing anything --- nor going through the very substantial overhead of creating a new ALTER TABLE variant. regards, tom lane
At 09:45 AM 5/05/2003 -0400, Tom Lane wrote: >This would fail to cover the case where the user has used setval() to >set is_called false and last_value to something other than minv. In this case I think they have shot themselves in the foot; the docs clearly state that setval/3 is for internal pg_dump use only. It is also not to be relied upon when there are more than one connection to the db updating the sequence. > > Note that for the purpose of serial values we do not need to set the > > sequence exactly as it was internally, we just need to make sure that the > > next allocated value will be what we expect. > >I disagree with the notion that pg_dump need not dump legal >configurations of the database. What I am suggesting is that we make SERIAL sequences more black-box like so that we have less reliance on specific implementation details in other pieces of code (pg_dump). > SELECT setval(serial_seq_name('table', 'column'), 42, true); > >and we're not sacrificing anything I think we are: setval/3 was written to fix one mess with sequences (inability to set is_called), now we are suggesting another function to patch another (small) hole. ISTM that a more implementation-independant approach might be needed. I am not attached to my solution, but I do think it's a good idea to look at what would have been done with a 'green fields' design, and then ask: can we do it now? Is it worth it? AFAICT, is_called is just a kludge to allow sequences to attain minval at some point; I'm not sure we should be supporting people actually setting it. >--- nor going through the very >substantial overhead of creating a new ALTER TABLE variant. Admittedly it's more complex, but which makes PostgreSQL a better product? ALTER TABLE sometable ALTER COLUMN somecolumn SET NEXT 42; or SELECT setval(serial_seq_name('sometable', 'somecolumn'), 42, true); ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 03 5330 3172 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes: > At 09:45 AM 5/05/2003 -0400, Tom Lane wrote: >> This would fail to cover the case where the user has used setval() to >> set is_called false and last_value to something other than minv. > In this case I think they have shot themselves in the foot; the docs > clearly state that setval/3 is for internal pg_dump use only. There is no such statement visible in http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/functions-sequence.html nor do I find it anywhere else in the current documents. > It is also > not to be relied upon when there are more than one connection to the db > updating the sequence. Any more or less so than either two-parameter SETVAL or the proposed ALTER TABLE? I don't see how. > I am not attached to my solution, but I do think > it's a good idea to look at what would have been done with a 'green fields' > design, and then ask: can we do it now? Is it worth it? It probably would look different if we were starting from scratch ... but we aren't, and I don't see any problems here that are large enough to justify starting over. regards, tom lane
At 10:58 AM 5/05/2003 -0400, Tom Lane wrote: >Philip Warner <pjw@rhyme.com.au> writes: > > > In this case I think they have shot themselves in the foot; the docs > > clearly state that setval/3 is for internal pg_dump use only. > >There is no such statement visible in >http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/functions-sequence.html >nor do I find it anywhere else in the current documents. Good point. It's only in the source code. I thought I had updated the docs as well... > > It is also > > not to be relied upon when there are more than one connection to the db > > updating the sequence. > >Any more or less so than either two-parameter SETVAL or the proposed >ALTER TABLE? I don't see how. My recollection is that setting is_called is more fragile than just setting the sequence value, so it not wise to use in general. >It probably would look different if we were starting from scratch >... but we aren't, and I don't see any problems here that are large >enough to justify starting over. I'm not actually suggesting starting over. Just presenting a nicer interface and fixing a bug in the process, rather than building yet another user-visible function as a band-aid solution. ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 03 5330 3172 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes: > Good point. It's only in the source code. I thought I had updated the docs > as well... Actually, I think the sequence of events was that we neglected to remove the statement in the source code when we fixed the documentation. 3-parameter setval is documented because it solves a problem that users have --- pg_dump is not the only application that needs to do this. > My recollection is that setting is_called is more fragile than just setting > the sequence value, so it not wise to use in general. Doesn't look that way to me; we're setting several fields of the sequence record no matter what. Perhaps your recollection predates Vadim's last rewrite of the sequence code? > I'm not actually suggesting starting over. Just presenting a nicer > interface and fixing a bug in the process, rather than building yet another > user-visible function as a band-aid solution. But the proposed "nicer interface" *introduces* a bug, namely the inability to preserve is_called, which is exactly the pg_dump bug that 3-parameter setval was invented to fix. I do not want to go backwards in the name of a "nicer interface". regards, tom lane
At 11:22 AM 5/05/2003 -0400, Tom Lane wrote: >But the proposed "nicer interface" *introduces* a bug, namely the >inability to preserve is_called, which is exactly the pg_dump bug >that 3-parameter setval was invented to fix. No, I don't want to remove the setval functions from general sequences. I just want to use existing restrictions on the SERIAL data type: by definition they start at 0, increase by 1, and are intX. We can ignore the current is_called flag and set the min value or is_called flag in such a way as to get the desired next value. ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 03 5330 3172 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/