pg_dump versus SERIAL, round N - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | pg_dump versus SERIAL, round N |
Date | |
Msg-id | 27967.1156002459@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: pg_dump versus SERIAL, round N
|
List | pgsql-hackers |
We have still another complaint here: http://archives.postgresql.org/pgsql-bugs/2006-08/msg00109.php about pg_dump failing to cope nicely with any slightly-unusual condition related to a SERIAL column. We've had previous discussions about this, most recently this thread: http://archives.postgresql.org/pgsql-hackers/2006-04/msg01118.php but still no consensus on what to do. I'm pretty well convinced at this point that having pg_dump dump serial columns via SERIAL declarations is a failure. Doing it that way means that the underlying sequence may change names during reload, which pg_dump is not well prepared to cope with --- we have a hack in place for setval() but not for anything else, notably GRANT, nondefault ALTER SEQUENCE parameters, and references to the sequence from other tables. Even if there were a reasonably simple way to fix all those things, since when is it part of pg_dump's charter to editorialize on your database schema? The goal ought to be to reproduce the state of the database, not to "clean it up". I think that we ought to make pg_dump dump these things as if they were separate sequence objects, ie, in the style CREATE SEQUENCE foo_bar_seq ... any nondefault parameters here ... ... CREATE TABLE foo ( bar int default nextval('foo_bar_seq'), ... This fixes most of the problems at a stroke. The one thing it breaks is that after loading such a dump, there is no longer any dependency from the sequence to the column, and thus dropping the column wouldn't cause the sequence to go away, as it would've in the original database. The new thought I had this morning was to fix that head-on: provide a way for pg_dump to re-establish that dependency. If you were willing to load the dump as superuser it could just INSERT a row into pg_depend, but that's certainly not an acceptable assumption (and it wouldn't be future-proof anyway; we learned the folly of that with pg_dumpall's hacking on the pg_group table...) However, I see nothing wrong with providing a slightly more abstract way of declaring that a sequence "belongs to" a column. If we're willing to hack up the grammar a bit we could make ALTER SEQUENCE do it, perhaps ALTER SEQUENCE foo_bar_seq SERIAL FOR foo.bar; which I would propose being allowed to anyone who owns both the sequence and the table in question. Or the poor man's way to do it would involve creating a pg_set_serial_sequence() function that does the same thing. Further down the road we could consider allowing this command to drop a serial-sequence association or reattach a sequence to a different owning column, though pg_dump needs neither of these. This seems no uglier to me than the existing pg_dump hack that uses pg_get_serial_sequence(), and AFAICS it allows exact reproduction of the state of the database, even in the face of renames, ALTER COLUMN DEFAULT, etc. In terms of the discussion I mentioned above, this amounts to embracing the "SERIAL is a macro" philosophy rather than "SERIAL is a black box", and recognizing that there's one little piece of the implementation that still needs to be exposed so that we can describe exactly what the macro consists of. Comments? regards, tom lane
pgsql-hackers by date: