Thanks, Justin, as always for the gentle prodding to expound on an email
:-).
Here's a much-expanded version of the response about how to change field
types, as well as info about best practices for dumping/restoring.
As of version 7.1, PostgreSQL still lacks some commands to alter tables
that other databases have.
Currently, you can:
* add columns
* add or drop a default value for a column
* rename a column
* rename a table
* add a constraint / referential integrity
* assign the table to a new owner
Some features are not yet implemented:
* dropping a column
* changing a column from NULL to NOT NULL or vice versa
* adding a primary key
* changing a column type
However, these features can be implemented by using temporary tables and
moving the data back and forth.
In these examples, we'll work with this table:
CREATE TABLE Pers (
id INT NOT NULL,
fname TEXT NOT NULL,
lname TEXT NOT NULL,
gender CHAR NOT NULL,
email VARCHAR(30),
phone TEXT NOT NULL
);
The example requires that you create a new temporary table that is
slightly different from the Pers table. In your database, you might not
still have the SQL CREATE TABLE statement handy, but you can always get it
with pg_dump:
$ pg_dump -sn -t <tablename> <dbname>
1) Dropping a Column
To drop a column, create a new table without the offending column.
To drop our phone column:
(See the important comments belong about indexes, triggers, rules, and
sequences.)
CREATE TABLE Pers_temp (
id INT NOT NULL,
fname TEXT NOT NULL,
lname TEXT NOT NULL,
gender CHAR NOT NULL,
email VARCHAR(30)
);
INSERT INTO Pers_temp
SELECT * FROM Pers;
DROP TABLE Pers;
ALTER TABLE Pers_temp RENAME TO Pers;
2) Changing a column from NULL to NOT NULL (or vice versa)
This is much easier -- there's a field in the system catalog that we can
switch.
To change our email field to required:
UPDATE pg_attribute SET attnotnull = TRUE
WHERE attname = 'email'
AND attrelid = ( SELECT oid FROM pg_class WHERE relname = 'pers') ;
(set to FALSE to clear the NOT NULL setting.)
This change does *not* require any fixing or indexes, triggers, etc. The
only problem is when you change a column to NOT NULL, and there are null
values in it. In this case, PostgreSQL will *allow* the initial change,
but refuses any UPDATEs to records with NULL values in the field. If you
do set it to NOT NULL when there are NULL values present, the only sane
choice would be to immediate populate those NULL values.
3) Adding a Primary Key
A primary key in PostgreSQL is a unique index on a column that is declared
NOT NULL. Unless you name the primary key differently, PostgreSQL assumes
the name '<tablename>_pkey' for the key name.
Creating a primary key is easy:
CREATE UNIQUE INDEX pers_pkey ON pers ( id );
4) Changing a column type
This is the same idea as #1, above: create a temporary table and use this
as the new table. (Again, if you use triggers, sequences, rules, see the
important note below).
To make our email field larger:
CREATE TABLE Pers_temp (
id INT NOT NULL PRIMARY KEY,
fname TEXT NOT NULL,
lname TEXT NOT NULL,
gender CHAR NOT NULL,
email VARCHAR(50)
);
INSERT INTO Pers_temp
SELECT * FROM Pers;
DROP TABLE Pers;
ALTER TABLE pers_temp RENAME TO pers;
** IMPORTANT NOTES **
Solutions #2 and #3 should work anywhere without problems.
Solutions #1 and #4 can be a pain if you use the following features of
PostgreSQL:
* sequences
* triggers
* rules
* views
Sequences can be a pain because, in creating our new temporary table, if
we use a SERIAL field type, PostgreSQL creates a sequence that is named
for the table -- with the new, temporary table name. So, if our Pers table
used a SERIAL for the id, our new temporary table (created with field
declaration "id SERIAL NOT NULL" really looks like this:
CREATE TABLE pers_temp (
id INT NOT NULL DEFAULT nextval('pers_temp_id_seq')
...
When we rename this table, the sequence won't be renamed with
it. Everything will still work fine, but it's strange to have Pers getting
its serial id value from pers_temp_id_seq.
Therefore:
DROP SEQUENCE pers_temp_id_seq;
ALTER TABLE pers ALTER COLUMN id SET DEFAULT nextval('pers_id_seq');
Now, you're using your original sequence with the new pers table.
Triggers, rules, and views are more of a problem. Basically, these objects
in PostgreSQL know tables by their OID, not by their names (for a good
description of OIDs, see Momjian's book).
To fix these, you'd have to DROP and CREATE every trigger that was placed
on the pers table, every rule that was placed on it, every view that drew
from it. And, worse, you have to then DROP and CREATE every view that
relied on the view you just dropped, and so on.
Unless your database is small and simple, the best solution for this
is: DON'T DO IT. Instead, make your changes with pg_dump and pg_restore.
(Functions aren't a problem, because functions are re-created with every
new backend. If you have functions that refer to the Pers table, and you
drop it/rename it, you can simply quit psql and restart it to regain use
of your functions that referred to it.)
Doing it with pg_dump / pg_restore:
In complicated cases, the best solution is to make a dump file, edit that,
and restore that:
$ pg_dump -S postgres <dbname> > database.sql
then your your favorite text editor to edit this. Some text editors
(such as Emacs/XEmacs) have an 'SQL mode' which can color-highlight or
help you edit an SQL file.
Then, quit your database and
\c template1
DROP DATABASE <dbname>;
CREATE DATABASE <dbname> WITH TEMPLATE=template0;
\c <dbname>
\i database.sql
will drop your database, recreate it, and read in your (edited) dump file.
Noite the use of TEMPLATE=template0. Template0 is a database that is empty
of all user functions/tables/etc. Template1 (the default template) starts
of empty of user functions, but often, people add
languages/functions/tables, etc to template1, making it a bad choice for
using in a dump/restore cycle.
The restore process shows many NOTICE messages, and if your syntax is
wrong, ERROR messages. To ensure that I don't miss any important messages,
I actually do my dump like this:
1) in my directory, I keep a tiny text file called recreate.sql. In it
is the following:
DROP DATABASE <dbname>;
CREATE DATABASE <dbname> WITH TEMPLATE=template0;
\c <dbname>
2) I add "\i recreate.sql" to the top of the dump file while editing it.
3) So that I can easily find the error messages, I do my recreating
like this:
$ echo "\i database.sql" | psql template1 1>/dev/null
By piping the \i command to psql, rather than redirecting standard
input ("psql template1 < database.sql", I can see line numbers in
the dump file for any error messages (very helpful!).
By redirecting standard output to /dev/null, I now only see
NOTICE and ERROR messages, so I can be sure everything works fine.
4) After re-creating, I re-dump the file to a new name:
$ pg_dump -S postgres <dbname> > database1.sql
5) I compare the lengths the two files to make sure that they are
almost exactly the same. This is just to double-check to make
sure that pg_dump didn't screw anything up. If you're really
paranoid, you can diff the two files to see if they're the
same.
HTH,
--
Joel Burton <jburton@scw.org>
Director of Information Systems, Support Center of Washington