Re: feature request - adding columns with default value - Mailing list pgsql-general

From Tom Lane
Subject Re: feature request - adding columns with default value
Date
Msg-id 5484.1049467180@sss.pgh.pa.us
Whole thread Raw
In response to feature request - adding columns with default value  ("Henrik Steffen" <steffen@city-map.de>)
List pgsql-general
"Henrik Steffen" <steffen@city-map.de> writes:
> when doing an "alter table blub add column blah int2"
> and then issuing an "alter table blub alter column blah set default 0"
> I will always have to manually "update blub set blah=0" to initialize
> all existing records.... this is quite annoying while performing this
> update on a running database with 100 simultaneous clients accessing
> the database and about 443482 rows in the table.

As someone else already pointed out, allowing "alter table add column"
to specify a default would simply mean the system has to go through
these same steps behind your back.  It would *not* be any faster.  What
it would be is a lot worse from a concurrency standpoint, because the
"alter table" transaction has to take an exclusive lock, which would
then have to be held throughout the update of the data rows.  Doing it
in separate steps allows the exclusive lock to be held only while the
critical catalog updates are performed.  The UPDATE part is then just
an ordinary writer that needs no exclusive lock.

> it takes about 10 minutes to perform this update, and for about 6 minutes
> all other updates to the table are in status "waiting", this leads to
> a very high load and all max_connection setting is reached soon
> (on a Dual-Xeon 2.4 GHz machine with 2 GB RAM)

Other transactions should only need to block if they try to update a row
already updated by the "set blah=0" transaction.  You might consider
breaking the big update into a series of smaller transactions, say
updating 10% of the rows at a time.  (If you vacuumed after each of
these smaller updates, you could also minimize table bloat.)

We will eventually support "add column" with default, but AFAICS it will
only be a convenience feature; if you are concerned about keeping the
table available for concurrent use, the multi-step manual approach will
always win.

            regards, tom lane


pgsql-general by date:

Previous
From: Lonni J Friedman
Date:
Subject: Re: unable to dump database, toast errors
Next
From: Stephan Szabo
Date:
Subject: Re: feature request - adding columns with default value