Thread: feature request - adding columns with default value

feature request - adding columns with default value

From
"Henrik Steffen"
Date:
hello,

when doing an "alter table blub add column blah int2"
and then issuing an "alter table blub alter column blah set default 0"
I will always have to manually "update blub set blah=0" to initialize
all existing records.... this is quite annoying while performing this
update on a running database with 100 simultaneous clients accessing
the database and about 443482 rows in the table.

it takes about 10 minutes to perform this update, and for about 6 minutes
all other updates to the table are in status "waiting", this leads to
a very high load and all max_connection setting is reached soon
(on a Dual-Xeon 2.4 GHz machine with 2 GB RAM)

i would suggest to change the "alter table add column" function to
adding a default value while adding the column if this is possible
and if this minimizes "down-time".

thank you

--

Mit freundlichem Gruß

Henrik Steffen
Geschäftsführer

top concepts Internetmarketing GmbH
Am Steinkamp 7 - D-21684 Stade - Germany
--------------------------------------------------------
http://www.topconcepts.com          Tel. +49 4141 991230
mail: steffen@topconcepts.com       Fax. +49 4141 991233
--------------------------------------------------------
24h-Support Hotline:  +49 1908 34697 (EUR 1.86/Min,topc)
--------------------------------------------------------
Ihr SMS-Gateway: JETZT NEU unter: http://sms.city-map.de
System-Partner gesucht: http://www.franchise.city-map.de
--------------------------------------------------------
Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
--------------------------------------------------------


Re: feature request - adding columns with default value

From
Shridhar Daithankar
Date:
On Friday 04 April 2003 13:38, you wrote:
> hello,
>
> when doing an "alter table blub add column blah int2"
> and then issuing an "alter table blub alter column blah set default 0"
> I will always have to manually "update blub set blah=0" to initialize
> all existing records.... this is quite annoying while performing this
> update on a running database with 100 simultaneous clients accessing
> the database and about 443482 rows in the table.
>
> it takes about 10 minutes to perform this update, and for about 6 minutes
> all other updates to the table are in status "waiting", this leads to
> a very high load and all max_connection setting is reached soon
> (on a Dual-Xeon 2.4 GHz machine with 2 GB RAM)
>
> i would suggest to change the "alter table add column" function to
> adding a default value while adding the column if this is possible
> and if this minimizes "down-time".

May be I don't get it right, but altering defaults on live table, would yield
inconsistent data. Some NULLS in past rows and zeros being inserted in new
rows and that is not correct. Am I right here?

Furthermore mass updating close to 450,000 row would give you a nice waste of
dead tuples and your next vacuum will take loooong time unless you are
vacuuming in tight loops while updating.

I recommend you schedule a downtime, dump the table and reload it. This way
there would be no wasted tuples. Any change in database schema should go thr.
a scheduled maintenance , if you ask me.

 Shridhar


Re: feature request - adding columns with default value

From
"Henrik Steffen"
Date:
fortunately, the vacuum after the update does only take
approx. 30 seconds thanks to the ultra-fast machine we
are using...


--

Mit freundlichem Gruß

Henrik Steffen
Geschäftsführer

top concepts Internetmarketing GmbH
Am Steinkamp 7 - D-21684 Stade - Germany
--------------------------------------------------------
http://www.topconcepts.com          Tel. +49 4141 991230
mail: steffen@topconcepts.com       Fax. +49 4141 991233
--------------------------------------------------------
24h-Support Hotline:  +49 1908 34697 (EUR 1.86/Min,topc)
--------------------------------------------------------
Ihr SMS-Gateway: JETZT NEU unter: http://sms.city-map.de
System-Partner gesucht: http://www.franchise.city-map.de
--------------------------------------------------------
Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
--------------------------------------------------------

----- Original Message -----
From: "Shridhar Daithankar" <shridhar_daithankar@persistent.co.in>
To: "pgsql" <pgsql-general@postgresql.org>
Sent: Friday, April 04, 2003 10:24 AM
Subject: Re: [GENERAL] feature request - adding columns with default value


> On Friday 04 April 2003 13:38, you wrote:
> > hello,
> >
> > when doing an "alter table blub add column blah int2"
> > and then issuing an "alter table blub alter column blah set default 0"
> > I will always have to manually "update blub set blah=0" to initialize
> > all existing records.... this is quite annoying while performing this
> > update on a running database with 100 simultaneous clients accessing
> > the database and about 443482 rows in the table.
> >
> > it takes about 10 minutes to perform this update, and for about 6
minutes
> > all other updates to the table are in status "waiting", this leads to
> > a very high load and all max_connection setting is reached soon
> > (on a Dual-Xeon 2.4 GHz machine with 2 GB RAM)
> >
> > i would suggest to change the "alter table add column" function to
> > adding a default value while adding the column if this is possible
> > and if this minimizes "down-time".
>
> May be I don't get it right, but altering defaults on live table, would
yield
> inconsistent data. Some NULLS in past rows and zeros being inserted in new
> rows and that is not correct. Am I right here?
>
> Furthermore mass updating close to 450,000 row would give you a nice waste
of
> dead tuples and your next vacuum will take loooong time unless you are
> vacuuming in tight loops while updating.
>
> I recommend you schedule a downtime, dump the table and reload it. This
way
> there would be no wasted tuples. Any change in database schema should go
thr.
> a scheduled maintenance , if you ask me.
>
>  Shridhar
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org


Re: feature request - adding columns with default value

From
Andrew Sullivan
Date:
On Fri, Apr 04, 2003 at 10:08:50AM +0200, Henrik Steffen wrote:
> it takes about 10 minutes to perform this update, and for about 6 minutes
> all other updates to the table are in status "waiting", this leads to
> a very high load and all max_connection setting is reached soon
> (on a Dual-Xeon 2.4 GHz machine with 2 GB RAM)
>
> i would suggest to change the "alter table add column" function to
> adding a default value while adding the column if this is possible
> and if this minimizes "down-time".

Why do you think that having the back end set the default value on
those other rows won't sugger the very same problem?  What esle is it
to do except implicitly iees the same UPDATE you do?

One way around this problem is to write a little script (I use perl
for this, but pick your favourite flavour) which does the updates
1000 at a time.  This minimises the locking, so you don't have to
have anything WAITING for 10 minutes.

It also allows you to insert the occasional VACUUM so that your FSM
settings don't have to be huge.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: feature request - adding columns with default value

From
Diogo de Oliveira Biazus
Date:
Henrik Steffen wrote:

>hello,
>
>when doing an "alter table blub add column blah int2"
>and then issuing an "alter table blub alter column blah set default 0"
>I will always have to manually "update blub set blah=0" to initialize
>all existing records.... this is quite annoying while performing this
>update on a running database with 100 simultaneous clients accessing
>the database and about 443482 rows in the table.
>
>it takes about 10 minutes to perform this update, and for about 6 minutes
>all other updates to the table are in status "waiting", this leads to
>a very high load and all max_connection setting is reached soon
>(on a Dual-Xeon 2.4 GHz machine with 2 GB RAM)
>
>i would suggest to change the "alter table add column" function to
>adding a default value while adding the column if this is possible
>and if this minimizes "down-time".
>
>thank you
>
>
Maybe I'm wrong, but this feature was supported in 7.1,
and it was removed in 7.2.
If what I'm saying is true, I would like to know why was this feature
removed?

--
Diogo de Oliveira Biazus
diogo@ikono.com.br
Ikono Sistemas e Automação
http://www.ikono.com.br


Re: feature request - adding columns with default value

From
Stephan Szabo
Date:
On Fri, 4 Apr 2003, Diogo de Oliveira Biazus wrote:

> Henrik Steffen wrote:
>
> >hello,
> >
> >when doing an "alter table blub add column blah int2"
> >and then issuing an "alter table blub alter column blah set default 0"
> >I will always have to manually "update blub set blah=0" to initialize
> >all existing records.... this is quite annoying while performing this
> >update on a running database with 100 simultaneous clients accessing
> >the database and about 443482 rows in the table.
> >
> >it takes about 10 minutes to perform this update, and for about 6 minutes
> >all other updates to the table are in status "waiting", this leads to
> >a very high load and all max_connection setting is reached soon
> >(on a Dual-Xeon 2.4 GHz machine with 2 GB RAM)
> >
> >i would suggest to change the "alter table add column" function to
> >adding a default value while adding the column if this is possible
> >and if this minimizes "down-time".
> >
> >thank you
> >
> >
> Maybe I'm wrong, but this feature was supported in 7.1,
> and it was removed in 7.2.
> If what I'm saying is true, I would like to know why was this feature
> removed?

AFAIR, some older versions accepted the syntax but then totally ignored
the extra attributes (ie, you could say something like int2 default 0, but
the default was just silently lost).


Re: feature request - adding columns with default value

From
Tom Lane
Date:
"Henrik Steffen" <steffen@city-map.de> writes:
> when doing an "alter table blub add column blah int2"
> and then issuing an "alter table blub alter column blah set default 0"
> I will always have to manually "update blub set blah=0" to initialize
> all existing records.... this is quite annoying while performing this
> update on a running database with 100 simultaneous clients accessing
> the database and about 443482 rows in the table.

As someone else already pointed out, allowing "alter table add column"
to specify a default would simply mean the system has to go through
these same steps behind your back.  It would *not* be any faster.  What
it would be is a lot worse from a concurrency standpoint, because the
"alter table" transaction has to take an exclusive lock, which would
then have to be held throughout the update of the data rows.  Doing it
in separate steps allows the exclusive lock to be held only while the
critical catalog updates are performed.  The UPDATE part is then just
an ordinary writer that needs no exclusive lock.

> it takes about 10 minutes to perform this update, and for about 6 minutes
> all other updates to the table are in status "waiting", this leads to
> a very high load and all max_connection setting is reached soon
> (on a Dual-Xeon 2.4 GHz machine with 2 GB RAM)

Other transactions should only need to block if they try to update a row
already updated by the "set blah=0" transaction.  You might consider
breaking the big update into a series of smaller transactions, say
updating 10% of the rows at a time.  (If you vacuumed after each of
these smaller updates, you could also minimize table bloat.)

We will eventually support "add column" with default, but AFAICS it will
only be a convenience feature; if you are concerned about keeping the
table available for concurrent use, the multi-step manual approach will
always win.

            regards, tom lane


Re: feature request - adding columns with default value

From
Stephan Szabo
Date:
On Fri, 4 Apr 2003, Henrik Steffen wrote:

> when doing an "alter table blub add column blah int2"
> and then issuing an "alter table blub alter column blah set default 0"
> I will always have to manually "update blub set blah=0" to initialize
> all existing records.... this is quite annoying while performing this
> update on a running database with 100 simultaneous clients accessing
> the database and about 443482 rows in the table.
>
> it takes about 10 minutes to perform this update, and for about 6 minutes
> all other updates to the table are in status "waiting", this leads to
> a very high load and all max_connection setting is reached soon
> (on a Dual-Xeon 2.4 GHz machine with 2 GB RAM)
>
> i would suggest to change the "alter table add column" function to
> adding a default value while adding the column if this is possible
> and if this minimizes "down-time".

IIRC, ADD COLUMN adds the column to the end of the list, but doesn't
modify the actual rows stored on disk.  I think when the column past the
end is accessed on a row that doesn't have it a NULL is returned.  SET
DEFAULT just sets the default up but also doesn't modify the actual rows
stored on disk.  Presumably SET DEFAULT should probably do the update, but
it's not likely to be appreciably faster than you doing the update.

Other options would include things like updating the rows on select (that
sounds like it could lead to wierd deadlocks between a select and an
update), returning the default rather than NULL for the access (but that
doesn't work for non-immutable defaults unless you update the row during
the select) or building a new copy of the table (requires potentially alot
of disk space).  In addition, since the transaction in question probably
has an exclusive lock to the table, the other transactions need to wait
while the alter runs (so a new copy doesn't save you from locking reads
while the table is building unless you drop the lock, which is probably
dangerous). I think this was discussed in the past, so the archives can
probably do a better job than I can (since I wasn't paying all that much
attention).


Re: feature request - adding columns with default value

From
Tom Lane
Date:
Diogo de Oliveira Biazus <diogo@ikono.com.br> writes:
> Henrik Steffen wrote:
>> i would suggest to change the "alter table add column" function to
>> adding a default value while adding the column if this is possible
>> and if this minimizes "down-time".

> Maybe I'm wrong, but this feature was supported in 7.1,
> and it was removed in 7.2.

THe old implementation was really equivalent to "alter table add column"
followed by "set default", because it didn't touch the actual rows
(the implied new column stayed NULL).  That's not what the command
should do per SQL spec, so we took it out until we get around to
implementing the spec-compliant behavior.  The current behavior conforms
to spec as far as it goes.

            regards, tom lane


Re: feature request - adding columns with default value

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> SET DEFAULT just sets the default up but also doesn't modify the actual rows
> stored on disk.  Presumably SET DEFAULT should probably do the update,

No, it shouldn't; the existing behavior is per spec.  SET DEFAULT just
changes the default for future insertions, it's not supposed to touch
the present table contents.

ALTER TABLE ADD COLUMN with a default clause *is* supposed to fill all
rows of the table with the default.  The reason we reject it is we don't
have that behavior implemented.

            regards, tom lane


Re: feature request - adding columns with default value

From
Dennis Gearon
Date:
do ANY databases do this? How does it fit the SQL standard. And, whether you do
it manually or the ALTER TABLE command does it, it's still going to block other
updates.

Henrik Steffen wrote:
> hello,
>
> when doing an "alter table blub add column blah int2"
> and then issuing an "alter table blub alter column blah set default 0"
> I will always have to manually "update blub set blah=0" to initialize
> all existing records.... this is quite annoying while performing this
> update on a running database with 100 simultaneous clients accessing
> the database and about 443482 rows in the table.
>
> it takes about 10 minutes to perform this update, and for about 6 minutes
> all other updates to the table are in status "waiting", this leads to
> a very high load and all max_connection setting is reached soon
> (on a Dual-Xeon 2.4 GHz machine with 2 GB RAM)
>
> i would suggest to change the "alter table add column" function to
> adding a default value while adding the column if this is possible
> and if this minimizes "down-time".
>
> thank you
>
> --
>
> Mit freundlichem Gruß
>
> Henrik Steffen
> Geschäftsführer
>
> top concepts Internetmarketing GmbH
> Am Steinkamp 7 - D-21684 Stade - Germany
> --------------------------------------------------------
> http://www.topconcepts.com          Tel. +49 4141 991230
> mail: steffen@topconcepts.com       Fax. +49 4141 991233
> --------------------------------------------------------
> 24h-Support Hotline:  +49 1908 34697 (EUR 1.86/Min,topc)
> --------------------------------------------------------
> Ihr SMS-Gateway: JETZT NEU unter: http://sms.city-map.de
> System-Partner gesucht: http://www.franchise.city-map.de
> --------------------------------------------------------
> Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
> --------------------------------------------------------
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>


Re: feature request - adding columns with default value

From
Stephan Szabo
Date:
On Fri, 4 Apr 2003, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> > SET DEFAULT just sets the default up but also doesn't modify the actual rows
> > stored on disk.  Presumably SET DEFAULT should probably do the update,
>
> No, it shouldn't; the existing behavior is per spec.  SET DEFAULT just
> changes the default for future insertions, it's not supposed to touch
> the present table contents.

Yeah, realized that didn't make much sense afterwards.

> ALTER TABLE ADD COLUMN with a default clause *is* supposed to fill all
> rows of the table with the default.  The reason we reject it is we don't
> have that behavior implemented.

Right.  We should probably change the error text then since it doesn't
actually mention the update being necessary (and what it suggests is just
as wrong as the behavior it would have if we just did the default in one
step).


Re: feature request - adding columns with default value

From
"Wayne Armstrong"
Date:
** Reply to message from Dennis Gearon <gearond@cvc.net> on Fri, 04 Apr 2003
08:19:24 -0800
Yeah,
 Db2 does.
 The reall usefulness of this is the form :-
  Alter table add blah varchar(10) not null default 'No'

  The backend appears to initialise the new column to the default value (just
about instantly - even for large tables), and you can add  a not null column to
the table in one statement.

  Regards,
  Wayne

> do ANY databases do this? How does it fit the SQL standard. And, whether you do
> it manually or the ALTER TABLE command does it, it's still going to block other
> updates.
>
> Henrik Steffen wrote:
> > hello,
> >
> > when doing an "alter table blub add column blah int2"
> > and then issuing an "alter table blub alter column blah set default 0"
> > I will always have to manually "update blub set blah=0" to initialize
> > all existing records.... this is quite annoying while performing this
> > update on a running database with 100 simultaneous clients accessing
> > the database and about 443482 rows in the table.
> >
> > it takes about 10 minutes to perform this update, and for about 6 minutes
> > all other updates to the table are in status "waiting", this leads to
> > a very high load and all max_connection setting is reached soon
> > (on a Dual-Xeon 2.4 GHz machine with 2 GB RAM)
> >
> > i would suggest to change the "alter table add column" function to
> > adding a default value while adding the column if this is possible
> > and if this minimizes "down-time".
> >
> > thank you
> >
> > --
> >
> > Mit freundlichem Gruß
> >
> > Henrik Steffen
> > Geschäftsführer
> >
> > top concepts Internetmarketing GmbH
> > Am Steinkamp 7 - D-21684 Stade - Germany
> > --------------------------------------------------------
> > http://www.topconcepts.com          Tel. +49 4141 991230
> > mail: steffen@topconcepts.com       Fax. +49 4141 991233
> > --------------------------------------------------------
> > 24h-Support Hotline:  +49 1908 34697 (EUR 1.86/Min,topc)
> > --------------------------------------------------------
> > Ihr SMS-Gateway: JETZT NEU unter: http://sms.city-map.de
> > System-Partner gesucht: http://www.franchise.city-map.de
> > --------------------------------------------------------
> > Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
> > --------------------------------------------------------
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> >
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html