Thread: Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?

Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?

From
Dmitry Koterov
Date:
Hello.<br /><br />PostgreSQL is very fast when we perform (even on a huge table)<br /><br />ALTER TABLE ... ADD COLUMN
...NULL;<br /><br />(nullable without a default value). This is because of NULL bitmap in tuples. And it's greatest
featurefor a developer!<br /><br /><br />But another very common-case query like<br /><br />ALTER TABLE ... ADD COLUMN
...BOOLEAN NOT NULL DEFAULT false;<br />or<br />ALTER TABLE ... ADD COLUMN ... INT NOT NULL DEFAULT 0;<br /><br />for a
hugetable is performed very slow - this is because PostgreSQL have to re-create all tuples assigning the default value
tothem. If I have a table with 1 billion rows (for example), I have no chance to perform this query at all - too
slow.<br/><br />(In most cases NOT NULL DEFAULT xxx fields are BOOLEAN, flags: it is not handy to have 3-way flags.)<br
/><br/><br />So, are there plans to optimize such kind of queries? This could be done by many ways:<br /><br />1. Store
theDEFAULT flag directly in NULL BITMAP (add a bit to NULL bitmap not only for NULLable fields, but also for NOT NULL
DEFAULT... fields).<br /> 2. Add another bitmap for each tuple (DEFAULT bitmap). Bit value 0 means that there is a real
valuein a cell, 1 - that the value is default.<br />3. The same as (1), but always force default value to be 0 (or
falseor any other values with meaning "zero") and optimize only these cases.<br /><br /><br /> 

Re: Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?

From
Peter Eisentraut
Date:
On Thursday 21 May 2009 11:06:29 Dmitry Koterov wrote:
> 1. Store the DEFAULT flag directly in NULL BITMAP (add a bit to NULL bitmap
> not only for NULLable fields, but also for NOT NULL DEFAULT ... fields).
> 2. Add another bitmap for each tuple (DEFAULT bitmap). Bit value 0 means
> that there is a real value in a cell, 1 - that the value is default.
> 3. The same as (1), but always force default value to be 0 (or false or any
> other values with meaning "zero") and optimize only these cases.

It seems to me that these solutions would require everyone in the world to pay 
the overhead of a few to many bits in every tuple to optimize what appears to 
be a relatively rare circumstance after all.  Doesn't seem worth it to me.


Re: Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?

From
Sam Mason
Date:
On Thu, May 21, 2009 at 12:06:29PM +0400, Dmitry Koterov wrote:
> ALTER TABLE ... ADD COLUMN ... NULL;
> 
> (nullable without a default value). This is because of NULL bitmap in
> tuples. And it's greatest feature for a developer!

I don't think this is because of the "NULL bitmap".  PG just never needs
to flush the changes to every tuple because it knows that all "old"
tuples (i.e. ones that were created before this column was added) are
supposed to be NULL.

> But another very common-case query like
> 
> ALTER TABLE ... ADD COLUMN ... BOOLEAN NOT NULL DEFAULT false;
> or
> ALTER TABLE ... ADD COLUMN ... INT NOT NULL DEFAULT 0;

> So, are there plans to optimize such kind of queries? This could be done by
> many ways:

I think this hasn't been done before because it's been considered too
difficult to keep track of everything, but I've just tried to come up
with an example of why it's difficult and failed.  If I'm interpreting
things correctly it's not nearly as difficult as I thought it should
be.  All that needs to be tracked is the "first" default value (this is
currently assumed to be NULL).  All subsequent INSERTs will have this
value in the tuple and things should just work out.
 CREATE TABLE t ( i INTEGER PRIMARY KEY ); INSERT INTO t (i) VALUES (1); ALTER TABLE t ADD COLUMN j INTEGER DEFAULT 1;
INSERTINTO t (i) VALUES (2); ALTER TABLE t ALTER j SET DEFAULT 2; INSERT INTO t (i) VALUES (3); ALTER TABLE t ALTER j
DROPDEFAULT; INSERT INTO t (i) VALUES (4);
 

After this we will have the following tuples:
 (1) (2,1) (3,2) (4,NULL)

All that needs to be done is to fill in the "default" for i=1 to the
first default (i.e. the integer 1) and everything is done.

Who wants to tell me what I've missed?

--  Sam  http://samason.me.uk/


Re: Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?

From
Robert Haas
Date:
> (In most cases NOT NULL DEFAULT xxx fields are BOOLEAN, flags: it is not
> handy to have 3-way flags.)

This is certainly not true for me.  I have both nullable booleans and
not-nullable, defaulted columns of other types.

> So, are there plans to optimize such kind of queries? This could be done by
> many ways:
>
> 1. Store the DEFAULT flag directly in NULL BITMAP (add a bit to NULL bitmap
> not only for NULLable fields, but also for NOT NULL DEFAULT ... fields).
> 2. Add another bitmap for each tuple (DEFAULT bitmap). Bit value 0 means
> that there is a real value in a cell, 1 - that the value is default.

Both of these options would expand storage usage by a not-inconsiderable amount.

> 3. The same as (1), but always force default value to be 0 (or false or any
> other values with meaning "zero") and optimize only these cases.

I'm not sure there's any way to know this for an arbitrary data type.

...Robert


Re: Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?

From
Tom Lane
Date:
Sam Mason <sam@samason.me.uk> writes:
> On Thu, May 21, 2009 at 12:06:29PM +0400, Dmitry Koterov wrote:
>> ALTER TABLE ... ADD COLUMN ... NULL;
>> 
>> (nullable without a default value). This is because of NULL bitmap in
>> tuples. And it's greatest feature for a developer!

> I don't think this is because of the "NULL bitmap".

No, it isn't.  It's because each tuple includes the actual count of
fields it contains (t_natts or HeapTupleHeaderGetNatts), and the value
extraction routines are coded to assume that references to fields
beyond that number should yield NULL.  So the ALTER can just leave
the existing rows alone --- only when you update a row will it change
to include the newly added field(s).

AFAICS there's no good way to scale that solution up to handling
non-null values.

> All that needs to be tracked is the "first" default value (this is
> currently assumed to be NULL).

You're being a bit vague, but in any case I don't think it can work
for non-constant defaults (consider DEFAULT NOW()).  And what about
ALTER COLUMN DEFAULT?

(BTW, I'm quite sure schemes like this have been discussed before.
Check the archives...)
        regards, tom lane


Re: Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?

From
Robert Haas
Date:
On Thu, May 21, 2009 at 10:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Sam Mason <sam@samason.me.uk> writes:
>> On Thu, May 21, 2009 at 12:06:29PM +0400, Dmitry Koterov wrote:
>>> ALTER TABLE ... ADD COLUMN ... NULL;
>>>
>>> (nullable without a default value). This is because of NULL bitmap in
>>> tuples. And it's greatest feature for a developer!
>
>> I don't think this is because of the "NULL bitmap".
>
> No, it isn't.  It's because each tuple includes the actual count of
> fields it contains (t_natts or HeapTupleHeaderGetNatts), and the value
> extraction routines are coded to assume that references to fields
> beyond that number should yield NULL.  So the ALTER can just leave
> the existing rows alone --- only when you update a row will it change
> to include the newly added field(s).
>
> AFAICS there's no good way to scale that solution up to handling
> non-null values.
>
>> All that needs to be tracked is the "first" default value (this is
>> currently assumed to be NULL).
>
> You're being a bit vague, but in any case I don't think it can work
> for non-constant defaults (consider DEFAULT NOW()).  And what about
> ALTER COLUMN DEFAULT?

I think what Sam is proposing is that, in addition to storing the
default value for a column, you could store the value at the time the
column was added.  These might be the same if the default is a
constant and has never been modified, or they might be different.
This would even work for now() because it's stable, but it couldn't be
used for a volatile function like random() or timeofday(), of course.

...Robert


Re: Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?

From
Greg Stark
Date:
On Thu, May 21, 2009 at 3:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>> All that needs to be tracked is the "first" default value (this is
>> currently assumed to be NULL).
>
> You're being a bit vague, but in any case I don't think it can work
> for non-constant defaults (consider DEFAULT NOW()).  And what about
> ALTER COLUMN DEFAULT?
>
> (BTW, I'm quite sure schemes like this have been discussed before.
> Check the archives...)

Schemes like this have been discussed before but I don't think we
considered applying the limitation that only the "first" default value
would be covered. We always wanted to be able to handle new defaults
or making a non-null column nullable later. I think the main
motivation in the past was space savings for default values rather
than making adding new non-null columns cheap.

AFAICS as long as we only want to handle newly created non-nullable
columns with initial defaults that are applied when the column is
added then we could actually handle it by storing the initial default
value in the catalog and keeping it in the tupledesc at all times.

If you later change the default then it would only affect newly
inserted rows which will have the new default value explicitly
included anyways. And if likewise if you later make the row nullable
the column will be explicitly listed in natts and the null bitmap.

One gotcha I can think of is if the default expression depends on
something like a type. If the default is later changed then people
might be surprised to find there's still an invisible dependency on
the type. But if it's limited to simple constants that should minimize
that issue quite a bit. At least for types the column itself will have
the same dependency anyways.

--
greg


Re: Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?

From
Tom Lane
Date:
Greg Stark <stark@enterprisedb.com> writes:
> Schemes like this have been discussed before but I don't think we
> considered applying the limitation that only the "first" default value
> would be covered. We always wanted to be able to handle new defaults
> or making a non-null column nullable later.

Yeah ... I don't see exactly what it would buy to restrict it to just
the first such value.
        regards, tom lane


Re: Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?

From
Robert Haas
Date:
On Thu, May 21, 2009 at 11:13 AM, Greg Stark <stark@enterprisedb.com> wrote:
> On Thu, May 21, 2009 at 3:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>>> All that needs to be tracked is the "first" default value (this is
>>> currently assumed to be NULL).
>>
>> You're being a bit vague, but in any case I don't think it can work
>> for non-constant defaults (consider DEFAULT NOW()).  And what about
>> ALTER COLUMN DEFAULT?
>>
>> (BTW, I'm quite sure schemes like this have been discussed before.
>> Check the archives...)
>
> Schemes like this have been discussed before but I don't think we
> considered applying the limitation that only the "first" default value
> would be covered. We always wanted to be able to handle new defaults
> or making a non-null column nullable later. I think the main

How would this scheme prevent you from doing that?  The tuples that
are added subsequent to the initial ADD COLUMN will have the value for
that column in it, whether NULL or otherwise, default or not.  The
only thing you need to store is the value to be used (in place of
NULL) when reading an old, short tuple.

> One gotcha I can think of is if the default expression depends on
> something like a type. If the default is later changed then people
> might be surprised to find there's still an invisible dependency on
> the type. But if it's limited to simple constants that should minimize
> that issue quite a bit. At least for types the column itself will have
> the same dependency anyways.

I was also wondering whether there might be some gotchas in this area.

...Robert


Re: Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?

From
Greg Stark
Date:
On Thu, May 21, 2009 at 4:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Greg Stark <stark@enterprisedb.com> writes:
>> Schemes like this have been discussed before but I don't think we
>> considered applying the limitation that only the "first" default value
>> would be covered. We always wanted to be able to handle new defaults
>> or making a non-null column nullable later.
>
> Yeah ... I don't see exactly what it would buy to restrict it to just
> the first such value.

Well it wouldn't buy you steady-state space savings or performance improvements.

What it would buy you is a much narrowed set of circumstances where
ALTER TABLE ADD COLUMN goes from a fast O(1) catalog change to a
complete table rewrite. The use cases covered such as "boolean DEFAULT
false" or "integer DEFAULT 0" are extremely common.

I think users today often avoid the full table rewrite either make
their application treat null as implicitly the default value or do a
piecemeal rewrite using updates.

I think Robert Haas is right that we could handle any stable
expression by evaluating the expression once and storing only the
final resulting value as a constant. That would avoid the problems
with dependencies and later changes to functions.

Another gotcha is that the default value might be very large.... It
can't be very common but I suppose we would have to take some care
around that.

-- 
greg


Re: Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?

From
Tom Lane
Date:
Greg Stark <stark@enterprisedb.com> writes:
> On Thu, May 21, 2009 at 4:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yeah ... I don't see exactly what it would buy to restrict it to just
>> the first such value.

> Well it wouldn't buy you steady-state space savings or performance improvements.

> What it would buy you is a much narrowed set of circumstances where
> ALTER TABLE ADD COLUMN goes from a fast O(1) catalog change to a
> complete table rewrite. The use cases covered such as "boolean DEFAULT
> false" or "integer DEFAULT 0" are extremely common.

No, you missed my point --- what's the value of having an implementation
of this that only works for one column?  If we do it, I'd envision it
as an extra column in pg_attribute, and it would work for any column(s).
There's nothing to be gained by restricting it.

> I think Robert Haas is right that we could handle any stable
> expression by evaluating the expression once and storing only the
> final resulting value as a constant. That would avoid the problems
> with dependencies and later changes to functions.

Right, that's *necessary* to avoid changing semantics compared to
the non-optimized behavior.

> Another gotcha is that the default value might be very large.... It
> can't be very common but I suppose we would have to take some care
> around that.

Yeah, that occurred to me too.  We'd probably not be able to toast
the pg_attribute column (depending on exactly how it's
declared/represented) so we'd have to put a limit on the width of
data value that we'd be willing to handle this way.  Seems doable
though.
        regards, tom lane


Re: Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?

From
Greg Stark
Date:

--  
Greg


On 21 May 2009, at 12:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Greg Stark <stark@enterprisedb.com> writes:
>> On Thu, May 21, 2009 at 4:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Yeah ... I don't see exactly what it would buy to restrict it to  
>>> just
>>> the first such value.
>
>> Well it wouldn't buy you steady-state space savings or performance  
>> improvements.
>
>> What it would buy you is a much narrowed set of circumstances where
>> ALTER TABLE ADD COLUMN goes from a fast O(1) catalog change to a
>> complete table rewrite. The use cases covered such as "boolean  
>> DEFAULT
>> false" or "integer DEFAULT 0" are extremely common.
>
> No, you missed my point --- what's the value of having an  
> implementation
> of this that only works for one column?  If we do it, I'd envision it
> as an extra column in pg_attribute, and it would work for any  
> column(s).
> There's nothing to be gained by restricting it.
>

Oh, I never meant to restrict it to one column.

It might be nice to have vacuum notice the minimum natts in the table  
and trim the old unneeded ones. But I can't think of a very compelling  
reason to. Perhaps to save memory used in tuplestores?





>> I think Robert Haas is right that we could handle any stable
>> expression by evaluating the expression once and storing only the
>> final resulting value as a constant. That would avoid the problems
>> with dependencies and later changes to functions.
>
> Right, that's *necessary* to avoid changing semantics compared to
> the non-optimized behavior.

I'm coming at it from the other direction. I was assuming we could  
only handle simple constants and am trying to see how wide we can  
expand it. Doing all stable expressions would seem pretty convincingly  
wide to me.


>> Another gotcha is that the default value might be very large.... It
>> can't be very common but I suppose we would have to take some care
>> around that.
>
> Yeah, that occurred to me too.  We'd probably not be able to toast
> the pg_attribute column (depending on exactly how it's
> declared/represented) so we'd have to put a limit on the width of
> data value that we'd be willing to handle this way.  Seems doable
> though.
>
>            regards, tom lane


Re: Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?

From
Sam Mason
Date:
On Thu, May 21, 2009 at 12:26:08PM -0400, Tom Lane wrote:
> I'd envision it
> as an extra column in pg_attribute, and it would work for any column(s).
> There's nothing to be gained by restricting it.

Yes, when I said "first" I meant the only thing that needs to be stored
is the first default value for the column.  The code currently assumes
that this first default value is NULL and is hence able to optimize the
case where you adding a NULLable column.  Adding this first default
value to pg_attribute would allow you to break this assumption and
allow the user to have non-NULL default values without complete table
rewrites.

I think the discussion covered this, but I don't think I was
particularly clear in my first message and thought I should attempt to
explain what I was saying.  Other comments about only allowing STABLE
expressions and non-toastable values make sense and were the sorts of
things I thought I would miss.

--  Sam  http://samason.me.uk/


Re: Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?

From
Dmitry Koterov
Date:
<div class="gmail_quote">On Thu, May 21, 2009 at 6:45 PM, Tom Lane <span dir="ltr"><<a
href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>></span>wrote:<br /><blockquote class="gmail_quote"
style="border-left:1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="im">Sam
Mason<<a href="mailto:sam@samason.me.uk">sam@samason.me.uk</a>> writes:<br /> > On Thu, May 21, 2009 at
12:06:29PM+0400, Dmitry Koterov wrote:<br /> >> ALTER TABLE ... ADD COLUMN ... NULL;<br /> >><br />
>>(nullable without a default value). This is because of NULL bitmap in<br /> >> tuples. And it's greatest
featurefor a developer!<br /><br /> > I don't think this is because of the "NULL bitmap".<br /><br /></div>No, it
isn't. It's because each tuple includes the actual count of<br /> fields it contains (t_natts or
HeapTupleHeaderGetNatts),and the value<br /> extraction routines are coded to assume that references to fields<br />
beyondthat number should yield NULL.  So the ALTER can just leave<br /> the existing rows alone --- only when you
updatea row will it change<br /> to include the newly added field(s).<br /></blockquote></div><br />No, I meant that in
caseof the row (1, NULL, NULL, 2, 3, NULL):<br />- the corresponding NULL bitmap is (100110...)<br />- the
correspondingtuple is (1, 2, 3)<br />- t_natts=3 (if I am not wrong here)<br /><br />And in case of the row (5, 6,
NULL,7, 8, 9):<br />- the corresponding NULL bitmap is (110111...)<br /> - the corresponding tuple is (5, 6, 7, 9)<br
/>- t_natts=4<br /><br />So, without a NULL bitmap, we cannot handle this kind of rows at all by t_natts only. And the
NULLbitmap plays very important role in tuple saving, and I meant exactly that point. <br /><br /><br /> 

Re: Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?

From
Tom Lane
Date:
Dmitry Koterov <dmitry@koterov.ru> writes:
> No, I meant that in case of the row (1, NULL, NULL, 2, 3, NULL):
> - the corresponding NULL bitmap is (100110...)
> - the corresponding tuple is (1, 2, 3)
> - t_natts=3 (if I am not wrong here)

You are wrong --- t_natts would be six here.  In general the length of
the null bitmap in a tuple (if it has one at all) is always exactly
equal to its t_natts value.
        regards, tom lane


Re: Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?

From
Dmitry Koterov
Date:
<div class="gmail_quote"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt
0pt0pt 0.8ex; padding-left: 1ex;"><div class="im">Dmitry Koterov <<a
href="mailto:dmitry@koterov.ru">dmitry@koterov.ru</a>>writes:<br /> > No, I meant that in case of the row (1,
NULL,NULL, 2, 3, NULL):<br /> > - the corresponding NULL bitmap is (100110...)<br /> > - the corresponding tuple
is(1, 2, 3)<br /> > - t_natts=3 (if I am not wrong here)<br /><br /></div>You are wrong --- t_natts would be six
here. In general the length of<br /> the null bitmap in a tuple (if it has one at all) is always exactly<br /> equal to
itst_natts value.<br /></blockquote></div><br />And so, the real number of values in the tuple - (1, 2, 3) above - is
equalto the number of 1-bits in NULL bitmap. And the size of NULL bitmap is held in t_natts. I meant that when I said
"thanksto NULL bitmap, adding a new nullable column is cheap". :-) And, of course, thanks to t_natts
(HeapTupleHeaderGetNattsmacro) - too.<br /><br />