Re: Tip box on Adding a Column - Mailing list pgsql-docs

From David Rowley
Subject Re: Tip box on Adding a Column
Date
Msg-id CAApHDvr_N8crEmtxL2QvkDqiAmg-NJExhLWQwCbCWD4hEHmYxQ@mail.gmail.com
Whole thread Raw
In response to Tip box on Adding a Column  (Marcelo Fernandes <marcefern7@gmail.com>)
List pgsql-docs
On Fri, 1 Nov 2024 at 22:06, Marcelo Fernandes <marcefern7@gmail.com> wrote:
> - https://www.postgresql.org/docs/current/ddl-alter.html#DDL-ALTER-ADDING-A-COLUMN
>
> That says:
>
> > From PostgreSQL 11, adding a column with a constant default value no longer
> > means that each row of the table needs to be updated when the ALTER TABLE
> > statement is executed. Instead, the default value will be returned the next
> > time the row is accessed, and applied when the table is rewritten, making the
> > ALTER TABLE very fast even on large tables.
>
> I'm just seeking clarification if this advice is true **even for** new columns
> declared with NOT NULL?

Yes, providing the default expression is immutable.

> Historically, I've had to add new fields on existing big tables with a NULL to
> avoid downtime, but it may be different when a DEFAULT is provided?

We have the ability to store immutable defaults in the catalogue
tables and "fill in the blanks" from there for any tuples that don't
have the new column.  Since we only can store 1 value per column, it
must be a constant, i.e. the same for all rows.

> I have used perf to profile the call-chain for adding a NOT NULL column with
> a default versus just an ordinary NULL with a default, and they are fairly
> similar.
>
> However, I see these functions being called in both cases:
>
> - ATRewriteTables
>      find_composite_type_dependencies
>      systable_beginscan
>      index_rescan
>      btrescan
>
> And the names raised my eyebrow... I don't have a deep understanding of the
> internals here, so it would be great if someone could clarify this for me.

I think you're better off staying in userland here and just doing some
experiments and using the performance numbers to help give you an idea
of what's going on.

Using psql:

\timing on
create table t1 as select a from generate_Series(1,1000000)a;
alter table t1 add column b int null; -- Time: 1.721 ms
alter table t1 add column c int not null default 1234; -- Time: 5.450 ms
alter table t1 add column d int not null default random(1,10); --
Time: 436.735 ms!!

only adding column d required a rewrite, that's because the default
expression isn't immutable.

If timing isn't confirmation enough, try doing the above with: SET
client_min_messages=debug1;

Only adding column "d" gives you: DEBUG:  rewriting table "t1"

David



pgsql-docs by date:

Previous
From: PG Doc comments form
Date:
Subject: PostgreSQL limits
Next
From: PG Doc comments form
Date:
Subject: fir for row level security