Fast AT ADD COLUMN with DEFAULTs - Mailing list pgsql-hackers

From Serge Rielau
Subject Fast AT ADD COLUMN with DEFAULTs
Date
Msg-id 90c85206-f465-472f-b782-5a6412d217f3@rielau.com
Whole thread Raw
Responses Re: Fast AT ADD COLUMN with DEFAULTs  (Vitaly Burovoy <vitaly.burovoy@gmail.com>)
Re: Fast AT ADD COLUMN with DEFAULTs  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Dear Hackers,

I’m working on a patch that expands PG’s ability to add columns to a table without a table rewrite (i.e. at O(1) cost) from the nullable-without-default to a more general case.
E.g.
CREATE TABLE T(pk INT NOT NULL PRIMARY KEY);
INSERT INTO T VALEUS (1), (2), (3);
ALTER TABLE T ADD COLUMN c1 INTEGER NOT NULL DEFAULT 5;
INSERT INTO T VALUES (4, DEFAULT);
ALTER TABLE T ALTER COLUMN SET DEFAULT 6;
INSERT INTO T VALUS (5, DEFAULT);
SELECT * FROM T ORDER BY pk;
=>
(1, 5),
(2, 5),
(3, 5),
(4, 5),
(5, 6);

Rows 1-3 have never been updated, yet they know that their values of c1 is 5.

The requirement is driven by large tables for which add column takes too much time and/or produces too large a transaction for comfort.

In simplified terms:
* a second “exist default” is computed and stored in the catalogs at time of 
   AT ADD COLUMN
* The exist default is cached in the tuple descriptor (e.g in attrdef)
* When one of the getAttr or copytuple related routines is invoked the exist default is filled in instead of simply NULL padding if the tuple is shorter the requested attribute number.

Is there an interest in principle in the community for this functionality?

Cheers
Serge Rielau
Salesforce.com

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: PostgreSQL - Weak DH group
Next
From: Tom Lane
Date:
Subject: Re: Our "fallback" atomics implementation doesn't actually work