Thread: pgsql: Fast ALTER TABLE ADD COLUMN with a non-NULL default

pgsql: Fast ALTER TABLE ADD COLUMN with a non-NULL default

From
Andrew Dunstan
Date:
Fast ALTER TABLE ADD COLUMN with a non-NULL default

Currently adding a column to a table with a non-NULL default results in
a rewrite of the table. For large tables this can be both expensive and
disruptive. This patch removes the need for the rewrite as long as the
default value is not volatile. The default expression is evaluated at
the time of the ALTER TABLE and the result stored in a new column
(attmissingval) in pg_attribute, and a new column (atthasmissing) is set
to true. Any existing row when fetched will be supplied with the
attmissingval. New rows will have the supplied value or the default and
so will never need the attmissingval.

Any time the table is rewritten all the atthasmissing and attmissingval
settings for the attributes are cleared, as they are no longer needed.

The most visible code change from this is in heap_attisnull, which
acquires a third TupleDesc argument, allowing it to detect a missing
value if there is one. In many cases where it is known that there will
not be any (e.g.  catalog relations) NULL can be passed for this
argument.

Andrew Dunstan, heavily modified from an original patch from Serge
Rielau.
Reviewed by Tom Lane, Andres Freund, Tomas Vondra and David Rowley.

Discussion: https://postgr.es/m/31e2e921-7002-4c27-59f5-51f08404c858@2ndQuadrant.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/16828d5c0273b4fe5f10f42588005f16b415b2d8

Modified Files
--------------
doc/src/sgml/catalogs.sgml                  |  27 ++
doc/src/sgml/ref/alter_table.sgml           |  34 +-
src/backend/access/common/heaptuple.c       | 629 ++++++++++++++++++++++------
src/backend/access/common/tupdesc.c         |  60 +++
src/backend/catalog/aclchk.c                |   2 +-
src/backend/catalog/heap.c                  | 171 +++++++-
src/backend/catalog/index.c                 |   4 +-
src/backend/commands/cluster.c              |  12 +-
src/backend/commands/functioncmds.c         |   2 +-
src/backend/commands/indexcmds.c            |   4 +-
src/backend/commands/tablecmds.c            |  51 ++-
src/backend/commands/typecmds.c             |   2 +-
src/backend/executor/execExprInterp.c       |   2 +-
src/backend/executor/execMain.c             |  13 +-
src/backend/executor/execTuples.c           |  28 +-
src/backend/executor/execUtils.c            |   2 +
src/backend/optimizer/util/clauses.c        |   4 +-
src/backend/optimizer/util/plancat.c        |   8 +-
src/backend/rewrite/rewriteHandler.c        |   3 +-
src/backend/statistics/extended_stats.c     |   2 +-
src/backend/utils/adt/ri_triggers.c         |  29 +-
src/backend/utils/adt/ruleutils.c           |   6 +-
src/backend/utils/cache/relcache.c          |  82 +++-
src/backend/utils/fmgr/fmgr.c               |   4 +-
src/backend/utils/fmgr/funcapi.c            |   8 +-
src/include/access/htup_details.h           |   4 +-
src/include/access/tupdesc.h                |   3 +
src/include/access/tupdesc_details.h        |  29 ++
src/include/catalog/heap.h                  |   6 +-
src/include/catalog/pg_attribute.h          |  30 +-
src/include/catalog/pg_class.h              |   2 +-
src/test/regress/expected/event_trigger.out |   4 +-
src/test/regress/expected/fast_default.out  | 515 +++++++++++++++++++++++
src/test/regress/parallel_schedule          |   2 +-
src/test/regress/serial_schedule            |   1 +
src/test/regress/sql/fast_default.sql       | 357 ++++++++++++++++
36 files changed, 1898 insertions(+), 244 deletions(-)


Re: pgsql: Fast ALTER TABLE ADD COLUMN with a non-NULL default

From
Andres Freund
Date:
Hi,

On 2018-03-28 00:56:10 +0000, Andrew Dunstan wrote:
> Fast ALTER TABLE ADD COLUMN with a non-NULL default

> src/backend/executor/execExprInterp.c       |   2 +-

I bet this'll break JITed tuple deforming... We'll have to emit a call
to the new slot missingattrs function, instead of using memset directly.

Do you want to attempt to take care of it, or should I?

Greetings,

Andres Freund


Re: pgsql: Fast ALTER TABLE ADD COLUMN with a non-NULL default

From
Andrew Dunstan
Date:
On Wed, Mar 28, 2018 at 11:30 AM, Andres Freund <andres@anarazel.de> wrote:
> Hi,
>
> On 2018-03-28 00:56:10 +0000, Andrew Dunstan wrote:
>> Fast ALTER TABLE ADD COLUMN with a non-NULL default
>
>> src/backend/executor/execExprInterp.c       |   2 +-
>
> I bet this'll break JITed tuple deforming... We'll have to emit a call
> to the new slot missingattrs function, instead of using memset directly.
>
> Do you want to attempt to take care of it, or should I?
>

Please do. I guess I need to do some enabling of that in my test setup.

cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: pgsql: Fast ALTER TABLE ADD COLUMN with a non-NULL default

From
Andres Freund
Date:
Hi,

On 2018-03-28 11:37:49 +1030, Andrew Dunstan wrote:
> Please do.

K, not 100% sure I'll get it done today, I'm neck deep looking at
something else.  But if not, I'll surely get it done tomorrow morning
PST.

Note there appear to be some independent failures?


> I guess I need to do some enabling of that in my test setup.

Unless you overhaul fundamental things like tuple deforming it hopefully
shouldn't need to be adjusted that often... ;)

Greetings,

Andres Freund


Re: pgsql: Fast ALTER TABLE ADD COLUMN with a non-NULL default

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> Note there appear to be some independent failures?

It looks to me like the hard failures from this patch are all on your JIT
critters.  There are a bunch of random failures from Simon's commit,
but I think those are just timing, per <31021.1522185986@sss.pgh.pa.us>.

            regards, tom lane


Re: pgsql: Fast ALTER TABLE ADD COLUMN with a non-NULL default

From
Andres Freund
Date:
Hi,

On 2018-03-27 18:10:31 -0700, Andres Freund wrote:
> K, not 100% sure I'll get it done today, I'm neck deep looking at
> something else.  But if not, I'll surely get it done tomorrow morning
> PST.

I've pushed a somewhat hurried fix. It passes all the tests here, but
normally I would've let it rest for more than a minute ;).  I'll check
it again tomorrow.

Let's hope the buildfarm turns a bit greener.

- Andres


Re: pgsql: Fast ALTER TABLE ADD COLUMN with a non-NULL default

From
Andres Freund
Date:
On 2018-03-27 23:40:25 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Note there appear to be some independent failures?
> 
> It looks to me like the hard failures from this patch are all on your JIT
> critters.

The one I noticed was:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=treepie&dt=2018-03-28%2002%3A43%3A00

*** /var/buildfarm/treepie/build/HEAD/pgsql.build/src/test/regress/expected/fast_default.out    Wed Mar 28 03:28:03
2018
--- /var/buildfarm/treepie/build/HEAD/pgsql.build/src/test/regress/results/fast_default.out    Wed Mar 28 04:56:44
2018
***************
*** 340,348 ****
  
  -- Aggregate function
  SELECT SUM(c_bigint), MAX(c_text), MIN(c_text) FROM T;
!  sum |  max  | min 
! -----+-------+-----
!  200 | hello | 31
  (1 row)
  
  -- ORDER BY
--- 340,348 ----
  
  -- Aggregate function
  SELECT SUM(c_bigint), MAX(c_text), MIN(c_text) FROM T;
!  sum | max |  min  
! -----+-----+-------
!  200 | 40  | hello
  (1 row)
  
  -- ORDER BY

which probably is just collation related and will stably fail?


> There are a bunch of random failures from Simon's commit,
> but I think those are just timing, per <31021.1522185986@sss.pgh.pa.us>.

There's also a few of that annoying postgres_fdw thing. Some of my
critters hit that regularly :(. We need to figure out what's going on
there.

Greetings,

Andres Freund


Re: pgsql: Fast ALTER TABLE ADD COLUMN with a non-NULL default

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> There's also a few of that annoying postgres_fdw thing. Some of my
> critters hit that regularly :(. We need to figure out what's going on
> there.

Yeah, really.  I've made several sustained attempts to reproduce that
locally, on every machine I have, without success.

It's weird, because longfin has shown that failure a couple of times,
but I've not been able to reproduce it in manual runs on longfin's host,
which makes little sense.  I'm reduced to a theory that the buildfarm
script creates an environment different from manual runs --- but I cannot
see what that would be.

            regards, tom lane


Re: pgsql: Fast ALTER TABLE ADD COLUMN with a non-NULL default

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> The one I noticed was:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=treepie&dt=2018-03-28%2002%3A43%3A00
> ...
> which probably is just collation related and will stably fail?

Yeah, magpie is still showing that after the latest commits.
A bit of COLLATE "C" might fix it.

            regards, tom lane


Re: pgsql: Fast ALTER TABLE ADD COLUMN with a non-NULL default

From
Andrew Dunstan
Date:
On Wed, Mar 28, 2018 at 3:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@anarazel.de> writes:
>> The one I noticed was:
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=treepie&dt=2018-03-28%2002%3A43%3A00
>> ...
>> which probably is just collation related and will stably fail?
>
> Yeah, magpie is still showing that after the latest commits.
> A bit of COLLATE "C" might fix it.
>

Just testing a fix along those lines now.

cheers

andrew


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: pgsql: Fast ALTER TABLE ADD COLUMN with a non-NULL default

From
Andres Freund
Date:
On 2018-03-27 23:40:25 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Note there appear to be some independent failures?
> 
> It looks to me like the hard failures from this patch are all on your JIT
> critters.  There are a bunch of random failures from Simon's commit,
> but I think those are just timing, per <31021.1522185986@sss.pgh.pa.us>.

There's a weird failure:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=whelk&dt=2018-03-28%2005%3A41%3A30

It blames two commits that should not have any effect on the platform,
so I assume it's something not stable...  Hard to say without a
backtrace.

Greetings,

Andres Freund


Re: pgsql: Fast ALTER TABLE ADD COLUMN with a non-NULL default

From
Etsuro Fujita
Date:
(2018/03/28 13:25), Tom Lane wrote:
> Andres Freund<andres@anarazel.de>  writes:
>> There's also a few of that annoying postgres_fdw thing. Some of my
>> critters hit that regularly :(. We need to figure out what's going on
>> there.

I plan to work on this after the commitfest.

> Yeah, really.  I've made several sustained attempts to reproduce that
> locally, on every machine I have, without success.
>
> It's weird, because longfin has shown that failure a couple of times,
> but I've not been able to reproduce it in manual runs on longfin's host,
> which makes little sense.  I'm reduced to a theory that the buildfarm
> script creates an environment different from manual runs --- but I cannot
> see what that would be.

Thank you for taking the time to work on this!

Best regards,
Etsuro Fujita