Re: Document atthasmissing default optimization avoids verification table scan - Mailing list pgsql-hackers

From James Coleman
Subject Re: Document atthasmissing default optimization avoids verification table scan
Date
Msg-id CAAaqYe8qdz1vpb+fVJGHAZfiUA5TZcP1d-JHM0v9M4W2k7amOQ@mail.gmail.com
Whole thread Raw
In response to Re: Document atthasmissing default optimization avoids verification table scan  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: Document atthasmissing default optimization avoids verification table scan
List pgsql-hackers
On Wed, Jan 19, 2022 at 9:34 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Wed, Jan 19, 2022 at 6:14 PM James Coleman <jtc331@gmail.com> wrote:
>>
>> I'm open to the idea of wordsmithing here, of course, but I strongly
>> disagree that this is irrelevant data.
>
>
> Ok, but wording aside, only changing a tip in the DDL - Add Table section doesn't seem like a complete fix.  The
notesin alter table, where I'd look for such an official directive first, need to be touched as well.
 
>
> For the alter table docs maybe change/add to the existing sentence below (I'm in favor of not pointing out that
scanningthe table doesn't mean we are rewriting it, but maybe I'm making another unwarranted assumption regarding
obviousness...).
>
> "Adding a CHECK or NOT NULL constraint requires scanning the table [but not rewriting it] to verify that existing
rowsmeet the constraint.  It is skipped when done as part of ADD COLUMN unless a table rewrite is required anyway."
 

I'm looking over the docs again to see how it might be better
structured; point is well taken that we should have it clearly in the
primary place.

> On that note, does the check constraint interplay with the default rewrite avoidance in the same way?

I hadn't checked until you asked, but interestingly, no it doesn't (I
assume you mean scan not rewrite in this context):

test=# select seq_scan from pg_stat_all_tables where relname = 't2';
 seq_scan
----------
        2
test=# alter table t2 add column i3 int not null default 5;
ALTER TABLE
test=# select seq_scan from pg_stat_all_tables where relname = 't2';
 seq_scan
----------
        2
test=# alter table t2 add column i4 int default 5 check (i4 < 50);
ALTER TABLE
test=# select seq_scan from pg_stat_all_tables where relname = 't2';
 seq_scan
----------
        3

That seems like an opportunity for improvement here, though it's
obviously a separate patch. I might poke around at that though
later...

> For the Tip I'd almost rather redo it to say:
>
> "Before PostgreSQL 11, adding a new column to a table required rewriting that table, making it a very slow operation.
More recent versions can sometimes optimize away this rewrite and related validation scans.  See the notes in ALTER
TABLEfor details."
 
>
> Though I suppose I'd accept something like (leave existing text, alternative patch text):
>
> "[...]large tables.\nIf the added column also has a not null constraint the usual verification scan is also
skipped."
>
> "constant" is used in the Tip, "non-volatile" is used in alter table - hence a desire to have just one source of
truth,with alter table being the correct place.  We should sync them up otherwise.
 

As noted I'll look over how restructuring might improve and reply with
an updated proposed patch.

Thanks,
James Coleman



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: removing datlastsysoid
Next
From: Alvaro Herrera
Date:
Subject: Re: row filtering for logical replication