Re: Document atthasmissing default optimization avoids verification table scan - Mailing list pgsql-hackers

From James Coleman
Subject Re: Document atthasmissing default optimization avoids verification table scan
Date
Msg-id CAAaqYe923gXAsFXa_Oo3hRGuH+8BoTbpXa3VfRAvJxn6T+tfDg@mail.gmail.com
Whole thread Raw
In response to Re: Document atthasmissing default optimization avoids verification table scan  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Document atthasmissing default optimization avoids verification table scan  ("David G. Johnston" <david.g.johnston@gmail.com>)
Re: Document atthasmissing default optimization avoids verification table scan  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Fri, Mar 25, 2022 at 4:40 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Jan 25, 2022 at 8:49 AM James Coleman <jtc331@gmail.com> wrote:
> > Here's a version that looks like that. I'm not convinced it's an
> > improvement over the previous version: again, I expect more advanced
> > users to already understand this concept, and I think moving it to the
> > ALTER TABLE page could very well have the effect of burying i(amidst
> > the ton of detail on the ALTER TABLE page) concept that would be
> > useful to learn early on in a tutorial like the DDL page. But if
> > people really think this is an improvement, then I can acquiesce.
>
> I vote for rejecting both of these patches.
>
> 0001 adds the following sentence to the documentation: "A <literal>NOT
> NULL</literal> constraint may be added to the new column in the same
> statement without requiring scanning the table to verify the
> constraint." My first reaction when I read this sentence was that it
> was warning the user about the absence of a hazard that no one would
> expect in the first place. We could also document that adding a NOT
> NULL constraint will not cause your gas tank to catch fire, but nobody
> was worried about that until we brought it up.

As noted at minimum we (Braintree Payments) feared this hazard. That's
reasonable because adding a NOT NULL constraint normally requires a
table scan while holding an exclusive lock. It's fairly obvious why
someone like us (any anyone who can't have downtime) would be paranoid
about any possibility of long-running operations under exclusive locks

I realize it's rhetorical flourish, but it hardly seems reasonable to
compare an actual hazard a database could plausibly have (an index it
is an optimization in the code that prevents it from happening -- a
naive implementation would in fact scan the full table on all NOT NULL
constraint additions) with something not at all related to database
(gas tank fires).

I simply do not accept the claim that this is not a reasonable concern
to have nor that this isn't worth documenting. It was worth someone
taking the time to consider as an optimization in the code. And the
consequence of that not having been done could be an outage for an
unsuspecting user. Of all the things we would want to document DDL
that could require executing long operations while holding exclusive
locks seems pretty high on the list.

> I also think that the
> sentence makes the rest of the paragraph harder to understand, because
> the rest of the paragraph is talking about adding a new column with a
> default, and now suddenly we're talking about NOT NULL constraints.

I am, however, happy to hear critiques of the style of the patch or
the best way to document this kind of behavior.

I'm curious though what you'd envision being a better place for this
information. Yes, we're talking about new columns -- that's the
operation under consideration -- but the NOT NULL constraint is part
of the new column definition. I'm not sure where else you would
document something that's a part of adding a new column.

> 0002 moves some advice about adding columns with defaults from one
> part of the documentation to another. Maybe that's a good idea, and
> maybe it isn't, but it also rewords the advice, and in my opinion, the
> new wording is less clear and specific than the existing wording. It
> also changes a sentence that mentions volatile defaults to give a
> specific example of a volatile function -- clock_timestamp -- probably
> because where the documentation was before that function was
> mentioned. However, that sentence seems clear enough as it is and does
> not really need an example.

Adding that example (and, indeed, moving the advice) was per a
previous reviewer's request. So I'm not sure what to do in this
situation -- I'm trying to improve the proposal per reviewer feedback
but there are conflicting reviewers. I suppose we'd need a
tie-breaking reviewer.

Thanks,
James Coleman



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Why is lorikeet so unstable in v14 branch only?
Next
From: James Coleman
Date:
Subject: Re: Document atthasmissing default optimization avoids verification table scan