Home > mailing lists

Re: cataloguing NOT NULL constraints - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: cataloguing NOT NULL constraints
Date	May 13, 2024 18:58:25
Msg-id	CA+TgmoYR356AHUoVT-57Lf-_MheDBqLukJyhipAcxVZHtVtrwA@mail.gmail.com Whole thread Raw
In response to	Re: cataloguing NOT NULL constraints (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses	Re: cataloguing NOT NULL constraints
List	pgsql-hackers

Tree view

On Mon, May 13, 2024 at 12:45 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> The point is that a column can be in a primary key and not have an
> explicit not-null constraint.  This is different from having a column be
> NOT NULL and having a primary key on top.  In both cases the attnotnull
> flag is set; the difference between these two scenarios is what happens
> if you drop the primary key.  If you do not have an explicit not-null
> constraint, then the attnotnull flag is lost as soon as you drop the
> primary key.  You don't have to do DROP NOT NULL for that to happen
>
> This means that if you have a column that's in the primary key but does
> not have an explicit not-null constraint, then we shouldn't make one up.
> (Which we would, if we were to keep an unadorned NOT NULL that we can't
> drop at the end of the dump.)

It seems to me that the practical thing to do about this problem is
just decide not to solve it. I mean, it's currently the case that if
you establish a PRIMARY KEY when you create a table, the columns of
that key are marked NOT NULL and remain NOT NULL even if the primary
key is later dropped. So, if that didn't change, we would be no less
compliant with the SQL standard (or your reading of it) than we are
now. And if you do really want to make that change, why not split it
out into its own patch, so that the patch that does $SUBJECT is
changing the minimal number of other things at the same time? That
way, reverting something might not involve reverting everything, plus
you could have a separate design discussion about what that fix ought
to look like, separate from the issues that are truly inherent to
cataloging NOT NULL constraints per se.

What I meant about changing the order of operations is that,
currently, the database knows that the column is NOT NULL before the
COPY happens, and I don't think we can change that. I think you agree
-- that's why you invented the throwaway constraints. As far as I can
see, the problems all have to do with getting the "throwaway" part to
happen correctly. It can't be a problem to just mark the relevant
columns NOT NULL in the relevant tables -- we already do that. But if
you want to discard some of those NOT NULL markings once the PRIMARY
KEY is added, you have to know which ones to discard. If we just
consider the most straightforward scenario where somebody does a full
dump-and-restore, getting that right may be annoying, but it seems
like it surely has to be possible. The dump will just have to
understand which child tables (or, more generally, descendent tables)
got a NOT NULL marking on a column because of the PK and which ones
had an explicit marking in the old database and do the right thing in
each case.

But what if somebody does a selective restore of one table from a
partitioning hierarchy? Currently, the columns that would have been
part of the primary key end up NOT NULL, but the primary key itself is
not restored because it can't be. What will happen in this new system?
If you don't apply any NOT NULL constraints to those columns, then a
user who restores one partition from an old dump and tries to reattach
it to the correct partitioned table has to recheck the NOT NULL
constraint, unlike now. If you apply a normal-looking garden-variety
NOT NULL constraint to that column, you've invented a constraint that
didn't exist in the source database. And if you apply a throwaway NOT
NULL constraint but the user never attaches that table anywhere, then
the throwaway constraint survives. None of those options sound very
good to me.

Another scenario: Say that you have a table with a PRIMARY KEY. For
some reason, you want to drop the primary key and then add it back.
Well, with this definitional change, as soon as you drop it, you
forget that the underlying columns don't contain any nulls, so when
you add it back, you have to check them again. I don't know who would
find that behavior an improvement over what we have today.

So I don't really think it's a great idea to change this behavior, but
even if it is, is it such a good idea that we want to sink the whole
patch set repeatedly over it, as has already happened twice now? I
feel that if we did what Tom suggested a year ago in
https://www.postgresql.org/message-id/3801207.1681057430@sss.pgh.pa.us
-- "I'm inclined to think that this idea of suppressing the implied
NOT NULL from PRIMARY KEY is a nonstarter and we should just go ahead
and make such a constraint" -- there's a very good chance that a
revert would have been avoided here and it would still be just as
valid to think of revisiting this particular question in a future
release as it is now.

--
Robert Haas
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Noah Misch
Date: 13 May 2024, 18:54:03
Subject: Re: race condition in pg_class

From: Chapman Flack
Date: 13 May 2024, 18:59:01
Subject: Re: Is there any chance to get some kind of a result set sifting mechanism in Postgres?

Re: cataloguing NOT NULL constraints - Mailing list pgsql-hackers

Previous

Next