Re: cataloguing NOT NULL constraints - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: cataloguing NOT NULL constraints
Date
Msg-id 20220919111919.4y2ks5zycilwzkt5@alvherre.pgsql
Whole thread Raw
In response to Re: cataloguing NOT NULL constraints  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
List pgsql-hackers
On 2022-Sep-14, Peter Eisentraut wrote:

> Reading through the SQL standard again, I think this patch goes a bit too
> far in folding NOT NULL and CHECK constraints together.  The spec says that
> you need to remember whether a column was defined as NOT NULL, and that the
> commands DROP NOT NULL and SET NOT NULL only affect constraints defined in
> that way.  In this implementation, a constraint defined as NOT NULL is
> converted to a CHECK (x IS NOT NULL) constraint and the original definition
> is forgotten.

Hmm, I don't read it the same way.  Reading SQL:2016, they talk about a
nullability characteristic (/known not nullable/ or /possibly
nullable/):

: 4.13 Columns, fields, and attributes
: [...]
: Every column has a nullability characteristic that indicates whether the
: value from that column can be the null value. A nullability characteristic
: is either known not nullable or possibly nullable.
: Let C be a column of a base table T. C is known not nullable if and only
: if at least one of the following is true:
: — There exists at least one constraint NNC that is enforced and not
: deferrable and that simply contains a <search condition> that is a
: <boolean value expression> that is a readily-known-not-null condition for C.
: [other possibilities]

then in the same section they explain that this is derived from a
table constraint:

: A column C is described by a column descriptor. A column descriptor
: includes:
: [...]
: — If C is a column of a base table, then an indication of whether it is
: defined as NOT NULL and, if so, the constraint name of the associated
: table constraint definition.

   [aside: note that elsewhere (<boolean value expression>), they define
   "readily-known-not-null" in Syntax Rule 3), of 6.39 <boolean value
   expression>:

   : 3) Let X denote either a column C or the <key word> VALUE. Given a
   : <boolean value expression> BVE and X, the notion “BVE is a
   : readily-known-not-null condition for X” is defined as follows.
   : Case:
   :  a) If BVE is a <predicate> of the form “RVE IS NOT NULL”, where RVE is a
   :     <row value predicand> that is a <row value constructor predicand> that
   :     simply contains a <common value expression>, <boolean predicand>, or
   :     <row value constructor element> that is a <column reference> that
   :     references C, then BVE is a readily-known-not-null condition for C.
   :  b) If BVE is the <predicate> “VALUE IS NOT NULL”, then BVE is a
   :     readily-known-not-null condition for VALUE.
   :  c) Otherwise, BVE is not a readily-known-not-null condition for X.
   edisa]

Later, <column definition> says literally that specifying NOT NULL in a
column is equivalent to the CHECK (.. IS NOT NULL) table constraint:

: 11.4 <column definition>
: 
: Syntax Rules,
: 17) If a <column constraint definition> is specified, then let CND be
: the <constraint name definition> if one is specified and let CND be the
: zero-length character character string otherwise; let CA be the
: <constraint characteristics> if specified and let CA be the zero-length
: character string otherwise. The <column constraint definition> is
: equivalent to a <table constraint definition> as follows.
: 
: Case:
: 
: a) If a <column constraint definition> is specified that contains the
: <column constraint> NOT NULL, then it is equivalent to the following
: <table constraint definition>:
:    CND CHECK ( C IS NOT NULL ) CA

In my reading of it, it doesn't follow that you have to remember whether
the table constraint was created by saying explicitly by doing CHECK (c
IS NOT NULL) or as a plain NOT NULL column constraint.  The idea of
being able to do DROP NOT NULL when only a constraint defined as CHECK
(c IS NOT NULL) exists seems to follow from there; and also that you can
use DROP CONSTRAINT to remove one added via plain NOT NULL; and that
both these operations change the nullability characteristic of the
column.  This is made more explicit by the fact that they do state that
the nullability characteristic can *not* be "destroyed" for other types
of constraints, in 11.26 <drop table constraint definition>, Syntax Rule
11)

: 11) Destruction of TC shall not cause the nullability characteristic of
: any of the following columns of T to change from known not nullable to
: possibly nullable:
: 
: a) A column that is a constituent of the primary key of T, if any.
: b) The system-time period start column, if any.
: c) The system-time period end column, if any.
: d) The identity column, if any.

then General Rule 7) explains that this does indeed happen for columns
declared to have some sort of NOT NULL constraint, without saying
exactly how was that constraint defined:

: 7) If TC causes some column COL to be known not nullable and no other
: constraint causes COL to be known not nullable, then the nullability
: characteristic of the column descriptor of COL is changed to possibly
: nullable.

> Besides that, I think that users are not going to like that pg_dump rewrites
> their NOT NULL constraints into CHECK table constraints.

This is a good point, but we could get around it by decreeing that
pg_dump dumps the NOT NULL in the old way if the name is not changed
from whatever would be generated normally.  This would require some
games to remove the CHECK one; and it would also mean that partitions
would not use the same constraint as the parent, but rather it'd have to
generate a new constraint name that uses its own table name, rather than
the parent's.

(This makes me wonder what should happen if you rename a table: should
we go around and rename all the automatically-named constraints as well?
Probably not, but this may annoy people that creates table under one
name, then rename them into their final places afterwards.  pg_dump may
behave funny for those.  We can tackle that later, if ever.  But
consider that moving the table across schemas might cause even weirder
problems, since the standard says constraint names must not conflict
within a schema ...)

> I suspect that this needs a separate contype for NOT NULL constraints that
> is separate from CONSTRAINT_CHECK.

Maybe it is possible to read this in the way you propose, but I think
that interpretation is strictly less useful than the one I propose.
Also, see this reply from Tom to Vitaly Burovoy who was proposing
something that seems to derivate from this interpretation:
https://www.postgresql.org/message-id/flat/17684.1462339177%40sss.pgh.pa.us

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Hay dos momentos en la vida de un hombre en los que no debería
especular: cuando puede permitírselo y cuando no puede" (Mark Twain)



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: Fix typos in code comments
Next
From: Ashutosh Bapat
Date:
Subject: Re: confirmed_flush_lsn shows LSN of the data that has not yet been received by the logical subscriber.