Re: PostgreSQL domains and NOT NULL constraint - Mailing list pgsql-hackers

From Tom Lane
Subject Re: PostgreSQL domains and NOT NULL constraint
Date
Msg-id 1745571.1697299787@sss.pgh.pa.us
Whole thread Raw
In response to Re: PostgreSQL domains and NOT NULL constraint  (Vik Fearing <vik@postgresfriends.org>)
Responses Re: PostgreSQL domains and NOT NULL constraint
Re: PostgreSQL domains and NOT NULL constraint
List pgsql-hackers
Vik Fearing <vik@postgresfriends.org> writes:
> On 10/13/23 06:37, Tom Lane wrote:
>> Hmph.  The really basic problem here, I think, is that the spec
>> wants to claim that a domain is a data type, but then it backs
>> off and limits where the domain's constraints need to hold.

> I don't think that is an accurate depiction of domains.
> First of all, I am not seeing where it says that a domain is a data 
> type.  It allows domains to be used in some places where a data type is 
> used, but that is not equivalent to a domain /being/ a data type.

Hmm, you are right.  This is something I'd never paid attention to
before, but they do seem to exclude domains from being the declared
type of any expression.  Most notably, not even a CAST to a domain
type produces the domain type.  Per SQL:2021 6.13 <cast specification>
syntax rules:

    1) Case:
        a) If a <domain name> is specified, then let TD be the data
        type of the specified domain.

        b) If a <data type> is specified, then let TD be the data type
        identified by <data type>. <data type> shall not contain a
        <collate clause>.

    2) The declared type of the result of the <cast specification> is TD.

Even more amusingly for our current purposes, CAST does not enforce
NOT NULL.  <cast specification> general rule 2:

    2) Case:
        a) If the <cast operand> specifies NULL, then the result of CS
        is the null value and no further General Rules of this
        Subclause are applied.

        b) If the <cast operand> specifies an <empty specification>,
        then the result of CS is an empty collection of declared type
        TD and no further General Rules of this Subclause are applied.

        c) If SV is the null value, then the result of CS is the null
        value and no further General Rules of this Subclause are
        applied.

So for a null value the spec never reaches GR 23 that says to apply
the domain's constraints.

This is already a sufficient intellectual muddle that I'm not sure
we want to follow it slavishly.  If not-null can be ignored here,
why not elsewhere?

But anyway, yeah, the spec's notion of a domain bears only passing
resemblance to what we've actually implemented.  I'm not really sure
that we want to switch, because AFAICS the spec's model doesn't
include any of these things:

* Domains over other domains

* Domains over arrays, composite types, etc

* Functions accepting or returning domain types

If we were to try to do something closer to what the spec has in mind,
how would we do it without ripping out a ton of functionality that
people have requested and come to depend on?

> Section 4.25.4, "Domain constraints" has this to say (emphasis mine):
>
> - A domain constraint is satisfied by SQL-data *if and only if*, for 
> every *table* T that has a column named C based on that domain, the 
> applicable <search condition> recorded in the appropriate domain 
> constraint usage evaluates to True or Unknown.

I think that isn't particularly relevant, because I believe that by
SQL-data they mean the static contents of a database, so of course
only table contents matter.  What we are concerned about is dynamic
behavior within queries and functions.

> Secondly, why are you so concerned about outer join nulls here and not 
> for any other column marked NOT NULL?

Primarily because that's an easy way for a column that was marked
NOT NULL to read out as NULL.

>> That's fundamentally inconsistent.  It's like claiming that
>> 'foobarbaz' is a valid value of type numeric as long as it's
>> only in flight within a query and you haven't tried to store it
>> into a table.

> It's like claiming that null is a valid value of type numeric as long as 
> it's only in flight within a query and you haven't tried to store it 
> into a table with that column marked NOT NULL.

And?  NULL *is* a valid value of type numeric, as well as all other
base types.

> Allowing a null to be stored in a column where the user has specified 
> NOT NULL, no matter how the user did that, is unacceptable and I am 
> frankly surprised that you are defending it.

What I'm trying to hold onto is the notion that a domain can
meaningfully be considered to be a data type (that is, that a value in
flight can be considered to be of a domain type).  We've been building
the system on that assumption for over twenty years now, and I think
it's pretty deeply ingrained.  I don't understand the consequences
of abandoning it, and I'm not convinced that the spec's model is
sufficiently intellectually rigorous that we can just say "oh, we'll
follow the spec instead of what we've been doing, and it'll be fine".

As a trivial example: our implementation assumes that enforcing a
domain's constraints is to be done by casting the base type value
to the domain type.  Per the above reading of <6.13>, this should
fail to reject nulls, so we'd have to understand and implement
checking of domain constraints in some other way.

Given the exception the spec makes for CAST, I wonder if we shouldn't
just say "NULL is a valid value of every domain type, as well as every
base type.  If you don't like it, too bad; write a separate NOT NULL
constraint for your table column."

            regards, tom lane



pgsql-hackers by date:

Previous
From: David Steele
Date:
Subject: Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"
Next
From: David Steele
Date:
Subject: Rename backup_label to recovery_control