Thread: Re: [BUGS] BUG #5053: domain constraints still leak

Re: [BUGS] BUG #5053: domain constraints still leak

From
Tom Lane
Date:
[ probably time to move this thread to -hackers ]

There is some moderately interesting reading material in section
4.17.4 "Domain constraints" of SQL:2008.  In particular, it appears to
me that the standard goes out of its way to NOT claim that every value
that "is of" a domain type satisfies the domain's constraints.  It looks
to me that the implementation they have in mind is that domain
constraints are to be checked:

(1) when a value is assigned to a *table* column having that domain type;

(2) when a value is converted to that domain type by an *explicit*
cast construct;

(3) nowhere else.

If I'm reading this right, it sidesteps most of the concerns we have
been worrying about here, at the cost of being perhaps more surprising
and less useful than one would expect.  It would also mean that a lot
of our existing domain behavior is wrong.  I think there is ammunition
here for an argument that, in effect, values "in flight" in expression
or query evaluation should always be considered to be of base types,
and domain constraints should only be checked when assigning to a
persistent storage location such as a table field or plpgsql variable
(plus the special case for CAST constructs).
        regards, tom lane


Re: [BUGS] BUG #5053: domain constraints still leak

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> [ probably time to move this thread to -hackers ]
Tom> There is some moderately interesting reading material in sectionTom> 4.17.4 "Domain constraints" of SQL:2008.  In
particular,itTom> appears to me that the standard goes out of its way to NOT claimTom> that every value that "is of" a
domaintype satisfies theTom> domain's constraints.  It looks to me that the implementationTom> they have in mind is
thatdomain constraints are to be checked:
 
Tom> (1) when a value is assigned to a *table* column having thatTom> domain type;
Tom> (2) when a value is converted to that domain type by anTom> *explicit* cast construct;
Tom> (3) nowhere else.

By my reading it's a bit more involved than that. In particular, if
you cast from one rowtype to another, that seems to be defined in terms
of individual casts of each column, so CAST(ROW(null) TO rowtype) where
rowtype has one column of a not-null domain type would still count as an
explicit cast to the domain.

But there's a kicker: in Subclause 6.12, <cast specification>, in the
General Rules is:
 2) Case:    a) If the <cast operand> specifies NULL, then the result of CS is       the null value and no further
GeneralRules of this Subclause       are applied.    b) If the <cast operand> specifies an <empty specification>, then
    the result of CS is an empty collection of declared type TD       and no further General Rules of this Subclause
areapplied.    c) If SV is the null value, then the result of CS is the null       value and no further General Rules
ofthis Subclause are       applied.
 

That "no further General Rules" clause implies (assuming it's not a
blatant mistake in the spec) that this rule is therefore skipped in
the case of nulls:
21) If the <cast specification> contains a <domain name> and that    <domain name> refers to a domain that contains a
<domain   constraint> and if TV does not satisfy the <check constraint    definition> simply contained in the <domain
constraint>,then an    exception condition is raised: integrity constraint violation.
 

Which would imply that you can cast a NULL to a domain type even if
that would violate a constraint. Which would pretty much leave
actual assignment to storage as being the only place for the check
to happen.

-- 
Andrew (irc:RhodiumToad)


Re: [BUGS] BUG #5053: domain constraints still leak

From
Sam Mason
Date:
On Mon, Sep 14, 2009 at 11:20:59PM -0400, Tom Lane wrote:
> There is some moderately interesting reading material in section
> 4.17.4 "Domain constraints" of SQL:2008.

Not sure where to look for a copy of that, nor any particularly helpful
links :(

> In particular, it appears to
> me that the standard goes out of its way to NOT claim that every value
> that "is of" a domain type satisfies the domain's constraints.  It looks
> to me that the implementation they have in mind is that domain
> constraints are to be checked:
> 
> (1) when a value is assigned to a *table* column having that domain type;
>
> (2) when a value is converted to that domain type by an *explicit*
> cast construct;
> 
> (3) nowhere else.

I struggle to get any useful meaning out of the SQL specs, but that
sounds about right to me.

> If I'm reading this right, it sidesteps most of the concerns we have
> been worrying about here, at the cost of being perhaps more surprising
> and less useful than one would expect.

It means that domains are a world away from ADTs (abstract data types)
and just seem to function as quick "templates" for creating new columns.
PG seems to be treating domains as ADTs at the moment, which is the
abstraction that's proved to be more useful in larger programming
projects.

> It would also mean that a lot
> of our existing domain behavior is wrong.  I think there is ammunition
> here for an argument that, in effect, values "in flight" in expression
> or query evaluation should always be considered to be of base types,
> and domain constraints should only be checked when assigning to a
> persistent storage location such as a table field or plpgsql variable
> (plus the special case for CAST constructs).

Are you considering changing PGs behavior here? and if so, what would
happen to existing behavior?

--  Sam  http://samason.me.uk/


Re: [BUGS] BUG #5053: domain constraints still leak

From
Sam Mason
Date:
On Tue, Sep 15, 2009 at 05:13:21AM +0100, Andrew Gierth wrote:
> But there's a kicker: in Subclause 6.12, <cast specification>, in the
> General Rules is:
> 
>      a) If the <cast operand> specifies NULL, then the result of CS is
>         the null value and no further General Rules of this Subclause
>         are applied.
> 
> That "no further General Rules" clause implies (assuming it's not a
> blatant mistake in the spec) that this rule is therefore skipped in
> the case of nulls:

I think the NOT NULL constraint is a PG specific constraint, I can't see
how it's allowed in the spec.  Then again, I have trouble parsing the
spec so could well be wrong about this.

The NOT NULL constraint feels wrong as well, what are the semantics of:
 CREATE DOMAIN d AS INTEGER NOT NULL; SELECT a.n AS aa, b.n AS bb FROM (VALUES (CAST(1 AS d)),(2)) a(n)   LEFT JOIN
(VALUES(CAST(1 AS d))) b(n) ON a.n = b.n;
 

in the presence of it?  I'm expecting aa and bb both to come out as
domain "d", but this shouldn't work with what you're saying the current
semantics should be.

--  Sam  http://samason.me.uk/


Re: [BUGS] BUG #5053: domain constraints still leak

From
Andrew Gierth
Date:
>>>>> "Sam" == Sam Mason <sam@samason.me.uk> writes:
>> But there's a kicker: in Subclause 6.12, <cast specification>, in the>> General Rules is:>> >> a) If the <cast
operand>specifies NULL, then the result of CS is>> the null value and no further General Rules of this Subclause>> are
applied.>>>> That "no further General Rules" clause implies (assuming it's not a>> blatant mistake in the spec) that
thisrule is therefore skipped in>> the case of nulls:
 
Sam> I think the NOT NULL constraint is a PG specific constraint, ISam> can't see how it's allowed in the spec.

That's a good point; it doesn't seem to be.

But the spec _does_ appear to allow CHECK(VALUE IS NOT NULL) as a
domain constraint (in general the spec defines NOT NULL constraints
this way), and the wording from 6.12 implies that that check is still
skipped in the case of NULLs (so that constraint would stop you
inserting a null into a table column (I think), but not from casting a
null value to the domain type).
Sam> The NOT NULL constraint feels wrong as well, what are theSam> semantics of:
Sam>   CREATE DOMAIN d AS INTEGER NOT NULL;Sam>   SELECT a.n AS aa, b.n AS bbSam>   FROM (VALUES (CAST(1 AS d)),(2))
a(n)Sam>    LEFT JOIN (VALUES (CAST(1 AS d))) b(n) ON a.n = b.n;
 
Sam> in the presence of it?  I'm expecting aa and bb both to come outSam> as domain "d", but this shouldn't work with
whatyou're sayingSam> the current semantics should be.
 

I think that's just another example of Tom's initial comment about how
broken domain "not null" constraints are currently.

-- 
Andrew (irc:RhodiumToad)


Re: [BUGS] BUG #5053: domain constraints still leak

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Sam" == Sam Mason <sam@samason.me.uk> writes:
>  Sam> The NOT NULL constraint feels wrong as well, what are the
>  Sam> semantics of:

>  Sam>   CREATE DOMAIN d AS INTEGER NOT NULL;
>  Sam>   SELECT a.n AS aa, b.n AS bb
>  Sam>   FROM (VALUES (CAST(1 AS d)),(2)) a(n)
>  Sam>     LEFT JOIN (VALUES (CAST(1 AS d))) b(n) ON a.n = b.n;

>  Sam> in the presence of it?  I'm expecting aa and bb both to come out
>  Sam> as domain "d", but this shouldn't work with what you're saying
>  Sam> the current semantics should be.

> I think that's just another example of Tom's initial comment about how
> broken domain "not null" constraints are currently.

Well, the LEFT JOIN case is exactly why I feel that domain not-null
constraints are inherently broken.  The only clean way around it is to
decree that the output of a left join is not of the domain type after
all, but of its base type.  Which seems to me to be one side effect of
the wording in 4.17.4, though they are extending it to *all* evaluation
contexts not only outer joins.

I haven't yet read the additional material you guys found ...
        regards, tom lane


Re: [BUGS] BUG #5053: domain constraints still leak

From
Sam Mason
Date:
On Tue, Sep 15, 2009 at 02:54:18PM +0100, Andrew Gierth wrote:
> the spec _does_ appear to allow CHECK(VALUE IS NOT NULL) as a
> domain constraint (in general the spec defines NOT NULL constraints
> this way),

Huh, that's a trivial rewrite isn't it.  Not sure why it didn't occur to
me that it's just syntax sugar.

> and the wording from 6.12 implies that that check is still
> skipped in the case of NULLs (so that constraint would stop you
> inserting a null into a table column (I think), but not from casting a
> null value to the domain type).

Explicitly ignoring NULL values in CAST expressions seems like a good
feature as well.  Although it gives me the feeling that domains are more
and more like a mis-designed feature.

> >>>>> "Sam" == Sam Mason <sam@samason.me.uk> writes:
>  Sam> The NOT NULL constraint feels wrong as well, 

> I think that's just another example of Tom's initial comment about how
> broken domain "not null" constraints are currently.

Hum, given that it's just sugar for more general constraints I'm not
sure if it's the not null constraints that are broken or just the
current interpretation of them.  They would do the "right thing" if they
were only checked in a limited number of places that the user was aware
of, which the spec seems to imply is when the user explicitly asks for a
CAST to be performed or when writing into the table.

--  Sam  http://samason.me.uk/


Re: [BUGS] BUG #5053: domain constraints still leak

From
"Kevin Grittner"
Date:
Since our shop seems to use domains more than most, I figured I
should comment on this thread.

>Sam Mason <sam@samason.me.uk> wrote:
>> On Tue, Sep 15, 2009 at 02:54:18PM +0100, Andrew Gierth wrote:
>> and the wording from 6.12 implies that that check is still
>> skipped in the case of NULLs (so that constraint would stop you
>> inserting a null into a table column (I think), but not from
>> casting a null value to the domain type).
> 
> Explicitly ignoring NULL values in CAST expressions seems like a
> good feature as well.
OK by me.
> Although it gives me the feeling that domains are more
> and more like a mis-designed feature.
They have their place, for when you don't really need a new type,
but you want to show that multiple columns contain data from the
same set.  My rule of thumb is this -- if it would make sense for
two columns to be compared for equality, there's a very good chance
they belong in the same domain; if not, they probably don't.  Using
them helps to document complex databases and helps with portability,
quite aside from the issue of constraints.
> Hum, given that it's just sugar for more general constraints I'm
> not sure if it's the not null constraints that are broken or just
> the current interpretation of them.  They would do the "right
> thing" if they were only checked in a limited number of places
> that the user was aware of, which the spec seems to imply is when
> the user explicitly asks for a CAST to be performed or when
> writing into the table.
If that's what the spec says, then +1 from me.  The change won't
cause problems here.
-Kevin