Thread: Re: [BUGS] BUG #5053: domain constraints still leak
[ probably time to move this thread to -hackers ] There is some moderately interesting reading material in section 4.17.4 "Domain constraints" of SQL:2008. In particular, it appears to me that the standard goes out of its way to NOT claim that every value that "is of" a domain type satisfies the domain's constraints. It looks to me that the implementation they have in mind is that domain constraints are to be checked: (1) when a value is assigned to a *table* column having that domain type; (2) when a value is converted to that domain type by an *explicit* cast construct; (3) nowhere else. If I'm reading this right, it sidesteps most of the concerns we have been worrying about here, at the cost of being perhaps more surprising and less useful than one would expect. It would also mean that a lot of our existing domain behavior is wrong. I think there is ammunition here for an argument that, in effect, values "in flight" in expression or query evaluation should always be considered to be of base types, and domain constraints should only be checked when assigning to a persistent storage location such as a table field or plpgsql variable (plus the special case for CAST constructs). regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> [ probably time to move this thread to -hackers ] Tom> There is some moderately interesting reading material in sectionTom> 4.17.4 "Domain constraints" of SQL:2008. In particular,itTom> appears to me that the standard goes out of its way to NOT claimTom> that every value that "is of" a domaintype satisfies theTom> domain's constraints. It looks to me that the implementationTom> they have in mind is thatdomain constraints are to be checked: Tom> (1) when a value is assigned to a *table* column having thatTom> domain type; Tom> (2) when a value is converted to that domain type by anTom> *explicit* cast construct; Tom> (3) nowhere else. By my reading it's a bit more involved than that. In particular, if you cast from one rowtype to another, that seems to be defined in terms of individual casts of each column, so CAST(ROW(null) TO rowtype) where rowtype has one column of a not-null domain type would still count as an explicit cast to the domain. But there's a kicker: in Subclause 6.12, <cast specification>, in the General Rules is: 2) Case: a) If the <cast operand> specifies NULL, then the result of CS is the null value and no further GeneralRules of this Subclause are applied. b) If the <cast operand> specifies an <empty specification>, then the result of CS is an empty collection of declared type TD and no further General Rules of this Subclause areapplied. c) If SV is the null value, then the result of CS is the null value and no further General Rules ofthis Subclause are applied. That "no further General Rules" clause implies (assuming it's not a blatant mistake in the spec) that this rule is therefore skipped in the case of nulls: 21) If the <cast specification> contains a <domain name> and that <domain name> refers to a domain that contains a <domain constraint> and if TV does not satisfy the <check constraint definition> simply contained in the <domain constraint>,then an exception condition is raised: integrity constraint violation. Which would imply that you can cast a NULL to a domain type even if that would violate a constraint. Which would pretty much leave actual assignment to storage as being the only place for the check to happen. -- Andrew (irc:RhodiumToad)
On Mon, Sep 14, 2009 at 11:20:59PM -0400, Tom Lane wrote: > There is some moderately interesting reading material in section > 4.17.4 "Domain constraints" of SQL:2008. Not sure where to look for a copy of that, nor any particularly helpful links :( > In particular, it appears to > me that the standard goes out of its way to NOT claim that every value > that "is of" a domain type satisfies the domain's constraints. It looks > to me that the implementation they have in mind is that domain > constraints are to be checked: > > (1) when a value is assigned to a *table* column having that domain type; > > (2) when a value is converted to that domain type by an *explicit* > cast construct; > > (3) nowhere else. I struggle to get any useful meaning out of the SQL specs, but that sounds about right to me. > If I'm reading this right, it sidesteps most of the concerns we have > been worrying about here, at the cost of being perhaps more surprising > and less useful than one would expect. It means that domains are a world away from ADTs (abstract data types) and just seem to function as quick "templates" for creating new columns. PG seems to be treating domains as ADTs at the moment, which is the abstraction that's proved to be more useful in larger programming projects. > It would also mean that a lot > of our existing domain behavior is wrong. I think there is ammunition > here for an argument that, in effect, values "in flight" in expression > or query evaluation should always be considered to be of base types, > and domain constraints should only be checked when assigning to a > persistent storage location such as a table field or plpgsql variable > (plus the special case for CAST constructs). Are you considering changing PGs behavior here? and if so, what would happen to existing behavior? -- Sam http://samason.me.uk/
On Tue, Sep 15, 2009 at 05:13:21AM +0100, Andrew Gierth wrote: > But there's a kicker: in Subclause 6.12, <cast specification>, in the > General Rules is: > > a) If the <cast operand> specifies NULL, then the result of CS is > the null value and no further General Rules of this Subclause > are applied. > > That "no further General Rules" clause implies (assuming it's not a > blatant mistake in the spec) that this rule is therefore skipped in > the case of nulls: I think the NOT NULL constraint is a PG specific constraint, I can't see how it's allowed in the spec. Then again, I have trouble parsing the spec so could well be wrong about this. The NOT NULL constraint feels wrong as well, what are the semantics of: CREATE DOMAIN d AS INTEGER NOT NULL; SELECT a.n AS aa, b.n AS bb FROM (VALUES (CAST(1 AS d)),(2)) a(n) LEFT JOIN (VALUES(CAST(1 AS d))) b(n) ON a.n = b.n; in the presence of it? I'm expecting aa and bb both to come out as domain "d", but this shouldn't work with what you're saying the current semantics should be. -- Sam http://samason.me.uk/
>>>>> "Sam" == Sam Mason <sam@samason.me.uk> writes: >> But there's a kicker: in Subclause 6.12, <cast specification>, in the>> General Rules is:>> >> a) If the <cast operand>specifies NULL, then the result of CS is>> the null value and no further General Rules of this Subclause>> are applied.>>>> That "no further General Rules" clause implies (assuming it's not a>> blatant mistake in the spec) that thisrule is therefore skipped in>> the case of nulls: Sam> I think the NOT NULL constraint is a PG specific constraint, ISam> can't see how it's allowed in the spec. That's a good point; it doesn't seem to be. But the spec _does_ appear to allow CHECK(VALUE IS NOT NULL) as a domain constraint (in general the spec defines NOT NULL constraints this way), and the wording from 6.12 implies that that check is still skipped in the case of NULLs (so that constraint would stop you inserting a null into a table column (I think), but not from casting a null value to the domain type). Sam> The NOT NULL constraint feels wrong as well, what are theSam> semantics of: Sam> CREATE DOMAIN d AS INTEGER NOT NULL;Sam> SELECT a.n AS aa, b.n AS bbSam> FROM (VALUES (CAST(1 AS d)),(2)) a(n)Sam> LEFT JOIN (VALUES (CAST(1 AS d))) b(n) ON a.n = b.n; Sam> in the presence of it? I'm expecting aa and bb both to come outSam> as domain "d", but this shouldn't work with whatyou're sayingSam> the current semantics should be. I think that's just another example of Tom's initial comment about how broken domain "not null" constraints are currently. -- Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > "Sam" == Sam Mason <sam@samason.me.uk> writes: > Sam> The NOT NULL constraint feels wrong as well, what are the > Sam> semantics of: > Sam> CREATE DOMAIN d AS INTEGER NOT NULL; > Sam> SELECT a.n AS aa, b.n AS bb > Sam> FROM (VALUES (CAST(1 AS d)),(2)) a(n) > Sam> LEFT JOIN (VALUES (CAST(1 AS d))) b(n) ON a.n = b.n; > Sam> in the presence of it? I'm expecting aa and bb both to come out > Sam> as domain "d", but this shouldn't work with what you're saying > Sam> the current semantics should be. > I think that's just another example of Tom's initial comment about how > broken domain "not null" constraints are currently. Well, the LEFT JOIN case is exactly why I feel that domain not-null constraints are inherently broken. The only clean way around it is to decree that the output of a left join is not of the domain type after all, but of its base type. Which seems to me to be one side effect of the wording in 4.17.4, though they are extending it to *all* evaluation contexts not only outer joins. I haven't yet read the additional material you guys found ... regards, tom lane
On Tue, Sep 15, 2009 at 02:54:18PM +0100, Andrew Gierth wrote: > the spec _does_ appear to allow CHECK(VALUE IS NOT NULL) as a > domain constraint (in general the spec defines NOT NULL constraints > this way), Huh, that's a trivial rewrite isn't it. Not sure why it didn't occur to me that it's just syntax sugar. > and the wording from 6.12 implies that that check is still > skipped in the case of NULLs (so that constraint would stop you > inserting a null into a table column (I think), but not from casting a > null value to the domain type). Explicitly ignoring NULL values in CAST expressions seems like a good feature as well. Although it gives me the feeling that domains are more and more like a mis-designed feature. > >>>>> "Sam" == Sam Mason <sam@samason.me.uk> writes: > Sam> The NOT NULL constraint feels wrong as well, > I think that's just another example of Tom's initial comment about how > broken domain "not null" constraints are currently. Hum, given that it's just sugar for more general constraints I'm not sure if it's the not null constraints that are broken or just the current interpretation of them. They would do the "right thing" if they were only checked in a limited number of places that the user was aware of, which the spec seems to imply is when the user explicitly asks for a CAST to be performed or when writing into the table. -- Sam http://samason.me.uk/
Since our shop seems to use domains more than most, I figured I should comment on this thread. >Sam Mason <sam@samason.me.uk> wrote: >> On Tue, Sep 15, 2009 at 02:54:18PM +0100, Andrew Gierth wrote: >> and the wording from 6.12 implies that that check is still >> skipped in the case of NULLs (so that constraint would stop you >> inserting a null into a table column (I think), but not from >> casting a null value to the domain type). > > Explicitly ignoring NULL values in CAST expressions seems like a > good feature as well. OK by me. > Although it gives me the feeling that domains are more > and more like a mis-designed feature. They have their place, for when you don't really need a new type, but you want to show that multiple columns contain data from the same set. My rule of thumb is this -- if it would make sense for two columns to be compared for equality, there's a very good chance they belong in the same domain; if not, they probably don't. Using them helps to document complex databases and helps with portability, quite aside from the issue of constraints. > Hum, given that it's just sugar for more general constraints I'm > not sure if it's the not null constraints that are broken or just > the current interpretation of them. They would do the "right > thing" if they were only checked in a limited number of places > that the user was aware of, which the spec seems to imply is when > the user explicitly asks for a CAST to be performed or when > writing into the table. If that's what the spec says, then +1 from me. The change won't cause problems here. -Kevin