Thread: Domains and type coercion
The DOMAIN patch is completely broken when it comes to type coercion behavior. For one thing, it doesn't know that any operators or functions on a domain's base type can be used with a domain: domain=# create domain zip as char(2); CREATE domain=# create table foo (f1 zip); CREATE domain=# select f1 || 'z' from foo; ERROR: Unable to identify an operator '||' for types 'zip' and 'unknown' You will have to retype this query usingan explicit cast and casting does not help: domain=# select f1::char(2) || 'z' from foo; ERROR: Cannot cast type 'zip' to 'character' There are more subtle problems too. Among other things, it will generate expressions that are supposed to be labeled with the domain type but are actually labeled with the domain's base type, leading to all sorts of confusion. (The reason we had to introduce RelabelType expression nodes a couple years ago was to avoid just this scenario.) I am thinking that a non-broken approach would involve (1) treating a domain as binary-compatible with its base type, and therefore with all other domains on the same base type, and (2) allowing a coercion function that produces the base type to be used to produce the domain type. (The patch tries to do (2), but does it in the wrong places, leading to the mislabeled-expression problem.) An implication of this is that one could not define functions and operators that implement any interesting domain-type-specific behavior. This strikes me as okay --- it seems like domains are a shortcut to save having to invent a real type, and so people wouldn't care about defining domain-specific functions. If we don't accept binary equivalence of domains to base types, then creating a useful domain will be nearly as nontrivial as creating a new base type. Comments? regards, tom lane
> I am thinking that a non-broken approach would involve (1) treating > a domain as binary-compatible with its base type, and therefore with > all other domains on the same base type, and (2) allowing a coercion > function that produces the base type to be used to produce the domain > type. (The patch tries to do (2), but does it in the wrong places, > leading to the mislabeled-expression problem.) 2 was the goal, and it worked enough for any default expression I could come up with -- so I thought it did pretty good. Guess not. It should be binary equivelent to the base type it's made out of.
I wrote: > I am thinking that a non-broken approach would involve (1) treating > a domain as binary-compatible with its base type, and therefore with > all other domains on the same base type, and (2) allowing a coercion > function that produces the base type to be used to produce the domain > type. I've committed code that does this, and it seems to handle the basic cases okay. However, there are still some corner cases that are unfriendly: regression=# create domain mydom as numeric(7,2); CREATE DOMAIN regression=# create table foo (f1 numeric(7,2), f2 mydom); CREATE regression=# insert into foo values(111,222); INSERT 139780 1 regression=# select f1 + 42 from foo;?column? ---------- 153.00 (1 row) regression=# select f2 + 42 from foo; ERROR: Unable to identify an operator '+' for types 'mydom' and 'integer' You will have to retype this query usingan explicit cast The problem seems to be that when parse_func looks for "exact match" operators, it doesn't consider numeric to be an exact match for mydom. So that heuristic fails and we're left with no unique best choice for the operator. I'm not sure if there's anything much that can be done about this. We could treat exact and binary-compatible matches alike (doesn't seem good), or put a special case into the operator selection rules to reduce domains to their basetypes before making the "exact match" test. Neither of these seem real appealing, but if we don't do something I think that domains are going to be a big pain in the neck to use. Any thoughts? regards, tom lane
Tom Lane wrote: > > (...) or put a special case into the operator selection rules to reduce > domains to their basetypes before making the "exact match" test. By definition, which I believe should be read as "A domain is a set of permissible values (of a data type)". What I am trying to say is that the domain is still the same data type w.r.t. operator and functions so reducing it to the basic type for such searchs is the right thing to do. > Neither of these seem real appealing, but if we don't do something > I think that domains are going to be a big pain in the neck to use. > Agreed. -- Fernando Nasser Red Hat - Toronto E-Mail: fnasser@redhat.com 2323 Yonge Street, Suite #300 Toronto, Ontario M4P 2C9
Tom Lane wrote: > > Any thoughts? > As we are talking about CAST, if one CASTs to a domain, SQL99 says we have to check the constraints and issue a "integrity constraint violation" if appropriate (6.22, GR 21). -- Fernando Nasser Red Hat - Toronto E-Mail: fnasser@redhat.com 2323 Yonge Street, Suite #300 Toronto, Ontario M4P 2C9
... > The problem seems to be that when parse_func looks for "exact match" > operators, it doesn't consider numeric to be an exact match for mydom. > So that heuristic fails and we're left with no unique best choice for > the operator. Sure. At the moment there is no reason for parse_func to think that mydom is anything, right? > I'm not sure if there's anything much that can be done about this. Something has to be done ;) > We could treat exact and binary-compatible matches alike (doesn't seem > good), or put a special case into the operator selection rules to reduce > domains to their basetypes before making the "exact match" test. > Neither of these seem real appealing, but if we don't do something > I think that domains are going to be a big pain in the neck to use. There could also be an explicit heuristic *after* the exact match gathering to look for an exact match for domains reduced to their base types. Is there any reason to look for domains before that? - Thomas
Thomas Lockhart <thomas@fourpalms.org> writes: >> We could treat exact and binary-compatible matches alike (doesn't seem >> good), or put a special case into the operator selection rules to reduce >> domains to their basetypes before making the "exact match" test. > There could also be an explicit heuristic *after* the exact match > gathering to look for an exact match for domains reduced to their base > types. Is there any reason to look for domains before that? The problem in the case I gave was that the "exact match" heuristic was throwing away the operator we really wanted to use. I had "domain + int4" where domain is really numeric. In the base case, "numeric + int4", we'll keep both "numeric + numeric" and "int4 + int4" since each has one exact match, and later decide that "numeric + numeric" is preferred. In the domain case we will keep only "int4 + int4" ... oops. Testing later will not help. If we take the hard SQL99 line that domains *are* the base type plus constraints, then we could reduce domains to base types before we start the entire matching process, and this issue would go away. This would prevent declaring any specialized operators or functions for a domain. (In fact, I'd be inclined to set things up so that it's impossible to store domain type OIDs in pg_proc or pg_operator, thus saving the time of doing getBaseType on one side of the match.) Thoughts? regards, tom lane
Tom Lane wrote: > > If we take the hard SQL99 line that domains *are* the base type plus > constraints, then we could reduce domains to base types before we start > the entire matching process, and this issue would go away. This would > prevent declaring any specialized operators or functions for a domain. > (In fact, I'd be inclined to set things up so that it's impossible to > store domain type OIDs in pg_proc or pg_operator, thus saving the time > of doing getBaseType on one side of the match.) Thoughts? > IMHO this is the right thing to do. -- Fernando Nasser Red Hat - Toronto E-Mail: fnasser@redhat.com 2323 Yonge Street, Suite #300 Toronto, Ontario M4P 2C9
> If we take the hard SQL99 line that domains *are* the base type plus > constraints, then we could reduce domains to base types before we start > the entire matching process, and this issue would go away. This would > prevent declaring any specialized operators or functions for a domain. > (In fact, I'd be inclined to set things up so that it's impossible to > store domain type OIDs in pg_proc or pg_operator, thus saving the time > of doing getBaseType on one side of the match.) Thoughts? It would be fairly straight forward to simply copy the domain base type into the atttypid, then create an atttypdomain (normally 0, except in the case of a domain). Everything would use the attypid, except for \d and pg_dump which could use the domain if it exists. Is this something I should do?
"Rod Taylor" <rbt@zort.ca> writes: > It would be fairly straight forward to simply copy the domain base > type into the atttypid, then create an atttypdomain (normally 0, > except in the case of a domain). Everything would use the attypid, > except for \d and pg_dump which could use the domain if it exists. > Is this something I should do? No, because it's quite irrelevant to the problem of type coercion, which works with expressions; attributes are only one part of the expression world. Actually, considering Fernando's point that a CAST ought to apply the constraints associated with a domain type, your attribute-based implementation is wrong anyway. Rather than merging the domain constraints into the table definition (which will be a nightmare for pg_dump to sort out, anyway) keep 'em separate. The constraints could be checked during casting from a base type to a domain type --- take a look at the existing mechanism for enforcing typmod (length limits), which after all is a simplistic kind of domain constraint. regards, tom lane