Re: Re: 4.1beta1: ANYARRAY disallowed for DOMAIN types which happen to be arrays - Mailing list pgsql-hackers

From Noah Misch
Subject Re: Re: 4.1beta1: ANYARRAY disallowed for DOMAIN types which happen to be arrays
Date
Msg-id 20110511191249.GA29592@tornado.gateway.2wire.net
Whole thread Raw
In response to Re: Re: 4.1beta1: ANYARRAY disallowed for DOMAIN types which happen to be arrays  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, May 11, 2011 at 10:22:01AM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > On Mon, May 09, 2011 at 11:32:28PM -0400, Tom Lane wrote:
> >> So we basically had three alternatives to make it better:
> >>     * downcast to the array type, which would possibly silently
> >>       break applications that were relying on the function result
> >>       being considered of the domain type
> >>     * re-apply domain checks on the function result, which would be
> >>       a performance hit and possibly again result in unobvious
> >>       breakage
> >>     * explicitly break it by throwing a parse error until you
> >>       downcast (and then upcast the function result if you want)
> >> I realize that #3 is a bit unpleasant, but are either of the other two
> >> better?  At least #3 shows you where you need to check for problems.
> 
> > Though I've never used a domain over an array type, I'd strongly prefer #2.
> 
> Hmm.  I hadn't seriously considered that alternative, but we could go in
> that direction.  Logically, what this would probably imply is inserting
> CastToDomain whenever the result of a polymorphic function is deemed to
> be of a domain type, whether the base type is array or not.
> 
> The reason I hadn't taken it very seriously is that I don't think it's
> actually going to end up being consistent.  If we don't do #1 (downcast
> polymorphic arguments to a base type), but consider the arguments passed
> to the function to be of the domain type, then really we have to expect
> the polymorphic function to enforce domain constraints internally; we
> cannot fix it with something as localized as having the function call
> parser stick a CastToDomain on top.  Here's a possibly rather silly
> example:
> 
>     create function negate(anyelement) returns anyelement as
>     $$ select - $1 $$ language sql;
> 
>     create domain pos as int check (value > 0);
> 
>     select negate(42::pos);
> 
> This negate() function will work for any type that has a unary minus
> operator.  But the result of the unary minus operation cannot sanely be
> considered to be of this domain type.

While simple, I think that example covers the salient features.  The git master
behavior is sound: "ERROR:  return type mismatch in function declared to return
pos DETAIL:  Actual return type is integer."  If you defined a unary minus
operator for the domain type itself, the function implementing that operator
would then gain responsibility for preserving any domain constraints.

Now, perhaps it's unfortunate that the example can't be easily rewritten to
actually work for arbitrary domain inputs (without rewriting it in C).  I don't
have any particular ideas for improving that.

> In this simplified example you
> might feel it doesn't matter, since with an external CastToDomain we'd
> throw error anyway a moment later, as soon as control comes back from
> the function.  But what if the function does further operations with the
> value, such as passing it to another polymorphic function?

The SQL PL understands that the value of -($1::pos) has type integer, as does
PL/pgSQL.  I'm not seeing any problems offhand.

> So really, if you go down this path, you end up concluding that PLs
> supporting polymorphic arguments had better be prepared to enforce
> domain constraints all the way through, and thus there should be no need
> for an external CastToDomain --- what comes back from the function ought
> to be checked already.

That was my conclusion.  I'm not aware of any particular holes in this area, but
I won't wager there are none.  The code that produces the domain-typed datums is
not usually part of the PL implementation, so the PL mostly needs to rigorously
track the provenance of its datums.

> Unfortunately, even if the PLs do that (SQL
> functions might get it right, but I'm not real sure whether plpgsql is
> water-tight on this, and I don't trust the other PLs for it at all),
> there's no way that built-in polymorphic functions like array_append are
> going to.
> 
> So on the whole, #2 looks like an implementation quagmire to me: it's
> not clear what to check, or where, or how you know when you're done.

Every C function is responsible for returning a datum consistent with its
declared return type: we do nothing in particular to ensure that a function has
done so.  I see domain constraints as a natural extension of that rule.  If a C
function purports to return a domain-typed datum, perhaps through polymorphism,
it's responsible for having checked the datum against that domain.  Every PL
accepts responsibility for enforcing this on behalf of its functions, much as it
enforces general consistency between return types and actual returned datums.

A function with a polymorphic argument returning the same polymorphic type is
consequently responsible for applying domain checks when the concrete type is a
domain.  (A function with an anyelement argument that returns anyarray is not a
problem, because the return type is never a domain.  Likewise for a function
with an anyarray argument that returns anyelement.)  Functions returning
anyarray are at the most risk, because it's actually practical to write a
function that fabricates an array of arbitrary type.  I can't come up with an
analogous example for a scalar type -- they end up delegating the fabrication to
type-specific code, like your unary negation example.

Again, I think that leaves just array_append, array_prepend, and array_cat
actually needing a new check.

> I'm not willing to volunteer my own time to make it work that way.
> If somebody else who uses domains a lot wants to step up and take
> responsibility, go for it.

I'm willing the implement the above behavior, if it seems reasonable.

nm


pgsql-hackers by date:

Previous
From: Darren Duncan
Date:
Subject: Re: VARIANT / ANYTYPE datatype
Next
From: Darren Duncan
Date:
Subject: Re: VARIANT / ANYTYPE datatype