On Fri, Nov 12, 2010 at 4:07 PM, Jeff Davis <pgsql@j-davis.com> wrote:
> I think the best we'll do is be able to hack on some of the things that
> we actively want and have clear use cases for, such as type interfaces.
> We might have to give up on some of the more ambitious ideas that
> involve propagating interesting information through the type inference
> system; or having any real type that wasn't declared with CREATE TYPE.
> Consider that right now we bundle the element type information along
> with the array _value_.
Here are some weaknesses in the SUM aggregate that run up against the
type system. Maybe they'll help crystallize some discussion:
SUM(int2) => int4
SUM(int4) => int8
SUM(int8) => numeric
Some weaknesses:
SUM, of any precision, assumes that the precision being accumulated
into (which is also the return-precision) is enough to avoid overflow.
This is generally the case, but there's no reason why it *must* be
true. I'm especially looking at the int2 to int4 conversion. One could
imagine a more interesting scenario where overflow behavior could
occur much more easily.
SUM always promotes types upwards in precision, and is unable to keep
types of the smallest possible precision should SUM expressions be
composed.
SUM is unable to maintain any supplementary information about
precision, i.e. say anything interesting about the typmod, which
defeats or makes impossible many interesting optimizations.
I think that a type-interface system moves towards solving the first
couple of problems, since SUM can return some abstract type such as
"Integer" and use that to promote more aggressively (avoiding
overflow) or keep representations small (avoiding unnecessary
promotion) at run-time. It might require Integer to be an abstract,
non-storable data type, though, so the current CREATE TYPE is not
going to make life easy.
The third problem is slightly different...it might require some
user-pluggable code to be called as part of semantic analysis. I have
felt the idea of making a Postgres Type a more robust first-class data
type and somehow being able to attach a function to another
function/aggregate that is responsible for getting called during
semantic analysis and returning the proper signature roll around in my
head, but it might be the whispers of cthulu, too.
fdr