Thread: Re: [GENERAL] aggregate returning anyarray and 'cannot determine result data type'
Re: [GENERAL] aggregate returning anyarray and 'cannot determine result data type'
From
Tom Lane
Date:
[ redirecting to -hackers ] Tomas Vondra <tv@fuzzy.cz> writes: > So my plan was to do something like this: > sample_append(internal, anyelement, int) -> internal > sample_final(internal) -> anyarray > CREATE AGGREGATE sample_agg(anyelement, int) ( > SFUNC = sample_append, > STYPE = internal, > FINALFUNC = sample_final > ); > However this leads to > ERROR: cannot determine result data type > DETAIL: A function returning a polymorphic type must have at least > one polymorphic argument > because 'sample_final' produces anyarray but has no polymorphic > argument. Yeah, this is a problem with trying to use internal stype for polymorphic aggregates. The same problem came up in connection with the "ordered set" aggregates that were added recently, and that patch implemented an interesting workaround: the final function for an OSA gets additional dummy arguments of the same type as the aggregate inputs. They are always passed as NULLs at runtime, and have no real value except if the aggregate is polymorphic --- but when it is, they provide a way to resolve the result type of a polymorphic final function, even if the state type is "internal" or otherwise non-polymorphic. I thought at the time that maybe we should offer this feature for regular aggregates as well as ordered-set ones, but didn't do anything about it because there hadn't been demand. If we did have it, you could solve this problem with sample_append(internal, anyelement, int) -> internal sample_final(internal, anyelement, int) -> anyarray CREATE AGGREGATE sample_agg(anyelement, int) ( SFUNC = sample_append, STYPE = internal, FINALFUNC = sample_final); where sample_final would have to be declared non-strict (since it'd always be getting some NULL arguments), but that's a small price to pay. I think it'd be a pretty small adjustment to the already-committed code to allow this to happen. Basically we'd just have to decouple the extra-arguments-to-finalfn behavior from ordered-set aggregates. One potential issue though is that if sample_final existed in both signatures it wouldn't be very clear which one got selected for the aggregate. Perhaps the best fix would be to invent a different CREATE AGGREGATE keyword for finalfns with extra arguments? If so, that's something we ought to do *now*, not in 9.5, because it'll be too late to redefine how to create OSAs once 9.4 ships. Thoughts? regards, tom lane
Re: [GENERAL] aggregate returning anyarray and 'cannot determine result data type'
From
Tom Lane
Date:
I wrote: > The same problem came up in connection with the "ordered set" aggregates > that were added recently, and that patch implemented an interesting > workaround: the final function for an OSA gets additional dummy arguments > of the same type as the aggregate inputs. They are always passed as NULLs > at runtime, and have no real value except if the aggregate is polymorphic > --- but when it is, they provide a way to resolve the result type of a > polymorphic final function, even if the state type is "internal" or > otherwise non-polymorphic. > I thought at the time that maybe we should offer this feature for regular > aggregates as well as ordered-set ones, but didn't do anything about > it because there hadn't been demand. After sleeping on it, I'm convinced that this was an oversight that we should fix before 9.4 ships. The code changes should be pretty minimal; the executor in particular probably needs *less* code to do this in a uniform way. > One potential issue though is that if sample_final existed in both > signatures it wouldn't be very clear which one got selected for the > aggregate. Perhaps the best fix would be to invent a different > CREATE AGGREGATE keyword for finalfns with extra arguments? To be concrete: let's add a new boolean parameter with the semantics of "final function takes extra dummy arguments" (default false). There would need to be one for the separate moving-aggregate final function too, of course. The best naming idea I've got right now is "finalfunc_extra" and "mfinalfunc_extra", but maybe somebody can do better? regards, tom lane
Re: [GENERAL] aggregate returning anyarray and 'cannot determine result data type'
From
Tomas Vondra
Date:
On 23.4.2014 16:07, Tom Lane wrote: > > To be concrete: let's add a new boolean parameter with the semantics > of "final function takes extra dummy arguments" (default false). > There would need to be one for the separate moving-aggregate final > function too, of course. > > The best naming idea I've got right now is "finalfunc_extra" and > "mfinalfunc_extra", but maybe somebody can do better? Do we really need a separate parameter for this? Couldn't this be decided simply using the signature of the final function? Either it has a single parameter (current behavior), or it has the same parameters as the state transition function (new behavior). regards Tomas
Re: [GENERAL] aggregate returning anyarray and 'cannot determine result data type'
From
Tom Lane
Date:
Tomas Vondra <tv@fuzzy.cz> writes: > On 23.4.2014 16:07, Tom Lane wrote: >> To be concrete: let's add a new boolean parameter with the semantics >> of "final function takes extra dummy arguments" (default false). >> There would need to be one for the separate moving-aggregate final >> function too, of course. > Do we really need a separate parameter for this? Couldn't this be > decided simply using the signature of the final function? Either it has > a single parameter (current behavior), or it has the same parameters as > the state transition function (new behavior). The problem is that the CREATE AGGREGATE syntax only specifies the name of the final function, not its argument list, so you have to make an assumption about the argument list in order to look up the final function in the first place. I did consider the idea of looking for both signatures and using whatever we find, but that seems fairly dangerous: the same CREATE AGGREGATE command could give different results depending on what versions of the final function happen to exist. This would create an ordering hazard that pg_dump could not reliably cope with, for example. regards, tom lane
Re: [GENERAL] aggregate returning anyarray and 'cannot determine result data type'
From
Tomas Vondra
Date:
On 25.4.2014 23:26, Tom Lane wrote: > Tomas Vondra <tv@fuzzy.cz> writes: >> On 23.4.2014 16:07, Tom Lane wrote: >>> To be concrete: let's add a new boolean parameter with the >>> semantics of "final function takes extra dummy arguments" >>> (default false). There would need to be one for the separate >>> moving-aggregate final function too, of course. > >> Do we really need a separate parameter for this? Couldn't this be >> decided simply using the signature of the final function? Either >> it has a single parameter (current behavior), or it has the same >> parameters as the state transition function (new behavior). > > The problem is that the CREATE AGGREGATE syntax only specifies the > name of the final function, not its argument list, so you have to > make an assumption about the argument list in order to look up the > final function in the first place. > > I did consider the idea of looking for both signatures and using > whatever we find, but that seems fairly dangerous: the same CREATE > AGGREGATE command could give different results depending on what > versions of the final function happen to exist. This would create an > ordering hazard that pg_dump could not reliably cope with, for > example. Yeah. And it wouldn't be clear which function to use in case two suitable functions (with different signatures) exist. So I guess this actually requires a parameter. I'd vote for "finalfunc_extra" - can't think of a better name, and I'm not sure what the "m" in "mfinalfunc_extra" stands for. regards Tomas
Re: [GENERAL] aggregate returning anyarray and 'cannot determine result data type'
From
Tom Lane
Date:
Tomas Vondra <tv@fuzzy.cz> writes: > On 25.4.2014 23:26, Tom Lane wrote: >> The problem is that the CREATE AGGREGATE syntax only specifies the >> name of the final function, not its argument list, so you have to >> make an assumption about the argument list in order to look up the >> final function in the first place. > Yeah. And it wouldn't be clear which function to use in case two > suitable functions (with different signatures) exist. So I guess this > actually requires a parameter. Exactly. > I'd vote for "finalfunc_extra" - can't think of a better name, and I'm > not sure what the "m" in "mfinalfunc_extra" stands for. Sorry for not being clear. The "m" version is the alternate setting for the moving-aggregate sub-implementation, which is new as of a couple weeks ago: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a9d9acbf219b9e96585779cd5f99d674d4ccba74 regards, tom lane