Thread: idea - new aggregates median, listagg

idea - new aggregates median, listagg

From
Pavel Stehule
Date:
Hello

I am looking on new feature - ORDER clause in aggregate, and I thing,
so we are able to effectively implement some non standard, but well
known aggregates.

a) function median - it is relative frequent request - with usually
slow implementation

b) function listagg (it is analogy of group_concat from MySQL) - it
should simplify report generating and some other

What is your opinion? Do you like to see these functions in core?

Regards
Pavel Stehule


Re: idea - new aggregates median, listagg

From
David Fetter
Date:
On Tue, Dec 15, 2009 at 10:28:49PM +0100, Pavel Stehule wrote:
> Hello
> 
> I am looking on new feature - ORDER clause in aggregate, and I thing,
> so we are able to effectively implement some non standard, but well
> known aggregates.
> 
> a) function median - it is relative frequent request - with usually
> slow implementation

Makes a lot of sense.  I suspect we'll have to provide several
different medians, as there are several precise, useful, and
conflicting definitions.  Some examples below:

http://www.simple-talk.com/sql/t-sql-programming/median-workbench/

> b) function listagg (it is analogy of group_concat from MySQL) - it
> should simplify report generating and some other

This is redundant, as it's equivalent to array_to_string(array_agg()).

If it's done as syntactic sugar over that, it's fine.  If it's a
separate implementation, it's a bad idea.  Probably best as an
optional module of some kind.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: idea - new aggregates median, listagg

From
Pavel Stehule
Date:
2009/12/15 David Fetter <david@fetter.org>:
> On Tue, Dec 15, 2009 at 10:28:49PM +0100, Pavel Stehule wrote:
>> Hello
>>
>> I am looking on new feature - ORDER clause in aggregate, and I thing,
>> so we are able to effectively implement some non standard, but well
>> known aggregates.
>>
>> a) function median - it is relative frequent request - with usually
>> slow implementation
>
> Makes a lot of sense.  I suspect we'll have to provide several
> different medians, as there are several precise, useful, and
> conflicting definitions.  Some examples below:
>
> http://www.simple-talk.com/sql/t-sql-programming/median-workbench/

In this article the are two medians - statistical and financial. I am
for both. But only one can be named "median".

>
>> b) function listagg (it is analogy of group_concat from MySQL) - it
>> should simplify report generating and some other
>
> This is redundant, as it's equivalent to array_to_string(array_agg()).

when I implement it in orafce - the almost all code was parameters
checking. Implementation is trivial, because important part of work
was done with array_agg support. It is wrapper over stringInfo.

>
> If it's done as syntactic sugar over that, it's fine.  If it's a
> separate implementation, it's a bad idea.  Probably best as an
> optional module of some kind

I am not sure if implementation as syntactic sugar is best. It needs
parser changes. And final implementation will not be effective.
listagg should be  faster  - string operations are significantly
simpler than array op. What more - PostgreSQL cannot well wrap SRF and
aggregates. The implementation could be analogy with functions
generate_series and generate_subscripts. generate_subscripts is
redundant, but well implementation needs C coding. Implementation as
syntactic sugar is possible - it could be nice too - listagg is more
shorter than array_to_string(array_agg()) - but it has overhead with
array.

There could be some steps. First - implementation as wrapper over
arrays, second - if this functionality will be popular - native
implementation in C?

Regards
Pavel

>
> Cheers,
> David.
> --
> David Fetter <david@fetter.org> http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter      XMPP: david.fetter@gmail.com
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>


Re: idea - new aggregates median, listagg

From
Robert Haas
Date:
On Wed, Dec 16, 2009 at 1:36 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>> b) function listagg (it is analogy of group_concat from MySQL) - it
>>> should simplify report generating and some other
>>
>> This is redundant, as it's equivalent to array_to_string(array_agg()).
>
> when I implement it in orafce - the almost all code was parameters
> checking. Implementation is trivial, because important part of work
> was done with array_agg support. It is wrapper over stringInfo.
>
>>
>> If it's done as syntactic sugar over that, it's fine.  If it's a
>> separate implementation, it's a bad idea.  Probably best as an
>> optional module of some kind
>
> I am not sure if implementation as syntactic sugar is best. It needs
> parser changes.

I don't think this is important enough to change the parser for it.  I
don't see a concrete proposal for syntax here, but it seems like  the
standard function call syntax should be adequate.

...Robert


Re: idea - new aggregates median, listagg

From
"Kevin Grittner"
Date:
Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2009/12/15 David Fetter <david@fetter.org>:
>> http://www.simple-talk.com/sql/t-sql-programming/median-workbench/
> 
> In this article the are two medians - statistical and financial. I
> am for both. But only one can be named "median".
Well, since the statistical median requires that you specify whether
you want the left or right median (in case there are an even number
of values), you could either have two median functions, one of which
would take an extra parameter for this, or you could have median,
left_median, and right_median functions.
-Kevin


Re: idea - new aggregates median, listagg

From
Thom Brown
Date:
2009/12/15 Pavel Stehule <pavel.stehule@gmail.com>
Hello

I am looking on new feature - ORDER clause in aggregate, and I thing,
so we are able to effectively implement some non standard, but well
known aggregates.

a) function median - it is relative frequent request - with usually
slow implementation

b) function listagg (it is analogy of group_concat from MySQL) - it
should simplify report generating and some other

What is your opinion? Do you like to see these functions in core?



I'm probably missing the point here, but when I originally saw MySQL's group_concat function, I found it odd that it featured ordering functionality.  Shouldn't the order by determined by the query itself?  Otherwise it's almost as if its separating the relationship between the result column and the resultset.

Thom

Re: idea - new aggregates median, listagg

From
Pavel Stehule
Date:
2009/12/16 Thom Brown <thombrown@gmail.com>:
> 2009/12/15 Pavel Stehule <pavel.stehule@gmail.com>
>>
>> Hello
>>
>> I am looking on new feature - ORDER clause in aggregate, and I thing,
>> so we are able to effectively implement some non standard, but well
>> known aggregates.
>>
>> a) function median - it is relative frequent request - with usually
>> slow implementation
>>
>> b) function listagg (it is analogy of group_concat from MySQL) - it
>> should simplify report generating and some other
>>
>> What is your opinion? Do you like to see these functions in core?
>>
>>
>
> I'm probably missing the point here, but when I originally saw MySQL's
> group_concat function, I found it odd that it featured ordering
> functionality.  Shouldn't the order by determined by the query itself?
> Otherwise it's almost as if its separating the relationship between the
> result column and the resultset.
>

Aggregates as group_concat or listagg are not typical SQL aggregates.
With these aggregates we are able to do some reports on SQL level
without stored procedures. What I know, order is determined only for
non hash aggregates - and you cannot specify method of aggregation, so
possibility to specify ORDER is important. But this feature isn't
related to this "proposal". It was commited yesterday - so you can
look on discussion about this feature.

Regards
Pavel Stehuke

> Thom
>
>


Re: idea - new aggregates median, listagg

From
Scott Bailey
Date:
Thom Brown wrote:
> 2009/12/15 Pavel Stehule <pavel.stehule@gmail.com 
> <mailto:pavel.stehule@gmail.com>>
> 
>     Hello
> 
>     I am looking on new feature - ORDER clause in aggregate, and I thing,
>     so we are able to effectively implement some non standard, but well
>     known aggregates.
> 
>     a) function median - it is relative frequent request - with usually
>     slow implementation
> 
>     b) function listagg (it is analogy of group_concat from MySQL) - it
>     should simplify report generating and some other
> 
>     What is your opinion? Do you like to see these functions in core?
> 
> 
> 
> I'm probably missing the point here, but when I originally saw MySQL's 
> group_concat function, I found it odd that it featured ordering 
> functionality.  Shouldn't the order by determined by the query itself?  
> Otherwise it's almost as if its separating the relationship between the 
> result column and the resultset.

For xmlagg in particular, it is quite useful to be able order the 
results. And sorting the query doesn't work for ordering the agg unless 
you do it in a subquery. Oracle has this functionality and it is quite 
handy. It would be nice to see listagg with the option to order as well.

Scott