Thread: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

[mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

From
Decibel!
Date:
ISTM that having a built-in array_to_set function would be awfully
useful... Is the aggregate method below an acceptable way to do it?

----- Forwarded message from Merlin Moncure <mmoncure@gmail.com> -----
On 8/3/07, Guy Fraser <guy@incentre.net> wrote:
> On Wed, 2007-08-01 at 07:14 +0530, Merlin Moncure wrote:
> > On 8/1/07, Decibel! <decibel@decibel.org> wrote:
> > > David Fetter and I just came up with these, perhaps others will find
> > > them useful:
> > >
> > > CREATE OR REPLACE FUNCTION array_to_set(anyarray, int) RETURNS SETOF anyelement LANGUAGE SQL AS $$
> > >     SELECT $1[i] from generate_series(array_lower($1, $2), array_upper($1, $2)) i
> > > $$;
> > > CREATE OR REPLACE FUNCTION array_to_set(anyarray) RETURNS SETOF anyelement LANGUAGE SQL AS $$
> > >     SELECT array_to_set($1, 1)
> > > $$;
> >
> > very nice, although IMO there is a strong justification for these
> > functions to be in core and written in C for efficiency (along with
> > array_accum, which I have hand burn from copying and pasting out of
> > the documentation).
> >
> > merlin
> >
> Excellent timing guys. :^)
>
> I was trying to build a function to list the items of an array, but
> ran into problems and was going to post what I had been working on.
>
> Your functions work great.
>
> In case you don't have the function to generate an array from a set
> here is one I have been using :
>
>
> CREATE AGGREGATE array_accum (
>     BASETYPE = anyelement,
>     SFUNC = array_append,
>     STYPE = anyarray,
>     INITCOND = '{}'
> );

I think that's what just about everyone uses.  Unfortunately the
reverse of the function (array_to_set above) AFAIK does not map
directly to the C array API.

merlin

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to      choose an index scan if your joining column's
datatypesdo not      match 


----- End forwarded message -----

--
Decibel!, aka Jim Nasby                        decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

From
Jeremy Drake
Date:
On Tue, 7 Aug 2007, Decibel! wrote:

> ISTM that having a built-in array_to_set function would be awfully
> useful... Is the aggregate method below an acceptable way to do it?

Umm, the array_to_set function is not an aggregate.  Personally, when I
need this functionality, I use this function conveniently present in the
default install:

select * from information_schema._pg_expandarray(ARRAY['foo', 'bar', 'baz']); x  | n
-----+---foo | 1bar | 2baz | 3
(3 rows)


Not exactly well documented or well known, but it works.

>
> ----- Forwarded message from Merlin Moncure <mmoncure@gmail.com> -----
> On 8/3/07, Guy Fraser <guy@incentre.net> wrote:
> > On Wed, 2007-08-01 at 07:14 +0530, Merlin Moncure wrote:
> > > On 8/1/07, Decibel! <decibel@decibel.org> wrote:
> > > > David Fetter and I just came up with these, perhaps others will find
> > > > them useful:
> > > >
> > > > CREATE OR REPLACE FUNCTION array_to_set(anyarray, int) RETURNS SETOF anyelement LANGUAGE SQL AS $$
> > > >     SELECT $1[i] from generate_series(array_lower($1, $2), array_upper($1, $2)) i
> > > > $$;
> > > > CREATE OR REPLACE FUNCTION array_to_set(anyarray) RETURNS SETOF anyelement LANGUAGE SQL AS $$
> > > >     SELECT array_to_set($1, 1)
> > > > $$;
> > >
> > > very nice, although IMO there is a strong justification for these
> > > functions to be in core and written in C for efficiency (along with
> > > array_accum, which I have hand burn from copying and pasting out of
> > > the documentation).
> > >
> > > merlin
> > >
> > Excellent timing guys. :^)
> >
> > I was trying to build a function to list the items of an array, but
> > ran into problems and was going to post what I had been working on.
> >
> > Your functions work great.
> >
> > In case you don't have the function to generate an array from a set
> > here is one I have been using :
> >
> >
> > CREATE AGGREGATE array_accum (
> >     BASETYPE = anyelement,
> >     SFUNC = array_append,
> >     STYPE = anyarray,
> >     INITCOND = '{}'
> > );
>
> I think that's what just about everyone uses.  Unfortunately the
> reverse of the function (array_to_set above) AFAIK does not map
> directly to the C array API.
>
> merlin
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
>
>
> ----- End forwarded message -----
>
>

-- 
Mollison's Bureaucracy Hypothesis:If an idea can survive a bureaucratic review and be implementedit wasn't worth
doing.


Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

From
Decibel!
Date:
On Tue, Aug 07, 2007 at 10:18:32PM -0700, Jeremy Drake wrote:
> On Tue, 7 Aug 2007, Decibel! wrote:
>
> > ISTM that having a built-in array_to_set function would be awfully
> > useful... Is the aggregate method below an acceptable way to do it?
>
> Umm, the array_to_set function is not an aggregate.  Personally, when I

Sorry, should have been more clear... array_to_set as shown below isn't,
but array_accum is, and does the opposite (set_to_array).

> need this functionality, I use this function conveniently present in the
> default install:
>
> select * from information_schema._pg_expandarray(ARRAY['foo', 'bar', 'baz']);
>   x  | n
> -----+---
>  foo | 1
>  bar | 2
>  baz | 3
> (3 rows)
>
> Not exactly well documented or well known, but it works.

Worse than that, that's something that's entirely internal and could
change at any release. The fact that it exists for info_schema indicates
further need for these functions to exist in the backend.

> >
> > ----- Forwarded message from Merlin Moncure <mmoncure@gmail.com> -----
> > On 8/3/07, Guy Fraser <guy@incentre.net> wrote:
> > > On Wed, 2007-08-01 at 07:14 +0530, Merlin Moncure wrote:
> > > > On 8/1/07, Decibel! <decibel@decibel.org> wrote:
> > > > > David Fetter and I just came up with these, perhaps others will find
> > > > > them useful:
> > > > >
> > > > > CREATE OR REPLACE FUNCTION array_to_set(anyarray, int) RETURNS SETOF anyelement LANGUAGE SQL AS $$
> > > > >     SELECT $1[i] from generate_series(array_lower($1, $2), array_upper($1, $2)) i
> > > > > $$;
> > > > > CREATE OR REPLACE FUNCTION array_to_set(anyarray) RETURNS SETOF anyelement LANGUAGE SQL AS $$
> > > > >     SELECT array_to_set($1, 1)
> > > > > $$;
> > > >
> > > > very nice, although IMO there is a strong justification for these
> > > > functions to be in core and written in C for efficiency (along with
> > > > array_accum, which I have hand burn from copying and pasting out of
> > > > the documentation).
> > > >
> > > > merlin
> > > >
> > > Excellent timing guys. :^)
> > >
> > > I was trying to build a function to list the items of an array, but
> > > ran into problems and was going to post what I had been working on.
> > >
> > > Your functions work great.
> > >
> > > In case you don't have the function to generate an array from a set
> > > here is one I have been using :
> > >
> > >
> > > CREATE AGGREGATE array_accum (
> > >     BASETYPE = anyelement,
> > >     SFUNC = array_append,
> > >     STYPE = anyarray,
> > >     INITCOND = '{}'
> > > );
> >
> > I think that's what just about everyone uses.  Unfortunately the
> > reverse of the function (array_to_set above) AFAIK does not map
> > directly to the C array API.
> >
> > merlin
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 9: In versions below 8.0, the planner will ignore your desire to
> >        choose an index scan if your joining column's datatypes do not
> >        match
> >
> >
> > ----- End forwarded message -----
> >
> >
>
> --
> Mollison's Bureaucracy Hypothesis:
>     If an idea can survive a bureaucratic review and be implemented
>     it wasn't worth doing.
>

--
Decibel!, aka Jim Nasby                        decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

From
Michael Glaesemann
Date:
On Aug 8, 2007, at 11:41 , Decibel! wrote:

> On Tue, Aug 07, 2007 at 10:18:32PM -0700, Jeremy Drake wrote:

>> select * from information_schema._pg_expandarray(ARRAY['foo',  
>> 'bar', 'baz']);
>>   x  | n
>> -----+---
>>  foo | 1
>>  bar | 2
>>  baz | 3
>> (3 rows)
>>
>> Not exactly well documented or well known, but it works.
>
> Worse than that, that's something that's entirely internal and could
> change at any release. The fact that it exists for info_schema  
> indicates
> further need for these functions to exist in the backend.

Personally, I think expandarray is more appropriate and its  
functionality probably more generally useful, as it identifies the  
array indices as well. Note you can also rename the columns.

select * from information_schema._pg_expandarray(ARRAY['foo', 'bar',  
'baz']) as b(a,i);  a  | i
-----+---
foo | 1
bar | 2
baz | 3
(3 rows)

array_to_set really isn't, as AFAICS it didn't guarantee element  
uniqueness (but that's just a naming issue).

Michael Glaesemann
grzm seespotcode net



Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

From
Decibel!
Date:
On Wed, Aug 08, 2007 at 12:03:34PM -0500, Michael Glaesemann wrote:
> Personally, I think expandarray is more appropriate and its
> functionality probably more generally useful, as it identifies the
> array indices as well. Note you can also rename the columns.

Sure. My point is that we should have a way to convert arrays to sets
and back in the backend.
--
Decibel!, aka Jim Nasby                        decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

From
Michael Glaesemann
Date:
On Aug 8, 2007, at 12:18 , Decibel! wrote:

> On Wed, Aug 08, 2007 at 12:03:34PM -0500, Michael Glaesemann wrote:
>> Personally, I think expandarray is more appropriate and its
>> functionality probably more generally useful, as it identifies the
>> array indices as well. Note you can also rename the columns.
>
> Sure. My point is that we should have a way to convert arrays to sets
> and back in the backend.

Can't really argue with you there, as I find array_accum myself.  
(Though I'd still nit-pick that this isn't an array to set  
conversion, but rather array to--possibly single-column--table.)

Michael Glaesemann
grzm seespotcode net



Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

From
Bruce Momjian
Date:
TODO item?

---------------------------------------------------------------------------

Michael Glaesemann wrote:
-- Start of PGP signed section.
> 
> On Aug 8, 2007, at 12:18 , Decibel! wrote:
> 
> > On Wed, Aug 08, 2007 at 12:03:34PM -0500, Michael Glaesemann wrote:
> >> Personally, I think expandarray is more appropriate and its
> >> functionality probably more generally useful, as it identifies the
> >> array indices as well. Note you can also rename the columns.
> >
> > Sure. My point is that we should have a way to convert arrays to sets
> > and back in the backend.
> 
> Can't really argue with you there, as I find array_accum myself.  
> (Though I'd still nit-pick that this isn't an array to set  
> conversion, but rather array to--possibly single-column--table.)
> 
> Michael Glaesemann
> grzm seespotcode net
> 
> 
-- End of PGP section, PGP failed!

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

From
"Pavel Stehule"
Date:
2007/8/14, Bruce Momjian <bruce@momjian.us>:
>
> TODO item?
>
>   + If your life is a hard drive, Christ can be your backup. +
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend

I am against. It's too simple do it in SQL language.

Regards
Pavel Stehule


Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

From
Joe Conway
Date:
Bruce Momjian wrote:
> TODO item?
> 

Probably. See SQL2003 UNNEST:

<collection derived table> ::=  UNNEST <left paren> <collection value expression> <right paren>  [ WITH ORDINALITY ]

<collection value expression> ::=    <array value expression>  | <multiset value expression>

Joe


Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

From
Decibel!
Date:
On Tue, Aug 14, 2007 at 05:38:33PM +0200, Pavel Stehule wrote:
> 2007/8/14, Bruce Momjian <bruce@momjian.us>:
> >
> > TODO item?
> >
> >   + If your life is a hard drive, Christ can be your backup. +
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: explain analyze is your friend
>
> I am against. It's too simple do it in SQL language.

Why make everyone who works with arrays create a function just to do
this? Something that's of use to common users should be included, simple
or not.
--
Decibel!, aka Jim Nasby                        decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

From
"Merlin Moncure"
Date:
On 8/14/07, Bruce Momjian <bruce@momjian.us> wrote:
>
> TODO item?

I would say yes...array_accum is virtually an essential function when
working with arrays and the suggested array_to_set (and it's built in
cousin, _pg_expand_array) really should not be built around
generate_series when a C function is faster and will scale much
better.

array_to_set, as suggested in SQL, is something only a relative expert
with PostgreSQL could be expected to write.

Thus could generate_series be relieved from providing the only core
function for set returning functions in the documentation.  IMO, this
part of the documentation could use some expansion anyways :)

merlin


Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

From
David Fetter
Date:
On Tue, Aug 14, 2007 at 04:08:27PM -0500, Decibel! wrote:
> On Tue, Aug 14, 2007 at 05:38:33PM +0200, Pavel Stehule wrote:
> > 2007/8/14, Bruce Momjian <bruce@momjian.us>:
> > >
> > > TODO item?
> > >
> > >   + If your life is a hard drive, Christ can be your backup. +
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 6: explain analyze is your friend
> > 
> > I am against. It's too simple do it in SQL language.
> 
> Why make everyone who works with arrays create a function just to do
> this? Something that's of use to common users should be included, simple
> or not.

As I recall, this wasn't included because it didn't do all of what the
SQL:2003 standard UNNEST does, although looking it over, the only
thing it "doesn't do" is to allow the results to come back in any
order other than the array index's.  Sadly, UNNEST would be a new
feature, and we're *way* past that for 8.3 :/

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778        AIM: dfetter666                             Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate


Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

From
"Pavel Stehule"
Date:
2007/8/14, Decibel! <decibel@decibel.org>:
> On Tue, Aug 14, 2007 at 05:38:33PM +0200, Pavel Stehule wrote:
> > 2007/8/14, Bruce Momjian <bruce@momjian.us>:
> > >
> > > TODO item?
> > >
> > >   + If your life is a hard drive, Christ can be your backup. +
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 6: explain analyze is your friend
> >
> > I am against. It's too simple do it in SQL language.
>
> Why make everyone who works with arrays create a function just to do
> this? Something that's of use to common users should be included, simple
> or not.
> --

Unpacking array is more SQL construct for me, than SRF function. With
function you cannot conntrol behave of unpacking. With SQL construct I
can

SELECT DISTINCT a(i) FROM generate_series ... remove duplicities
SELECT a(i) FROM generate_series ORDER BY .. sorted output
etc

But I can

SELECT * FROM generate_series(ARRAY[1,3,4,5,7,10]);

else
FUNCTION generate_series(anyarray) returns setof any

Regards
Pavel Stehule


Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

From
"Pavel Stehule"
Date:
2007/8/15, Merlin Moncure <mmoncure@gmail.com>:
> On 8/14/07, Bruce Momjian <bruce@momjian.us> wrote:
> >
> > TODO item?
>
> I would say yes...array_accum is virtually an essential function when
> working with arrays and the suggested array_to_set (and it's built in
> cousin, _pg_expand_array) really should not be built around
> generate_series when a C function is faster and will scale much
> better.
>

Hello Merlin

array_accum is good sample of PostgreSQL possibilities. But it is slow.

SELECT ARRAY(SELECT ... FROM ...)) is much faster. :(

so I unlike not necessary aggregate functions

I agree. These constructs can be showed in doc

Regards
Pavel Stehule


Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

From
Decibel!
Date:
On Wed, Aug 15, 2007 at 06:47:05AM +0200, Pavel Stehule wrote:
> 2007/8/14, Decibel! <decibel@decibel.org>:
> > On Tue, Aug 14, 2007 at 05:38:33PM +0200, Pavel Stehule wrote:
> > > 2007/8/14, Bruce Momjian <bruce@momjian.us>:
> > > >
> > > > TODO item?
> > > >
> > > >   + If your life is a hard drive, Christ can be your backup. +
> > > >
> > > > ---------------------------(end of broadcast)---------------------------
> > > > TIP 6: explain analyze is your friend
> > >
> > > I am against. It's too simple do it in SQL language.
> >
> > Why make everyone who works with arrays create a function just to do
> > this? Something that's of use to common users should be included, simple
> > or not.
> > --
>
> Unpacking array is more SQL construct for me, than SRF function. With
> function you cannot conntrol behave of unpacking. With SQL construct I
> can

Huh? You can do a DISTINCT or an ORDER BY on the output of a SRF.

> SELECT DISTINCT a(i) FROM generate_series ... remove duplicities
> SELECT a(i) FROM generate_series ORDER BY .. sorted output
> etc
>
> But I can
>
> SELECT * FROM generate_series(ARRAY[1,3,4,5,7,10]);
>
> else
> FUNCTION generate_series(anyarray) returns setof any
--
Decibel!, aka Jim Nasby                        decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

From
"Pavel Stehule"
Date:
2007/8/15, Decibel! <decibel@decibel.org>:
> On Wed, Aug 15, 2007 at 06:47:05AM +0200, Pavel Stehule wrote:
> > 2007/8/14, Decibel! <decibel@decibel.org>:
> > > On Tue, Aug 14, 2007 at 05:38:33PM +0200, Pavel Stehule wrote:
> > > > 2007/8/14, Bruce Momjian <bruce@momjian.us>:
> > > > >
> > > > > TODO item?
> > > > >
> > > > >   + If your life is a hard drive, Christ can be your backup. +
> > > > >
> > > > > ---------------------------(end of broadcast)---------------------------
> > > > > TIP 6: explain analyze is your friend
> > > >
> > > > I am against. It's too simple do it in SQL language.
> > >
> > > Why make everyone who works with arrays create a function just to do
> > > this? Something that's of use to common users should be included, simple
> > > or not.
> > > --
> >
> > Unpacking array is more SQL construct for me, than SRF function. With
> > function you cannot conntrol behave of unpacking. With SQL construct I
> > can
>
> Huh? You can do a DISTINCT or an ORDER BY on the output of a SRF.

Yes, but then you get little bit different my form :)


>
> > SELECT DISTINCT a(i) FROM generate_series ... remove duplicities
> > SELECT a(i) FROM generate_series ORDER BY .. sorted output
> > etc
> >
> > But I can
> >
> > SELECT * FROM generate_series(ARRAY[1,3,4,5,7,10]);
> >
> > else
> > FUNCTION generate_series(anyarray) returns setof any
> --
> Decibel!, aka Jim Nasby                        decibel@decibel.org
> EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)
>
>


Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

From
Bruce Momjian
Date:
This has been saved for the 8.4 release:
http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Pavel Stehule wrote:
> 2007/8/15, Merlin Moncure <mmoncure@gmail.com>:
> > On 8/14/07, Bruce Momjian <bruce@momjian.us> wrote:
> > >
> > > TODO item?
> >
> > I would say yes...array_accum is virtually an essential function when
> > working with arrays and the suggested array_to_set (and it's built in
> > cousin, _pg_expand_array) really should not be built around
> > generate_series when a C function is faster and will scale much
> > better.
> >
> 
> Hello Merlin
> 
> array_accum is good sample of PostgreSQL possibilities. But it is slow.
> 
> SELECT ARRAY(SELECT ... FROM ...)) is much faster. :(
> 
> so I unlike not necessary aggregate functions
> 
> I agree. These constructs can be showed in doc
> 
> Regards
> Pavel Stehule

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

From
Bruce Momjian
Date:
Added to TODO:

* Add array_accum() and array_to_set() functions for arrays
 http://archives.postgresql.org/pgsql-hackers/2007-08/msg00464.php


---------------------------------------------------------------------------

Merlin Moncure wrote:
> On 8/14/07, Bruce Momjian <bruce@momjian.us> wrote:
> >
> > TODO item?
> 
> I would say yes...array_accum is virtually an essential function when
> working with arrays and the suggested array_to_set (and it's built in
> cousin, _pg_expand_array) really should not be built around
> generate_series when a C function is faster and will scale much
> better.
> 
> array_to_set, as suggested in SQL, is something only a relative expert
> with PostgreSQL could be expected to write.
> 
> Thus could generate_series be relieved from providing the only core
> function for set returning functions in the documentation.  IMO, this
> part of the documentation could use some expansion anyways :)
> 
> merlin
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
> 
>                http://www.postgresql.org/docs/faq

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +