Thread: is any reason why only one columns subselect are allowed in array()?

is any reason why only one columns subselect are allowed in array()?

From
"Pavel Stehule"
Date:
Hello

I am thinking about allowing 2D array from multicolumn subselect (all
columns have to share same type, ofcourse). Is there some real limit,
why this functionality is disallowed?

Regards
Pavel Stehule


Re: is any reason why only one columns subselect are allowed in array()?

From
"Robert Haas"
Date:
On Tue, Nov 18, 2008 at 10:06 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> Hello
>
> I am thinking about allowing 2D array from multicolumn subselect (all
> columns have to share same type, ofcourse). Is there some real limit,
> why this functionality is disallowed?

Seems like you could just write SELECT ARRAY[col1, col2, col3] instead
of SELECT col1, col2, col3.

...Robert


Re: is any reason why only one columns subselect are allowed in array()?

From
David Fetter
Date:
On Tue, Nov 18, 2008 at 10:29:53AM -0500, Robert Haas wrote:
> On Tue, Nov 18, 2008 at 10:06 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> > Hello
> >
> > I am thinking about allowing 2D array from multicolumn subselect
> > (all columns have to share same type, ofcourse). Is there some
> > real limit, why this functionality is disallowed?
> 
> Seems like you could just write SELECT ARRAY[col1, col2, col3]
> instead of SELECT col1, col2, col3.

If I understand this right, Pavel is thinking that
   ARRAY(SELECT col1, col2, col3...)

should produce an array each element of which is the compound type
consisting of (col1, col2, col3), and if it doesn't produce one, we're
looking at a pretty large POLA violation.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: is any reason why only one columns subselect are allowed in array()?

From
Sam Mason
Date:
On Tue, Nov 18, 2008 at 07:37:44AM -0800, David Fetter wrote:
> On Tue, Nov 18, 2008 at 10:29:53AM -0500, Robert Haas wrote:
> > Seems like you could just write SELECT ARRAY[col1, col2, col3]
> > instead of SELECT col1, col2, col3.
> 
> If I understand this right, Pavel is thinking that
> 
>     ARRAY(SELECT col1, col2, col3...)
> 
> should produce an array each element of which is the compound type
> consisting of (col1, col2, col3), and if it doesn't produce one, we're
> looking at a pretty large POLA violation.

I've used this syntax before and got a surprising message back.  I'd
expect to be able to do the following:
 ARRAY((SELECT col1, col2 FROM (VALUES ('a',1), ('b',2)) x(col1,col2)));

and get the following back {"(a,1)","(b,2)"}.  So I think I'm with
David.

 Sam


Re: is any reason why only one columns subselect are allowed in array()?

From
"Pavel Stehule"
Date:
2008/11/18 David Fetter <david@fetter.org>:
> On Tue, Nov 18, 2008 at 10:29:53AM -0500, Robert Haas wrote:
>> On Tue, Nov 18, 2008 at 10:06 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> > Hello
>> >
>> > I am thinking about allowing 2D array from multicolumn subselect
>> > (all columns have to share same type, ofcourse). Is there some
>> > real limit, why this functionality is disallowed?
>>
>> Seems like you could just write SELECT ARRAY[col1, col2, col3]
>> instead of SELECT col1, col2, col3.
>
> If I understand this right, Pavel is thinking that
>
>    ARRAY(SELECT col1, col2, col3...)
>
> should produce an array each element of which is the compound type
> consisting of (col1, col2, col3), and if it doesn't produce one, we're
> looking at a pretty large POLA violation.
>

yes, I though this.

there are some not necessary limits, because we should some operations:

postgres=# select array(select array[a,b] from foo);
ERROR:  could not find array type for datatype integer[]
postgres=# select array(select * from foo);
ERROR:  subquery must return only one column
LINE 1: select array(select * from foo);

postgres=# select array[[1,2]]||array[[1,3]];  ?column?
---------------{{1,2},{1,3}}
(1 row)

so I don't see reason why dis feature are blocked

regards
Pavel Stehule

> Cheers,
> David.
> --
> David Fetter <david@fetter.org> http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter      XMPP: david.fetter@gmail.com
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>


Re: is any reason why only one columns subselect are allowed in array()?

From
"Pavel Stehule"
Date:
2008/11/18 Sam Mason <sam@samason.me.uk>:
> On Tue, Nov 18, 2008 at 07:37:44AM -0800, David Fetter wrote:
>> On Tue, Nov 18, 2008 at 10:29:53AM -0500, Robert Haas wrote:
>> > Seems like you could just write SELECT ARRAY[col1, col2, col3]
>> > instead of SELECT col1, col2, col3.
>>
>> If I understand this right, Pavel is thinking that
>>
>>     ARRAY(SELECT col1, col2, col3...)
>>
>> should produce an array each element of which is the compound type
>> consisting of (col1, col2, col3), and if it doesn't produce one, we're
>> looking at a pretty large POLA violation.
>
> I've used this syntax before and got a surprising message back.  I'd
> expect to be able to do the following:
>
>  ARRAY((SELECT col1, col2 FROM (VALUES ('a',1), ('b',2)) x(col1,col2)));
>
> and get the following back {"(a,1)","(b,2)"}.  So I think I'm with
> David.

this is different result - it's array of records, not 2d array.

regards
Pavel

>
>
>  Sam
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: is any reason why only one columns subselect are allowed in array()?

From
Sam Mason
Date:
On Tue, Nov 18, 2008 at 05:20:27PM +0100, Pavel Stehule wrote:
> 2008/11/18 Sam Mason <sam@samason.me.uk>:
> > I've used this syntax before and got a surprising message back.  I'd
> > expect to be able to do the following:
> >
> >  ARRAY((SELECT col1, col2 FROM (VALUES ('a',1), ('b',2)) x(col1,col2)));
> >
> > and get the following back {"(a,1)","(b,2)"}.  So I think I'm with
> > David.
> 
> this is different result - it's array of records, not 2d array.

Yes, but an array of records is much more natural.  There are only a
few specific cases when what you want to do would be useful.  It also
naturally follows on from the current semantics:
 ARRAY(VALUES (1),(2));

returns a 1d array of integers and not a 2d array of unit width---you
don't get this back:
 ARRAY[ARRAY[1],ARRAY[2]]

But I can't see any reason for changing the semantics between when you
return a single column vs. many.  In fact it may confuse calling code
even more.

Records have a predefined and static (over the duration of the query)
structure so it's easy to make an equivalence between single element
records and the element itself (several very rigorously specified
languages do this very successfully).  It's somewhat annoying that PG
only does this sometimes:
 SELECT x, n FROM now() x(n);

Causes the record "x" to be of type timestamp (i.e. the same as "n") and
not a record containing a timestamp.  Whereas:
 SELECT x, n FROM (VALUES (1)) x(n);

Causes the record "x" to remain as a record containing an integer and
"n" to refer to the same integer.  All good fun, but not very relevant!

The length of an array is specifically unknown, so assuming any
equivalence between arrays and single elements of defined type is
difficult at best.  In your example, everything works out because
you're doing a transition from a tuple (record) to a vector (1d array)
to a matrix (2d array), where the length of the vector is constant (as
predicated on the source being a record) and hence width of the matrix
is constant.  But I can't see why the user would always want to put this
middle step in.

Does that make any sense?

 Sam


Re: is any reason why only one columns subselect are allowed in array()?

From
Sam Mason
Date:
On Tue, Nov 18, 2008 at 05:19:12PM +0100, Pavel Stehule wrote:
> there are some not necessary limits, because we should some operations:
> 
> postgres=# select array(select * from foo);
> ERROR:  subquery must return only one column
> LINE 1: select array(select * from foo);

The current limitation that a sub-query can only return a single column
is very annoying and is one of the more common restrictions I bump into
when writing SQL.

 Sam


Re: is any reason why only one columns subselect are allowed in array()?

From
"Pavel Stehule"
Date:
2008/11/18 Sam Mason <sam@samason.me.uk>:
> On Tue, Nov 18, 2008 at 05:19:12PM +0100, Pavel Stehule wrote:
>> there are some not necessary limits, because we should some operations:
>>
>> postgres=# select array(select * from foo);
>> ERROR:  subquery must return only one column
>> LINE 1: select array(select * from foo);
>
> The current limitation that a sub-query can only return a single column
> is very annoying and is one of the more common restrictions I bump into
> when writing SQL.
>

yes, and typical limitation is return only one row too. So these
typical rules are broken yet. I did some fast hacking and after change
about no more than 10 rows I am able remove this limit. It is similar
aggregates that were one argument too, and we have multi parametric
aggregates now.

postgres=# select array(select a from test);?column?
----------{1,2,3}
(1 row)

postgres=# select array(select a,b from test);     ?column?
---------------------{{1,2},{2,3},{3,4}}
(1 row)

postgres=# select array(select a,b,c from test);        ?column?
---------------------------{{1,2,3},{2,3,4},{3,4,5}}
(1 row)

postgres=# select array(select (a,b,c) from test);           ?column?
---------------------------------{"(1,2,3)","(2,3,4)","(3,4,5)"}
(1 row)


postgres=# select * from test;a | b | c
---+---+---1 | 2 | 32 | 3 | 43 | 4 | 5
(3 rows)

It's need some parser lines more (finding most common type), but it is
full compatible - and usefull for multicolumn time series.

regards
Pavel

>
>  Sam
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: is any reason why only one columns subselect are allowed in array()?

From
"Pavel Stehule"
Date:
2008/11/18 Sam Mason <sam@samason.me.uk>:
> On Tue, Nov 18, 2008 at 05:20:27PM +0100, Pavel Stehule wrote:
>> 2008/11/18 Sam Mason <sam@samason.me.uk>:
>> > I've used this syntax before and got a surprising message back.  I'd
>> > expect to be able to do the following:
>> >
>> >  ARRAY((SELECT col1, col2 FROM (VALUES ('a',1), ('b',2)) x(col1,col2)));
>> >
>> > and get the following back {"(a,1)","(b,2)"}.  So I think I'm with
>> > David.
>>
>> this is different result - it's array of records, not 2d array.
>
> Yes, but an array of records is much more natural.  There are only a
> few specific cases when what you want to do would be useful.  It also
> naturally follows on from the current semantics:
>
>  ARRAY(VALUES (1),(2));
>
> returns a 1d array of integers and not a 2d array of unit width---you
> don't get this back:
>
>  ARRAY[ARRAY[1],ARRAY[2]]
>
> But I can't see any reason for changing the semantics between when you
> return a single column vs. many.  In fact it may confuse calling code
> even more

There are simple reason - I am not able to iterate over record in
plpgsql, and I should to do it over 2d array. I am sorry, but I don't
see any real reason for this limit - when I use array constructor and
input is one column, then result is one dimensional array (and it's
not important if it is array of scalar or array of record), when input
is tuple - vector, then natural result is array of array, that is 2d
array in pg.

2d arrays are much general than records and it able to store multi
time series, that is important.

regards
Pavel Stehule

p.s. my first qustions was about real limits inside pg, and there are
not any limit.

>
> Records have a predefined and static (over the duration of the query)
> structure so it's easy to make an equivalence between single element
> records and the element itself (several very rigorously specified
> languages do this very successfully).  It's somewhat annoying that PG
> only does this sometimes:
>
>  SELECT x, n FROM now() x(n);
>
> Causes the record "x" to be of type timestamp (i.e. the same as "n") and
> not a record containing a timestamp.  Whereas:
>
>  SELECT x, n FROM (VALUES (1)) x(n);
>
> Causes the record "x" to remain as a record containing an integer and
> "n" to refer to the same integer.  All good fun, but not very relevant!
>
> The length of an array is specifically unknown, so assuming any
> equivalence between arrays and single elements of defined type is
> difficult at best.  In your example, everything works out because
> you're doing a transition from a tuple (record) to a vector (1d array)
> to a matrix (2d array), where the length of the vector is constant (as
> predicated on the source being a record) and hence width of the matrix
> is constant.  But I can't see why the user would always want to put this
> middle step in.
>
> Does that make any sense?
>
>
>  Sam
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: is any reason why only one columns subselect are allowed in array()?

From
Tom Lane
Date:
Sam Mason <sam@samason.me.uk> writes:
> I've used this syntax before and got a surprising message back.  I'd
> expect to be able to do the following:
>   ARRAY((SELECT col1, col2 FROM (VALUES ('a',1), ('b',2)) x(col1,col2)));
> and get the following back {"(a,1)","(b,2)"}.  So I think I'm with
> David.

I concur --- if we support something like this, the behavior should be
that you get an array of record.  Pavel's proposal for a 2-D array seems
unworkably restrictive.  And I certainly don't want to end up in a
situation where we return either a 2-D array or array of record
depending on whether the parser thinks the column data types match ...
        regards, tom lane


Re: is any reason why only one columns subselect are allowed in array()?

From
"Pavel Stehule"
Date:
2008/11/18 Tom Lane <tgl@sss.pgh.pa.us>:
> Sam Mason <sam@samason.me.uk> writes:
>> I've used this syntax before and got a surprising message back.  I'd
>> expect to be able to do the following:
>>
>> and get the following back {"(a,1)","(b,2)"}.  So I think I'm with
>> David.
>
> I concur --- if we support something like this, the behavior should be
> that you get an array of record.  Pavel's proposal for a 2-D array seems
> unworkably restrictive.  And I certainly don't want to end up in a
> situation where we return either a 2-D array or array of record
> depending on whether the parser thinks the column data types match ...

there are clean rules. you do array from input - when input is 1D
array, then result is 2D array, when input is record, then result is
1D array of record. Where should be problem? I see Sam proposal as
only one special case of my proposal.

Pavel

I am sorry, but you know - record type is very unfriendly to plpgsql.

>
>                        regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: is any reason why only one columns subselect are allowed in array()?

From
Sam Mason
Date:
On Tue, Nov 18, 2008 at 06:55:26PM +0100, Pavel Stehule wrote:
> 2008/11/18 Sam Mason <sam@samason.me.uk>:
> > On Tue, Nov 18, 2008 at 05:20:27PM +0100, Pavel Stehule wrote:
> >> 2008/11/18 Sam Mason <sam@samason.me.uk>:
> >> > I've used this syntax before and got a surprising message back.  I'd
> >> > expect to be able to do the following:
> >> >
> >> >  ARRAY((SELECT col1, col2 FROM (VALUES ('a',1), ('b',2)) x(col1,col2)));
> >> >
> >> > and get the following back {"(a,1)","(b,2)"}.  So I think I'm with
> >> > David.
> >>
> >> this is different result - it's array of records, not 2d array.
> >
> > Yes, but an array of records is much more natural.  There are only a
> > few specific cases when what you want to do would be useful.  It also
> > naturally follows on from the current semantics:
> >
> >  ARRAY(VALUES (1),(2));
> >
> > returns a 1d array of integers and not a 2d array of unit width---you
> > don't get this back:
> >
> >  ARRAY[ARRAY[1],ARRAY[2]]
> >
> > But I can't see any reason for changing the semantics between when you
> > return a single column vs. many.  In fact it may confuse calling code
> > even more
> 
> There are simple reason - I am not able to iterate over record in
> plpgsql, and I should to do it over 2d array. I am sorry, but I don't
> see any real reason for this limit - when I use array constructor and
> input is one column, then result is one dimensional array (and it's
> not important if it is array of scalar or array of record), when input
> is tuple - vector, then natural result is array of array, that is 2d
> array in pg.

I really think you're solving this the wrong way around!  Overloading a
general array accumulation function with extra semantics seems strange.
I've always been taught to design things so that the that the basic
semantics should be as simple as possible which maintaining useful
performance.

I don't have your code that allows queries to return more than one row,
but this is what I think you want to do in the context of aggregates:
 CREATE FUNCTION array_concat_(ANYARRAY,ANYARRAY) RETURNS ANYARRAY   AS $$ SELECT array_cat($1,ARRAY[$2]); $$
LANGUAGESQL   IMMUTABLE;
 
 CREATE AGGREGATE array_concat (ANYARRAY) (     sfunc = array_concat_,     stype = ANYARRAY,     initcond = '{}' );

A demo query being:
 SELECT array_concat(a) FROM (VALUES   (ARRAY[1,2,3]),   (ARRAY[5,6,7]),   (ARRAY[7,8,9])) x(a);

is that somewhat correct?

> 2d arrays are much general than records and it able to store multi
> time series, that is important.

From a type-theoretic viewpoint "general" is not a useful description
of the difference between tuples and lists, they both have *different*
semantics and the situation you use them in determines which is more
useful.

> p.s. my first qustions was about real limits inside pg, and there are
> not any limit.

If subqueries can return more than one row, this can be done as you
want---I think so anyway!

 Sam


Re: is any reason why only one columns subselect are allowed in array()?

From
"Pavel Stehule"
Date:
2008/11/18 Sam Mason <sam@samason.me.uk>:
> On Tue, Nov 18, 2008 at 06:55:26PM +0100, Pavel Stehule wrote:
>> 2008/11/18 Sam Mason <sam@samason.me.uk>:
>> > On Tue, Nov 18, 2008 at 05:20:27PM +0100, Pavel Stehule wrote:
>> >> 2008/11/18 Sam Mason <sam@samason.me.uk>:
>> >> > I've used this syntax before and got a surprising message back.  I'd
>> >> > expect to be able to do the following:
>> >> >
>> >> >  ARRAY((SELECT col1, col2 FROM (VALUES ('a',1), ('b',2)) x(col1,col2)));
>> >> >
>> >> > and get the following back {"(a,1)","(b,2)"}.  So I think I'm with
>> >> > David.
>> >>
>> >> this is different result - it's array of records, not 2d array.
>> >
>> > Yes, but an array of records is much more natural.  There are only a
>> > few specific cases when what you want to do would be useful.  It also
>> > naturally follows on from the current semantics:
>> >
>> >  ARRAY(VALUES (1),(2));
>> >
>> > returns a 1d array of integers and not a 2d array of unit width---you
>> > don't get this back:
>> >
>> >  ARRAY[ARRAY[1],ARRAY[2]]
>> >
>> > But I can't see any reason for changing the semantics between when you
>> > return a single column vs. many.  In fact it may confuse calling code
>> > even more
>>
>> There are simple reason - I am not able to iterate over record in
>> plpgsql, and I should to do it over 2d array. I am sorry, but I don't
>> see any real reason for this limit - when I use array constructor and
>> input is one column, then result is one dimensional array (and it's
>> not important if it is array of scalar or array of record), when input
>> is tuple - vector, then natural result is array of array, that is 2d
>> array in pg.
>
> I really think you're solving this the wrong way around!  Overloading a
> general array accumulation function with extra semantics seems strange.
> I've always been taught to design things so that the that the basic
> semantics should be as simple as possible which maintaining useful
> performance.
>
> I don't have your code that allows queries to return more than one row,
> but this is what I think you want to do in the context of aggregates:
>
>  CREATE FUNCTION array_concat_(ANYARRAY,ANYARRAY) RETURNS ANYARRAY
>    AS $$ SELECT array_cat($1,ARRAY[$2]); $$
>    LANGUAGE SQL
>    IMMUTABLE;
>
>  CREATE AGGREGATE array_concat (ANYARRAY) (
>      sfunc = array_concat_,
>      stype = ANYARRAY,
>      initcond = '{}'
>  );
>
> A demo query being:
>
>  SELECT array_concat(a) FROM (VALUES
>    (ARRAY[1,2,3]),
>    (ARRAY[5,6,7]),
>    (ARRAY[7,8,9])) x(a);
>
> is that somewhat correct?
>
yes, it's should be - it's one way

actually there is similar way

select array_agg(a) from ...
select array(select a from ...


>> 2d arrays are much general than records and it able to store multi
>> time series, that is important.
>
> From a type-theoretic viewpoint "general" is not a useful description
> of the difference between tuples and lists, they both have *different*
> semantics and the situation you use them in determines which is more
> useful.

try to iterate over record in plpgsql or sql functions.

regards
Pavel Stehule

>
>> p.s. my first qustions was about real limits inside pg, and there are
>> not any limit.
>
> If subqueries can return more than one row, this can be done as you
> want---I think so anyway!
>
>
>  Sam
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: is any reason why only one columns subselect are allowed in array()?

From
Sam Mason
Date:
On Tue, Nov 18, 2008 at 06:22:21PM +0000, Sam Mason wrote:
> I've always been taught to design things so that the that the basic
> semantics should be as simple as possible which maintaining useful

this should of course be "while"!           ^^^^^ 

> performance.

>   CREATE FUNCTION array_concat_(ANYARRAY,ANYARRAY) RETURNS ANYARRAY
>     AS $$ SELECT array_cat($1,ARRAY[$2]); $$
>     LANGUAGE SQL
>     IMMUTABLE;
> 
>   CREATE AGGREGATE array_concat (ANYARRAY) (
>       sfunc = array_concat_,
>       stype = ANYARRAY,
>       initcond = '{}'
>   );
> 
> A demo query being:
> 
>   SELECT array_concat(a) FROM (VALUES
>     (ARRAY[1,2,3]),
>     (ARRAY[5,6,7]),
>     (ARRAY[7,8,9])) x(a);
> 
> is that somewhat correct?

I've just realized that this doesn't work in simple cases like the
following:
 SELECT (SELECT array_concat(a) FROM (VALUES   (1), (5), (7)) x(a));

To do this, another pair of function and aggregate is needed, this time
parametrized on ANYNONARRAY rather than ANYARRAY.  It would be nice if
PG's type system was sufficiently clever to allow the above definition
to be parametrized on ANYELEMENT (or maybe something else) and it all
work out, but never mind...


Another point came to mind as well; why does array_accum pretty much
duplicate the behavior of array(subquery)?  The following looks like a
useful and general transform:
 SELECT accum_fn(subquery);

to:
 SELECT (accum_fn(c1..cn) FROM (subquery) x(c1..cn));

What have I missed?

 Sam


Re: is any reason why only one columns subselect are allowed in array()?

From
Tom Lane
Date:
"Pavel Stehule" <pavel.stehule@gmail.com> writes:
> there are clean rules. you do array from input - when input is 1D
> array, then result is 2D array, when input is record, then result is
> 1D array of record. Where should be problem?

That seems all right, but it's *not* what you first proposed, and
what you first proposed is what people are unhappy with.
        regards, tom lane


Re: is any reason why only one columns subselect are allowed in array()?

From
"Pavel Stehule"
Date:
2008/11/18 Sam Mason <sam@samason.me.uk>:
> On Tue, Nov 18, 2008 at 06:22:21PM +0000, Sam Mason wrote:
>> I've always been taught to design things so that the that the basic
>> semantics should be as simple as possible which maintaining useful
>
> this should of course be "while"!           ^^^^^
>
>> performance.
>
>>   CREATE FUNCTION array_concat_(ANYARRAY,ANYARRAY) RETURNS ANYARRAY
>>     AS $$ SELECT array_cat($1,ARRAY[$2]); $$
>>     LANGUAGE SQL
>>     IMMUTABLE;
>>
>>   CREATE AGGREGATE array_concat (ANYARRAY) (
>>       sfunc = array_concat_,
>>       stype = ANYARRAY,
>>       initcond = '{}'
>>   );
>>
>> A demo query being:
>>
>>   SELECT array_concat(a) FROM (VALUES
>>     (ARRAY[1,2,3]),
>>     (ARRAY[5,6,7]),
>>     (ARRAY[7,8,9])) x(a);
>>
>> is that somewhat correct?
>
> I've just realized that this doesn't work in simple cases like the
> following:
>
>  SELECT (SELECT array_concat(a) FROM (VALUES
>    (1), (5), (7)) x(a));
>
> To do this, another pair of function and aggregate is needed, this time
> parametrized on ANYNONARRAY rather than ANYARRAY.  It would be nice if
> PG's type system was sufficiently clever to allow the above definition
> to be parametrized on ANYELEMENT (or maybe something else) and it all
> work out, but never mind...
>
>
> Another point came to mind as well; why does array_accum pretty much
> duplicate the behavior of array(subquery)?  The following looks like a
> useful and general transform:
>
>  SELECT accum_fn(subquery);
>
> to:
>
>  SELECT (accum_fn(c1..cn) FROM (subquery) x(c1..cn));
>
> What have I missed?

array_accum was terrible slow. Non aggregate construct is much faster
- we should to build array in one simple cycle.

Pavel

>
>
>  Sam
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: is any reason why only one columns subselect are allowed in array()?

From
Sam Mason
Date:
On Tue, Nov 18, 2008 at 07:32:33PM +0100, Pavel Stehule wrote:
> 2008/11/18 Sam Mason <sam@samason.me.uk>:
> > this is what I think you want to do in the context of aggregates:
> >
> >  CREATE FUNCTION array_concat_(ANYARRAY,ANYARRAY) RETURNS ANYARRAY
> >    AS $$ SELECT array_cat($1,ARRAY[$2]); $$
> >    LANGUAGE SQL
> >    IMMUTABLE;
> >
> >  CREATE AGGREGATE array_concat (ANYARRAY) (
> >      sfunc = array_concat_,
> >      stype = ANYARRAY,
> >      initcond = '{}'
> >  );
> >
> > A demo query being:
> >
> >  SELECT array_concat(a) FROM (VALUES
> >    (ARRAY[1,2,3]),
> >    (ARRAY[5,6,7]),
> >    (ARRAY[7,8,9])) x(a);
> >
> > is that somewhat correct?
> >
> yes, it's should be - it's one way
> 
> actually there is similar way
> 
> select array_agg(a) from ...

Sorry, PG 8.3 doesn't seem to have anything that responds to that.

> select array(select a from ...

I just get an error of "could not find array type for datatype
integer[]" when I try to do that, hence why I wrote the above code.

> > On Tue, Nov 18, 2008 at 06:55:26PM +0100, Pavel Stehule wrote:
> >> 2d arrays are much general than records and it able to store multi
> >> time series, that is important.
> >
> > From a type-theoretic viewpoint "general" is not a useful description
> > of the difference between tuples and lists, they both have *different*
> > semantics and the situation you use them in determines which is more
> > useful.
> 
> try to iterate over record in plpgsql or sql functions.

We're talking completely cross purposes here and both saying the same
things.  To be formal, lets start with a hopefully redundant set of
definitions: A list/array is a structure for holding a dynamically
varying number of elements, there are methods of dynamically iterating
through its elements.  A tuple/record holds a static set of elements and
you can not iterate its elements dynamically---if there was some form of
introspection available in PG this would of course be false.

What you're after is an array; I'm just saying that if you add another
step in before you actually receive that array the language will be
strictly more powerful and general.  This step can of course be hidden
behind some utility function; but, IMHO, "array" is a very important
name and the general function should probably behind this name by
default and not the utility function.

 Sam