Thread: SQL question on chunking aggregates

SQL question on chunking aggregates

From
Owen Hartnett
Date:
Hi all:

I have a table that has multiple records for a single owner_id.  I'm able to use array_arg to combine the records into a single row, which works fine.  I'm using this sql:

select owner_id, array_agg(trim(maplot)), array_agg(revallandvalue + revalbuildingvalues) from parcel group by owner_id;

Which results in the following (sometimes there's only one record per aggregate, sometimes multiple):

1030600;"{"154    191"}";"{244690}"
1030900;"{"22    202"}";"{217210}"
1031130;"{"113    135","113    138","113    132","113    130","113    133","113    127","113    126","113    131","113    129","113    136","113    125","113    137","113    134","113    128"}";"{7700,7700,7700,7700,7700,7700,7700,7700,7700,191770,7700,7700,7700,7700}"

What I want to do, is where there are more than 5 rows involved in the aggregate, as in the last example, to split it into multiple rows of 5 aggregated rows.  It's for a mailing list and I want to combine like addresses into one record, but if I'm over 5, I have to print the rest on a separate letter.

1031130;"{"113    135","113    138","113    132","113    130","113    133"}";"{7700,7700,7700,7700,7700}"
1031130;"{"113    127","113    126","113    131","113    129","113    136"}";"{7700, 7700,7700,7700,191770}"
1031130;"{"113    125","113    137","113    134","113    128"}";"{7700,7700,7700,7700}"

 It looks like I should be able to use the window function to do this, but I've been unsuccessful.  The following runs, but doesn't seem to have any effect:

select owner_id, array_agg(trim(maplot)), array_agg(revallandvalue + revalbuildingvalues) from parcel group by owner_id
window mywindow as (rows between current row and 5 following);

Does anyone have any suggestions on what I should try?

-Owen

Re: SQL question on chunking aggregates

From
Merlin Moncure
Date:
On Tue, Mar 4, 2014 at 1:49 PM, Owen Hartnett <owen@clipboardinc.com> wrote:
> Hi all:
>
> I have a table that has multiple records for a single owner_id.  I'm able to
> use array_arg to combine the records into a single row, which works fine.
> I'm using this sql:
>
> select owner_id, array_agg(trim(maplot)), array_agg(revallandvalue +
> revalbuildingvalues) from parcel group by owner_id;
>
> Which results in the following (sometimes there's only one record per
> aggregate, sometimes multiple):
>
> 1030600;"{"154    191"}";"{244690}"
> 1030900;"{"22    202"}";"{217210}"
> 1031130;"{"113    135","113    138","113    132","113    130","113
> 133","113    127","113    126","113    131","113    129","113    136","113
> 125","113    137","113    134","113
> 128"}";"{7700,7700,7700,7700,7700,7700,7700,7700,7700,191770,7700,7700,7700,7700}"
>
> What I want to do, is where there are more than 5 rows involved in the
> aggregate, as in the last example, to split it into multiple rows of 5
> aggregated rows.  It's for a mailing list and I want to combine like
> addresses into one record, but if I'm over 5, I have to print the rest on a
> separate letter.
>
> 1031130;"{"113    135","113    138","113    132","113    130","113
> 133"}";"{7700,7700,7700,7700,7700}"
> 1031130;"{"113    127","113    126","113    131","113    129","113
> 136"}";"{7700, 7700,7700,7700,191770}"
> 1031130;"{"113    125","113    137","113    134","113
> 128"}";"{7700,7700,7700,7700}"
>
>  It looks like I should be able to use the window function to do this, but
> I've been unsuccessful.  The following runs, but doesn't seem to have any
> effect:
>
> select owner_id, array_agg(trim(maplot)), array_agg(revallandvalue +
> revalbuildingvalues) from parcel group by owner_id
> window mywindow as (rows between current row and 5 following);
>
> Does anyone have any suggestions on what I should try?
>
> -Owen

I didn't test it, but something along the lines of:

select
  owner_id,
  array_agg(maplot),
  array_agg(totalvalues)
from
(
  select
    owner_id,
    trim(maplot) as maplot,
    revallandvalue + revalbuildingvalues as totalvalues,
    row_number() over (partition by owner_id) as n
  from parcel
) q
group by owner_id, (n - 1)/5;

merlin


Re: SQL question on chunking aggregates

From
David Johnston
Date:
Merlin Moncure-2 wrote
> On Tue, Mar 4, 2014 at 1:49 PM, Owen Hartnett <

> owen@

> > wrote:
>
>>  It looks like I should be able to use the window function to do this,
>> but
>> I've been unsuccessful.  The following runs, but doesn't seem to have any
>> effect:
>>
>> select owner_id, array_agg(trim(maplot)), array_agg(revallandvalue +
>> revalbuildingvalues) from parcel group by owner_id
>> window mywindow as (rows between current row and 5 following);
>>
>> Does anyone have any suggestions on what I should try?
>>
>> -Owen
>
> I didn't test it, but something along the lines of:
>
> select
>   owner_id,
>   array_agg(maplot),
>   array_agg(totalvalues)
> from
> (
>   select
>     owner_id,
>     trim(maplot) as maplot,
>     revallandvalue + revalbuildingvalues as totalvalues,
>     row_number() over (partition by owner_id) as n
>   from parcel
> ) q
> group by owner_id, (n - 1)/5;

Yeah, a window cannot work because it cannot be defined to provide disjoint
subsets.

In most cases multiple invocations of array_agg(...) - at the same level in
a query - will see the same row order but that is not something that it is
wise to rely upon.  Any time you want to have synchronized array_agg(...)
calls you should add identical explicit ORDER BY clauses to them; or better
yet combine that data into a custom datatype and then store that in the
array.

The solution is as Merlin presents; you need to use integer division to
bucket the rows and then call the array_agg(...) using those groups.  I like
to keep the bucket ID around in order to capture the original order but as
shown it is not a requirement.

David J.






--
View this message in context:
http://postgresql.1045698.n5.nabble.com/SQL-question-on-chunking-aggregates-tp5794680p5794694.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.