Thread: array_agg performance
Using postgres 8.4.1, I thought the upgrade from 8.4.0 would help with array_agg.
Using ARRAY() with a subselect is yielding results hundreds times faster than array_agg even though its plan looks much worse…
And the entire system is seriously slugglish and near non-responsive while running this with array_agg. Eventually I just have to kill it, after running for several hours array_agg one still won’t finish.
Any ideas as to why? Queries and plans below.
SELECT
trip_id,
ARRAY(select customer_upload_id from details_child_current as d2 where d1.trip_id = d2.trip_id) as c1,
array_to_string(
ARRAY(SELECT
customer_upload_id::text||'=>"'||
replace(
'detail_id=>'||case when detail_id is null then 'null' else quote_ident(detail_id::text) end||
',cp=>'||case when cp is null then 'null' else quote_ident(cp::text) end||
',ap=>'||case when ap is null then 'null' else quote_ident(ap::text) end||
',re=>'||case when re is null then 'null' else quote_ident(re::text) end||
',fleetid=>'||case when fleetid is null then 'null' else quote_ident(fleetid::text) end
,'"',$$\"$$)||'"'
FROM details_child_current as d2 where d1.trip_id = d2.trip_id)
,',')::hstore
as c2
FROM
details_child_current as d1;
"Seq Scan on details_child_current d1 (cost=0.00..6295788.14 rows=376423 width=8)"
" SubPlan 1"
" -> Index Scan using details_current_trip_id on details_child_current d2 (cost=0.00..8.31 rows=1 width=4)"
" Index Cond: ($0 = trip_id)"
" SubPlan 2"
" -> Index Scan using details_current_trip_id on details_child_current d2 (cost=0.00..8.38 rows=1 width=45)"
" Index Cond: ($0 = trip_id)"
SELECT
trip_id,
array_agg(customer_upload_id) as c1,
array_to_string(
array_agg(customer_upload_id::text||'=>"'||
replace(
'detail_id=>'||case when detail_id is null then 'null' else quote_ident(detail_id::text) end||
',cp=>'||case when cp is null then 'null' else quote_ident(cp::text) end||
',ap=>'||case when ap is null then 'null' else quote_ident(ap::text) end||
',re=>'||case when re is null then 'null' else quote_ident(re::text) end||
',fleetid=>'||case when fleetid is null then 'null' else quote_ident(fleetid::text) end
,'"',$$\"$$)||'"'
)
,',')::hstore
as c2
FROM
details_child_current as d1
group by trip_id;
"GroupAggregate (cost=0.00..73447.71 rows=346009 width=53)"
" -> Index Scan using details_current_trip_id on details_child_current d1 (cost=0.00..38618.70 rows=376423 width=53)"
Chris Spotts
Programmer / Analyst
Transcore
christopher.spotts@transcore.com
"Spotts, Christopher" <Christopher.Spotts@TransCore.com> writes: > Using postgres 8.4.1, I thought the upgrade from 8.4.0 would help with > array_agg. > Using ARRAY() with a subselect is yielding results hundreds times faster > than array_agg even though its plan looks much worse... Huh, you sure your server is on 8.4.1? I can't reproduce any memory leak here (not that you've provided a self-contained example). regards, tom lane