array_agg performance

Started by Spotts, Christopherover 16 years ago2 messagesgeneral
Jump to latest
#1Spotts, Christopher
Christopher.Spotts@TransCore.com

Using postgres 8.4.1, I thought the upgrade from 8.4.0 would help with
array_agg.

Using ARRAY() with a subselect is yielding results hundreds times faster
than array_agg even though its plan looks much worse...

And the entire system is seriously slugglish and near non-responsive
while running this with array_agg. Eventually I just have to kill it,
after running for several hours array_agg one still won't finish.

Any ideas as to why? Queries and plans below.

SELECT

trip_id,

ARRAY(select
customer_upload_id from details_child_current as d2 where d1.trip_id =
d2.trip_id) as c1,

array_to_string(

ARRAY(SELECT

customer_upload_id::text||'=>"'||

replace(

'detail_id=>'||case when detail_id is null then 'null' else
quote_ident(detail_id::text) end||

',cp=>'||case when cp is null then 'null' else quote_ident(cp::text)
end||

',ap=>'||case when ap is null then 'null' else quote_ident(ap::text)
end||

',re=>'||case when re is null then 'null' else quote_ident(re::text)
end||

',fleetid=>'||case when fleetid is null then 'null' else
quote_ident(fleetid::text) end

,'"',$$\"$$)||'"'

FROM details_child_current as d2 where d1.trip_id = d2.trip_id)

,',')::hstore

as c2

FROM

details_child_current as
d1;

"Seq Scan on details_child_current d1 (cost=0.00..6295788.14
rows=376423 width=8)"

" SubPlan 1"

" -> Index Scan using details_current_trip_id on
details_child_current d2 (cost=0.00..8.31 rows=1 width=4)"

" Index Cond: ($0 = trip_id)"

" SubPlan 2"

" -> Index Scan using details_current_trip_id on
details_child_current d2 (cost=0.00..8.38 rows=1 width=45)"

" Index Cond: ($0 = trip_id)"

SELECT

trip_id,

array_agg(customer_upload_id) as c1,

array_to_string(

array_agg(customer_upload_id::text||'=>"'||

replace(

'detail_id=>'||case when detail_id is null then 'null' else
quote_ident(detail_id::text) end||

',cp=>'||case when cp is null then 'null' else quote_ident(cp::text)
end||

',ap=>'||case when ap is null then 'null' else quote_ident(ap::text)
end||

',re=>'||case when re is null then 'null' else quote_ident(re::text)
end||

',fleetid=>'||case when fleetid is null then 'null' else
quote_ident(fleetid::text) end

,'"',$$\"$$)||'"'

)

,',')::hstore

as c2

FROM

details_child_current as
d1

group by trip_id;

"GroupAggregate (cost=0.00..73447.71 rows=346009 width=53)"

" -> Index Scan using details_current_trip_id on details_child_current
d1 (cost=0.00..38618.70 rows=376423 width=53)"

Chris Spotts

Programmer / Analyst

Transcore

christopher.spotts@transcore.com

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Spotts, Christopher (#1)
Re: array_agg performance

"Spotts, Christopher" <Christopher.Spotts@TransCore.com> writes:

Using postgres 8.4.1, I thought the upgrade from 8.4.0 would help with
array_agg.

Using ARRAY() with a subselect is yielding results hundreds times faster
than array_agg even though its plan looks much worse...

Huh, you sure your server is on 8.4.1? I can't reproduce any memory
leak here (not that you've provided a self-contained example).

regards, tom lane