Thread: BUG #5608: array_agg() consumes too much memory
The following bug has been logged online: Bug reference: 5608 Logged by: Itagaki Takahiro Email address: itagaki.takahiro@gmail.com PostgreSQL version: 9.0beta4 Operating system: Windows 7 (32bit) Description: array_agg() consumes too much memory Details: I encountered "out of memory" error in large GROUP BY query with array_agg(). The server log was filled by the following messages: accumArrayResult: 8192 total in 1 blocks; 7800 free (0 chunks); 392 used Should we choose smaller size of initial memory in accumArrayResult()? It allocates ALLOCSET_DEFAULT_INITSIZE (=8kB) now, but only 10% of the area was used in my case. Note that work_mem is not considered in the case; array_agg() allocates 8kB * (number of groups).
"Itagaki Takahiro" <itagaki.takahiro@gmail.com> writes: > I encountered "out of memory" error in large > GROUP BY query with array_agg(). The server log > was filled by the following messages: > accumArrayResult: 8192 total in 1 blocks; 7800 free (0 chunks); 392 > used > Should we choose smaller size of initial memory in accumArrayResult()? That's not really going to help much, considering that the planner's estimated memory use per hash aggregate is only a few dozen bytes. We have to get that estimate in sync with reality or the problem will remain. Eventually it might be nice to have some sort of way to specify the estimate to use for any aggregate function --- but for a near-term fix maybe we should just hard-wire a special case for array_agg in count_agg_clauses_walker(). I'd be inclined to leave the array_agg code as-is and teach the planner to assume ALLOCSET_DEFAULT_INITSIZE per array_agg aggregate. regards, tom lane
2010/8/10 Tom Lane <tgl@sss.pgh.pa.us>: > Eventually it might be nice to have some sort of way to specify the > estimate to use for any aggregate function --- but for a near-term > fix maybe we should just hard-wire a special case for array_agg in > count_agg_clauses_walker(). The attached patch is the near-term fix; it adds ALLOCSET_DEFAULT_INITSIZE bytes to memory assumption. We might need the same adjustment for string_agg(), that consumes 1024 bytes for the transit condition. array_agg() and string_agg() are only aggregates that have "internal" for aggtranstype. -- Itagaki Takahiro
Attachment
2010/8/14 Itagaki Takahiro <itagaki.takahiro@gmail.com>: > 2010/8/10 Tom Lane <tgl@sss.pgh.pa.us>: >> Eventually it might be nice to have some sort of way to specify the >> estimate to use for any aggregate function --- but for a near-term >> fix maybe we should just hard-wire a special case for array_agg in >> count_agg_clauses_walker(). > > The attached patch is the near-term fix; it adds ALLOCSET_DEFAULT_INITSIZE > bytes to memory assumption. > > We might need the same adjustment for string_agg(), that consumes > 1024 bytes for the transit condition. array_agg() and string_agg() > are only aggregates that have "internal" for aggtranstype. So, is it better to generalize as it adds ALLOCSET_DEFAULT_INITSIZE if the transtype is internal, rather than specifying individual function OID as the patch stands? Regards, -- Hitoshi Harada
Hitoshi Harada <umi.tanuki@gmail.com> writes: > 2010/8/14 Itagaki Takahiro <itagaki.takahiro@gmail.com>: >> 2010/8/10 Tom Lane <tgl@sss.pgh.pa.us>: >>> Eventually it might be nice to have some sort of way to specify the >>> estimate to use for any aggregate function --- but for a near-term >>> fix maybe we should just hard-wire a special case for array_agg in >>> count_agg_clauses_walker(). > So, is it better to generalize as it adds ALLOCSET_DEFAULT_INITSIZE if > the transtype is internal, rather than specifying individual function > OID as the patch stands? Seems like a good idea ... it's ugly, but it seems much less likely to need maintenance. regards, tom lane
Hitoshi Harada <umi.tanuki@gmail.com> writes: > 2010/8/14 Itagaki Takahiro <itagaki.takahiro@gmail.com>: >> The attached patch is the near-term fix; it adds ALLOCSET_DEFAULT_INITSIZE >> bytes to memory assumption. >> >> We might need the same adjustment for string_agg(), that consumes >> 1024 bytes for the transit condition. array_agg() and string_agg() >> are only aggregates that have "internal" for aggtranstype. > So, is it better to generalize as it adds ALLOCSET_DEFAULT_INITSIZE if > the transtype is internal, rather than specifying individual function > OID as the patch stands? I've applied a patch following Hitoshi-san's idea. BTW, a note about the upthread patch: we don't do manual #define's for OIDs in pg_proc.h. Use fmgroids.h for that. regards, tom lane