ERROR: out of memory when using aggregates over a partitioned table - Mailing list pgsql-hackers

From Matteo Beccati
Subject ERROR: out of memory when using aggregates over a partitioned table
Date
Msg-id 49FF89EF.7060404@beccati.com
Whole thread Raw
List pgsql-hackers
Hi everyone,

I'm unexpectedly getting out of memory error both with 8.3.3 and
8.4beta1 when doing something as simple as:

SELECT id, COUNT(*) AS counter, MAX(last_modified) AS last_modified FROM
foo GROUP BY id;

where foo is a partitioned table and id is a uuid column.

It looks like the HashAggregate estimate is set to a default of 200 even
though ndistinct in each partition is averaging at -0.59. As RhodiumToad
pointed out there's a comment explaining the behaviour:
* XXX This means the Var represents a column of an append* relation. Later add code to look at the member relations
and*try to derive some kind of combined statistics?
 

I just wanted to raise it as something that might happen to those using
partitions as it's very likely that a partitioned table is bigger than
the available RAM. However I don't think it happens very often that one
needs to run an aggregate query on it. I just needed it to populate a
separate table that will be kept up to date via triggers.

Here's the EXPLAIN output:
HashAggregate  (cost=1344802.32..1344805.32 rows=200 width=24)  ->  Append  (cost=0.00..969044.47 rows=50101047
width=24)       ->  Seq Scan on foo  (cost=0.00..16.60 rows=660 width=24)        ->  Seq Scan on part_0 foo
(cost=0.00..60523.89rows=3129289
 
width=24)        ->  Seq Scan on part_1 foo  (cost=0.00..60555.37 rows=3130937
width=24)        ->  Seq Scan on part_2 foo  (cost=0.00..60532.17 rows=3129717
width=24)        ->  Seq Scan on part_3 foo  (cost=0.00..60550.86 rows=3130686
width=24)        ->  Seq Scan on part_4 foo  (cost=0.00..60545.07 rows=3130407
width=24)        ->  Seq Scan on part_5 foo  (cost=0.00..60579.93 rows=3131393
width=24)        ->  Seq Scan on part_6 foo  (cost=0.00..60566.70 rows=3131470
width=24)        ->  Seq Scan on part_7 foo  (cost=0.00..60610.66 rows=3133766
width=24)        ->  Seq Scan on part_8 foo  (cost=0.00..60546.67 rows=3129667
width=24)        ->  Seq Scan on part_9 foo  (cost=0.00..60509.92 rows=3128592
width=24)        ->  Seq Scan on part_a foo  (cost=0.00..60581.25 rows=3132225
width=24)        ->  Seq Scan on part_b foo  (cost=0.00..60552.81 rows=3130781
width=24)        ->  Seq Scan on part_c foo  (cost=0.00..60621.15 rows=3134315
width=24)        ->  Seq Scan on part_d foo  (cost=0.00..60714.26 rows=3139126
width=24)        ->  Seq Scan on part_e foo  (cost=0.00..60552.85 rows=3130785
width=24)        ->  Seq Scan on part_f foo  (cost=0.00..60484.31 rows=3127231
width=24)


Cheers
--
Matteo Beccati
http://www.openx.org/


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Unicode string literals versus the world
Next
From: Mark Wong
Date:
Subject: community equipment