Home > mailing lists

Re: postmaster consuming /lots/ of memory with hash aggregate. why? - Mailing list pgsql-performance

From	Pierre C
Subject	Re: postmaster consuming /lots/ of memory with hash aggregate. why?
Date	November 6, 2010 05:58:09
Msg-id	op.vlqq6vxteorkce@apollo13 Whole thread
In response to	postmaster consuming /lots/ of memory with hash aggregate. why? (Jon Nelson <jnelson+pgsql@jamponi.net>)
Responses	Re: postmaster consuming /lots/ of memory with hash aggregate. why?
List	pgsql-performance

Tree view

> 2. Why do both HashAggregate and GroupAggregate say the cost estimate
> is 40000 rows?

I've reproduced this :


CREATE TABLE popo AS SELECT (x%1000) AS a,(x%1001) AS b FROM
generate_series( 1,1000000 ) AS x;
VACUUM ANALYZE popo;
EXPLAIN ANALYZE SELECT a,b,count(*) FROM (SELECT * FROM popo UNION ALL
SELECT * FROM popo) AS foo GROUP BY a,b;
                                                          QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------
  HashAggregate  (cost=43850.00..44350.00 rows=40000 width=8) (actual
time=1893.441..2341.780 rows=1000000 loops=1)
    ->  Append  (cost=0.00..28850.00 rows=2000000 width=8) (actual
time=0.025..520.581 rows=2000000 loops=1)
          ->  Seq Scan on popo  (cost=0.00..14425.00 rows=1000000 width=8)
(actual time=0.025..142.639 rows=1000000 loops=1)
          ->  Seq Scan on popo  (cost=0.00..14425.00 rows=1000000 width=8)
(actual time=0.003..114.257 rows=1000000 loops=1)
  Total runtime: 2438.741 ms
(5 lignes)

Temps : 2439,247 ms

I guess the row count depends on the correlation of a and b, which pg has
no idea about. In the first example, there is no correlation, now with
full correlation :


UPDATE popo SET a=b;
VACUUM FULL popo;
VACUUM FULL popo;
ANALYZE popo;
EXPLAIN ANALYZE SELECT a,b,count(*) FROM (SELECT * FROM popo UNION ALL
SELECT * FROM popo) AS foo GROUP BY a,b;
                                                          QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------
  HashAggregate  (cost=43850.00..44350.00 rows=40000 width=8) (actual
time=1226.201..1226.535 rows=1001 loops=1)
    ->  Append  (cost=0.00..28850.00 rows=2000000 width=8) (actual
time=0.008..518.068 rows=2000000 loops=1)
          ->  Seq Scan on popo  (cost=0.00..14425.00 rows=1000000 width=8)
(actual time=0.007..128.609 rows=1000000 loops=1)
          ->  Seq Scan on popo  (cost=0.00..14425.00 rows=1000000 width=8)
(actual time=0.005..128.502 rows=1000000 loops=1)
  Total runtime: 1226.797 ms

pgsql-performance by date:

From: Jon Nelson
Date: 05 November 2010, 21:27:17
Subject: postmaster consuming /lots/ of memory with hash aggregate. why?

From: Jon Nelson
Date: 06 November 2010, 15:37:42
Subject: Re: postmaster consuming /lots/ of memory with hash aggregate. why?

Re: postmaster consuming /lots/ of memory with hash aggregate. why? - Mailing list pgsql-performance

Previous

Next