Home > mailing lists

Re: intagg memory leak - Mailing list pgsql-general

From	Tom Lane
Subject	Re: intagg memory leak
Date	June 6, 2008 18:55:42
Msg-id	27959.1212778529@sss.pgh.pa.us Whole thread Raw
In response to	intagg memory leak (Sam Mason <sam@samason.me.uk>)
List	pgsql-general

Tree view

Sam Mason <sam@samason.me.uk> writes:
> I'm seeing big memory leaks when doing a query like:

>   SELECT d.source_loc_id, d.movement_date - '2006-1-1',
>     array_to_string(int_array_aggregate(l.source_ls_id),' ') AS livestockids
>   FROM movedates d, livestock_locations l
>   WHERE l.source_loc_id = d.source_loc_id
>     AND d.movement_date BETWEEN l.start_date AND COALESCE(l.end_date,'2500-1-1')
>   GROUP BY d.source_loc_id, d.movement_date
>   ORDER BY d.movement_date, d.source_loc_id;

> Explain gives the following reasonable plan:

>  Sort  (cost=340841771.28..340843520.38 rows=699639 width=12)
>    Sort Key: d.movement_date, d.source_loc_id
>    ->  HashAggregate  (cost=340761605.76..340773849.45 rows=699639 width=12)
>          ->  Merge Join  (cost=19364430.15..327907117.88 rows=1713931718 width=12)

Are the row estimates good?  What you're expecting the thing to do is
aggregate 1.7 billion integers, which would take about 7GB even assuming
zero overhead.  I don't think there's any "memory leak", it's just that
the hash aggregate table is bigger than your machine can stand.

If there are indeed a lot of groups, you could fix the problem by
disabling hash aggregation:
    set enable_hashagg to off
at the cost of having to sort before instead of after the aggregation.

(The reason the planner doesn't figure this out for itself is that it
has no good idea of the amount of workspace needed by each aggregate.
Maybe we need to be more pessimistic about the likely size of array-type
state values...)

            regards, tom lane

pgsql-general by date:

From: "Roberts, Jon"
Date: 06 June 2008, 18:02:53
Subject: Re: when to reindex?

From: Tom Lane
Date: 06 June 2008, 19:09:03
Subject: Re: Re: how to clean up temporary schemas (how to sync the system table with pg_dump)

Re: intagg memory leak - Mailing list pgsql-general

Previous

Next