Home > mailing lists

Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H
Date	June 20, 2015 15:49:47
Msg-id	55858B8B.3080309@2ndquadrant.com Whole thread Raw
In response to	Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H (Feng Tian <ftian@vitessedata.com>)
List	pgsql-hackers

Tree view

Hi,

On 06/20/2015 05:29 PM, Feng Tian wrote:
>
> I have not read Jeff's patch, but here is how I think hash agg should work,
>
> Hash agg scan lineitem table, perform aggregation in memory.   Once
> workmem is exhausted, it write intermediate state to disk, bucket by
> bucket.  When lineitem table is finished, it reads all tuples from one
> bucket back, combining intermediate state and finalize the aggregation.
>    I saw a quite extensive discussion on combining aggregation on the
> dev list, so I assume it will be added.

That's not really how the proposed patch works, and the fact that we 
don't have a good way to serialize/deserialize the aggregate state etc. 
There are also various corner cases how you can end up with writing much 
more data than you assumed, but let's discuss that in the thread about 
the patch, not here.

regards

--
Tomas Vondra                   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

From: Alvaro Herrera
Date: 20 June 2015, 15:36:28
Subject: Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H

From: Heikki Linnakangas
Date: 20 June 2015, 15:53:25
Subject: Re: Extension support for postgres_fdw

Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H - Mailing list pgsql-hackers

Previous

Next