Thread: bad plan: 8.4.8, hashagg, work_mem=1MB.

bad plan: 8.4.8, hashagg, work_mem=1MB.

From
Jon Nelson
Date:
PostgreSQL 8.4.8 on CentOS 5.6, x86_64. Default settings except work_mem = 1MB.

NOTE: I am using partitioned tables here, and was querying the
'master' table. Perhaps is this a Known Issue.

I ran a query recently where the result was very large. The outer-most
part of the query looked like this:

 HashAggregate  (cost=56886512.96..56886514.96 rows=200 width=30)
   ->  Result  (cost=0.00..50842760.97 rows=2417500797 width=30)

The row count for 'Result' is in the right ballpark, but why does
HashAggregate think that it can turn 2 *billion* rows of strings (an
average of 30 bytes long) into only 200?  This is my primary concern.
If I don't disable hash aggregation, postgresql quickly consumes huge
quantities of memory and eventually gets killed by the OOM manager.



After manually disabling hash aggregation, I ran the same query. It's
been running for over 2 days now. The disk is busy but actual data
transferred is very low. Total data size is approx. 250GB, perhaps a
bit less.

The query scans 160 or so tables for data. If I use a distinct + union
on each table, the plan looks like this:

 Unique  (cost=357204094.44..357318730.75 rows=22927263 width=28)
   ->  Sort  (cost=357204094.44..357261412.59 rows=22927263 width=28)

23 million rows is more like it, and the cost is much lower. What is
the possibility that distinct/unique operations can be pushed "down"
into queries during the planning stage to see if they are less
expensive?

In this case, postgresql believes (probably correctly, I'll let you
know) that distinct(column foo from tableA + column foo from tableB +
column foo from tableC ...) is more expensive than distinct(distinct
column foo from tableA + distinct column foo from tableB .... ).

--
Jon

Re: bad plan: 8.4.8, hashagg, work_mem=1MB.

From
Tom Lane
Date:
Jon Nelson <jnelson+pgsql@jamponi.net> writes:
> I ran a query recently where the result was very large. The outer-most
> part of the query looked like this:

>  HashAggregate  (cost=56886512.96..56886514.96 rows=200 width=30)
>    ->  Result  (cost=0.00..50842760.97 rows=2417500797 width=30)

> The row count for 'Result' is in the right ballpark, but why does
> HashAggregate think that it can turn 2 *billion* rows of strings (an
> average of 30 bytes long) into only 200?

200 is the default assumption about number of groups when it's unable to
make any statistics-based estimate.  You haven't shown us any details so
it's hard to say more than that.

            regards, tom lane

Re: bad plan: 8.4.8, hashagg, work_mem=1MB.

From
Jon Nelson
Date:
On Mon, Jun 20, 2011 at 11:08 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jon Nelson <jnelson+pgsql@jamponi.net> writes:
>> I ran a query recently where the result was very large. The outer-most
>> part of the query looked like this:
>
>>  HashAggregate  (cost=56886512.96..56886514.96 rows=200 width=30)
>>    ->  Result  (cost=0.00..50842760.97 rows=2417500797 width=30)
>
>> The row count for 'Result' is in the right ballpark, but why does
>> HashAggregate think that it can turn 2 *billion* rows of strings (an
>> average of 30 bytes long) into only 200?
>
> 200 is the default assumption about number of groups when it's unable to
> make any statistics-based estimate.  You haven't shown us any details so
> it's hard to say more than that.

What sorts of details would you like? The row count for the Result
line is approximately correct -- the stats for all tables are up to
date (the tables never change after import).  statistics is set at 100
currently.


--
Jon

Re: bad plan: 8.4.8, hashagg, work_mem=1MB.

From
Robert Haas
Date:
On Mon, Jun 20, 2011 at 3:31 PM, Jon Nelson <jnelson+pgsql@jamponi.net> wrote:
> On Mon, Jun 20, 2011 at 11:08 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Jon Nelson <jnelson+pgsql@jamponi.net> writes:
>>> I ran a query recently where the result was very large. The outer-most
>>> part of the query looked like this:
>>
>>>  HashAggregate  (cost=56886512.96..56886514.96 rows=200 width=30)
>>>    ->  Result  (cost=0.00..50842760.97 rows=2417500797 width=30)
>>
>>> The row count for 'Result' is in the right ballpark, but why does
>>> HashAggregate think that it can turn 2 *billion* rows of strings (an
>>> average of 30 bytes long) into only 200?
>>
>> 200 is the default assumption about number of groups when it's unable to
>> make any statistics-based estimate.  You haven't shown us any details so
>> it's hard to say more than that.
>
> What sorts of details would you like? The row count for the Result
> line is approximately correct -- the stats for all tables are up to
> date (the tables never change after import).  statistics is set at 100
> currently.

The query and the full EXPLAIN output (attached as text files) would
be a good place to start....

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company