Re: postmaster consuming /lots/ of memory with hash aggregate. why? - Mailing list pgsql-performance

From Jon Nelson
Subject Re: postmaster consuming /lots/ of memory with hash aggregate. why?
Date
Msg-id AANLkTikv9BT3=5C1aXCJomej0gQxTkv3TX1DeiMjA-P-@mail.gmail.com
Whole thread Raw
In response to Re: postmaster consuming /lots/ of memory with hash aggregate. why?  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: postmaster consuming /lots/ of memory with hash aggregate. why?  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-performance
On Thu, Nov 11, 2010 at 10:38 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2010/11/12 Jon Nelson <jnelson+pgsql@jamponi.net>:
>> On Thu, Nov 11, 2010 at 10:26 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>> Hello
>>>
>>> look on EXPLAIN ANALYZE command. Probably your statistic are out, and
>>> then planner can be confused. EXPLAIN ANALYZE statement show it.
>>
>> As I noted earlier, I did set statistics to 1000 an re-ran vacuum
>> analyze and the plan did not change.
>
> this change can do nothing. this is default in config. did you use
> ALTER TABLE ALTER COLUMN SET STATISTIC = ... ? and ANALYZE

No. To be clear: are you saying that changing the value for
default_statistics_target, restarting postgresql, and re-running
VACUUM ANALYZE does *not* change the statistics for columns
created/populated *prior* to the sequence of operations, and that one
/must/ use ALTER TABLE ALTER COLUMN SET STATISTICS ... and re-ANALYZE?

That does not jive with the documentation, which appears to suggest
that setting a new default_statistics_target, restarting postgresql,
and then re-ANALYZE'ing a table should be sufficient (provided the
columns have not had a statistics target explicitly set).

>> What other diagnostics can I provide? This still doesn't answer the
>> 40000 row question, though. It seems absurd to me that the planner
>> would give up and just use 40000 rows (0.02 percent of the actual
>> result).
>>
>
> there can be some not well supported operation, then planner use a
> some % from rows without statistic based estimation

The strange thing is that the value 40000 keeps popping up in totally
diffferent contexts, with different tables, databases, etc... I tried
digging through the code and the only thing I found was that numGroups
was being set to 40000 but I couldn't see where.

--
Jon

pgsql-performance by date:

Previous
From: Thom Brown
Date:
Subject: Re: MVCC performance issue
Next
From: bricklen
Date:
Subject: Re: MVCC performance issue