Re: slow query - Mailing list pgsql-performance

From Anj Adu
Subject Re: slow query
Date
Msg-id AANLkTin5CmUjXSANfUxp5f5Mo-HBa674HIV3j23ashse@mail.gmail.com
Whole thread Raw
In response to Re: slow query  (Anj Adu <fotographs@gmail.com>)
Responses Re: slow query  (Anj Adu <fotographs@gmail.com>)
List pgsql-performance
Does the difference in expected and actual rows as seen by the planner
a big factor? Even after an analyze...the results are similar. (there
is a big diff between expected and actual)
Partition has 25 million rows

On Fri, Jun 4, 2010 at 10:41 AM, Anj Adu <fotographs@gmail.com> wrote:
> 2010/6/4  <tv@fuzzy.cz>:
>>> I am reposting as my original query was mangled
>>>
>>> The link to the explain plan is here as it does not paste well into
>>> the email body.
>>>
>>> http://explain.depesz.com/s/kHa
>>>
>>>
>>> The machine is a 2 cpu quad core 5430 with 32G RAM and 6x450G 15K
>>> single raid-10 array
>>>
>>> 1G work_mem
>>> default_statistics_target=1000
>>> random_page_cost=1
>>
>> Are you sure it's wise to set the work_mem to 1G? Do you really need it?
>> Don't forget this is not a 'total' or 'per query' - each query may
>> allocate multiple work areas (and occupy multiple GB). But I guess this
>> does not cause the original problem.
>>
>> The last row 'random_page_cost=1' - this basically says that reading data
>> by random is just as cheap as reading data sequentially. Which may result
>> in poor performance due to bad plans. Why have you set this value?
>>
>> Sure, there are rare cases where 'random_page_cost=1' is OK.
>
> The default for 8.4 is 2
> I tried with 2 and 1..but the results are not very different. I
> understand that for fast disks (which we have with a decent Raid 10
> setup)..the random_page_cost can be lowered as needed..but I guess it
> did not make a difference here.
>
>
>>
>>>
>>> I am curious why the hash join takes so long. The main table
>>> dev4_act_dy_fact_2010_05_t has 25 million rows. The table is
>>> partitioned into 3 parts per month. Remaining tables are very small (
>>> < 1000 rows)
>>
>> Well, the real cause that makes your query slow is the 'index scan' part.
>>
>> Index Scan using dev4_act_dy_fact_2010_05_t3_thedate on
>> dev4_act_dy_fact_2010_05_t3 a (cost=0.00..94041.89 rows=204276 width=60)
>> (actual time=164533.725..164533.725 rows=0 loops=1)
>>
>> The first thing to note here is the difference in expected and actual
>> number of rows - the planner expects 204276 but gets 0 rows. How large is
>> this partition?
>
> The partition has 25 million rows with indexes on theDate, node_id..
> I altered the random_page_cost to 4 (1 more than the default)..still
> slow. These tables are analyzed every day
> I have an index on each field used in the where criteria,
>>
>> Try to analyze it, set the random_page_cost to something reasonable (e.g.
>> 4) and try to run the query again.
>>
>> Tomas
>>
>>
>

pgsql-performance by date:

Previous
From: Anj Adu
Date:
Subject: Re: slow query
Next
From: Bryan Hinton
Date:
Subject: Re: Performance tuning for postgres