Thread: slow query performance

slow query performance

From
Anj Adu
Date:
I cant seem to pinpoint why this query is slow . No full table scans
are being done. The hash join is taking maximum time. The table
dev4_act_action has only 3 rows.

box is a 2 cpu quad core intel 5430 with 32G RAM... Postgres 8.4.0
1G work_mem
20G effective_cache
random_page_cost=1
default_statistics_target=1000

The larget table  in the inner query is dev4_act_dy_fact which is
partitioned into 3 partitions per month. Each partition has about 25
million rows.
The rest of the tables are very small (100- 1000 rows)

explain analyze
select ipconvert(srctest_num),CASE targetpt::character varying
            WHEN NULL::text THEN serv.targetsrv
            ELSE targetpt::character varying
        END AS targetsrv, sesstype,hits as cons,bytes, srcz.srcarea as
srcz, dstz.dstarea as dstz from
(
select srctest_num, targetpt,targetsrv_id, sesstype_id, sum(total) as
hits, sum(bin) + sum(bout) as bts, sourcearea_id, destinationarea_id
 from dev4_act_dy_fact a, dev4_act_action act where thedate between
'2010-05-22' and '2010-05-22'
 and a.action_id = act.action_id and action in ('rejected','sess_rejected')
 and guardid_id in (select guardid_id from dev4_act_guardid where
guardid like 'cust00%')
 and node_id=(select node_id from dev4_act_node where node='10.90.100.2')
 group by srctest_num,targetpt,targetsrv_id,sesstype_id,
sourcearea_id, destinationarea_id
  order by (sum(bin) + sum(bout)) desc
 limit 1000
 ) a left outer join dev4_act_dstarea dstz on a.destinationarea_id =
dstz.dstarea_id
 left outer join dev4_act_srcarea srcz on a.sourcearea_id = srcz.srcarea_id
 left outer join  dev4_act_targetsrv serv on a.targetsrv_id = serv.targetsrv_id
 left outer join dev4_sesstype proto on a.sesstype_id = proto.sesstype_id
 order by bytes desc



"Nested Loop Left Join  (cost=95392.32..95496.13 rows=20 width=510)
(actual time=164533.831..164533.831 rows=0 loops=1)"
"  ->  Nested Loop Left Join  (cost=95392.32..95473.43 rows=20
width=396) (actual time=164533.830..164533.830 rows=0 loops=1)"
"        ->  Nested Loop Left Join  (cost=95392.32..95455.83 rows=20
width=182) (actual time=164533.829..164533.829 rows=0 loops=1)"
"              ->  Nested Loop Left Join  (cost=95392.32..95410.17
rows=20 width=186) (actual time=164533.829..164533.829 rows=0
loops=1)"
"                    ->  Limit  (cost=95392.32..95392.37 rows=20
width=52) (actual time=164533.828..164533.828 rows=0 loops=1)"
"                          InitPlan 1 (returns $0)"
"                            ->  Index Scan using dev4_act_node_uindx
on dev4_act_node  (cost=0.00..2.27 rows=1 width=4) (actual
time=0.052..0.052 rows=0 loops=1)"
"                                  Index Cond: ((node)::text =
'10.90.100.2'::text)"
"                          ->  Sort  (cost=95390.05..95390.10 rows=20
width=52) (actual time=164533.826..164533.826 rows=0 loops=1)"
"                                Sort Key: ((sum(a.bin) + sum(a.bout)))"
"                                Sort Method:  quicksort  Memory: 17kB"
"                                ->  HashAggregate
(cost=95389.22..95389.62 rows=20 width=52) (actual
time=164533.796..164533.796 rows=0 loops=1)"
"                                      ->  Nested Loop Semi Join
(cost=7.37..95388.77 rows=20 width=52) (actual
time=164533.793..164533.793 rows=0 loops=1)"
"                                            ->  Hash Join
(cost=7.37..94836.75 rows=2043 width=56) (actual
time=164533.792..164533.792 rows=0 loops=1)"
"                                                  Hash Cond:
(a.action_id = act.action_id)"
"                                                  ->  Append
(cost=2.80..94045.71 rows=204277 width=60) (actual
time=164533.790..164533.790 rows=0 loops=1)"
"                                                        ->  Bitmap
Heap Scan on dev4_act_dy_fact a  (cost=2.80..3.82 rows=1 width=60)
(actual time=0.064..0.064 rows=0 loops=1)"
"                                                              Recheck
Cond: ((node_id = $0) AND (thedate >= '2010-05-22 00:00:00'::timestamp
without time area) AND (thedate <= '2010-05-22 00:00:00'::timestamp
without time area))"
"                                                              ->
BitmapAnd  (cost=2.80..2.80 rows=1 width=0) (actual time=0.062..0.062
rows=0 loops=1)"
"
->  Bitmap Index Scan on dev4_act_dy_dm_nd_indx  (cost=0.00..1.27
rows=3 width=0) (actual time=0.062..0.062 rows=0 loops=1)"
"
    Index Cond: (node_id = $0)"
"
->  Bitmap Index Scan on dev4_act_dy_dm_cd_indx  (cost=0.00..1.28
rows=3 width=0) (never executed)"
"
    Index Cond: ((thedate >= '2010-05-22 00:00:00'::timestamp without
time area) AND (thedate <= '2010-05-22 00:00:00'::timestamp without
time area))"
"                                                        ->  Index
Scan using dev4_act_dy_fact_2010_05_t3_thedate on
dev4_act_dy_fact_2010_05_t3 a  (cost=0.00..94041.89 rows=204276
width=60) (actual time=164533.725..164533.725 rows=0 loops=1)"
"                                                              Index
Cond: ((thedate >= '2010-05-22 00:00:00'::timestamp without time area)
AND (thedate <= '2010-05-22 00:00:00'::timestamp without time area))"
"                                                              Filter:
(node_id = $0)"
"                                                  ->  Hash
(cost=4.54..4.54 rows=2 width=4) (never executed)"
"                                                        ->  Bitmap
Heap Scan on dev4_act_action act  (cost=2.52..4.54 rows=2 width=4)
(never executed)"
"                                                              Recheck
Cond: ((action)::text = ANY ('{rejected,sess_rejected}'::text[]))"
"                                                              ->
Bitmap Index Scan on dev4_act_action_uindx  (cost=0.00..2.52 rows=2
width=0) (never executed)"
"
Index Cond: ((action)::text = ANY
('{rejected,sess_rejected}'::text[]))"
"                                            ->  Index Scan using
dev4_act_guardid_pk on dev4_act_guardid  (cost=0.00..0.27 rows=1
width=4) (never executed)"
"                                                  Index Cond:
(dev4_act_guardid.guardid_id = a.guardid_id)"
"                                                  Filter:
((dev4_act_guardid.guardid)::text ~~ 'cust00%'::text)"
"                    ->  Index Scan using sys_c006248 on dev4_sesstype
proto  (cost=0.00..0.87 rows=1 width=102) (never executed)"
"                          Index Cond: (a.sesstype_id = proto.sesstype_id)"
"              ->  Index Scan using dev4_act_targetsrv_pk on
dev4_act_targetsrv serv  (cost=0.00..2.27 rows=1 width=4) (never
executed)"
"                    Index Cond: (a.targetsrv_id = serv.targetsrv_id)"
"        ->  Index Scan using dev4_act_srcarea_pk on dev4_act_srcarea
srcz  (cost=0.00..0.87 rows=1 width=222) (never executed)"
"              Index Cond: (a.sourcearea_id = srcz.srcarea_id)"
"  ->  Index Scan using dev4_act_dstarea_pk on dev4_act_dstarea dstz
(cost=0.00..0.87 rows=1 width=122) (never executed)"
"        Index Cond: (a.destinationarea_id = dstz.dstarea_id)"
"Total runtime: 164534.172 ms"

Re: slow query performance

From
Andy Colson
Date:
On 6/3/2010 12:47 PM, Anj Adu wrote:
> I cant seem to pinpoint why this query is slow . No full table scans
> are being done. The hash join is taking maximum time. The table
> dev4_act_action has only 3 rows.
>
> box is a 2 cpu quad core intel 5430 with 32G RAM... Postgres 8.4.0
> 1G work_mem
> 20G effective_cache
> random_page_cost=1
> default_statistics_target=1000
>
> The larget table  in the inner query is dev4_act_dy_fact which is
> partitioned into 3 partitions per month. Each partition has about 25
> million rows.
> The rest of the tables are very small (100- 1000 rows)
>
> explain analyze
> select ipconvert(srctest_num),CASE targetpt::character varying
>              WHEN NULL::text THEN serv.targetsrv
>              ELSE targetpt::character varying
>          END AS targetsrv, sesstype,hits as cons,bytes, srcz.srcarea as
> srcz, dstz.dstarea as dstz from
> (
> select srctest_num, targetpt,targetsrv_id, sesstype_id, sum(total) as
> hits, sum(bin) + sum(bout) as bts, sourcearea_id, destinationarea_id
>   from dev4_act_dy_fact a, dev4_act_action act where thedate between
> '2010-05-22' and '2010-05-22'
>   and a.action_id = act.action_id and action in ('rejected','sess_rejected')
>   and guardid_id in (select guardid_id from dev4_act_guardid where
> guardid like 'cust00%')
>   and node_id=(select node_id from dev4_act_node where node='10.90.100.2')
>   group by srctest_num,targetpt,targetsrv_id,sesstype_id,
> sourcearea_id, destinationarea_id
>    order by (sum(bin) + sum(bout)) desc
>   limit 1000
>   ) a left outer join dev4_act_dstarea dstz on a.destinationarea_id =
> dstz.dstarea_id
>   left outer join dev4_act_srcarea srcz on a.sourcearea_id = srcz.srcarea_id
>   left outer join  dev4_act_targetsrv serv on a.targetsrv_id = serv.targetsrv_id
>   left outer join dev4_sesstype proto on a.sesstype_id = proto.sesstype_id
>   order by bytes desc
>
>


Wow, the word wrap on that makes it hard to read... can you paste it
here and send us a link?

http://explain.depesz.com


Re: slow query performance

From
Anj Adu
Date:
Link to plan

http://explain.depesz.com/s/kHa

On Thu, Jun 3, 2010 at 11:43 AM, Andy Colson <andy@squeakycode.net> wrote:
> On 6/3/2010 12:47 PM, Anj Adu wrote:
>>
>> I cant seem to pinpoint why this query is slow . No full table scans
>> are being done. The hash join is taking maximum time. The table
>> dev4_act_action has only 3 rows.
>>
>> box is a 2 cpu quad core intel 5430 with 32G RAM... Postgres 8.4.0
>> 1G work_mem
>> 20G effective_cache
>> random_page_cost=1
>> default_statistics_target=1000
>>
>> The larget table  in the inner query is dev4_act_dy_fact which is
>> partitioned into 3 partitions per month. Each partition has about 25
>> million rows.
>> The rest of the tables are very small (100- 1000 rows)
>>
>> explain analyze
>> select ipconvert(srctest_num),CASE targetpt::character varying
>>             WHEN NULL::text THEN serv.targetsrv
>>             ELSE targetpt::character varying
>>         END AS targetsrv, sesstype,hits as cons,bytes, srcz.srcarea as
>> srcz, dstz.dstarea as dstz from
>> (
>> select srctest_num, targetpt,targetsrv_id, sesstype_id, sum(total) as
>> hits, sum(bin) + sum(bout) as bts, sourcearea_id, destinationarea_id
>>  from dev4_act_dy_fact a, dev4_act_action act where thedate between
>> '2010-05-22' and '2010-05-22'
>>  and a.action_id = act.action_id and action in
>> ('rejected','sess_rejected')
>>  and guardid_id in (select guardid_id from dev4_act_guardid where
>> guardid like 'cust00%')
>>  and node_id=(select node_id from dev4_act_node where node='10.90.100.2')
>>  group by srctest_num,targetpt,targetsrv_id,sesstype_id,
>> sourcearea_id, destinationarea_id
>>   order by (sum(bin) + sum(bout)) desc
>>  limit 1000
>>  ) a left outer join dev4_act_dstarea dstz on a.destinationarea_id =
>> dstz.dstarea_id
>>  left outer join dev4_act_srcarea srcz on a.sourcearea_id =
>> srcz.srcarea_id
>>  left outer join  dev4_act_targetsrv serv on a.targetsrv_id =
>> serv.targetsrv_id
>>  left outer join dev4_sesstype proto on a.sesstype_id = proto.sesstype_id
>>  order by bytes desc
>>
>>
>
>
> Wow, the word wrap on that makes it hard to read... can you paste it here
> and send us a link?
>
> http://explain.depesz.com
>
>

Re: slow query performance

From
Robert Haas
Date:
On Thu, Jun 3, 2010 at 4:37 PM, Anj Adu <fotographs@gmail.com> wrote:
> Link to plan
>
> http://explain.depesz.com/s/kHa

Your problem is likely related to the line that's showing up in red:

Index Scan using dev4_act_dy_fact_2010_05_t3_thedate on
dev4_act_dy_fact_2010_05_t3 a (cost=0.00..94041.89 rows=204276
width=60) (actual time=164533.725..164533.725 rows=0 loops=1)
    * Index Cond: ((thedate >= '2010-05-22 00:00:00'::timestamp
without time area) AND (thedate <= '2010-05-22 00:00:00'::timestamp
without time area))
    * Filter: (node_id = $0)

This index scan is estimated to return 204,276 rows and actually
returned zero...  it might work better to rewrite this part of the
query as a join, if you can:

node_id=(select node_id from dev4_act_node where node='10.90.100.2')

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: slow query performance

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Jun 3, 2010 at 4:37 PM, Anj Adu <fotographs@gmail.com> wrote:
>> Link to plan
>>
>> http://explain.depesz.com/s/kHa

> Your problem is likely related to the line that's showing up in red:

> Index Scan using dev4_act_dy_fact_2010_05_t3_thedate on
> dev4_act_dy_fact_2010_05_t3 a (cost=0.00..94041.89 rows=204276
> width=60) (actual time=164533.725..164533.725 rows=0 loops=1)
>     * Index Cond: ((thedate >= '2010-05-22 00:00:00'::timestamp
> without time area) AND (thedate <= '2010-05-22 00:00:00'::timestamp
> without time area))
>     * Filter: (node_id = $0)

"timestamp without time area"?  Somehow I think this isn't the true
unaltered output of EXPLAIN.

I'm just guessing, since we haven't been shown any table schemas,
but what it looks like to me is that the planner is using an entirely
inappropriate index in which the "thedate" column is a low-order column.
So what looks like a nice tight indexscan range is actually a full-table
indexscan.  The planner knows that this is ridiculously expensive, as
indicated by the high cost estimate.  It would be cheaper to do a
seqscan, which leads me to think the real problem here is the OP has
disabled seqscans.

It might be worth providing an index in which "thedate" is the only, or
at least the first, column.  For this particular query, an index on
node_id and thedate would actually be ideal, but that might be too
specialized.

            regards, tom lane

Re: slow query performance

From
Anj Adu
Date:
The plan is unaltered . There is a separate index on theDate as well
as one on node_id

I have not specifically disabled sequential scans.

This query performs much better on 8.1.9 on a similar sized
table.(althought the random_page_cost=4 on 8.1.9 and 2 on 8.4.0 )

On Wed, Jun 9, 2010 at 7:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Thu, Jun 3, 2010 at 4:37 PM, Anj Adu <fotographs@gmail.com> wrote:
>>> Link to plan
>>>
>>> http://explain.depesz.com/s/kHa
>
>> Your problem is likely related to the line that's showing up in red:
>
>> Index Scan using dev4_act_dy_fact_2010_05_t3_thedate on
>> dev4_act_dy_fact_2010_05_t3 a (cost=0.00..94041.89 rows=204276
>> width=60) (actual time=164533.725..164533.725 rows=0 loops=1)
>>     * Index Cond: ((thedate >= '2010-05-22 00:00:00'::timestamp
>> without time area) AND (thedate <= '2010-05-22 00:00:00'::timestamp
>> without time area))
>>     * Filter: (node_id = $0)
>
> "timestamp without time area"?  Somehow I think this isn't the true
> unaltered output of EXPLAIN.
>
> I'm just guessing, since we haven't been shown any table schemas,
> but what it looks like to me is that the planner is using an entirely
> inappropriate index in which the "thedate" column is a low-order column.
> So what looks like a nice tight indexscan range is actually a full-table
> indexscan.  The planner knows that this is ridiculously expensive, as
> indicated by the high cost estimate.  It would be cheaper to do a
> seqscan, which leads me to think the real problem here is the OP has
> disabled seqscans.
>
> It might be worth providing an index in which "thedate" is the only, or
> at least the first, column.  For this particular query, an index on
> node_id and thedate would actually be ideal, but that might be too
> specialized.
>
>                        regards, tom lane
>

Re: slow query performance

From
Robert Haas
Date:
On Wed, Jun 9, 2010 at 11:17 PM, Anj Adu <fotographs@gmail.com> wrote:
> The plan is unaltered . There is a separate index on theDate as well
> as one on node_id
>
> I have not specifically disabled sequential scans.

Please do "SHOW ALL" and attach the results as a text file.

> This query performs much better on 8.1.9 on a similar sized
> table.(althought the random_page_cost=4 on 8.1.9 and 2 on 8.4.0 )

Well that could certainly matter...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: slow query performance

From
Anj Adu
Date:
Attached

Thank you


On Thu, Jun 10, 2010 at 6:28 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jun 9, 2010 at 11:17 PM, Anj Adu <fotographs@gmail.com> wrote:
>> The plan is unaltered . There is a separate index on theDate as well
>> as one on node_id
>>
>> I have not specifically disabled sequential scans.
>
> Please do "SHOW ALL" and attach the results as a text file.
>
>> This query performs much better on 8.1.9 on a similar sized
>> table.(althought the random_page_cost=4 on 8.1.9 and 2 on 8.4.0 )
>
> Well that could certainly matter...
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>

Attachment

Re: slow query performance

From
Robert Haas
Date:
On Thu, Jun 10, 2010 at 11:32 AM, Anj Adu <fotographs@gmail.com> wrote:
> Attached

Hmm.  Well, I'm not quite sure what's going on here, but I think you
must be using a modified verison of PostgreSQL, because, as Tom
pointed out upthread, we don't have a data type called "timestamp with
time area".  It would be called "timestamp with time zone".

Can we see the index and table definitions of the relevant tables
(attached as a text file) and the size of each one (use select
pg_relation_size('name'))?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: slow query performance

From
Anj Adu
Date:
you are right..the word "zone" was replaced by "area" (my bad )

everything else is as is.

Apologies for the confusion.

On Thu, Jun 10, 2010 at 9:42 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Jun 10, 2010 at 11:32 AM, Anj Adu <fotographs@gmail.com> wrote:
>> Attached
>
> Hmm.  Well, I'm not quite sure what's going on here, but I think you
> must be using a modified verison of PostgreSQL, because, as Tom
> pointed out upthread, we don't have a data type called "timestamp with
> time area".  It would be called "timestamp with time zone".
>
> Can we see the index and table definitions of the relevant tables
> (attached as a text file) and the size of each one (use select
> pg_relation_size('name'))?
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>

Re: slow query performance

From
Robert Haas
Date:
On Thu, Jun 10, 2010 at 12:58 PM, Anj Adu <fotographs@gmail.com> wrote:
> you are right..the word "zone" was replaced by "area" (my bad )
>
> everything else is as is.
>
> Apologies for the confusion.

Well, two different people have asked you for the table and index
definitions now, and you haven't provided them... I think it's going
to be hard to troubleshoot this without seeing those definitions (and
also the sizes, which I asked for in my previous email).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: slow query performance

From
Anj Adu
Date:
I changed random_page_cost=4 (earlier 2) and the performance issue is gone

I am not clear why a page_cost of 2 on really fast disks would perform badly.

Thank you for all your help and time.

On Thu, Jun 10, 2010 at 8:32 AM, Anj Adu <fotographs@gmail.com> wrote:
> Attached
>
> Thank you
>
>
> On Thu, Jun 10, 2010 at 6:28 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Wed, Jun 9, 2010 at 11:17 PM, Anj Adu <fotographs@gmail.com> wrote:
>>> The plan is unaltered . There is a separate index on theDate as well
>>> as one on node_id
>>>
>>> I have not specifically disabled sequential scans.
>>
>> Please do "SHOW ALL" and attach the results as a text file.
>>
>>> This query performs much better on 8.1.9 on a similar sized
>>> table.(althought the random_page_cost=4 on 8.1.9 and 2 on 8.4.0 )
>>
>> Well that could certainly matter...
>>
>> --
>> Robert Haas
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise Postgres Company
>>
>

Re: slow query performance

From
Kenneth Marshall
Date:
Hi Anj,

That is an indication that your system was less correctly
modeled with a random_page_cost=2 which means that the system
will assume that random I/O is cheaper than it is and will
choose plans based on that model. If this is not the case,
the plan chosen will almost certainly be slower for any
non-trivial query. You can put a 200mph speedometer in a
VW bug but it will never go 200mph.

Regards,
Ken

On Thu, Jun 10, 2010 at 07:54:01PM -0700, Anj Adu wrote:
> I changed random_page_cost=4 (earlier 2) and the performance issue is gone
>
> I am not clear why a page_cost of 2 on really fast disks would perform badly.
>
> Thank you for all your help and time.
>
> On Thu, Jun 10, 2010 at 8:32 AM, Anj Adu <fotographs@gmail.com> wrote:
> > Attached
> >
> > Thank you
> >
> >
> > On Thu, Jun 10, 2010 at 6:28 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> >> On Wed, Jun 9, 2010 at 11:17 PM, Anj Adu <fotographs@gmail.com> wrote:
> >>> The plan is unaltered . There is a separate index on theDate as well
> >>> as one on node_id
> >>>
> >>> I have not specifically disabled sequential scans.
> >>
> >> Please do "SHOW ALL" and attach the results as a text file.
> >>
> >>> This query performs much better on 8.1.9 on a similar sized
> >>> table.(althought the random_page_cost=4 on 8.1.9 and 2 on 8.4.0 )
> >>
> >> Well that could certainly matter...
> >>
> >> --
> >> Robert Haas
> >> EnterpriseDB: http://www.enterprisedb.com
> >> The Enterprise Postgres Company
> >>
> >
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

Re: slow query performance

From
Anj Adu
Date:
Is there a way to determine a reasonable value for random_page_cost
via some testing with OS commands. We have several postgres databases
and determining this value on a case by case basis may not be viable
(we may have to go with the defaults)

On Fri, Jun 11, 2010 at 5:44 AM, Kenneth Marshall <ktm@rice.edu> wrote:
> Hi Anj,
>
> That is an indication that your system was less correctly
> modeled with a random_page_cost=2 which means that the system
> will assume that random I/O is cheaper than it is and will
> choose plans based on that model. If this is not the case,
> the plan chosen will almost certainly be slower for any
> non-trivial query. You can put a 200mph speedometer in a
> VW bug but it will never go 200mph.
>
> Regards,
> Ken
>
> On Thu, Jun 10, 2010 at 07:54:01PM -0700, Anj Adu wrote:
>> I changed random_page_cost=4 (earlier 2) and the performance issue is gone
>>
>> I am not clear why a page_cost of 2 on really fast disks would perform badly.
>>
>> Thank you for all your help and time.
>>
>> On Thu, Jun 10, 2010 at 8:32 AM, Anj Adu <fotographs@gmail.com> wrote:
>> > Attached
>> >
>> > Thank you
>> >
>> >
>> > On Thu, Jun 10, 2010 at 6:28 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> >> On Wed, Jun 9, 2010 at 11:17 PM, Anj Adu <fotographs@gmail.com> wrote:
>> >>> The plan is unaltered . There is a separate index on theDate as well
>> >>> as one on node_id
>> >>>
>> >>> I have not specifically disabled sequential scans.
>> >>
>> >> Please do "SHOW ALL" and attach the results as a text file.
>> >>
>> >>> This query performs much better on 8.1.9 on a similar sized
>> >>> table.(althought the random_page_cost=4 on 8.1.9 and 2 on 8.4.0 )
>> >>
>> >> Well that could certainly matter...
>> >>
>> >> --
>> >> Robert Haas
>> >> EnterpriseDB: http://www.enterprisedb.com
>> >> The Enterprise Postgres Company
>> >>
>> >
>>
>> --
>> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
>>
>

Re: slow query performance

From
Kenneth Marshall
Date:
If you check the archives, you will see that this is not easy
to do because of the effects of caching. The default values
were actually chosen to be a good compromise between fully
cached in RAM and totally un-cached. The actual best value
depends on the size of your database, the size of its working
set, your I/O system and your memory. The best recommendation
is usually to use the default values unless you know something
about your system that moves it out of that arena.

Regards,
Ken

On Fri, Jun 11, 2010 at 06:23:31AM -0700, Anj Adu wrote:
> Is there a way to determine a reasonable value for random_page_cost
> via some testing with OS commands. We have several postgres databases
> and determining this value on a case by case basis may not be viable
> (we may have to go with the defaults)
>
> On Fri, Jun 11, 2010 at 5:44 AM, Kenneth Marshall <ktm@rice.edu> wrote:
> > Hi Anj,
> >
> > That is an indication that your system was less correctly
> > modeled with a random_page_cost=2 which means that the system
> > will assume that random I/O is cheaper than it is and will
> > choose plans based on that model. If this is not the case,
> > the plan chosen will almost certainly be slower for any
> > non-trivial query. You can put a 200mph speedometer in a
> > VW bug but it will never go 200mph.
> >
> > Regards,
> > Ken
> >
> > On Thu, Jun 10, 2010 at 07:54:01PM -0700, Anj Adu wrote:
> >> I changed random_page_cost=4 (earlier 2) and the performance issue is gone
> >>
> >> I am not clear why a page_cost of 2 on really fast disks would perform badly.
> >>
> >> Thank you for all your help and time.
> >>
> >> On Thu, Jun 10, 2010 at 8:32 AM, Anj Adu <fotographs@gmail.com> wrote:
> >> > Attached
> >> >
> >> > Thank you
> >> >
> >> >
> >> > On Thu, Jun 10, 2010 at 6:28 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> >> >> On Wed, Jun 9, 2010 at 11:17 PM, Anj Adu <fotographs@gmail.com> wrote:
> >> >>> The plan is unaltered . There is a separate index on theDate as well
> >> >>> as one on node_id
> >> >>>
> >> >>> I have not specifically disabled sequential scans.
> >> >>
> >> >> Please do "SHOW ALL" and attach the results as a text file.
> >> >>
> >> >>> This query performs much better on 8.1.9 on a similar sized
> >> >>> table.(althought the random_page_cost=4 on 8.1.9 and 2 on 8.4.0 )
> >> >>
> >> >> Well that could certainly matter...
> >> >>
> >> >> --
> >> >> Robert Haas
> >> >> EnterpriseDB: http://www.enterprisedb.com
> >> >> The Enterprise Postgres Company
> >> >>
> >> >
> >>
> >> --
> >> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> >> To make changes to your subscription:
> >> http://www.postgresql.org/mailpref/pgsql-performance
> >>
> >
>

Re: slow query performance

From
Matthew Wakeling
Date:
On Fri, 11 Jun 2010, Kenneth Marshall wrote:
> If you check the archives, you will see that this is not easy
> to do because of the effects of caching.

Indeed. If you were to take the value at completely face value, a modern
hard drive is capable of transferring sequential pages somewhere between
40 and 100 times faster than random pages, depending on the drive.

However, caches tend to favour index scans much more than sequential
scans, so using a value between 40 and 100 would discourage Postgres from
using indexes when they are really the most appropriate option.

Matthew

--
 A. Top Posters
 > Q. What's the most annoying thing in the world?