Thread: Append Cost in query planners

Append Cost in query planners

From
"Nimesh Satam"
Date:
All,
 
We are trying to implement partition on one tables on date basis. the overall cost and timming and cost of the query is increasing on the Append of the child table output. As shown below:
 
->  Append  (cost=0.00..112217.92 rows=2752906 width=52) (actual time=2454.207..20712.021 rows=2752905 loops=1)
 ->  Seq Scan on trm  (cost=0.00..28570.35 rows=1 width=52) (actual time=2423.374..2423.374 rows=0 loops=1)
  Filter: ((sqldate >= '2007-06-01 00:00:00'::timestamp without time zone) AND (sqldate <= '2007-06-03 00:00:00'::timestamp without time zone))
 ->  Seq Scan on trm_d20070601 trm  (cost= 0.00..29203.41 rows=961094 width=52) (actual time=30.825..3027.217 rows=961094 loops=1)
  Filter: ((sqldate >= '2007-06-01 00:00:00'::timestamp without time zone) AND (sqldate <= '2007-06-03 00:00:00'::timestamp without time zone))
 ->  Seq Scan on trm_d20070602 trm  (cost=0.00..27442.52 rows=903168 width=52) (actual time=11.142..2687.422 rows=903168 loops=1)
  Filter: ((sqldate >= '2007-06-01 00:00:00'::timestamp without time zone) AND (sqldate <= '2007-06-03 00:00:00'::timestamp without time zone))
 ->  Seq Scan on trm_d20070603 trm  (cost=0.00..27001.64 rows=888643 width=52) (actual time=13.697..2568.012 rows=888643 loops=1)
  Filter: ((sqldate >= '2007-06-01 00:00:00'::timestamp without time zone) AND (sqldate <= '2007-06-03 00:00:00'::timestamp without time zone))

 
 
Can someone let me know, how we can reduce the overall cost and time of the append operation, and what parameters in the confirguration needs to be changed?
 
 
Let me know if you need any further information to improve the query plan.
 
 
Regards,
Nimesh.

Re: Append Cost in query planners

From
Heikki Linnakangas
Date:
Nimesh Satam wrote:
> We are trying to implement partition on one tables on date basis. the
> overall cost and timming and cost of the query is increasing on the Append
> of the child table output. As shown below:
>
> *->  Append  (cost=0.00..112217.92 rows=2752906 width=52) (actual time=
> 2454.207..20712.021 rows=2752905 loops=1)
>  ->  Seq Scan on trm  (cost=0.00..28570.35 rows=1 width=52) (actual time=
> 2423.374..2423.374 rows=0 loops=1)
>   Filter: ((sqldate >= '2007-06-01 00:00:00'::timestamp without time zone)
> AND (sqldate <= '2007-06-03 00:00:00'::timestamp without time zone))
>  ->  Seq Scan on trm_d20070601 trm  (cost=0.00..29203.41 rows=961094
> width=52) (actual time=30.825..3027.217 rows=961094 loops=1)
>   Filter: ((sqldate >= '2007-06-01 00:00:00'::timestamp without time zone)
> AND (sqldate <= '2007-06-03 00:00:00'::timestamp without time zone))
>  ->  Seq Scan on trm_d20070602 trm  (cost=0.00..27442.52 rows=903168
> width=52) (actual time=11.142..2687.422 rows=903168 loops=1)
>   Filter: ((sqldate >= '2007-06-01 00:00:00'::timestamp without time zone)
> AND (sqldate <= '2007-06-03 00:00:00'::timestamp without time zone))
>  ->  Seq Scan on trm_d20070603 trm  (cost=0.00..27001.64 rows=888643
> width=52) (actual time=13.697..2568.012 rows=888643 loops=1)
>   Filter: ((sqldate >= '2007-06-01 00:00:00'::timestamp without time zone)
> AND (sqldate <= '2007-06-03 00:00:00'::timestamp without time zone))*
>
> Can someone let me know, how we can reduce the overall cost and time of the
> append operation, and what parameters in the confirguration needs to be
> changed?

Does the query really return almost 3 million rows? If that's the case,
I'm afraid there isn't much you can do, software-wise. If not, show us
the complete query and EXPLAIN ANALYZE output.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

Re: Append Cost in query planners

From
Heikki Linnakangas
Date:
Nimesh Satam wrote:
> Following is the full plan of the query using partition. Let me know if you
> need any further information.

What indexes are there on the table partitions? You didn't post the
query, but it looks like your doing a join between rpt_network and the
partitioned table. An index on the join key might help...

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

Re: Append Cost in query planners

From
"Nimesh Satam"
Date:
Heikki,


Thanks for your reply. Will try to do the changes and rivert back. I had one more requirement for partitioning.

I wanted to inherit two different tables for partition. Below is the query used to create the table, an crete the inheritance.


CREATE TABLE metrics_d20070601 (CHECK (sqldate = '20070601')) INHERITS (metrics, date);

Further more we are using the below mentioned query:

SELECT rs.id AS sid, rs.name AS sname, rc.id AS cid, rc.name AS cname, rc.type AS rtype, rc.act_type AS acttype, ra.id AS adid, ra.name AS avname, rch.id AS chid, rch.name AS chname, rcr.dim AS dim, SUM(metrics.imp_del) AS imp, SUM(metrics.clidel) AS cli, date.sqldate AS date, rg.id AS gid 
FROM metrics, rn CROSS JOIN date, ra, rs, rc, rch, rcr, rg
WHERE metrics.netkey = rn.key
AND rn.id = 10
AND metrics.advkey = ra.key
AND metrics.campkey = rc.key
AND metrics.skey = rs.key
AND metrics.chkey = rch.key
AND metrics.cr_key = rcr.key 
AND date.sqldate BETWEEN '6/01/2007' AND '6/01/2007'
AND metrics.gkey = rg.key 
GROUP BY date.sqldate , rs.id, rs.name, ra.id, ra.name, rc.id, rc.name, rc.rev_type , rc.act_type, rch.id, rch.name, rcr.dim, rg.id;

And the query execution plan is as below

                                                                                                                                  QUERY PLAN                                                                                                             
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=589766.28..651315.41 rows=1119075 width=127)
   ->  Sort  (cost=589766.28..592563.97 rows=1119075 width=127)
         Sort Key: public.date.sqldate, rs.id, rs.name, ra.id, ra.name, rc.id, rc.name, rc.rtype, rc.act_type, rch.id, rch.name, rcr.dim, rg.id
         ->  Hash Join  (cost=64914.87..433619.51 rows=1119075 width=127)
               Hash Cond: ("outer".adv_key = "inner"."key")
               ->  Hash Join  (cost=64419.08..402349.16 rows=1119075 width=111)
                     Hash Cond: ("outer".s_key = "inner"."key")
                     ->  Hash Join  (cost= 63827.54..368185.38 rows=1119075 width=96)
                           Hash Cond: ("outer".campkey = "inner"."key")
                           ->  Hash Join  (cost=61339.00..323731.53 rows=1119075 width=66)
                                 Hash Cond: ("outer".chkey = "inner"."key")
                                 ->  Hash Join  (cost=59480.62..293896.26 rows=1119075 width=46)
                                       Hash Cond: ("outer".cr_key = "inner"."key")
                                       ->  Hash Join  (cost=51298.73..243749.06 rows=1119075 width=48)
                                             Hash Cond: ("outer".gkey = "inner"."key")
                                             ->  Hash Join  (cost=51051.50..204334.21 rows=1119075 width=48)
                                                   Hash Cond: (("outer".netkey = "inner"."key") AND ("outer".date_key = "inner"."key"))
                                                   ->  Append  (cost=0.00..51795.56 rows=1901256 width=48)
                                                         ->  Seq Scan on metrics  (cost= 0.00..25614.71 rows=940271 width=48)
                                                         ->  Seq Scan on metrics_d20070601 metrics  (cost=0.00..26180.85 rows=960985 width=48)
                                                   ->  Hash  (cost=40615.57..40615.57 rows=960986 width=16)
                                                         ->  Nested Loop  (cost=0.00..40615.57 rows=960986 width=16)
                                                               ->  Index Scan using rpt_netw_key_idx on rn  (cost=0.00..16.92 rows=1 width=4)
                                                                     Filter: (id = 10)
                                                               ->  Append  (cost=0.00..30988.79 rows=960986 width=12)
                                                                     ->  Index Scan using rpt_dt_sqldt_idx on date  (cost=0.00..3.02 rows=1 width=12)
                                                                           Index Cond: ((sqldate >= '2007-06-01 00:00:00'::timestamp without time zone) AND (sqldate <= '2007-06-01 00:00:00'::timestamp without time zone))
                                                                     ->  Seq Scan on metrics_d20070601 rpt_date  (cost=0.00..30985.78 rows=960985 width=12)
                                                                           Filter: ((sqldate >= '2007-06-01 00:00:00'::timestamp without time zone) AND (sqldate <= '2007-06-01 00:00:00'::timestamp without time zone))
                                             ->  Hash  (cost=223.18..223.18 rows=9618 width=8)
                                                   ->  Seq Scan on rg  (cost=0.00..223.18 rows=9618 width=8)
                                       ->  Hash  (cost=7367.71..7367.71 rows=325671 width=6)
                                             ->  Seq Scan on rc  (cost=0.00..7367.71 rows=325671 width=6)
                                 ->  Hash  (cost= 1652.51..1652.51 rows=82351 width=28)
                                       ->  Seq Scan on rch  (cost=0.00..1652.51 rows=82351 width=28)
                           ->  Hash  (cost=2283.83..2283.83 rows=81883 width=38)
                                 ->  Seq Scan on rc  (cost=0.00..2283.83 rows=81883 width=38)
                     ->  Hash  (cost=520.63..520.63 rows=28363 width=23)
                           ->  Seq Scan on rs  (cost= 0.00..520.63 rows=28363 width=23)
               ->  Hash  (cost=435.63..435.63 rows=24063 width=24)
                     ->  Seq Scan on radv  (cost=0.00..435.63 rows=24063 width=24)
(41 rows)

Can you let me know how we can avoid the double looping on the metrics table. This been a big table causes the queries to slowdown.


Regards & Thanks,
Nimesh.


On 10/28/07, Heikki Linnakangas <heikki@enterprisedb.com > wrote:
Nimesh Satam wrote:
> Following is the full plan of the query using partition. Let me know if you
> need any further information.

What indexes are there on the table partitions? You didn't post the
query, but it looks like your doing a join between rpt_network and the
partitioned table. An index on the join key might help...

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

Re: Append Cost in query planners

From
Heikki Linnakangas
Date:
Nimesh Satam wrote:
> Heikki,
>
>
> Thanks for your reply. Will try to do the changes and rivert back. I had one
> more requirement for partitioning.
>
> I wanted to inherit two different tables for partition. Below is the query
> used to create the table, an crete the inheritance.
>
>
> CREATE TABLE metrics_d20070601 (CHECK (sqldate = '20070601')) INHERITS
> (metrics, date);
>
> Further more we are using the below mentioned query:
>
> SELECT rs.id AS sid, rs.name AS sname, rc.id AS cid, rc.name AS cname,
> rc.type AS rtype, rc.act_type AS acttype, ra.id AS adid, ra.name AS avname,
> rch.id AS chid, rch.name AS chname, rcr.dim AS dim, SUM(metrics.imp_del) AS
> imp, SUM(metrics.clidel) AS cli, date.sqldate AS date, rg.id AS gid
> FROM metrics, rn CROSS JOIN date, ra, rs, rc, rch, rcr, rg
> WHERE metrics.netkey = rn.key
> AND rn.id = 10
> AND metrics.advkey = ra.key
> AND metrics.campkey = rc.key
> AND metrics.skey = rs.key
> AND metrics.chkey = rch.key
> AND metrics.cr_key = rcr.key
> AND date.sqldate BETWEEN '6/01/2007' AND '6/01/2007'
> AND metrics.gkey = rg.key
> GROUP BY date.sqldate, rs.id, rs.name, ra.id, ra.name, rc.id, rc.name,
> rc.rev_type, rc.act_type, rch.id, rch.name, rcr.dim, rg.id;
>
> And the query execution plan is as below
>
>
> QUERY
> PLAN
>
>
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  GroupAggregate  (cost=589766.28..651315.41 rows=1119075 width=127)
>    ->  Sort  (cost=589766.28..592563.97 rows=1119075 width=127)
>          Sort Key: public.date.sqldate, rs.id, rs.name, ra.id, ra.name,
> rc.id, rc.name, rc.rtype, rc.act_type, rch.id, rch.name, rcr.dim, rg.id
>          ->  Hash Join  (cost=64914.87..433619.51 rows=1119075 width=127)
>                Hash Cond: ("outer".adv_key = "inner"."key")
>                ->  Hash Join  (cost=64419.08..402349.16 rows=1119075
> width=111)
>                      Hash Cond: ("outer".s_key = "inner"."key")
>                      ->  Hash Join  (cost=63827.54..368185.38 rows=1119075
> width=96)
>                            Hash Cond: ("outer".campkey = "inner"."key")
>                            ->  Hash Join
> (cost=61339.00..323731.53rows=1119075 width=66)
>                                  Hash Cond: ("outer".chkey = "inner"."key")
>                                  ->  Hash Join
> (cost=59480.62..293896.26rows=1119075 width=46)
>                                        Hash Cond: ("outer".cr_key =
> "inner"."key")
>                                        ->  Hash Join  (cost=
> 51298.73..243749.06 rows=1119075 width=48)
>                                              Hash Cond: ("outer".gkey =
> "inner"."key")
>                                              ->  Hash Join  (cost=
> 51051.50..204334.21 rows=1119075 width=48)
>                                                    Hash Cond:
> (("outer".netkey = "inner"."key") AND ("outer".date_key = "inner"."key"))
>                                                    ->  Append  (cost=
> 0.00..51795.56 rows=1901256 width=48)
>                                                          ->  Seq Scan on
> metrics  (cost=0.00..25614.71 rows=940271 width=48)
>                                                          ->  Seq Scan on
> metrics_d20070601 metrics  (cost=0.00..26180.85 rows=960985 width=48)
>                                                    ->  Hash  (cost=
> 40615.57..40615.57 rows=960986 width=16)
>                                                          ->  Nested Loop
> (cost=0.00..40615.57 rows=960986 width=16)
>                                                                ->  Index
> Scan using rpt_netw_key_idx on rn  (cost=0.00..16.92 rows=1 width=4)
>                                                                      Filter:
> (id = 10)
>                                                                ->  Append
> (cost=0.00..30988.79 rows=960986 width=12)
>                                                                      ->
> Index Scan using rpt_dt_sqldt_idx on date  (cost=0.00..3.02 rows=1 width=12)
>
> Index Cond: ((sqldate >= '2007-06-01 00:00:00'::timestamp without time zone)
> AND (sqldate <= '2007-06-01 00:00:00'::timestamp without time zone))
>                                                                      ->  Seq
> Scan on metrics_d20070601 rpt_date  (cost=0.00..30985.78 rows=960985
> width=12)
>
> Filter: ((sqldate >= '2007-06-01 00:00:00'::timestamp without time zone) AND
> (sqldate <= '2007-06-01 00:00:00'::timestamp without time zone))
>                                              ->  Hash
> (cost=223.18..223.18rows=9618 width=8)
>                                                    ->  Seq Scan on rg
> (cost=0.00..223.18 rows=9618 width=8)
>                                        ->  Hash
> (cost=7367.71..7367.71rows=325671 width=6)
>                                              ->  Seq Scan on rc  (cost=
> 0.00..7367.71 rows=325671 width=6)
>                                  ->  Hash  (cost=1652.51..1652.51 rows=82351
> width=28)
>                                        ->  Seq Scan on rch  (cost=
> 0.00..1652.51 rows=82351 width=28)
>                            ->  Hash  (cost=2283.83..2283.83 rows=81883
> width=38)
>                                  ->  Seq Scan on rc
> (cost=0.00..2283.83rows=81883 width=38)
>                      ->  Hash  (cost=520.63..520.63 rows=28363 width=23)
>                            ->  Seq Scan on rs  (cost=0.00..520.63 rows=28363
> width=23)
>                ->  Hash  (cost=435.63..435.63 rows=24063 width=24)
>                      ->  Seq Scan on radv  (cost=0.00..435.63 rows=24063
> width=24)
> (41 rows)
>
> Can you let me know how we can avoid the double looping on the metrics
> table. This been a big table causes the queries to slowdown.

Well, if the index on metrics.netkey helps, it doesn't matter if it's
scanned twice.

On a query with that many tables involved, you should try raising
join_collapse_limit from the default. That query accesses 9 tables,
which is just above the default join_collapse_limit of 8, so the planner
is not considering all possible join orders.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

Re: Append Cost in query planners

From
"Nimesh Satam"
Date:
Heikki,


Thanks for the information. join_collapse_limit = 1 is already set before sending the query plan.

Will a index scan on metrics.netkey help in improving the performance what other configuration parameters should we consider while opting for partition?


Regards,
Nimesh.

On 10/29/07, Heikki Linnakangas <heikki@enterprisedb.com> wrote:
Nimesh Satam wrote:
> Heikki,
>
>
> Thanks for your reply. Will try to do the changes and rivert back. I had one
> more requirement for partitioning.
>
> I wanted to inherit two different tables for partition. Below is the query
> used to create the table, an crete the inheritance.
>
>
> CREATE TABLE metrics_d20070601 (CHECK (sqldate = '20070601')) INHERITS
> (metrics, date);
>
> Further more we are using the below mentioned query:
>
> SELECT rs.id AS sid, rs.name AS sname, rc.id AS cid, rc.name AS cname,
> rc.type AS rtype, rc.act_type AS acttype, ra.id AS adid, ra.name AS avname,
> rch.id AS chid, rch.name AS chname, rcr.dim AS dim, SUM(metrics.imp_del) AS
> imp, SUM(metrics.clidel) AS cli, date.sqldate AS date, rg.id AS gid
> FROM metrics, rn CROSS JOIN date, ra, rs, rc, rch, rcr, rg
> WHERE metrics.netkey = rn.key
> AND rn.id = 10
> AND metrics.advkey = ra.key
> AND metrics.campkey = rc.key
> AND metrics.skey = rs.key
> AND metrics.chkey = rch.key
> AND metrics.cr_key = rcr.key
> AND date.sqldate BETWEEN '6/01/2007' AND '6/01/2007'
> AND metrics.gkey = rg.key
> GROUP BY date.sqldate, rs.id , rs.name, ra.id, ra.name, rc.id, rc.name,
> rc.rev_type, rc.act_type, rch.id, rch.name, rcr.dim, rg.id;
>
> And the query execution plan is as below
>
>
> QUERY
> PLAN
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  GroupAggregate  (cost=589766.28..651315.41 rows=1119075 width=127)
>    ->  Sort  (cost=589766.28..592563.97 rows=1119075 width=127)
>          Sort Key: public.date.sqldate, rs.id, rs.name, ra.id, ra.name,
> rc.id, rc.name, rc.rtype, rc.act_type , rch.id, rch.name, rcr.dim, rg.id
>          ->  Hash Join  (cost=64914.87..433619.51 rows=1119075 width=127)
>                Hash Cond: ("outer".adv_key = "inner"."key")
>                ->  Hash Join  (cost=64419.08..402349.16 rows=1119075
> width=111)
>                      Hash Cond: ("outer".s_key = "inner"."key")
>                      ->  Hash Join  (cost= 63827.54..368185.38 rows=1119075
> width=96)
>                            Hash Cond: ("outer".campkey = "inner"."key")
>                            ->  Hash Join
> (cost= 61339.00..323731.53rows=1119075 width=66)
>                                  Hash Cond: ("outer".chkey = "inner"."key")
>                                  ->  Hash Join
> (cost= 59480.62..293896.26rows=1119075 width=46)
>                                        Hash Cond: ("outer".cr_key =
> "inner"."key")
>                                        ->  Hash Join  (cost=
> 51298.73..243749.06 rows=1119075 width=48)
>                                              Hash Cond: ("outer".gkey =
> "inner"."key")
>                                              ->  Hash Join  (cost=
> 51051.50..204334.21 rows=1119075 width=48)
>                                                    Hash Cond:
> (("outer".netkey = "inner"."key") AND ("outer".date_key = "inner"."key"))
>                                                    ->  Append  (cost=
> 0.00..51795.56 rows=1901256 width=48)
>                                                          ->  Seq Scan on
> metrics  (cost= 0.00..25614.71 rows=940271 width=48)
>                                                          ->  Seq Scan on
> metrics_d20070601 metrics  (cost=0.00..26180.85 rows=960985 width=48)
>                                                    ->  Hash  (cost=
> 40615.57..40615.57 rows=960986 width=16)
>                                                          ->  Nested Loop
> (cost=0.00..40615.57 rows=960986 width=16)
>                                                                ->  Index
> Scan using rpt_netw_key_idx on rn  (cost=0.00..16.92 rows=1 width=4)
>                                                                      Filter:
> (id = 10)
>                                                                ->  Append
> (cost=0.00..30988.79 rows=960986 width=12)
>                                                                      ->
> Index Scan using rpt_dt_sqldt_idx on date  (cost=0.00..3.02 rows=1 width=12)
>
> Index Cond: ((sqldate >= '2007-06-01 00:00:00'::timestamp without time zone)
> AND (sqldate <= '2007-06-01 00:00:00'::timestamp without time zone))
>                                                                      ->  Seq
> Scan on metrics_d20070601 rpt_date  (cost=0.00..30985.78 rows=960985
> width=12)
>
> Filter: ((sqldate >= '2007-06-01 00:00:00'::timestamp without time zone) AND
> (sqldate <= '2007-06-01 00:00:00'::timestamp without time zone))
>                                              ->  Hash
> (cost=223.18..223.18rows=9618 width=8)
>                                                    ->  Seq Scan on rg
> (cost=0.00..223.18 rows=9618 width=8)
>                                        ->  Hash
> (cost=7367.71..7367.71rows=325671 width=6)
>                                              ->  Seq Scan on rc  (cost=
> 0.00..7367.71 rows=325671 width=6)
>                                  ->  Hash  (cost=1652.51..1652.51 rows=82351
> width=28)
>                                        ->  Seq Scan on rch  (cost=
> 0.00..1652.51 rows=82351 width=28)
>                            ->  Hash  (cost=2283.83..2283.83 rows=81883
> width=38)
>                                  ->  Seq Scan on rc
> (cost=0.00..2283.83rows=81883 width=38)
>                      ->  Hash  (cost= 520.63..520.63 rows=28363 width=23)
>                            ->  Seq Scan on rs  (cost=0.00..520.63 rows=28363
> width=23)
>                ->  Hash  (cost=435.63..435.63 rows=24063 width=24)
>                      ->  Seq Scan on radv  (cost=0.00..435.63 rows=24063
> width=24)
> (41 rows)
>
> Can you let me know how we can avoid the double looping on the metrics
> table. This been a big table causes the queries to slowdown.

Well, if the index on metrics.netkey helps, it doesn't matter if it's
scanned twice.

On a query with that many tables involved, you should try raising
join_collapse_limit from the default. That query accesses 9 tables,
which is just above the default join_collapse_limit of 8, so the planner
is not considering all possible join orders.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

Re: Append Cost in query planners

From
Heikki Linnakangas
Date:
Nimesh Satam wrote:
> Thanks for the information. join_collapse_limit = 1 is already set before
> sending the query plan.

I assume that was a typo, and you really set it to 10 or higher...

> Will a index scan on metrics.netkey help in improving the performance what
> other configuration parameters should we consider while opting for
> partition?

Whether it helps or not depends on many things, like the distribution of
the data, but it's worth trying. I'd try that first, before worrying
about config parameters.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com