Home > mailing lists

Re: Optimizing nested loops in the query plan - Mailing list pgsql-general

From	Masaru Sugawara
Subject	Re: Optimizing nested loops in the query plan
Date	February 8, 2002 14:51:30
Msg-id	20020209042826.ABE3.RK73@echna.ne.jp Whole thread Raw
In response to	Optimizing nested loops in the query plan (Fran Fabrizio <ffabrizio@mmrd.com>)
List	pgsql-general

Tree view

On Wed, 06 Feb 2002 17:36:43 -0500
Fran Fabrizio <ffabrizio@mmrd.com> wrote:


> This is a query I'm working on at the moment.  It's taking about 4
> seconds to run, and I see that most of the cost is on the Nested Loop
> area of the explain output.  Is there a better way to write this query
> to improve execution time, the tables aren't that large and I feel like
> there's a good chance I can get it to run faster.   I'm not too familiar
> with which part of the query is causing the nested loop and how to
> optimize that sort of thing.  findregion() is a function I wrote which
> executes very quickly.  'select findregion(entity_id) from
> current_status' returns in subsecond time so I don't think this is the
> problem.  Thanks for any tips you can give me.
>
> explain select findregion(cs.entity_id) as region, r.name, cs.status,
> count(*) from current_status cs, region r where r.region_id =
> findregion(cs.entity_id) group by region, r.name, cs.status;


 I don't know how many rows the tables have and the query returns,
 but the more rows they have, the more chance to reduce the execution
 time there seems to be.  In case of the small data, on the other hand,
 execution time of the following query may be longer than yours.


-- If the region table doesn't have indices
(1)create index idx_region_region_id on region(region_id);

(2)explain
   select cs.entity_id as region, r.name, cs.status, count(*)
    from region r inner join
         (select distinct findregion(entity_id) as entity_id, status
          from current_status
          order by 1
         ) as cs on  r.region_id = cs.entity_id
   group by region, r.name, cs.status;

NOTICE:  QUERY PLAN:
Aggregate  (cost=13688.40..14338.40 rows=6500 width=24)
  ->  Group  (cost=13688.40..14175.90 rows=65000 width=24)
     ->  Sort  (cost=13688.40..13688.40 rows=65000 width=24)
          ->  Merge Join  (cost=7522.19..7674.94 rows=65000 width=24)
                ->  Index Scan using region_pkey on region r
                   (cost=0.00 59.00 rows=1000 width=16)
                ->  Sort  (cost=7522.19..7522.19 rows=6500 width=8)
                      ->  Subquery Scan cs  (cost=6785.54..7110.54
                                             rows=65 width=8)
                            ->  Unique  (cost=6785.54..7110.54 rows=6500
                                         with=8)
                                ->  Sort  (cost=6785.54..6785.54 rows=650
                                            width=8)
                                      ->  Seq Scan on current_status
                                        (st=0.00..1065.00 rows=65000 width=8)



-- And, if the findregion() function always returns the same result
-- when given the same arguments, you'll be able to create an index
-- on the function. The query time may become shorter.

(3)create function findregion(int4) returns int4 as '
   ...
   ' language 'plpgsql' with (iscachable);

(4)create index idx_findregion on current_status(findregion(entity_id));

(5)explain
   select cs.entity_id as region, r.name, cs.status, count(*)
    from region r inner join
         (select findregion(entity_id) as entity_id, status
          from current_status
          order by 1
         ) as cs on  r.region_id = cs.entity_id
   group by region, r.name, cs.status;

NOTICE:  QUERY PLAN:
Aggregate  (cost=106808.97..113308.97 rows=65000 width=24)
  ->  Group  (cost=106808.97..111683.97 rows=650000 width=24)
        ->  Sort  (cost=106808.97..106808.97 rows=650000 width=24)
              ->  Merge Join  (cost=9077.21..9961.21 rows=650000 width=24)
                    ->  Index Scan using region_pkey on region2 r
                         (cost=0.00..59.00 rows=1000 width=16)
                    ->  Sort  (cost=9077.21..9077.21 rows=65000 width=8)
                          ->  Subquery Scan cs  (cost=0.00..3356.67
                                          rows=65000 width=8)
                                ->  Index Scan using idx_findregion on
                                         current_status2 (cost=0.00..3356.67
                                             rows=65000 width=8)


Regards,
Masaru Sugawara

pgsql-general by date:

From: Mayan
Date: 08 February 2002, 14:22:02
Subject: First time installer !!

From: Jeff Davis
Date: 08 February 2002, 15:31:22
Subject: Re: First time installer !!

Re: Optimizing nested loops in the query plan - Mailing list pgsql-general

Previous

Next