Re: Query using SeqScan instead of IndexScan - Mailing list pgsql-performance

From Jim C. Nasby
Subject Re: Query using SeqScan instead of IndexScan
Date
Msg-id 20060331155911.GP49405@pervasive.com
Whole thread Raw
In response to Query using SeqScan instead of IndexScan  (Brendan Duddridge <brendan@clickspace.com>)
Responses Re: Query using SeqScan instead of IndexScan  (Brendan Duddridge <brendan@clickspace.com>)
List pgsql-performance
What's the correlation of category_id? The current index scan cost
estimator places a heavy penalty on anything with a correlation much
below about 90%.

On Wed, Mar 29, 2006 at 08:12:28PM -0700, Brendan Duddridge wrote:
> Hi,
>
> I have a query that is using a sequential scan instead of an index
> scan. I've turned off sequential scans and it is in fact faster with
> the index scan.
>
> Here's my before and after.
>
> Before:
>
> ssdev=# SET enable_seqscan TO DEFAULT;
> ssdev=# explain analyze select cp.product_id
>         from category_product cp, product_attribute_value pav
>         where cp.category_id = 1001082 and cp.product_id =
>         pav.product_id;
>
>
>            QUERY PLAN
> ------------------------------------------------------------------------
> ------------------------------------------------------------------------
> ------------------------------
> Hash Join  (cost=25.52..52140.59 rows=5139 width=4) (actual
> time=4.521..2580.520 rows=19695 loops=1)
>    Hash Cond: ("outer".product_id = "inner".product_id)
>    ->  Seq Scan on product_attribute_value pav  (cost=0.00..40127.12
> rows=2387312 width=4) (actual time=0.039..1469.295 rows=2385846 loops=1)
>    ->  Hash  (cost=23.10..23.10 rows=970 width=4) (actual
> time=2.267..2.267 rows=1140 loops=1)
>          ->  Index Scan using x_category_product__category_id_fk_idx
> on category_product cp  (cost=0.00..23.10 rows=970 width=4) (actual
> time=0.122..1.395 rows=1140 loops=1)
>                Index Cond: (category_id = 1001082)
> Total runtime: 2584.221 ms
> (7 rows)
>
>
> After:
>
> ssdev=# SET enable_seqscan TO false;
> ssdev=# explain analyze select cp.product_id
>         from category_product cp, product_attribute_value pav
>         where cp.category_id = 1001082 and cp.product_id =
>         pav.product_id;
>
>
>               QUERY PLAN
> ------------------------------------------------------------------------
> ------------------------------------------------------------------------
> -------------------------------------
> Nested Loop  (cost=0.00..157425.22 rows=5139 width=4) (actual
> time=0.373..71.177 rows=19695 loops=1)
>    ->  Index Scan using x_category_product__category_id_fk_idx on
> category_product cp  (cost=0.00..23.10 rows=970 width=4) (actual
> time=0.129..1.438 rows=1140 loops=1)
>          Index Cond: (category_id = 1001082)
>    ->  Index Scan using product_attribute_value__product_id_fk_idx
> on product_attribute_value pav  (cost=0.00..161.51 rows=61 width=4)
> (actual time=0.016..0.053 rows=17 loops=1140)
>          Index Cond: ("outer".product_id = pav.product_id)
> Total runtime: 74.747 ms
> (6 rows)
>
> There's quite a big difference in speed there. 2584.221 ms vs. 74.747
> ms.
>
> Any ideas what I can do to improve this without turning sequential
> scanning off?
>
> Thanks,
>
> ____________________________________________________________________
> Brendan Duddridge | CTO | 403-277-5591 x24 |  brendan@clickspace.com
>
> ClickSpace Interactive Inc.
> Suite L100, 239 - 10th Ave. SE
> Calgary, AB  T2G 0V9
>
> http://www.clickspace.com
>



--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

pgsql-performance by date:

Previous
From: "Jim C. Nasby"
Date:
Subject: Re: Indexes with descending date columns
Next
From: Frederic Back
Date:
Subject: un-'vacuum analyse'