Re: Query using SeqScan instead of IndexScan - Mailing list pgsql-performance

From Brendan Duddridge
Subject Re: Query using SeqScan instead of IndexScan
Date
Msg-id EBEB7ACF-84A8-4FCF-A481-8C087FF372D1@clickspace.com
Whole thread Raw
In response to Re: Query using SeqScan instead of IndexScan  ("Jim C. Nasby" <jnasby@pervasive.com>)
Responses Re: Query using SeqScan instead of IndexScan
List pgsql-performance
Hi Jim,

I'm not quite sure what you mean by the correlation of category_id?
The category_id is part of a compound primary key in the
category_product
table. The primary key on category_product is (category_id, product_id).

Here's the definitions of the two tables involved in the join:

             Table "public.category_product"
        Column        |         Type         | Modifiers
---------------------+----------------------+-----------
category_id         | integer              | not null
product_id          | integer              | not null
en_name_sort_order  | integer              |
fr_name_sort_order  | integer              |
merchant_sort_order | integer              |
price_sort_order    | integer              |
merchant_count      | integer              |
is_active           | character varying(5) |
Indexes:
     "x_category_product_pk" PRIMARY KEY, btree (category_id,
product_id)
     "category_product__is_active_idx" btree (is_active)
     "category_product__merchant_sort_order_idx" btree
(merchant_sort_order)
     "x_category_product__category_id_fk_idx" btree (category_id)
CLUSTER
     "x_category_product__product_id_fk_idx" btree (product_id)
Foreign-key constraints:
     "x_category_product_category_fk" FOREIGN KEY (category_id)
REFERENCES category(category_id) DEFERRABLE INITIALLY DEFERRED
     "x_category_product_product_fk" FOREIGN KEY (product_id)
REFERENCES product(product_id) DEFERRABLE INITIALLY DEFERRED



              Table "public.product_attribute_value"
            Column           |         Type          | Modifiers
----------------------------+-----------------------+-----------
attribute_id               | integer               | not null
attribute_unit_id          | integer               |
attribute_value_id         | integer               |
boolean_value              | character varying(5)  |
decimal_value              | numeric(30,10)        |
product_attribute_value_id | integer               | not null
product_id                 | integer               | not null
product_reference_id       | integer               |
status_code                | character varying(32) |
Indexes:
     "product_attribute_value_pk" PRIMARY KEY, btree
(product_attribute_value_id)
     "product_attribute_value__attribute_id_fk_idx" btree (attribute_id)
     "product_attribute_value__attribute_unit_id_fk_idx" btree
(attribute_unit_id)
     "product_attribute_value__attribute_value_id_fk_idx" btree
(attribute_value_id)
     "product_attribute_value__product_id_fk_idx" btree (product_id)
     "product_attribute_value__product_reference_id_fk_idx" btree
(product_reference_id)
Foreign-key constraints:
     "product_attribute_value_attribute_fk" FOREIGN KEY
(attribute_id) REFERENCES attribute(attribute_id) DEFERRABLE
INITIALLY DEFERRED
     "product_attribute_value_attributeunit_fk" FOREIGN KEY
(attribute_unit_id) REFERENCES attribute_unit(attribute_unit_id)
DEFERRABLE INITIALLY DEFERRED
     "product_attribute_value_attributevalue_fk" FOREIGN KEY
(attribute_value_id) REFERENCES attribute_value(attribute_value_id)
DEFERRABLE INITIALLY DEFERRED
     "product_attribute_value_product_fk" FOREIGN KEY (product_id)
REFERENCES product(product_id) DEFERRABLE INITIALLY DEFERRED
     "product_attribute_value_productreference_fk" FOREIGN KEY
(product_reference_id) REFERENCES product(product_id) DEFERRABLE
INITIALLY DEFERRED


Not sure if that helps answer your question, but the query is pretty
slow. Sometimes it takes 5 - 15 seconds depending on the category_id
specified.

Thanks,

____________________________________________________________________
Brendan Duddridge | CTO | 403-277-5591 x24 |  brendan@clickspace.com

ClickSpace Interactive Inc.
Suite L100, 239 - 10th Ave. SE
Calgary, AB  T2G 0V9

http://www.clickspace.com

On Mar 31, 2006, at 8:59 AM, Jim C. Nasby wrote:

> What's the correlation of category_id? The current index scan cost
> estimator places a heavy penalty on anything with a correlation much
> below about 90%.
>
> On Wed, Mar 29, 2006 at 08:12:28PM -0700, Brendan Duddridge wrote:
>> Hi,
>>
>> I have a query that is using a sequential scan instead of an index
>> scan. I've turned off sequential scans and it is in fact faster with
>> the index scan.
>>
>> Here's my before and after.
>>
>> Before:
>>
>> ssdev=# SET enable_seqscan TO DEFAULT;
>> ssdev=# explain analyze select cp.product_id
>>         from category_product cp, product_attribute_value pav
>>         where cp.category_id = 1001082 and cp.product_id =
>>         pav.product_id;
>>
>>
>>            QUERY PLAN
>> ---------------------------------------------------------------------
>> ---
>> ---------------------------------------------------------------------
>> ---
>> ------------------------------
>> Hash Join  (cost=25.52..52140.59 rows=5139 width=4) (actual
>> time=4.521..2580.520 rows=19695 loops=1)
>>    Hash Cond: ("outer".product_id = "inner".product_id)
>>    ->  Seq Scan on product_attribute_value pav  (cost=0.00..40127.12
>> rows=2387312 width=4) (actual time=0.039..1469.295 rows=2385846
>> loops=1)
>>    ->  Hash  (cost=23.10..23.10 rows=970 width=4) (actual
>> time=2.267..2.267 rows=1140 loops=1)
>>          ->  Index Scan using x_category_product__category_id_fk_idx
>> on category_product cp  (cost=0.00..23.10 rows=970 width=4) (actual
>> time=0.122..1.395 rows=1140 loops=1)
>>                Index Cond: (category_id = 1001082)
>> Total runtime: 2584.221 ms
>> (7 rows)
>>
>>
>> After:
>>
>> ssdev=# SET enable_seqscan TO false;
>> ssdev=# explain analyze select cp.product_id
>>         from category_product cp, product_attribute_value pav
>>         where cp.category_id = 1001082 and cp.product_id =
>>         pav.product_id;
>>
>>
>>               QUERY PLAN
>> ---------------------------------------------------------------------
>> ---
>> ---------------------------------------------------------------------
>> ---
>> -------------------------------------
>> Nested Loop  (cost=0.00..157425.22 rows=5139 width=4) (actual
>> time=0.373..71.177 rows=19695 loops=1)
>>    ->  Index Scan using x_category_product__category_id_fk_idx on
>> category_product cp  (cost=0.00..23.10 rows=970 width=4) (actual
>> time=0.129..1.438 rows=1140 loops=1)
>>          Index Cond: (category_id = 1001082)
>>    ->  Index Scan using product_attribute_value__product_id_fk_idx
>> on product_attribute_value pav  (cost=0.00..161.51 rows=61 width=4)
>> (actual time=0.016..0.053 rows=17 loops=1140)
>>          Index Cond: ("outer".product_id = pav.product_id)
>> Total runtime: 74.747 ms
>> (6 rows)
>>
>> There's quite a big difference in speed there. 2584.221 ms vs. 74.747
>> ms.
>>
>> Any ideas what I can do to improve this without turning sequential
>> scanning off?
>>
>> Thanks,
>>
>> ____________________________________________________________________
>> Brendan Duddridge | CTO | 403-277-5591 x24 |  brendan@clickspace.com
>>
>> ClickSpace Interactive Inc.
>> Suite L100, 239 - 10th Ave. SE
>> Calgary, AB  T2G 0V9
>>
>> http://www.clickspace.com
>>
>
>
>
> --
> Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
> Pervasive Software      http://pervasive.com    work: 512-231-6117
> vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that
> your
>        message can get through to the mailing list cleanly
>


Attachment

pgsql-performance by date:

Previous
From: "Magnus Hagander"
Date:
Subject: Re: [Solved] Slow performance on Windows .NET and OleDb
Next
From: "chris smith"
Date:
Subject: Re: Query using SeqScan instead of IndexScan