Performance in subquery - Mailing list pgsql-general

From Brian Haney
Subject Performance in subquery
Date
Msg-id 002601bfa8a6$70aa26e0$8101a8c0@specter.fresno.cybernaut.com
Whole thread Raw
List pgsql-general
I have multivendor product catalog and want to be able to search for
arbitrary text in a table of product attributes.  Two of the several
tables are:

Table    = products
+----------------------------------+----------------------------------
+-------+
|              Field               |              Type
| Length|
+----------------------------------+----------------------------------
+-------+
| prod_id                          | int4 not null default nextval
 |     4 |
| vendor_id                        | int4
|     4 |
| category_id                      | int4
|     4 |
| prod_name                        | text not null
|   var |
| prod_description                 | text not null
|   var |
| prod_price_low                   | numeric
| 30.6  |
| prod_price_high                  | numeric
| 30.6  |
| prod_promo                       | text
|   var |
| prod_promo_url                   | text
|   var |
| prod_datasht_url                 | text
|   var |
| prod_buynow_url                  | text
|   var |
| prod_asp_url                     | text
|   var |
| prod_active                      | bool default 'false'
|     1 |
| last_modified                    | timestamp
|     4 |
| prod_demo_url                    | text
|   var |
| prod_ephone                      | text
|   var |
+----------------------------------+----------------------------------
+-------+
Index:    products_pkey

Table    = prod_attr
+----------------------------------+----------------------------------
+-------+
|              Field               |              Type
| Length|
+----------------------------------+----------------------------------
+-------+
| attr_id                          | int4 not null
|     4 |
| prod_id                          | int4 not null
|     4 |
| prod_attr_text                   | text not null
|   var |
+----------------------------------+----------------------------------
+-------+
Indices:  prod_attr_pkey
          prod_attr_text_idx

The products table has 538 records and the prod_attr table has 7870.

When I enter the query:

SELECT prod_id FROM prod_attr WHERE prod_attr_text LIKE '%Linux%';

it performs quite well to give me a list of the product IDs almost
instantaneously.

But when I query:

select prod_name from products where prod_id in ( SELECT prod_id FROM
prod_attr
WHERE prod_attr_text LIKE '%Linux%');

It takes over 30 seconds to get the results.  Here are the EXPLAINs:

First query:

Seq Scan on prod_attr  (cost=317.84 rows=1 width=4)

Second query:

Seq Scan on products  (cost=31.75 rows=538 width=12)
  SubPlan
    ->  Seq Scan on prod_attr  (cost=317.84 rows=1 width=4)

As you can see, I have created and index for prod_attr
(prod_attr_text), but it has had no discernable effect.

The explains imply to me that the real cost of the second query is
scanning the 7870 records, but then I would expect the second query to
take only slightly longer than the first.  I suspect something else is
causing the second query to be so doggone slow.

Any help would be greatly appreciated.

--
Brian Haney  VP Engineering/CTO
brian@ibsystems.com  Internet Business Systems, Inc.


pgsql-general by date:

Previous
From: Jeffrey
Date:
Subject: Re: excell to postgres
Next
From: Hal Snyder
Date:
Subject: Re: excell to postgres