Thread: Yet another 'why does it not use my index' question.

Yet another 'why does it not use my index' question.

From

"Ryan"

Date:

07 May 2003, 12:07:31

Ok, I have two tables (Postgresql 7.3.2 on Debian):

               Table "public.zip"
   Column   |         Type          | Modifiers
------------+-----------------------+-----------
 zip        | character varying(5)  |
 city       | character varying(25) |
 county     | character varying(30) |
 countyfips | character varying(5)  |
 state_full | character varying(30) |
 state      | character varying(2)  |
 citytype   | character(1)          |
 zipcodetyp | character(1)          |
 areacode   | character varying(3)  |
 timezone   | character varying(10) |
 dst        | character(1)          |
 latitude   | double precision      |
 longitude  | double precision      |
 country    | character varying(10) |
Indexes: zip_idx btree (zip)

   Table "public.client_options"
    Column    |  Type  | Modifiers
--------------+--------+-----------
 client_id    | bigint | not null
 option_name  | text   | not null
 option_value | text   | not null
Foreign Key constraints: [...omitted...]

I wanted to do the following:

midas=# explain analyze select * from zip where zip in
 (select option_value from client_options where option_name = 'ZIP_CODE' );
                               QUERY PLAN
---------------------------------------------------------------------------
 Seq Scan on zip  (cost=0.00..206467.85 rows=38028 width=112)
                  (actual time=58.45..4676.76 rows=8 loops=1)
   Filter: (subplan)
   SubPlan
     ->  Seq Scan on client_options  (cost=0.00..5.36 rows=3 width=14)
                                (actual time=0.02..0.05 rows=3 loops=76056)
           Filter: (option_name = 'ZIP_CODE'::text)
 Total runtime: 4676.87 msec

Or even:

midas=# explain analyze select * from zip z, client_options c where
c.option_name = 'ZIP_CODE' and c.option_value = z.zip;
                              QUERY PLAN
---------------------------------------------------------------------------
 Nested Loop  (cost=0.00..9915.14 rows=10 width=148)
              (actual time=26.63..2864.01 rows=8 loops=1)
   Join Filter: ("outer".option_value = ("inner".zip)::text)
   ->  Seq Scan on client_options c  (cost=0.00..5.36 rows=3 width=36)
                                    (actual time=0.25..0.34 rows=3 loops=1)
         Filter: (option_name = 'ZIP_CODE'::text)
   ->  Seq Scan on zip z  (cost=0.00..2352.56 rows=76056 width=112)
                          (actual time=0.07..809.19 rows=76056 loops=3)
 Total runtime: 2864.16 msec


If I wanted to do select the zip codes out of the client_options and then
select the zipcodes seperately, I would be looking at times of .14 msec
and 222.82 msec respectively.

Oh, and yes, I have done a vacuum analyze.

(the reason I'm trying to join these tables is to get longitude and
latitude coordinates to use with the earthdistance <@> operator, it just
takes entirely too long)

What am I doing wrong?

Ryan

Re: Yet another 'why does it not use my index' question.

From

Bruno Wolff III

Date:

07 May 2003, 12:27:03

On Wed, May 07, 2003 at 09:11:49 -0500,
  Ryan <pgsql-performance@seahat.com> wrote:
> I wanted to do the following:
>
> midas=# explain analyze select * from zip where zip in
>  (select option_value from client_options where option_name = 'ZIP_CODE' );

Until 7.4 comes out IN will be slow and you should use a join to do this.

> midas=# explain analyze select * from zip z, client_options c where
> c.option_name = 'ZIP_CODE' and c.option_value = z.zip;

I think the problem here might be related to option_value being text
and zip being char varying. This might prevent an index from being used
to do the join.

Re: Yet another 'why does it not use my index' question.

From

"Ryan"

Date:

07 May 2003, 12:39:14

> On Wed, May 07, 2003 at 09:11:49 -0500,
>   Ryan <pgsql-performance@seahat.com> wrote:
>> I wanted to do the following:
>>
>> midas=# explain analyze select * from zip where zip in
>>  (select option_value from client_options where option_name =
>> 'ZIP_CODE' );
>
> Until 7.4 comes out IN will be slow and you should use a join to do
> this.
>
>> midas=# explain analyze select * from zip z, client_options c where
>> c.option_name = 'ZIP_CODE' and c.option_value = z.zip;
>
> I think the problem here might be related to option_value being text and
> zip being char varying. This might prevent an index from being used to
> do the join.
HMMMM.  I'll have to re-insert that table (it was a dbf2pg job) and change
that.  Any reason why postgres is so picky about varchar/text conversion,
considering they are practally the same thing?

Something intresting however.  If I do this:
select * from zip where zip = 98404;
I get a seq scan, as postgres types it to text.

but if I do this:
select * from zip where zip = '98404';
Postgres types it as character varying and uses the index.

Not that it would happen any time soon, but it would be nice if explain
analyze would tell you why it chose an seq scan on an indexed field.
(e.g. You should know better than to try an index with a different type!)

Ryan