Query does not use index - Mailing list pgsql-novice

From Martin Hampl
Subject Query does not use index
Date
Msg-id 495395E0-9A14-11D8-81E2-000393674318@gmx.de
Whole thread Raw
Responses Re: Query does not use index
List pgsql-novice
Hi,

I hope this message won't be sent several times to the list. If so,
please accept my apologies.

It would be very nice, if someone could have a look at the query I'm
trying to optimize. At the moment, I don't understand PostgreSQL's
behaviour and are stuck. Thanks a lot in advance.

I am using PostgreSQL 7.4.1

In my database are the following tables:

              Table "public.token"
   Column  |          Type          | Modifiers
----------+------------------------+-----------
  text_id  | integer                | not null
  position | integer                | not null
  word     | character varying(255) |
Indexes:
     "token_pkey" primary key, btree (text_id, "position")
     "word_idx" btree (word)

and

        Table "public.s"
  Column  |  Type   | Modifiers
---------+---------+-----------
  text_id | integer | not null
  s       | integer | not null
  start   | integer |
  stop    | integer |
Indexes:
     "s_pkey" primary key, btree (text_id, s)
     "s_begin_idx" btree (text_id, "start")
     "s_end_idx" btree (text_id, stop)
Foreign-key constraints:
     "$1" FOREIGN KEY (text_id, "start") REFERENCES token(text_id,
"position")
     "$2" FOREIGN KEY (text_id, stop) REFERENCES token(text_id,
"position")

(I hope it's readable)

I have the following query:

select * from
     token,
     s
where
     token.word = 'FACTSHEET' and
     s.text_id = token.text_id and
     s.start = token.position


and PostgreSQL generates the following query plan

                                             QUERY PLAN
------------------------------------------------------------------------
--------------------------
  Nested Loop  (cost=0.00..39120.95 rows=1 width=32)
    ->  Index Scan using s_pkey on s  (cost=0.00..24698.44 rows=3367
width=16)
    ->  Index Scan using token_pkey on token  (cost=0.00..4.27 rows=1
width=16)
          Index Cond: (("outer".text_id = token.text_id) AND
("outer"."start" = token."position"))
          Filter: ((word)::text = 'FACTSHEET'::text)

and the execution takes ages.

Now, 'FACTSHEET' is *very* seldom (5 of 23 million) and I intended it
to first search for 'FACTSHEET' and then use index s_begin_idx (and the
values of token.text_id and token.position) to derive the corresponding
records in s.

I tried to force PostgreSQL in doing it in this order by trying the
following:

select * from (select text_id, position from token where word =
'FACTSHEET') t left join s on (s.text_id = t.text_id and  s.start =
t.position )

                                           QUERY PLAN
------------------------------------------------------------------------
-----------------------
  Hash Left Join  (cost=24715.28..56630.78 rows=7892 width=24)
    Hash Cond: (("outer".text_id = "inner".text_id) AND
("outer"."position" = "inner"."start"))
    ->  Index Scan using word_idx on token  (cost=0.00..31402.51
rows=7892 width=8)
          Index Cond: ((word)::text = 'FACTSHEET'::text)
    ->  Hash  (cost=24698.44..24698.44 rows=3367 width=16)
          ->  Index Scan using s_pkey on s  (cost=0.00..24698.44
rows=3367 width=16)


Now it *does* first search for 'FACTSHEET' but it still does not use
s_begin_idx and I have no idea why. Any ideas, what I could do?

Thanks,
Martin.



pgsql-novice by date:

Previous
From: daq
Date:
Subject: Re: starting posgresql for the first time and most probably last :)
Next
From: Tom Lane
Date:
Subject: Re: Query does not use index