Thread: [Fwd: query efficiency - Can I speed it up?]

[Fwd: query efficiency - Can I speed it up?]

From

ann hedley

Date:

01 February 2007, 03:41:25

--
Ann

"In a world without walls and fences - who needs Windows and Gates ?"
                                                           (unknown)

Dear All

Can anyone tell me if/how I can speed up this query?  You could just
point me at a good 'guide to building efficient queries' if I'm missing
something obvious.  I've tried a few online sources and most of my
queries are much faster now but this one ???

Thanks

nemdb3=# explain select pept_id,description,confidence from gotcha inner
join go on (gotcha.go_term=go.go_term) where go.description~'oxygen' and
( spid='ALP' or spid='ASP' or spid='DIP' or spid='GPP' ) order by
gotcha.confidence;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=187523.03..187523.54 rows=203 width=54)
   Sort Key: gotcha.confidence
   ->  Nested Loop  (cost=0.00..187515.25 rows=203 width=54)
         ->  Seq Scan on go  (cost=0.00..266.50 rows=4 width=50)
               Filter: (description ~ 'oxygen'::text)
         ->  Index Scan using gotcha_go_term_sp_id on gotcha
(cost=0.00..46809.29 rows=232 width=32)
               Index Cond: ((gotcha.go_term)::text =
("outer".go_term)::text)
               Filter: (((spid)::text = 'ALP'::text) OR ((spid)::text =
'ASP'::text) OR ((spid)::text = 'DIP'::text) OR ((spid)::text =
'GPP'::text))
(8 rows)

nemdb3=# \d go
                  Table "public.go"
     Column      |         Type          | Modifiers
-----------------+-----------------------+-----------
 go_term         | character varying(14) |
 go_asp          | character varying(1)  |
 description     | text                  |
 hierarchy_level | integer               |
Indexes:
    "go_go_term" btree (go_term)

nemdb3=# \d gotcha
             Table "public.gotcha"
   Column   |         Type          | Modifiers
------------+-----------------------+-----------
 run_no     | character varying(20) |
 pept_id    | character varying(15) |
 contig     | integer               |
 confidence | integer               |
 go_term    | character varying(14) |
 hits       | integer               |
 spid       | character varying(5)  |
Indexes:
    "gotcha_go_term_sp_id" btree (go_term, spid)
    "gotcha_pept_id" btree (pept_id)
    "gotcha_spid" btree (spid)

nemdb3=# select relname, relkind, reltuples, relpages from pg_class
where relname like 'go%';
        relname         | relkind |  reltuples  | relpages
------------------------+---------+-------------+----------
 gotcha                 | r       | 2.70698e+07 |   695188
 go                     | r       |       10600 |      134
 gotcha_pept_id         | i       | 2.70698e+07 |   104266
 go_go_term             | i       |       10600 |       43
 gotcha_spid            | i       | 2.70698e+07 |    74225
 gotcha_go_term_sp_id   | i       | 2.70698e+07 |   134201
 (11 rows)

nemdb3=#




--
Ann

"In a world without walls and fences - who needs Windows and Gates ?"
                                                           (unknown)

Re: [Fwd: query efficiency - Can I speed it up?]

From

Tom Lane

Date:

01 February 2007, 12:48:38

ann hedley <ann.hedley@ed.ac.uk> writes:
> Can anyone tell me if/how I can speed up this query?

Try a newer version of Postgres --- I think 8.1 is the first one that
can make decent use of that two-column index in this type of query.
Notice that in the plan, only the "go_term" column is being checked
in the index condition:

>          ->  Index Scan using gotcha_go_term_sp_id on gotcha
> (cost=0.00..46809.29 rows=232 width=32)
>                Index Cond: ((gotcha.go_term)::text =
> ("outer".go_term)::text)
>                Filter: (((spid)::text = 'ALP'::text) OR ((spid)::text =
> 'ASP'::text) OR ((spid)::text = 'DIP'::text) OR ((spid)::text =
> 'GPP'::text))

I take it from the enormous cost that there are going to be lots of rows
with the same go_term, and it's the spid filter that is cutting it down
to a reasonable number of rows ... but this plan is going to visit the
heap for every row matching go_term, because the planner isn't smart
enough to fold the OR'd restriction clause together with the join
clause to make an index condition.  It applies it as a "filter" instead
which is way way slower in this situation.  I can't tell exactly which
PG release you're using, but it's definitely older than 8.1.

            regards, tom lane

Re: [Fwd: query efficiency - Can I speed it up?]

From

ann hedley

Date:

01 February 2007, 13:25:17

psql (PostgreSQL) 8.0.3

Thanks, I'll try an upgrade.

Tom Lane wrote:
> ann hedley <ann.hedley@ed.ac.uk> writes:
>
>> Can anyone tell me if/how I can speed up this query?
>>
>
> Try a newer version of Postgres --- I think 8.1 is the first one that
> can make decent use of that two-column index in this type of query.
> Notice that in the plan, only the "go_term" column is being checked
> in the index condition:
>
>
>>          ->  Index Scan using gotcha_go_term_sp_id on gotcha
>> (cost=0.00..46809.29 rows=232 width=32)
>>                Index Cond: ((gotcha.go_term)::text =
>> ("outer".go_term)::text)
>>                Filter: (((spid)::text = 'ALP'::text) OR ((spid)::text =
>> 'ASP'::text) OR ((spid)::text = 'DIP'::text) OR ((spid)::text =
>> 'GPP'::text))
>>
>
> I take it from the enormous cost that there are going to be lots of rows
> with the same go_term, and it's the spid filter that is cutting it down
> to a reasonable number of rows ... but this plan is going to visit the
> heap for every row matching go_term, because the planner isn't smart
> enough to fold the OR'd restriction clause together with the join
> clause to make an index condition.  It applies it as a "filter" instead
> which is way way slower in this situation.  I can't tell exactly which
> PG release you're using, but it's definitely older than 8.1.
>
>             regards, tom lane
>
>


--
Ann

"In a world without walls and fences - who needs Windows and Gates ?"
                                                           (unknown)