More index / search speed questions - Mailing list pgsql-general

From Mitch Vincent
Subject More index / search speed questions
Date
Msg-id 00ee01c023fb$a6e98860$0200000a@doot
Whole thread Raw
Responses Re: More index / search speed questions
List pgsql-general
This is a bit long, sorry about that..

ipa=# explain select a.app_id, a.appcode, a.firstname, a.middlename,
a.lastname, a.state, a.degree1, a.d1date, a.degree2, a.d2date, a.salary,
a.skill1, a.skill2, a.skill3, a.objective, a.employer, a.sic1, a.sic2,
a.sic3, a.prefs1, a.prefs2, a.sells, a.profile from applicants as
a,resumes_fti as rf,profile_fti as pf where (( pf.string='engineer' ) OR (
rf.string='engineer' ) ) and a.app_id=rf.app_id and a.app_id=pf.app_id order
by (case when a.resubmitted > a.created then a.resubmitted else a.created
end) desc limit 10 offset 0 ;
NOTICE:  QUERY PLAN:

Sort  (cost=167674.25..167674.25 rows=69399 width=252)
  ->  Merge Join  (cost=2096.22..160747.54 rows=69399 width=252)
        ->  Merge Join  (cost=2096.22..13223.87 rows=300530 width=236)
              ->  Index Scan using profile_fti_id_index on profile_fti pf
(cost=0.00..7261.23 rows=300530 width=16)
              ->  Sort  (cost=2096.22..2096.22 rows=8784 width=220)
                    ->  Seq Scan on applicants a  (cost=0.00..1520.84
rows=8784 width=220)
        ->  Index Scan using resumes_fti_id_index on resumes_fti rf
(cost=0.00..103748.68 rows=2200898 width=16)

EXPLAIN

Nasty slow, it hasn't even finished yet (and it's been at least 15
minutes)..

The resumes_fti table is huge, no doubt (2200898 rows) and the profile_fti
table is pretty big too (300573 rows).. Indexes are on all the app_id
columns (in all tables) ,  rf.string and pf.string, a.resubmitted and
a.created.

The merge join is evidently what's the killer and the seqscan on applicants
(10,000ish rows) doesn't help..

I'm looking for any way to speed this up some.. If I search on the
applicants table too then the search is a lot faster :

ipa=# explain select a.app_id, a.appcode, a.firstname, a.middlename,
a.lastname, a.state, a.degree1, a.d1date, a.degree2, a.d2date, a.salary,
a.skill1, a.skill2, a.skill3, a.objective, a.employer, a.sic1, a.sic2,
a.sic3, a.prefs1, a.prefs2, a.sells, a.profile from applicants as
a,resumes_fti as rf,profile_fti as pf where ( ( a.skill1=123 OR a.skill2=123
OR a.skill3=123 )) AND (( pf.string='engineer' OR pf.string='process' ) OR
( rf.string='engineer' OR rf.string='process' ) ) and a.app_id=rf.app_id and
a.app_id=pf.app_id order by (case when a.resubmitted > a.created then
a.resubmitted else a.created end) desc limit 10 offset 0

NOTICE:  QUERY PLAN:

Sort  (cost=48181.12..48181.12 rows=1754 width=252)
  ->  Merge Join  (cost=33645.39..48086.60 rows=1754 width=252)
        ->  Index Scan using profile_fti_id_index on profile_fti pf
(cost=0.00..7261.23 rows=300530 width=16)
        ->  Sort  (cost=33645.39..33645.39 rows=18580 width=236)
              ->  Nested Loop  (cost=0.00..32327.91 rows=18580 width=236)
                    ->  Index Scan using applicants_skill1,
applicants_skill2, applicants_skill3 on applicants a  (cost=0.00..294.14
rows=111 width=220)
                    ->  Index Scan using resumes_fti_id_index on resumes_fti
rf  (cost=0.00..287.31 rows=84 width=16)

EXPLAIN

...That takes only a few seconds..

So, can anyone smack me around with a clue stick?

-Mitch


pgsql-general by date:

Previous
From: "Rob Hutton"
Date:
Subject: RE: replication
Next
From: "Adam Lang"
Date:
Subject: Re: replication