Re: Full text indexing preformance! (long) - Mailing list pgsql-hackers
From | Mitch Vincent |
---|---|
Subject | Re: Full text indexing preformance! (long) |
Date | |
Msg-id | 005e01bfc9a8$1a3f2da0$0300000a@doot.org Whole thread Raw |
In response to | Full text indexing preformance! (long) ("Mitch Vincent" <mitch@venux.net>) |
Responses |
Re: Full text indexing preformance! (long)
|
List | pgsql-hackers |
Hi Tom, thanks for your reply.. I increased BLKSZ to 32k and re-compiled, then I imported all the resumes (some of which I couldn't get before) and this problem completly disappeared. The query is very fast now (.0.039792 seconds to be exact).. One thing I did run into was this... In my paging system I only have a need for 10 records at a time so I LIMIT the query. The problem comes when I need to get a total of all the records that matched the query (as a good search engine, I must tell people how many records were found).. I can't count() and LIMIT in the same query, so I'm forced to do 2 queries, one with count() and one without. An example : select a.appcode, a.firstname, a.middlename, a.lastname, a.state, a.degree1, a.d1date, a.degree2, a.d2date, a.salary, a.skill1, a.skill2, a.skill3, a.objective, a.employer, a.sic1, a.sic2, a.sic3, a.prefs1, a.prefs2, a.sells from applicants as a,applicants_states as s, applicants_resumes as ar,resumes_fti as rf where a.status = 'A' and lower(a.firstname) ~ lower('^a') and s.rstate='AL' and rf.string ~'^engineer' and a.app_id=s.app_id and ar.app_id=a.app_id and rf.id=ar.oid limit 10 offset 0 Vs. select count (a.app_id) as total from applicants as a,applicants_states as s, applicants_resumes as ar,resumes_fti as rf where a.status = 'A' and lower(a.firstname) ~ lower('^a') and s.rstate='AL' and rf.string ~'^engineer' and a.app_id=s.app_id and ar.app_id=a.app_id and rf.id=ar.oid However the count() query has to go through the entire record set (which makes sense) but it takes about 4 or 5 seconds. The plan for the count() query. NOTICE: QUERY PLAN: Aggregate (cost=56.61..56.61 rows=1 width=20) -> Nested Loop (cost=0.00..56.61 rows=1 width=20) -> Nested Loop (cost=0.00..10.74 rows=1 width=16) -> Nested Loop (cost=0.00..8.59 rows=1 width=12) -> Index Scan using resumes_fti_index on resumes_fti rf (cost=0.00..4.97 rows=1 width=4) -> Index Scan using applicants_resumes_index on applicants_resumes ar (cost=0.00..3.61 rows=1 width=8) -> Index Scan using applicants_app_id on applicantsa (cost=0.00..2.14 rows=1 width=4) -> Index Scan using applicants_states_app_id on applicants_states s (cost=0.00..45.86 rows=1 width=4) And the stats : ProcessQuery ! system usage stats: ! 5.088647 elapsed 4.954981 user 0.125561 system sec ! [4.976752 user 0.132817 sys total] ! 0/0 [0/0] filesystem blocks in/out ! 0/4607 [0/4846] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [3/3] messages rcvd/sent ! 0/52 [3/57] voluntary/involuntary context switches ! postgres usage stats: ! Shared blocks: 0 read, 0 written, buffer hit rate = 100.00% ! Local blocks: 0 read, 0 written, buffer hit rate = 0.00% ! Direct blocks: 0 read, 0 written CommitTransactionCommand proc_exit(0) The "0/4607 [0/4846] page faults/reclaims" area is greatly increased in this query that from the other. Is that to be expected? Is there anything else I can do to get the total number of records matched by the query and still use LIMIT (I doubt it)? If there isn't anything I can do, which looks to be the case here, I still appreciate all the help you've given me.. I look forward to your response. Thanks! -Mitch
pgsql-hackers by date: