mnogosearch under 7.4 ... - Mailing list pgsql-performance

From Marc G. Fournier
Subject mnogosearch under 7.4 ...
Date
Msg-id 20031222165147.S916@ganymede.hub.org
Whole thread Raw
List pgsql-performance
G'day all ...

Dave asked me today about 'slow downs' on the search engines, so am
looking at the various queries generated by enabling
log_statement/log_duration, to get a feel for is something is "off" ...
and the following seems a bit weird ...

QueryA and QueryB are the same query, but against two different tables in
the databases ... QueryA takes ~4x longer to run then QueryB, but both
EXPLAINs look similar ... in fact, looking at the EXPLAIN ANALYZE output,
I would expect that QueryB would be the slower of the two ... but, the
actual vs estimated times for ndict5/ndict4 seem off (ndict4 is estimated
high, ndict5 is estimated low) ...

QueryA:

186_archives=# explain analyze SELECT ndict5.url_id,ndict5.intag
                                 FROM ndict5, url
                                WHERE ndict5.word_id=1343124681
                                  AND url.rec_id=ndict5.url_id
                                  AND ((url.url || '') LIKE 'http://archives.postgresql.org/%%');
                                                              QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..69799.69 rows=44 width=8) (actual time=113.067..26477.672 rows=14112 loops=1)
   ->  Index Scan using n5_word on ndict5  (cost=0.00..34321.89 rows=8708 width=8) (actual time=27.349..25031.666
rows=15501loops=1) 
         Index Cond: (word_id = 1343124681)
   ->  Index Scan using url_rec_id on url  (cost=0.00..4.06 rows=1 width=4) (actual time=0.061..0.068 rows=1
loops=15501)
         Index Cond: (url.rec_id = "outer".url_id)
         Filter: ((url || ''::text) ~~ 'http://archives.postgresql.org/%%'::text)
 Total runtime: 26550.566 ms
(7 rows)

QueryB:

186_archives=# explain analyze SELECT ndict4.url_id,ndict4.intag
                                 FROM ndict4, url
                                WHERE ndict4.word_id=-2038735111
                                  AND url.rec_id=ndict4.url_id
                                  AND ((url.url || '') LIKE 'http://archives.postgresql.org/%%');

                                                             QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..99120.97 rows=62 width=8) (actual time=26.330..6630.581 rows=2694 loops=1)
   ->  Index Scan using n4_word on ndict4  (cost=0.00..48829.52 rows=12344 width=8) (actual time=7.954..6373.098
rows=2900loops=1) 
         Index Cond: (word_id = -2038735111)
   ->  Index Scan using url_rec_id on url  (cost=0.00..4.06 rows=1 width=4) (actual time=0.059..0.066 rows=1
loops=2900)
         Index Cond: (url.rec_id = "outer".url_id)
         Filter: ((url || ''::text) ~~ 'http://archives.postgresql.org/%%'::text)
 Total runtime: 6643.462 ms
(7 rows)




----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

pgsql-performance by date:

Previous
From: Dave Cramer
Date:
Subject: Re: is it possible to get the optimizer to use indexes
Next
From: Michael Guerin
Date:
Subject: Re: postgresql performance on linux port