Home > mailing lists

How to get around LIKE inefficiencies? - Mailing list pgsql-hackers

From	The Hermit Hacker
Subject	How to get around LIKE inefficiencies?
Date	November 5, 2000 20:05:31
Msg-id	Pine.BSF.4.21.0011052045440.494-100000@thelab.hub.org Whole thread Raw
Responses	Re: How to get around LIKE inefficiencies? Re: How to get around LIKE inefficiencies?
List	pgsql-hackers

Tree view

I'm tryin to figure out how to speed up udmsearch when run under
postgresql, and am being hit by atrocious performance when using a LIKE
query ... the query looks like:

SELECT ndict.url_id,ndict.intag  FROM ndict,url WHERE ndict.word_id=1971739852   AND url.rec_id=ndict.url_id    AND
(url.urlLIKE 'http://www.postgresql.org/%');
 

Take off the AND ( LIKE ) part of the query, finishes almost as soon as
you hit return.  Put it back in, and you can go for coffee before it
finishes ...
If I do 'SELECT url_id FROM ndict WHERE word_id=1971739852', there
are 153 records returned ... is there some way, that I'm not thinking, of
re-writing the above so that it 'resolves' the equality before the LIKE in
order to reduce the number of tuples that it has to do the LIKE on?  Is
there some way of writing the above so that it doesn't take forever to
execute?
I'm running this on a Dual-PIII 450 Server, 512Meg of RAM, zero
swap space being used ... the database has its indices on one hard drive,
the tables themselves are on a second one ... its PgSQL 7.0.2 (Tom,
anything in v7.0.3 that might improve this?) and startup is as:

#!/bin/tcsh
setenv PORT 5432
setenv POSTMASTER /pgsql/bin/postmaster
unlimit
${POSTMASTER} -B 384 -N 192 \             -o "-F -S 32768" \             -i -p ${PORT} -D/pgsql/data >&
/pgsql/logs/postmaster.${PORT}.$$ &
So its not like I'm not throwing alot of resources at this ...
Is there anything that we can do to improve this?  I was trying to
think of some way to use a subselect to narrow the search results, or
something ...
Oh, the above explains down to:

NOTICE:  QUERY PLAN:

Nested Loop  (cost=0.00..1195.14 rows=1 width=10) ->  Index Scan using url_url on url  (cost=0.00..2.73 rows=1 width=4)
-> Index Scan using n_word on ndict  (cost=0.00..1187.99 rows=353 width=6)
 

EXPLAIN
ndict: 663018 tuples  url:  29276 tuples

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org

pgsql-hackers by date:

From: Tatsuo Ishii
Date: 05 November 2000, 20:04:10
Subject: Re: Unicode conversion (Re: [COMMITTERS] pgsql (configure.in))

From: Tom Lane
Date: 05 November 2000, 21:28:21
Subject: Re: How to get around LIKE inefficiencies?

How to get around LIKE inefficiencies? - Mailing list pgsql-hackers

Previous

Next