Re: type-casting and LIKE queries - Mailing list pgsql-general

From Lincoln Yeoh
Subject Re: type-casting and LIKE queries
Date
Msg-id 5.1.0.14.1.20030317120905.02bf7e30@mbox.jaring.my
Whole thread Raw
In response to Re: type-casting and LIKE queries  (valerian <valerian2@hotpop.com>)
List pgsql-general
What you're asking for comes under full text indexing. There's a fair bit
of research in this field.

Supposedly a way to do this is to create an index of substrings.

e.g. this is the text
Index:
this is the text
  his is the text
   is is the text
    s is the text

And so on.

But without compression and other tricks it might not perform well. For the
index can become really huge so using it could be slower than or be about
the same speed as a seq scan of the main table.

A similar method is to just index keywords. If that is sufficient you could
look at the full text index thing for Postgresql.

If you're using it for phone numbers, I'd think most people are ok with
searching for the starting digits, or the ending digits.

For email you could try keywords.

In my experience if the keyword table isn't huge then a substring search on
the keyword table can be pretty fast.

Hope that helps,
Link.

At 09:05 PM 3/16/03 -0500, valerian wrote:

>test=> EXPLAIN ANALYZE SELECT * FROM test WHERE reverse_lc(email) LIKE
>'fdsa%';
>                                                          QUERY PLAN

>-----------------------------------------------------------------------------------------------------------------------------
>  Index Scan using test_email_revlc_idx on test  (cost=0.00..125.62
> rows=38 width=45) (actual time=0.39..0.39 rows=0 loops=1)
>    Index Cond: ((reverse_lc((email)::text) >= 'fdsa'::text) AND
> (reverse_lc((email)::text) < 'fdsb'::text))
>    Filter: (reverse_lc((email)::text) ~~ 'fdsa%'::text)
>  Total runtime: 0.53 msec
>(4 rows)
>
>So that takes care of the first two types of queries, but not the one
>that has a % both at the beginning and end of the search key.
>
>Any ideas on how to handle those?



pgsql-general by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: type-casting and LIKE queries
Next
From: javier garcia - CEBAS
Date:
Subject: copying between Postgres databases