Re: type-casting and LIKE queries - Mailing list pgsql-general

From valerian
Subject Re: type-casting and LIKE queries
Date
Msg-id 20030317003437.GA23009@hotpop.com
Whole thread Raw
In response to Re: type-casting and LIKE queries  (Lincoln Yeoh <lyeoh@pop.jaring.my>)
Responses Re: type-casting and LIKE queries  (valerian <valerian2@hotpop.com>)
List pgsql-general
On Sat, Mar 15, 2003 at 02:00:35PM +0800, Lincoln Yeoh wrote:
> Are you using the ? character as a wildcard? AFAIK _ and % are the wildcard
> characters for LIKE.

Sorry, that was a typo.  My application allows searching with * and ?
characters, but it translates them to % and _ when creating the sql
statement.

> Would having two indexes cover enough cases for you? One that allows
> indexed LIKE '407%'. And one that allows '%4820'.
>
> For the latter just reverse the string and index it, and do a search on
> LIKE '0284%'.

Hey that's an interesting idea!

> Maybe you could create a function that reverses strings, not sure how that
> would work tho - could look messy since you probably don't want to reverse
> the % too.

So I guess that trick wouldn't work if the search string has a % both
at the beginning and the end...

> Not sure if Postgresql would be able to productively use both indexes to
> find a substring in the middle of text, given a suitable query. If the
> substring is in a fixed position in the middle I think it can.

I can't guarantee fixed positions though, because I also have to be able
to run arbitrary searches on varchar columns too...  For instance, I
have a column 'email' of type varchar(255), which has a unique index on
lower(email).  Some typical searches might be:

   *@hotmail.com
   johndoe@*
   *unix*

The first two are no problem if I use the 'reverse index' trick you
described.  But I don't see how the indexes can get used in the third
case?

However, I went ahead and created the a 'reverse_lc' function to see what
kind of performace I would get.  The planner isn't using my index
though. :(


test=> CREATE UNIQUE INDEX test_email_lc_idx ON test (lower(email));
CREATE INDEX
test=> CREATE UNIQUE INDEX test_email_revlc_idx ON test (reverse_lc(email));
CREATE INDEX
test=> VACUUM ANALYZE test ;
VACUUM
test=> EXPLAIN ANALYZE SELECT * FROM test WHERE lower(email) LIKE 'asdf%';
                                                        QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------
 Index Scan using test_email_lc_idx on test  (cost=0.00..125.62 rows=38 width=45) (actual time=0.50..0.50 rows=0
loops=1)
   Index Cond: ((lower((email)::text) >= 'asdf'::text) AND (lower((email)::text) < 'asdg'::text))
   Filter: (lower((email)::text) ~~ 'asdf%'::text)
 Total runtime: 0.64 msec
(4 rows)

test=> EXPLAIN ANALYZE SELECT * FROM test WHERE reverse_lc(email) LIKE '%asdf';
                                              QUERY PLAN
------------------------------------------------------------------------------------------------------
 Seq Scan on test  (cost=0.00..193.56 rows=38 width=45) (actual time=5852.42..5852.42 rows=0 loops=1)
   Filter: (reverse_lc((email)::text) ~~ '%asdf'::text)
 Total runtime: 5852.54 msec
(3 rows)

test=> SELECT reverse_lc('%asdf');
 reverse_lc
------------
 fdsa%
(1 row)


Here's the code for it:

CREATE FUNCTION reverse_lc(text) RETURNS text IMMUTABLE AS '
    DECLARE
        old_str ALIAS FOR $1;
        low     text;
        new_str text;
        len     integer;
        i       integer;
    BEGIN
        len := length(old_str);
        low := lower(old_str);
        new_str := '''';
        i := 0;
        WHILE i < len LOOP
            new_str := rpad(new_str, i+1, substr(old_str, len-i, 1));
            i := i+1;
        END LOOP;
        RETURN new_str;
    END;
' LANGUAGE 'plpgsql';

Did I forget to do something?



pgsql-general by date:

Previous
From: "Marc G. Fournier"
Date:
Subject: Re: [webmaster] Online docs down again
Next
From: valerian
Date:
Subject: Re: type-casting and LIKE queries