Re: How to search ignoring spaces and minus signs - Mailing list pgsql-general

From Craig Ringer
Subject Re: How to search ignoring spaces and minus signs
Date
Msg-id 4CB67EAE.40408@postnewspapers.com.au
Whole thread Raw
In response to How to search ignoring spaces and minus signs  ("Andrus" <kobruleht2@hot.ee>)
List pgsql-general
On 14/10/10 01:45, Andrus wrote:
> CHAR(20) columns in 8.4 database may contains spaces and - signs like
>
> 13-333-333
> 12 3-44
> 33 33 333
> 12345
>
> User enters code to search without spaces and - signs, like 12344
> How to search for product code ignoring spaces and - signs?
>
> For example searching for code 12344 should return
> 12 3-44  as matching item.

Either convert the user input into a suitable regular expression, or
pre-process the column being searched to strip spaces and - signs.

I'd recommend writing a simple SQL function that uses a regexp_replace()
or a couple of regular replace() calls to simplify the column being
searched down to only numbers. If you want to strip *everything* that's
not a number, you could use:

CREATE OR REPLACE FUNCTION strip_nondigits(text) RETURNS text AS $$
SELECT regexp_replace($1, E'[^0-9]', '', 'g');
$$ LANGUAGE 'sql' IMMUTABLE STRICT;

See: http://www.postgresql.org/docs/current/static/functions-string.html

http://www.postgresql.org/docs/current/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP

http://www.postgresql.org/docs/current/interactive/xfunc-sql.html

You can now create a functional index on your target column that uses
that function, allowing you to run indexed searches against the column
without repeatedly re-evaluating the stripping expression for every
value during every search. The query planner will recognise when you use
the indexed expression in a query, and will use the index where
appropriate. Make the index like this:

CREATE INDEX stripped_numbers ON mytable ((strip_nondigits(thecolumn));

then use it like this:

SELECT * FROM mytable WHERE strip_nondigits('123-user-input-here') =
strip_nondigits(thecolumn);


EXPLAIN should show that an index scan of stripped_numbers is being
used, at least if the table contains non-trivial amounts of data.

Alternately, you could use a trigger to maintain a stripped version of
the field as an additional column in each row. Both approaches cost you
a bit of time during updates/inserts though, and the functional index is
probably easier.


--
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

pgsql-general by date:

Previous
From: Bruno Baguette
Date:
Subject: How to iterate fields into a NEW.* RECORD in a TRIGGER procedure ?
Next
From: "Reuven M. Lerner"
Date:
Subject: Re: Passing refcursors between pl/pgsql functions