Thread: How to search ignoring spaces and minus signs

How to search ignoring spaces and minus signs

From
"Andrus"
Date:
CHAR(20) columns in 8.4 database may contains spaces and - signs like

13-333-333
12 3-44
33 33 333
12345

User enters code to search without spaces and - signs, like 12344
How to search for product code ignoring spaces and - signs?

For example searching for code 12344 should return
12 3-44  as matching item.

Andrus.

Re: How to search ignoring spaces and minus signs

From
Pavel Stehule
Date:
Hello

you can  use a own function

CREATE OR REPLACE FUNCTION clean_some(text)
RETURNS text AS $$
SELECT replace(replace($1, ' ',''),'-','')
$$ LANGUAGE sql;

then you can do query with where clause

WHERE clean_some(colum) = clean_some('userinput');

you can enhance it with functional index

CRETE INDEX xx ON your_table((clean_some(column));

Regards

Pavel Stehule

2010/10/13 Andrus <kobruleht2@hot.ee>:
> CHAR(20) columns in 8.4 database may contains spaces and - signs like
>
> 13-333-333
> 12 3-44
> 33 33 333
> 12345
>
> User enters code to search without spaces and - signs, like 12344
> How to search for product code ignoring spaces and - signs?
>
> For example searching for code 12344 should return
> 12 3-44  as matching item.
>
> Andrus.
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: How to search ignoring spaces and minus signs

From
Merlin Moncure
Date:
2010/10/13 Andrus <kobruleht2@hot.ee>:
> CHAR(20) columns in 8.4 database may contains spaces and - signs like
>
> 13-333-333
> 12 3-44
> 33 33 333
> 12345
>
> User enters code to search without spaces and - signs, like 12344
> How to search for product code ignoring spaces and - signs?
>
> For example searching for code 12344 should return
> 12 3-44  as matching item.

postgres=# select regexp_replace('1243-56 6536  7', E'[^0-9]', '', 'g');
 regexp_replace
----------------
 12435665367
(1 row)

that's the key. obviously, searching with that as predicate through
large table will be unpleasant unless you precalculate the above into
index expression.

merlin

Re: How to search ignoring spaces and minus signs

From
Darren Duncan
Date:
Andrus wrote:
> CHAR(20) columns in 8.4 database may contains spaces and - signs like
>
> 13-333-333
> 12 3-44
> 33 33 333
> 12345
>
> User enters code to search without spaces and - signs, like 12344
> How to search for product code ignoring spaces and - signs?
>
> For example searching for code 12344 should return
> 12 3-44  as matching item.

Is your search an exact match ignoring the spaces and dashes, or a substring
search?  If the field contained 0123445 then is a search for 12344 supposed to
match it or not?

The best simple way to do what you want is to add another CHAR(20) column for
each of the existing ones like you describe where the extra column has a copy of
the original one but with the spaces and dashes removed.

Then when doing searches you search on the new copy and when displaying you
display the original copy.

Doing this would save the database having to do the most expensive kinds of
computations repeatedly at the time of searching given that these can be
staticly precomputed.

Moreover, if your search is exact-match, you get additional speed gains by
having an index on the search column.  (I don't know if there is any kind of
useful index for substring matches besides full text search.)

-- Darren Duncan

Re: How to search ignoring spaces and minus signs

From
Craig Ringer
Date:
On 14/10/10 01:45, Andrus wrote:
> CHAR(20) columns in 8.4 database may contains spaces and - signs like
>
> 13-333-333
> 12 3-44
> 33 33 333
> 12345
>
> User enters code to search without spaces and - signs, like 12344
> How to search for product code ignoring spaces and - signs?
>
> For example searching for code 12344 should return
> 12 3-44  as matching item.

Either convert the user input into a suitable regular expression, or
pre-process the column being searched to strip spaces and - signs.

I'd recommend writing a simple SQL function that uses a regexp_replace()
or a couple of regular replace() calls to simplify the column being
searched down to only numbers. If you want to strip *everything* that's
not a number, you could use:

CREATE OR REPLACE FUNCTION strip_nondigits(text) RETURNS text AS $$
SELECT regexp_replace($1, E'[^0-9]', '', 'g');
$$ LANGUAGE 'sql' IMMUTABLE STRICT;

See: http://www.postgresql.org/docs/current/static/functions-string.html

http://www.postgresql.org/docs/current/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP

http://www.postgresql.org/docs/current/interactive/xfunc-sql.html

You can now create a functional index on your target column that uses
that function, allowing you to run indexed searches against the column
without repeatedly re-evaluating the stripping expression for every
value during every search. The query planner will recognise when you use
the indexed expression in a query, and will use the index where
appropriate. Make the index like this:

CREATE INDEX stripped_numbers ON mytable ((strip_nondigits(thecolumn));

then use it like this:

SELECT * FROM mytable WHERE strip_nondigits('123-user-input-here') =
strip_nondigits(thecolumn);


EXPLAIN should show that an index scan of stripped_numbers is being
used, at least if the table contains non-trivial amounts of data.

Alternately, you could use a trigger to maintain a stripped version of
the field as an additional column in each row. Both approaches cost you
a bit of time during updates/inserts though, and the functional index is
probably easier.


--
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

Re: How to search ignoring spaces and minus signs

From
Rajesh Kumar Mallah
Date:
Dear Andrus,

Quick & Dirty Soln:

SELECT * from  table where regexp_replace( col   , '[-\\s+]' , '' ,
'g')  ilike '%search_term%'  ;

note above sql will not use any index if you have to search 10000s of
rows use alternate
approaches.

regds
Rajesh Kumar Mallah.



2010/10/13 Andrus <kobruleht2@hot.ee>:
> CHAR(20) columns in 8.4 database may contains spaces and - signs like
>
> 13-333-333
> 12 3-44
> 33 33 333
> 12345
>
> User enters code to search without spaces and - signs, like 12344
> How to search for product code ignoring spaces and - signs?
>
> For example searching for code 12344 should return
> 12 3-44  as matching item.
>
> Andrus.
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: How to search ignoring spaces and minus signs

From
Christian Ramseyer
Date:
On 10/13/2010 07:45 PM, Andrus wrote:
> CHAR(20) columns in 8.4 database may contains spaces and - signs like
>
> 13-333-333
> 12 3-44
> 33 33 333
> 12345
>
> User enters code to search without spaces and - signs, like 12344
> How to search for product code ignoring spaces and - signs?
>

There are many options to do that...

You could use a regular expression like this:

DMP=# select * from foo where prod_code ~ '1[ -]*2[ -]*3[ -]*4[ -]*4';
  prod_code
-----------
  12 3-44
  12-3-44
  123 44

[ -]* means "zero or more dashes or spaces".

Maybe easier or (computationally) faster (YMMV on both counts) would be
to replace() the dashes and spaces on the fly first and only search the
cleaned string:

DMP=# select prod_code,
   replace(replace(prod_code, '-', ''),' ','') from foo
  where replace(replace(prod_code, '-', ''),' ','') = '12344';
  prod_code | replace
-----------+---------
  12 3-44   | 12344
  12-3-44   | 12344
  123 44    | 12344

Or just store the codes in a uniform format to begin with.

Christian

Re: How to search ignoring spaces and minus signs

From
Jayadevan M
Date:
Hello,
>
> For example searching for code 12344 should return
> 12 3-44  as matching item.
>
> Andrus.
>
This will do?
postgres=# select name from myt;
    name
------------
 13-333-333
 12 3-44
 33 33 333
 12345
(4 rows)

postgres=# select * from myt where  translate(translate(name,'-',''),'
','') = '13333333';
    name
------------
 13-333-333
(1 row)

postgres=# select * from myt where  translate(translate(name,'-',''),'
','') = '12344';
  name
---------
 12 3-44
(1 row)

Regards,
Jayadevan





DISCLAIMER:

"The information in this e-mail and any attachment is intended only for
the person to whom it is addressed and may contain confidential and/or
privileged material. If you have received this e-mail in error, kindly
contact the sender and destroy all copies of the original communication.
IBS makes no warranty, express or implied, nor guarantees the accuracy,
adequacy or completeness of the information contained in this email or any
attachment and is not liable for any errors, defects, omissions, viruses
or for resultant loss or damage, if any, direct or indirect."






Re: How to search ignoring spaces and minus signs

From
Josh Kupershmidt
Date:
2010/10/13 Andrus <kobruleht2@hot.ee>:
> CHAR(20) columns in 8.4 database may contains spaces and - signs like
>
> 13-333-333
> 12 3-44
> 33 33 333
> 12345
>
> User enters code to search without spaces and - signs, like 12344
> How to search for product code ignoring spaces and - signs?
>
> For example searching for code 12344 should return
> 12 3-44  as matching item.

How about using translate() to strip off dashes and spaces:
  SELECT mycol FROM mytable
  WHERE translate(mycol, '- ', '') = '12344';

Josh

Re: How to search ignoring spaces and minus signs

From
Grzegorz Jaśkiewicz
Date:
select regexp_replace(myval, E'(\\D)', '', 'g') from foo;

for added speed, you might consider this:
CREATE INDEX ON foo((regexp_replace(myval, E'(\\D)', '', 'g'))::bigint);


which is also going to protect you against inserts where value doesn't
contain any digits.

and added benefit of index:
gj=# select * from foo where (regexp_replace(myval::text,
'(\D)'::text, ''::text, 'g'::text)::bigint) = 5656;
 id | myval
----+-------
 61 | 56-56
(1 row)

Time: 1.356 ms
gj=# explain select * from foo where (regexp_replace(myval::text,
'(\D)'::text, ''::text, 'g'::text)::bigint) = 5656;
                                            QUERY PLAN
---------------------------------------------------------------------------------------------------
 Index Scan using foo_regexp_replace_idx on foo  (cost=0.01..8.38
rows=1 width=17)
   Index Cond: ((regexp_replace((myval)::text, '(\D)'::text, ''::text,
'g'::text))::bigint = 5656)
(2 rows)

gj=# \d+ foo
                                             Table "public.foo"
 Column |         Type          |                    Modifiers
            | Storage  | Description
--------+-----------------------+--------------------------------------------------+----------+-------------
 id     | integer               | not null default
nextval('foo_id_seq'::regclass) | plain    |
 myval  | character varying(20) | not null
            | extended |
Indexes:
    "foo_pkey" PRIMARY KEY, btree (id)
    "foo_regexp_replace_idx" btree ((regexp_replace(myval::text,
'(\D)'::text, ''::text, 'g'::text)::bigint))
Has OIDs: no


or even create a view that would allow you to make it nice and easy:
gj=# CREATE VIEW foo_view AS select id, (regexp_replace(myval::text,
'(\D)'::text, ''::text, 'g'::text)::bigint) AS int_val FROM foo;
CREATE VIEW

gj=# select * from foo_view where int_val = 1212;
 id | int_val
----+---------
 17 |    1212
(1 row)

Time: 0.709 ms
gj=# explain select * from foo_view where int_val = 1212;
                                            QUERY PLAN
---------------------------------------------------------------------------------------------------
 Index Scan using foo_regexp_replace_idx on foo  (cost=0.01..8.39
rows=1 width=17)
   Index Cond: ((regexp_replace((myval)::text, '(\D)'::text, ''::text,
'g'::text))::bigint = 1212)
(2 rows)


HTH

--
GJ

Re: How to search ignoring spaces and minus signs

From
Osvaldo Kussama
Date:
2010/10/13 Andrus <kobruleht2@hot.ee>:
> CHAR(20) columns in 8.4 database may contains spaces and - signs like
>
> 13-333-333
> 12 3-44
> 33 33 333
> 12345
>
> User enters code to search without spaces and - signs, like 12344
> How to search for product code ignoring spaces and - signs?
>
> For example searching for code 12344 should return
> 12 3-44  as matching item.
>


Try:
SELECT * FROM your_table WHERE regexp_replace(your_col,
'[^[:digit:]]', '', 'g') = your_number;

Osvaldo

Re: How to search ignoring spaces and minus signs

From
Joshua Berry
Date:
Hi Andrus,

2010/10/13 Andrus <kobruleht2@hot.ee>
CHAR(20) columns in 8.4 database may contains spaces and - signs like

13-333-333
12 3-44
33 33 333
12345

User enters code to search without spaces and - signs, like 12344
How to search for product code ignoring spaces and - signs?

For example searching for code 12344 should return
12 3-44  as matching item.

Check out http://www.postgresql.org/docs/8.4/interactive/functions-string.html and try using these functions to modify the comparison in the WHERE portion of a query.
For example, if the CHAR(20) column that you mentioned is called 'sku' and the user enters a value like '12345'

select * from table where replace(replace(sku, ' ', ''), '-', '') = '12344'

-Joshua