Thread: How to search ignoring spaces and minus signs
CHAR(20) columns in 8.4 database may contains spaces and - signs like 13-333-333 12 3-44 33 33 333 12345 User enters code to search without spaces and - signs, like 12344 How to search for product code ignoring spaces and - signs? For example searching for code 12344 should return 12 3-44 as matching item. Andrus.
Hello you can use a own function CREATE OR REPLACE FUNCTION clean_some(text) RETURNS text AS $$ SELECT replace(replace($1, ' ',''),'-','') $$ LANGUAGE sql; then you can do query with where clause WHERE clean_some(colum) = clean_some('userinput'); you can enhance it with functional index CRETE INDEX xx ON your_table((clean_some(column)); Regards Pavel Stehule 2010/10/13 Andrus <kobruleht2@hot.ee>: > CHAR(20) columns in 8.4 database may contains spaces and - signs like > > 13-333-333 > 12 3-44 > 33 33 333 > 12345 > > User enters code to search without spaces and - signs, like 12344 > How to search for product code ignoring spaces and - signs? > > For example searching for code 12344 should return > 12 3-44 as matching item. > > Andrus. > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
2010/10/13 Andrus <kobruleht2@hot.ee>: > CHAR(20) columns in 8.4 database may contains spaces and - signs like > > 13-333-333 > 12 3-44 > 33 33 333 > 12345 > > User enters code to search without spaces and - signs, like 12344 > How to search for product code ignoring spaces and - signs? > > For example searching for code 12344 should return > 12 3-44 as matching item. postgres=# select regexp_replace('1243-56 6536 7', E'[^0-9]', '', 'g'); regexp_replace ---------------- 12435665367 (1 row) that's the key. obviously, searching with that as predicate through large table will be unpleasant unless you precalculate the above into index expression. merlin
Andrus wrote: > CHAR(20) columns in 8.4 database may contains spaces and - signs like > > 13-333-333 > 12 3-44 > 33 33 333 > 12345 > > User enters code to search without spaces and - signs, like 12344 > How to search for product code ignoring spaces and - signs? > > For example searching for code 12344 should return > 12 3-44 as matching item. Is your search an exact match ignoring the spaces and dashes, or a substring search? If the field contained 0123445 then is a search for 12344 supposed to match it or not? The best simple way to do what you want is to add another CHAR(20) column for each of the existing ones like you describe where the extra column has a copy of the original one but with the spaces and dashes removed. Then when doing searches you search on the new copy and when displaying you display the original copy. Doing this would save the database having to do the most expensive kinds of computations repeatedly at the time of searching given that these can be staticly precomputed. Moreover, if your search is exact-match, you get additional speed gains by having an index on the search column. (I don't know if there is any kind of useful index for substring matches besides full text search.) -- Darren Duncan
On 14/10/10 01:45, Andrus wrote: > CHAR(20) columns in 8.4 database may contains spaces and - signs like > > 13-333-333 > 12 3-44 > 33 33 333 > 12345 > > User enters code to search without spaces and - signs, like 12344 > How to search for product code ignoring spaces and - signs? > > For example searching for code 12344 should return > 12 3-44 as matching item. Either convert the user input into a suitable regular expression, or pre-process the column being searched to strip spaces and - signs. I'd recommend writing a simple SQL function that uses a regexp_replace() or a couple of regular replace() calls to simplify the column being searched down to only numbers. If you want to strip *everything* that's not a number, you could use: CREATE OR REPLACE FUNCTION strip_nondigits(text) RETURNS text AS $$ SELECT regexp_replace($1, E'[^0-9]', '', 'g'); $$ LANGUAGE 'sql' IMMUTABLE STRICT; See: http://www.postgresql.org/docs/current/static/functions-string.html http://www.postgresql.org/docs/current/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP http://www.postgresql.org/docs/current/interactive/xfunc-sql.html You can now create a functional index on your target column that uses that function, allowing you to run indexed searches against the column without repeatedly re-evaluating the stripping expression for every value during every search. The query planner will recognise when you use the indexed expression in a query, and will use the index where appropriate. Make the index like this: CREATE INDEX stripped_numbers ON mytable ((strip_nondigits(thecolumn)); then use it like this: SELECT * FROM mytable WHERE strip_nondigits('123-user-input-here') = strip_nondigits(thecolumn); EXPLAIN should show that an index scan of stripped_numbers is being used, at least if the table contains non-trivial amounts of data. Alternately, you could use a trigger to maintain a stripped version of the field as an additional column in each row. Both approaches cost you a bit of time during updates/inserts though, and the functional index is probably easier. -- Craig Ringer Tech-related writing: http://soapyfrogs.blogspot.com/
Dear Andrus, Quick & Dirty Soln: SELECT * from table where regexp_replace( col , '[-\\s+]' , '' , 'g') ilike '%search_term%' ; note above sql will not use any index if you have to search 10000s of rows use alternate approaches. regds Rajesh Kumar Mallah. 2010/10/13 Andrus <kobruleht2@hot.ee>: > CHAR(20) columns in 8.4 database may contains spaces and - signs like > > 13-333-333 > 12 3-44 > 33 33 333 > 12345 > > User enters code to search without spaces and - signs, like 12344 > How to search for product code ignoring spaces and - signs? > > For example searching for code 12344 should return > 12 3-44 as matching item. > > Andrus. > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
On 10/13/2010 07:45 PM, Andrus wrote: > CHAR(20) columns in 8.4 database may contains spaces and - signs like > > 13-333-333 > 12 3-44 > 33 33 333 > 12345 > > User enters code to search without spaces and - signs, like 12344 > How to search for product code ignoring spaces and - signs? > There are many options to do that... You could use a regular expression like this: DMP=# select * from foo where prod_code ~ '1[ -]*2[ -]*3[ -]*4[ -]*4'; prod_code ----------- 12 3-44 12-3-44 123 44 [ -]* means "zero or more dashes or spaces". Maybe easier or (computationally) faster (YMMV on both counts) would be to replace() the dashes and spaces on the fly first and only search the cleaned string: DMP=# select prod_code, replace(replace(prod_code, '-', ''),' ','') from foo where replace(replace(prod_code, '-', ''),' ','') = '12344'; prod_code | replace -----------+--------- 12 3-44 | 12344 12-3-44 | 12344 123 44 | 12344 Or just store the codes in a uniform format to begin with. Christian
Hello, > > For example searching for code 12344 should return > 12 3-44 as matching item. > > Andrus. > This will do? postgres=# select name from myt; name ------------ 13-333-333 12 3-44 33 33 333 12345 (4 rows) postgres=# select * from myt where translate(translate(name,'-',''),' ','') = '13333333'; name ------------ 13-333-333 (1 row) postgres=# select * from myt where translate(translate(name,'-',''),' ','') = '12344'; name --------- 12 3-44 (1 row) Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."
2010/10/13 Andrus <kobruleht2@hot.ee>: > CHAR(20) columns in 8.4 database may contains spaces and - signs like > > 13-333-333 > 12 3-44 > 33 33 333 > 12345 > > User enters code to search without spaces and - signs, like 12344 > How to search for product code ignoring spaces and - signs? > > For example searching for code 12344 should return > 12 3-44 as matching item. How about using translate() to strip off dashes and spaces: SELECT mycol FROM mytable WHERE translate(mycol, '- ', '') = '12344'; Josh
select regexp_replace(myval, E'(\\D)', '', 'g') from foo; for added speed, you might consider this: CREATE INDEX ON foo((regexp_replace(myval, E'(\\D)', '', 'g'))::bigint); which is also going to protect you against inserts where value doesn't contain any digits. and added benefit of index: gj=# select * from foo where (regexp_replace(myval::text, '(\D)'::text, ''::text, 'g'::text)::bigint) = 5656; id | myval ----+------- 61 | 56-56 (1 row) Time: 1.356 ms gj=# explain select * from foo where (regexp_replace(myval::text, '(\D)'::text, ''::text, 'g'::text)::bigint) = 5656; QUERY PLAN --------------------------------------------------------------------------------------------------- Index Scan using foo_regexp_replace_idx on foo (cost=0.01..8.38 rows=1 width=17) Index Cond: ((regexp_replace((myval)::text, '(\D)'::text, ''::text, 'g'::text))::bigint = 5656) (2 rows) gj=# \d+ foo Table "public.foo" Column | Type | Modifiers | Storage | Description --------+-----------------------+--------------------------------------------------+----------+------------- id | integer | not null default nextval('foo_id_seq'::regclass) | plain | myval | character varying(20) | not null | extended | Indexes: "foo_pkey" PRIMARY KEY, btree (id) "foo_regexp_replace_idx" btree ((regexp_replace(myval::text, '(\D)'::text, ''::text, 'g'::text)::bigint)) Has OIDs: no or even create a view that would allow you to make it nice and easy: gj=# CREATE VIEW foo_view AS select id, (regexp_replace(myval::text, '(\D)'::text, ''::text, 'g'::text)::bigint) AS int_val FROM foo; CREATE VIEW gj=# select * from foo_view where int_val = 1212; id | int_val ----+--------- 17 | 1212 (1 row) Time: 0.709 ms gj=# explain select * from foo_view where int_val = 1212; QUERY PLAN --------------------------------------------------------------------------------------------------- Index Scan using foo_regexp_replace_idx on foo (cost=0.01..8.39 rows=1 width=17) Index Cond: ((regexp_replace((myval)::text, '(\D)'::text, ''::text, 'g'::text))::bigint = 1212) (2 rows) HTH -- GJ
2010/10/13 Andrus <kobruleht2@hot.ee>: > CHAR(20) columns in 8.4 database may contains spaces and - signs like > > 13-333-333 > 12 3-44 > 33 33 333 > 12345 > > User enters code to search without spaces and - signs, like 12344 > How to search for product code ignoring spaces and - signs? > > For example searching for code 12344 should return > 12 3-44 as matching item. > Try: SELECT * FROM your_table WHERE regexp_replace(your_col, '[^[:digit:]]', '', 'g') = your_number; Osvaldo
Hi Andrus,
2010/10/13 Andrus <kobruleht2@hot.ee>
Check out http://www.postgresql.org/docs/8.4/interactive/functions-string.html and try using these functions to modify the comparison in the WHERE portion of a query.
For example, if the CHAR(20) column that you mentioned is called 'sku' and the user enters a value like '12345'
select * from table where replace(replace(sku, ' ', ''), '-', '') = '12344'
-Joshua
CHAR(20) columns in 8.4 database may contains spaces and - signs like
13-333-333
12 3-44
33 33 333
12345
User enters code to search without spaces and - signs, like 12344
How to search for product code ignoring spaces and - signs?
For example searching for code 12344 should return
12 3-44 as matching item.
Check out http://www.postgresql.org/docs/8.4/interactive/functions-string.html and try using these functions to modify the comparison in the WHERE portion of a query.
For example, if the CHAR(20) column that you mentioned is called 'sku' and the user enters a value like '12345'
select * from table where replace(replace(sku, ' ', ''), '-', '') = '12344'
-Joshua