Thread: Unexpected Results from regexp_replace
I would like to extract only letters and numbers [0-9a-z] from an input string. Using regexp_replace in postgres, I get unexpected results. The pattern matching seems to stop before the end of the string value is reached. Some of the values that should have been excluded are still there. What is the trick for removing all instances of unwanted characters and spaces? I tried regexp_matches but the results were no better. In Oracle, the same code produces the expected results. Below is the code and the output from postgres and from oracle, for your reference. Thanks for your help! Sue ----------------------------------------------------------------------- POSTGRESQL: Incorrect results...left over unwanted characters create table test_str (greeting varchar(256),v_greeting varchar(256)); select * from test_str; insert into test_str (greeting) values ('Hello Worldy'); insert into test_str (greeting) values ('Hello, World!'); insert into test_str (greeting) values ('hello_world'); insert into test_str (greeting) values ('Hello! World.'); insert into test_str (greeting) values ('hello - world'); insert into test_str (greeting) values ('hello_world n'); insert into test_str (greeting) values ('hello world n n'); select greeting,regexp_replace (lower(greeting),'[^0-9a-z]{1,}','') str from test_str ; greeting str Hello Worldy helloworldy Hello, World! helloworld! hello_world helloworld Hello! World. helloworld. hello - world helloworld hello_world n helloworld n hello world n n helloworld n n ------------------------------------------------------------------------ ORACLE: Correct results...all unwanted characters are gone create table test_str (greeting varchar2(256),v_greeting varchar2(256)); select * from test_str; insert into test_str (greeting) values ('Hello Worldy'); insert into test_str (greeting) values ('Hello, World!'); insert into test_str (greeting) values ('hello_world'); insert into test_str (greeting) values ('Hello! World.'); insert into test_str (greeting) values ('hello - world'); insert into test_str (greeting) values ('hello_world n'); insert into test_str (greeting) values ('hello world n n'); select greeting,regexp_replace (lower(greeting),'[^0-9a-z]{1,}','') str from test_str ; GREETING STR Hello Worldy helloworldy Hello, World! helloworld hello_world helloworld Hello! World. helloworld hello - world helloworld hello_world n helloworldn hello world n n helloworldnn -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Susan E Hurst Principal Consultant Email: susan.hurst@brookhurstdata.com Mobile: 314-486-3261
What is the trick for removing all instances of unwanted characters and spaces?
(g)lobal flag.
SELECT regexp_replace('hello - world n', '[^0-9a-z]+', '', 'g')
;
David J.
Excellent! I tried the global flag before but I had it in the wrong place. Thanks for the quick service, David! --- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Susan E Hurst Principal Consultant Email: susan.hurst@brookhurstdata.com Mobile: 314-486-3261 On 2015-11-10 14:19, David G. Johnston wrote: > On Tue, Nov 10, 2015 at 1:12 PM, Susan Hurst > <susan.hurst@brookhurstdata.com> wrote: > >> What is the trick for removing all instances of unwanted characters >> and spaces? > > (g)lobal flag. > > SELECT regexp_replace('hello - world n', '[^0-9a-z]+', '', 'g') > ; > > David J.