Thread: regexp_replace puzzle
I've got a problem with regexp_replace which I could reduce to the following: CREATE FUNCTION digest(text, text) RETURNS bytea LANGUAGE c IMMUTABLE STRICT AS '$libdir/pgcrypto', 'pg_digest'; CREATE FUNCTION sha224enc(text) RETURNS text AS $$ BEGIN RAISE WARNING 'arg=»%«', $1; RETURN encode(digest($1, 'sha224'), 'hex'); END; $$ LANGUAGE plpgsql IMMUTABLE; CREATE TABLE t1 ( id serial NOT NULL, val text NOT NULL, PRIMARY KEY (id) ); INSERT INTO t1 (val) VALUES ('d111111'); INSERT INTO t1 (val) VALUES ('xd222222'); INSERT INTO t1 (val) VALUES ('x d333333'); SELECT val, regexp_replace(val, '^(.*\W)?(C\d{7}|[DI]\d{6}|S\d{10})(\W.*)?$', '\1' || '»\2«='|| sha224enc('\2') || '\3', 'i') FROM t1 WHERE val ~* '^(.*\W)?(C\d{7}|[DI]\d{6}|S\d{10})(\W.*)?$'; (I want to replace patterns within a string by their SHA-224 hash.) However, when I run this example I get: WARNING: arg=»\2« val | regexp_replace -----------+---------------------------------------------------------------------- d111111 | »d111111«=da2c99be8170ce02b04ee7d90877ae9d86fea467abb973c74c708b27 x d333333 | x »d333333«=da2c99be8170ce02b04ee7d90877ae9d86fea467abb973c74c708b27 (2 rows) i.e. the first '\2' gets properly expanded by the second paren match, but the second '\2' doesn't get expanded. What am I overlooking?
2010/3/10 Harald Fuchs <hari.fuchs@gmail.com>: > I've got a problem with regexp_replace which I could reduce to the following: > > CREATE FUNCTION digest(text, text) RETURNS bytea > LANGUAGE c IMMUTABLE STRICT > AS '$libdir/pgcrypto', 'pg_digest'; > > CREATE FUNCTION sha224enc(text) RETURNS text AS $$ > BEGIN > RAISE WARNING 'arg=»%«', $1; > RETURN encode(digest($1, 'sha224'), 'hex'); > END; > $$ LANGUAGE plpgsql IMMUTABLE; > > CREATE TABLE t1 ( > id serial NOT NULL, > val text NOT NULL, > PRIMARY KEY (id) > ); > > INSERT INTO t1 (val) VALUES ('d111111'); > INSERT INTO t1 (val) VALUES ('xd222222'); > INSERT INTO t1 (val) VALUES ('x d333333'); > > SELECT val, > regexp_replace(val, > '^(.*\W)?(C\d{7}|[DI]\d{6}|S\d{10})(\W.*)?$', > '\1' || '»\2«='|| sha224enc('\2') || '\3', 'i') > FROM t1 > WHERE val ~* > '^(.*\W)?(C\d{7}|[DI]\d{6}|S\d{10})(\W.*)?$'; > > (I want to replace patterns within a string by their SHA-224 hash.) > However, when I run this example I get: > > WARNING: arg=»\2« > val | regexp_replace > -----------+---------------------------------------------------------------------- > d111111 | »d111111«=da2c99be8170ce02b04ee7d90877ae9d86fea467abb973c74c708b27 > x d333333 | x »d333333«=da2c99be8170ce02b04ee7d90877ae9d86fea467abb973c74c708b27 > (2 rows) > > i.e. the first '\2' gets properly expanded by the second paren match, > but the second '\2' doesn't get expanded. > > What am I overlooking? > Use g flag. "Flag g causes the function to find each match in the string, not only the first one, and return a row for each such match." [1] Osvaldo [1] http://www.postgresql.org/docs/current/interactive/functions-matching.html#FUNCTIONS-POSIX-REGEXP
On Wed, 10 Mar 2010 13:41:54 +0100, Harald Fuchs wrote about [GENERAL] regexp_replace puzzle: [snip] > SELECT val, > regexp_replace(val, > '^(.*\W)?(C\d{7}|[DI]\d{6}|S\d{10})(\W.*)?$', > '\1' || '»\2«='|| sha224enc('\2') || '\3', 'i') [snip] >i.e. the first '\2' gets properly expanded by the second paren match, >but the second '\2' doesn't get expanded. The second instance of '\2' is first passed to sha224enc(), then that function's result is passed to regexp_replace. -- Regards, Dave [RLU #314465] ======================================================================= david.w.noon@ntlworld.com (David W Noon) =======================================================================