regexp_replace puzzle

Started by Harald Fuchsabout 16 years ago3 messagesgeneral
Jump to latest
#1Harald Fuchs
hari.fuchs@gmail.com

I've got a problem with regexp_replace which I could reduce to the following:

CREATE FUNCTION digest(text, text) RETURNS bytea
LANGUAGE c IMMUTABLE STRICT
AS '$libdir/pgcrypto', 'pg_digest';

CREATE FUNCTION sha224enc(text) RETURNS text AS $$
BEGIN
RAISE WARNING 'arg=ᅵ%ᅵ', $1;
RETURN encode(digest($1, 'sha224'), 'hex');
END;
$$ LANGUAGE plpgsql IMMUTABLE;

CREATE TABLE t1 (
id serial NOT NULL,
val text NOT NULL,
PRIMARY KEY (id)
);

INSERT INTO t1 (val) VALUES ('d111111');
INSERT INTO t1 (val) VALUES ('xd222222');
INSERT INTO t1 (val) VALUES ('x d333333');

SELECT val,
regexp_replace(val,
'^(.*\W)?(C\d{7}|[DI]\d{6}|S\d{10})(\W.*)?$',
'\1' || 'ᅵ\2ᅵ='|| sha224enc('\2') || '\3', 'i')
FROM t1
WHERE val ~*
'^(.*\W)?(C\d{7}|[DI]\d{6}|S\d{10})(\W.*)?$';

(I want to replace patterns within a string by their SHA-224 hash.)
However, when I run this example I get:

WARNING: arg=ᅵ\2ᅵ
val | regexp_replace
-----------+----------------------------------------------------------------------
d111111 | ᅵd111111ᅵ=da2c99be8170ce02b04ee7d90877ae9d86fea467abb973c74c708b27
x d333333 | x ᅵd333333ᅵ=da2c99be8170ce02b04ee7d90877ae9d86fea467abb973c74c708b27
(2 rows)

i.e. the first '\2' gets properly expanded by the second paren match,
but the second '\2' doesn't get expanded.

What am I overlooking?

#2Osvaldo Kussama
osvaldo.kussama@gmail.com
In reply to: Harald Fuchs (#1)
Re: regexp_replace puzzle

2010/3/10 Harald Fuchs <hari.fuchs@gmail.com>:

I've got a problem with regexp_replace which I could reduce to the following:

 CREATE FUNCTION digest(text, text) RETURNS bytea
     LANGUAGE c IMMUTABLE STRICT
     AS '$libdir/pgcrypto', 'pg_digest';

 CREATE FUNCTION sha224enc(text) RETURNS text AS $$
 BEGIN
   RAISE WARNING 'arg=»%«', $1;
   RETURN encode(digest($1, 'sha224'), 'hex');
 END;
 $$ LANGUAGE plpgsql IMMUTABLE;

 CREATE TABLE t1 (
   id serial NOT NULL,
   val text NOT NULL,
   PRIMARY KEY (id)
 );

 INSERT INTO t1 (val) VALUES ('d111111');
 INSERT INTO t1 (val) VALUES ('xd222222');
 INSERT INTO t1 (val) VALUES ('x d333333');

 SELECT val,
        regexp_replace(val,
         '^(.*\W)?(C\d{7}|[DI]\d{6}|S\d{10})(\W.*)?$',
         '\1' || '»\2«='|| sha224enc('\2') || '\3', 'i')
 FROM t1
 WHERE val ~*
         '^(.*\W)?(C\d{7}|[DI]\d{6}|S\d{10})(\W.*)?$';

(I want to replace patterns within a string by their SHA-224 hash.)
However, when I run this example I get:

 WARNING:  arg=»\2«
     val    |                            regexp_replace
 -----------+----------------------------------------------------------------------
  d111111   | »d111111«=da2c99be8170ce02b04ee7d90877ae9d86fea467abb973c74c708b27
  x d333333 | x »d333333«=da2c99be8170ce02b04ee7d90877ae9d86fea467abb973c74c708b27
 (2 rows)

i.e. the first '\2' gets properly expanded by the second paren match,
but the second '\2' doesn't get expanded.

What am I overlooking?

Use g flag.
"Flag g causes the function to find each match in the string, not only
the first one, and return a row for each such match." [1]http://www.postgresql.org/docs/current/interactive/functions-matching.html#FUNCTIONS-POSIX-REGEXP

Osvaldo

[1]: http://www.postgresql.org/docs/current/interactive/functions-matching.html#FUNCTIONS-POSIX-REGEXP

#3David W Noon
dwnoon@ntlworld.com
In reply to: Harald Fuchs (#1)
Re: regexp_replace puzzle

On Wed, 10 Mar 2010 13:41:54 +0100, Harald Fuchs wrote about [GENERAL]
regexp_replace puzzle:

[snip]

SELECT val,
regexp_replace(val,
'^(.*\W)?(C\d{7}|[DI]\d{6}|S\d{10})(\W.*)?$',
'\1' || '»\2«='|| sha224enc('\2') || '\3', 'i')

[snip]

i.e. the first '\2' gets properly expanded by the second paren match,
but the second '\2' doesn't get expanded.

The second instance of '\2' is first passed to sha224enc(), then that
function's result is passed to regexp_replace.
--
Regards,

Dave [RLU #314465]
=======================================================================
david.w.noon@ntlworld.com (David W Noon)
=======================================================================