regexp_replace failing on 9.0.4

Started by Rob Sargentabout 13 years ago9 messagesgeneral
Jump to latest
#1Rob Sargent
robjsargent@gmail.com

On our 9.0.4[1]PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (SUSE Linux) 4.5.0 20100604 [gcc-4_5-branch revision 160292], 64-bit server my regexp_replace is a no-op, but on the 9.0.3[2]PostgreSQL 9.0.3 on x86_64-unknown-linux-gnu, compiled by GCC gcc (SUSE Linux) 4.5.0 20100604 [gcc-4_5-branch revision 160292], 64-bit
test machine and my 9.1.2[3]PostgreSQL 9.1.2 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit dev box all is fine

This is may statement

update cms.segment_data s
set text = regexp_replace(s.text,
'(^.*)ns/acres/pathology/dx/1.5(.*$)',
E'\\1ns/acres/pathology/dx/1.6\\2')
from
dm.payload_segment p,
support.fix_pathdx_namespace f
where s.id = p.segment_data_id
and p.item_id = f.item_id
and p.content_version = f.maxversion
;

"UPDATE 76" reported, but zero changes persist

When I just select the regexp like so

select legacy_id,
regexp_replace( substring(s.text, 1, 150) ,
'(^.*)ns/acres/pathology/dx/1.5(.*$)',
E'\\1ns/acres/pathology/dx/1.6\\2')
from
dm.payload_segment p,
support.fix_pathdx_namespace f,
cms.segment_data s
where s.id = p.segment_data_id
and p.item_id = f.item_id
and p.content_version = f.maxversion
;

I get the corrected data (chiefly the "1.6")

Can anyone see where I've gone off track?

[1]: PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (SUSE Linux) 4.5.0 20100604 [gcc-4_5-branch revision 160292], 64-bit
(SUSE Linux) 4.5.0 20100604 [gcc-4_5-branch revision 160292], 64-bit

[2]: PostgreSQL 9.0.3 on x86_64-unknown-linux-gnu, compiled by GCC gcc (SUSE Linux) 4.5.0 20100604 [gcc-4_5-branch revision 160292], 64-bit
(SUSE Linux) 4.5.0 20100604 [gcc-4_5-branch revision 160292], 64-bit

[3]: PostgreSQL 9.1.2 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit
(Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rob Sargent (#1)
Re: regexp_replace failing on 9.0.4

Rob Sargent <robjsargent@gmail.com> writes:

On our 9.0.4[1] server my regexp_replace is a no-op, but on the 9.0.3[2]
test machine and my 9.1.2[3] dev box all is fine

AFAICS from the commit logs, there were no changes affecting the regex
code between 9.0.3 and 9.0.4. I'm suspicious that your data is
different on the different servers.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Rob Sargent
robjsargent@gmail.com
In reply to: Tom Lane (#2)
Re: regexp_replace failing on 9.0.4

On 03/18/2013 01:19 PM, Tom Lane wrote:

Rob Sargent <robjsargent@gmail.com> writes:

On our 9.0.4[1] server my regexp_replace is a no-op, but on the 9.0.3[2]
test machine and my 9.1.2[3] dev box all is fine

AFAICS from the commit logs, there were no changes affecting the regex
code between 9.0.3 and 9.0.4. I'm suspicious that your data is
different on the different servers.

regards, tom lane

Good to hear, thought I might have glossed over the telling release note
- my usual mo

I restored my dev db from prod just for this run. test and prod are out
of sync by a couple of weeks.

What I had to do ultimately was as follows,

create table support.duplicate_pathdx_namespace
as select item_id, legacy_id, name, locked_by, maxversion,
regexp_replace(substring(content,1,150),
'(^.*)ns/acres/pathology/dx/1.5(.*$)',
E'\\1ns/acres/pathology/dx/1.6\\2')
|| substring(content, 151) as content
from support.fix_pathdx_namespace;

and the update finally stuck, then copied that reconstructed textblock
over to the target production table in an regexp-less update. Doing the
reg_exp_replace on the whole text blog as part of create temp table did
not work.

I'll double check my dev server from same pre-run dump.

Weird.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rob Sargent (#3)
Re: regexp_replace failing on 9.0.4

Rob Sargent <robjsargent@gmail.com> writes:

On 03/18/2013 01:19 PM, Tom Lane wrote:

Rob Sargent <robjsargent@gmail.com> writes:

On our 9.0.4[1] server my regexp_replace is a no-op, but on the 9.0.3[2]
test machine and my 9.1.2[3] dev box all is fine

AFAICS from the commit logs, there were no changes affecting the regex
code between 9.0.3 and 9.0.4. I'm suspicious that your data is
different on the different servers.

Good to hear, thought I might have glossed over the telling release note
- my usual mo

Maybe we're barking up the wrong tree by suspecting the regex itself.
Perhaps the updates were suppressed by a trigger, or the transaction
rolled back instead of committing, or some such?

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Rob Sargent
robjsargent@gmail.com
In reply to: Tom Lane (#4)
Re: regexp_replace failing on 9.0.4

On 03/18/2013 02:40 PM, Tom Lane wrote:

Rob Sargent <robjsargent@gmail.com> writes:

On 03/18/2013 01:19 PM, Tom Lane wrote:

Rob Sargent <robjsargent@gmail.com> writes:

On our 9.0.4[1] server my regexp_replace is a no-op, but on the 9.0.3[2]
test machine and my 9.1.2[3] dev box all is fine

AFAICS from the commit logs, there were no changes affecting the regex
code between 9.0.3 and 9.0.4. I'm suspicious that your data is
different on the different servers.

Good to hear, thought I might have glossed over the telling release note
- my usual mo

Maybe we're barking up the wrong tree by suspecting the regex itself.
Perhaps the updates were suppressed by a trigger, or the transaction
rolled back instead of committing, or some such?

regards, tom lane

The work was all rolled into a function:
o find the chapters;
o copy the necessary data (mainly the text blob) into a back-out
table
o "lock" the chapters (protect them from exposure to the client app)
o perform the regexp_replace as the update to prod. table

The function was exec'd in a tx and committed, leaving the back-out
table and the programmatic locks in place, but the update itself had
been a no-op and continued to be with ad hoc update statements, until I
hit the final goofy answer ( rg_replace(string, start) || substring(end) )

Have not yet had a chance to re-create on dev. Test worked like a charm.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6Rob Sargent
robjsargent@gmail.com
In reply to: Tom Lane (#4)
Re: regexp_replace failing on 9.0.4

Maybe we're barking up the wrong tree by suspecting the regex itself.
Perhaps the updates were suppressed by a trigger, or the transaction
rolled back instead of committing, or some such?

regards, tom lane

Barking mad, more like it. I had rolled back the execution of my
function in my dev env. and running again produces the same result as
experienced in production system.

Why the update of the text type field view regexp_replace is failing
still confuses me, but that's pretty much my natural state. The size of
the text field ranges from 7.5k to 24k char.

Here is the update (again)
update cms.segment_data s
set text = regexp_replace(f.content,
'(^.*)ns/acres/pathology/dx/1.5(.*$)', E'\\1ns/acres/pathology/dx/1.6\\2')
from
dm.payload_segment p,
support.fix_pathdx_namespace f
where s.id = p.segment_data_id
and p.item_id = f.item_id
and p.content_version = f.maxversion
;

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#7Rob Sargent
robjsargent@gmail.com
In reply to: Tom Lane (#4)
Re: regexp_replace failing on 9.0.4

On 03/18/2013 02:40 PM, Tom Lane wrote:

Rob Sargent <robjsargent@gmail.com> writes:

On 03/18/2013 01:19 PM, Tom Lane wrote:

Rob Sargent <robjsargent@gmail.com> writes:

On our 9.0.4[1] server my regexp_replace is a no-op, but on the 9.0.3[2]
test machine and my 9.1.2[3] dev box all is fine

AFAICS from the commit logs, there were no changes affecting the regex
code between 9.0.3 and 9.0.4. I'm suspicious that your data is
different on the different servers.

Good to hear, thought I might have glossed over the telling release note
- my usual mo

Maybe we're barking up the wrong tree by suspecting the regex itself.
Perhaps the updates were suppressed by a trigger, or the transaction
rolled back instead of committing, or some such?

regards, tom lane

For fun I decided to install 9.2 and thought I would try my luck there.
Here's was I saw (apologies for the wide output).

#localhost:cms# select count(*) from pg_trigger;
+-------+
| count |
+-------+
| 364 |
+-------+
(1 row)

Time: 0.407 ms
#localhost:cms# select tgname from pg_trigger where tgname !~
'^RI_ConstraintTrigger';
+--------+
| tgname |
+--------+
+--------+
(0 rows)

#localhost:cms# select version();
+-------------------------------------------------------------------------------------------------------+
|                 version 
     |
+---------------------------------------------------------------------------------------------------------
| PostgreSQL 9.2.3 on x86_64-unknown-linux-gnu, compiled by gcc 
(Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit |
+-------------------------------------------------------------------------------------------------------+
(1 row)
select substring(text,1,150) from cms.segment_data
where id = 'c092880f-8484-4b29-b712-f3df12216701';
+----------------------------------------------------------------------------------------------------------------+
|                                                   substring 
                                          |
+----------------------------------------------------------------------------------------------------------------+
| <?xml version="1.0" encoding="UTF-8"?>\r 
                                         +|
| <pathology xmlns="http://amirsys.com/ns/acres/pathology/dx/1.5" 
enable-tables="true"><metadata><id>diversion_c |
+----------------------------------------------------------------------------------------------------------------+
(1 row)

#localhost:cms# begin;
BEGIN

<< simple update in place>>
update cms.segment_data
set text = regexp_replace(text,'(^.*)ns/acres/pathology/dx/1.5(.*$)',
E'\\1ns/acres/pathology/dx/1.6\\2')
where id = 'c092880f-8484-4b29-b712-f3df12216701';
UPDATE 1

< check >>
select substring(text,1,150) from cms.segment_data
where id = 'c092880f-8484-4b29-b712-f3df12216701';
+----------------------------------------------------------------------------------------------------------------+
|                                                 substring 
                                        |
+----------------------------------------------------------------------------------------------------------------+
| <?xml version="1.0" encoding="UTF-8"?>\r 
                                         +|
| <pathology xmlns="http://amirsys.com/ns/acres/pathology/dx/1.5" 
enable-tables="true"><metadata><id>diversion_c |
+----------------------------------------------------------------------------------------------------------------+
(1 row) NO CHANGE (still "1.5");

<< update in parts >>
update cms.segment_data set text =
regexp_replace(substring(text,1,150),
'(^.*)ns/acres/pathology/dx/1.5(.*$)',
E'\\1ns/acres/pathology/dx/1.6\\2') || substring(text, 151)
where id = 'c092880f-8484-4b29-b712-f3df12216701';
UPDATE 1

<<check>>
select substring(text,1,150) from cms.segment_data
where id = 'c092880f-8484-4b29-b712-f3df12216701';
+----------------------------------------------------------------------------------------------------------------+
|                                                   substring 
                                          |
+----------------------------------------------------------------------------------------------------------------+
| <?xml version="1.0" encoding="UTF-8"?>\r 
                                         +|
| <pathology xmlns="http://amirsys.com/ns/acres/pathology/dx/1.6" 
enable-tables="true"><metadata><id>diversion_c |
+----------------------------------------------------------------------------------------------------------------+
(1 row)

<<CHANGED!! (now "1.6")>>

ROLLBACK

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rob Sargent (#7)
Re: regexp_replace failing on 9.0.4

Rob Sargent <robjsargent@gmail.com> writes:

For fun I decided to install 9.2 and thought I would try my luck there.
Here's was I saw (apologies for the wide output).

<< simple update in place>>
update cms.segment_data
set text = regexp_replace(text,'(^.*)ns/acres/pathology/dx/1.5(.*$)',
E'\\1ns/acres/pathology/dx/1.6\\2')
where id = 'c092880f-8484-4b29-b712-f3df12216701';
UPDATE 1
[ doesn't change the first 150 characters of the field ]

<< update in parts >>
update cms.segment_data set text =
regexp_replace(substring(text,1,150),
'(^.*)ns/acres/pathology/dx/1.5(.*$)',
E'\\1ns/acres/pathology/dx/1.6\\2') || substring(text, 151)
where id = 'c092880f-8484-4b29-b712-f3df12216701';
UPDATE 1
[ does change the first 150 characters of the field ]

I'm suspicious that there is more than one match to the substring
in that field, with the extra match(es) coming beyond char 150.
The way that regexp is written, I think it would replace the last
match not the first.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9Rob Sargent
robjsargent@gmail.com
In reply to: Tom Lane (#8)
Re: regexp_replace failing on 9.0.4

On 03/21/2013 06:25 PM, Tom Lane wrote:

Rob Sargent <robjsargent@gmail.com> writes:

For fun I decided to install 9.2 and thought I would try my luck there.
Here's was I saw (apologies for the wide output).

<< simple update in place>>
update cms.segment_data
set text = regexp_replace(text,'(^.*)ns/acres/pathology/dx/1.5(.*$)',
E'\\1ns/acres/pathology/dx/1.6\\2')
where id = 'c092880f-8484-4b29-b712-f3df12216701';
UPDATE 1
[ doesn't change the first 150 characters of the field ]

<< update in parts >>
update cms.segment_data set text =
regexp_replace(substring(text,1,150),
'(^.*)ns/acres/pathology/dx/1.5(.*$)',
E'\\1ns/acres/pathology/dx/1.6\\2') || substring(text, 151)
where id = 'c092880f-8484-4b29-b712-f3df12216701';
UPDATE 1
[ does change the first 150 characters of the field ]

I'm suspicious that there is more than one match to the substring
in that field, with the extra match(es) coming beyond char 150.
The way that regexp is written, I think it would replace the last
match not the first.

regards, tom lane

Indeed there are (or at least may be) other instances of the namespace
string. I was operating on the assumption that only the first would get
hit, but I fear greediness has gotten the better of my yet again.

And there's reason to believe the "9.0.3" test server db did not have
the proliferations of the string.

Thanks as always.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general