Regex help again (sorry, I am bad at these)

Started by Christopher Molnarover 10 years ago7 messagesgeneral
Jump to latest
#1Christopher Molnar
cmolnar@ourworldservices.com

Sorry to have to ask the experts here for some regex assistance again. I am
admittadly awful with these and could use some help.

Have the following string (this is an example) that needs to be changed.
Need to use a search and replace because the links across over 200K records
are similar but not the same.

'<p>Complete the attached lab and submit via dropbox</p>\r<p><a href="
https://owncloud.porterchester.edu/HVACR/PCI_GasHeat/GasElectrical/HVACR1114_LAB_13A.pdf&quot;
title="Lab 13A">Lab 13A<\a>'

Need the final string to separate the "LAB_13A.pdf" from the rest of the
URL by inserting a "&file=" in front of it. The final string should look
like:

'<p>Complete the attached lab and submit via dropbox</p>\r<p><a href="
https://owncloud.porterchester.edu/HVACR/PCI_GasHeat/GasElectrical/&amp;file=HVACR1114_LAB_13A.pdf&quot;
title="Lab 13A">Lab 13A<\a>'

I have tried something like:

update pcilms_assign set intro=regexp_replace(intro, '/([^/]*)\" title=$',
'&files=\1') where intro like '%https://owncloud.porterchester.edu%&#39; and
course=18 and id=55413;

and the result puts the &file= in the wrong place (at the end of the whole
string).

Any suggestions?

Thanks!

-Chris

#2Félix GERZAGUET
felix.gerzaguet@gmail.com
In reply to: Christopher Molnar (#1)
Re: Regex help again (sorry, I am bad at these)

Hello Chris,

On Mon, Dec 28, 2015 at 8:10 PM, Christopher Molnar <
cmolnar@ourworldservices.com> wrote:

Any suggestions?

This seems to works:

select regexp_replace('<p>Complete the attached lab and submit via
dropbox</p>\r<p><a href="
https://owncloud.porterchester.edu/HVACR/PCI_GasHeat/GasElectrical/HVACR1114_LAB_13A.pdf&quot;
title="Lab 13A">Lab 13A<\a>', '/([^/]*)\" title=', '/&file=\1" title=')

Regards,

Félix

#3Melvin Davidson
melvin6925@gmail.com
In reply to: Christopher Molnar (#1)
Re: Regex help again (sorry, I am bad at these)

Will this work?

UPDATE your_table
SET your_column = REPLACE (your_column, 'HVACR1114_LAB_13A.pdf',
'&file=HVACR1114_LAB_13A.pdf')
WHERE <whatever_expression is needed>;

Your mileage may vary because you have not stated your VERSION of
PostgreSQL or your O/S.

On Mon, Dec 28, 2015 at 2:10 PM, Christopher Molnar <
cmolnar@ourworldservices.com> wrote:

Sorry to have to ask the experts here for some regex assistance again. I
am admittadly awful with these and could use some help.

Have the following string (this is an example) that needs to be changed.
Need to use a search and replace because the links across over 200K records
are similar but not the same.

'<p>Complete the attached lab and submit via dropbox</p>\r<p><a href="
https://owncloud.porterchester.edu/HVACR/PCI_GasHeat/GasElectrical/HVACR1114_LAB_13A.pdf&quot;
title="Lab 13A">Lab 13A<\a>'

Need the final string to separate the "LAB_13A.pdf" from the rest of the
URL by inserting a "&file=" in front of it. The final string should look
like:

'<p>Complete the attached lab and submit via dropbox</p>\r<p><a href="
https://owncloud.porterchester.edu/HVACR/PCI_GasHeat/GasElectrical/&amp;file=HVACR1114_LAB_13A.pdf&quot;
title="Lab 13A">Lab 13A<\a>'

I have tried something like:

update pcilms_assign set intro=regexp_replace(intro, '/([^/]*)\"
title=$', '&files=\1') where intro like '%
https://owncloud.porterchester.edu%&#39; and course=18 and id=55413;

and the result puts the &file= in the wrong place (at the end of the whole
string).

Any suggestions?

Thanks!

-Chris

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#4David G. Johnston
david.g.johnston@gmail.com
In reply to: Christopher Molnar (#1)
Re: Regex help again (sorry, I am bad at these)

On Mon, Dec 28, 2015 at 12:10 PM, Christopher Molnar <
cmolnar@ourworldservices.com> wrote:

Given this...

'<p>Complete the attached lab and submit via dropbox</p>\r<p><a href="
https://owncloud.porterchester.edu/HVACR/PCI_GasHeat/GasElectrical/HVACR1114_LAB_13A.pdf&quot;
title="Lab 13A">Lab 13A<\a>'

​I have no clue how the following gives you any matches...​
specifically the presence of the "$" after the title= causes the entire
pattern to always fail since that isn't the end of the string.

update pcilms_assign set intro=regexp_replace(intro, '/([^/]*)\"
title=$', '&files=\1') where intro like '%
https://owncloud.porterchester.edu%&#39; and course=18 and id=55413;

and the result puts the &file= in the wrong place (at the end of the whole
string).

​The basic problem is that entirety of the content that your pattern
matches ​is replaced with the totality of the replacement expression.
Since you are matching the literal "title=" you have to somehow place that
same literal in the result. You can capture it and then use "\2" or you
can place it literally like Félix shows.

Alternatively, don't capture it. The way you match something without
capturing it is by using what is termed a "zero-width" expression or a
"look-around". In this case you want to "look-ahead" which is expressed
thusly: (?=)

So...

'/([^/]*)(?=" title=)'

SELECT regexp_replace('<a href="https://www.www.www/path/FILE.pdf&quot;
title="FILE">', '/([^/]*)(?=" title=)', '&files=\1')

David J.

#5David G. Johnston
david.g.johnston@gmail.com
In reply to: Melvin Davidson (#3)
Re: Regex help again (sorry, I am bad at these)

On Mon, Dec 28, 2015 at 12:25 PM, Melvin Davidson <melvin6925@gmail.com>
wrote:

Will this work?

UPDATE your_table
SET your_column = REPLACE (your_column, 'HVACR1114_LAB_13A.pdf',
'&file=HVACR1114_LAB_13A.pdf')
WHERE <whatever_expression is needed>;

Your mileage may vary because you have not stated your VERSION of
PostgreSQL or your O/S.

​What part of your solution is version or O/S dependent?

David J.

#6Melvin Davidson
melvin6925@gmail.com
In reply to: David G. Johnston (#5)
Re: Regex help again (sorry, I am bad at these)

Although, in this particular case, it is not version or O/S specific, it is
generally a good policy (and manners) to state them whenever contacting
this mail list.
In that way, future users that refer back to problems have it documented as
to which are and are not version specific.

On Mon, Dec 28, 2015 at 2:42 PM, David G. Johnston <
david.g.johnston@gmail.com> wrote:

On Mon, Dec 28, 2015 at 12:25 PM, Melvin Davidson <melvin6925@gmail.com>
wrote:

Will this work?

UPDATE your_table
SET your_column = REPLACE (your_column, 'HVACR1114_LAB_13A.pdf',
'&file=HVACR1114_LAB_13A.pdf')
WHERE <whatever_expression is needed>;

Your mileage may vary because you have not stated your VERSION of
PostgreSQL or your O/S.

​What part of your solution is version or O/S dependent?

David J.

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#7Christopher Molnar
cmolnar@ourworldservices.com
In reply to: Félix GERZAGUET (#2)
Re: Regex help again (sorry, I am bad at these)

Thank you Felix that was exactly what I needed!

-Chris

On Mon, Dec 28, 2015 at 2:23 PM, Félix GERZAGUET <felix.gerzaguet@gmail.com>
wrote:

Show quoted text

Hello Chris,

On Mon, Dec 28, 2015 at 8:10 PM, Christopher Molnar <
cmolnar@ourworldservices.com> wrote:

Any suggestions?

This seems to works:

select regexp_replace('<p>Complete the attached lab and submit via
dropbox</p>\r<p><a href="
https://owncloud.porterchester.edu/HVACR/PCI_GasHeat/GasElectrical/HVACR1114_LAB_13A.pdf&quot;
title="Lab 13A">Lab 13A<\a>', '/([^/]*)\" title=', '/&file=\1" title=')

Regards,

Félix