Replacing Ordinal Suffixes

Started by George Weaverabout 12 years ago18 messagesgeneral
Jump to latest
#1George Weaver
gweaver@shaw.ca

Hi list,

I'm stumped.

I am trying to use Regexp_Replace to replace ordinal suffixes in addresses (eg have '126th' want '126') for comparison purposes. So far no luck.

I have found that

SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), '(?!/D)(st|nd|rd|th)', '', 'g');
regexp_replace
------------------
300 nor 126 reet

but

SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), '(?=/D)(st|nd|rd|th)', '', 'g');
regexp_replace
------------------------
300 north 126th street

I'm a novice with regular expressions and google hasn't helped much.

Any suggestions?

Thanks,
George

#2Paul Jungwirth
pj@illuminatedcomputing.com
In reply to: George Weaver (#1)
Re: Replacing Ordinal Suffixes

Try this:

SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'),
'(\d)(st|nd|rd|th)', '\1', 'g');

Note that matching a number is \d not /D: backslash, not forward
slash, and lowercase d not uppercase. \d means a digit, \D means
anything except a digit.

Also, I don't think Postgres supports positive lookbehind expressions
(which are actually (?<=foo), not (?!foo)), but you can get the same
effect by capturing the number with (\d) and then outputting it again
with the \1.

Paul

On Fri, Feb 28, 2014 at 2:04 PM, George Weaver <gweaver@shaw.ca> wrote:

Hi list,

I'm stumped.

I am trying to use Regexp_Replace to replace ordinal suffixes in addresses
(eg have '126th' want '126') for comparison purposes. So far no luck.

I have found that

SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'),
'(?!/D)(st|nd|rd|th)', '', 'g');
regexp_replace
------------------
300 nor 126 reet

but

SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'),
'(?=/D)(st|nd|rd|th)', '', 'g');
regexp_replace
------------------------
300 north 126th street

I'm a novice with regular expressions and google hasn't helped much.

Any suggestions?

Thanks,
George

--
_________________________________
Pulchritudo splendor veritatis.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Steve Atkins
steve@blighty.com
In reply to: George Weaver (#1)
Re: Replacing Ordinal Suffixes

On Feb 28, 2014, at 2:04 PM, George Weaver <gweaver@shaw.ca> wrote:

Hi list,

I'm stumped.

I am trying to use Regexp_Replace to replace ordinal suffixes in addresses (eg have '126th' want '126') for comparison purposes. So far no luck.

I have found that

SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), '(?!/D)(st|nd|rd|th)', '', 'g');
regexp_replace
------------------
300 nor 126 reet

but

SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), '(?=/D)(st|nd|rd|th)', '', 'g');
regexp_replace
------------------------
300 north 126th street

I'm a novice with regular expressions and google hasn't helped much.

Any suggestions?

Maybe this?

select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)', '\1', 'gi');

Cheers,
Steve

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Bret Stern
bret_stern@machinemanagement.com
In reply to: Steve Atkins (#3)
Re: Replacing Ordinal Suffixes

This is a kick *ss forum. I must say.

On Fri, 2014-02-28 at 14:17 -0800, Steve Atkins wrote:

On Feb 28, 2014, at 2:04 PM, George Weaver <gweaver@shaw.ca> wrote:

Hi list,

I'm stumped.

I am trying to use Regexp_Replace to replace ordinal suffixes in addresses (eg have '126th' want '126') for comparison purposes. So far no luck.

I have found that

SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), '(?!/D)(st|nd|rd|th)', '', 'g');
regexp_replace
------------------
300 nor 126 reet

but

SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), '(?=/D)(st|nd|rd|th)', '', 'g');
regexp_replace
------------------------
300 north 126th street

I'm a novice with regular expressions and google hasn't helped much.

Any suggestions?

Maybe this?

select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)', '\1', 'gi');

Cheers,
Steve

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5George Weaver
gweaver@cleartagsoftware.com
In reply to: George Weaver (#1)
Re: Replacing Ordinal Suffixes

----- Original Message -----
From: Steve Atkins
To: pgsql-general
Sent: Friday, February 28, 2014 4:17 PM
Subject: Re: [GENERAL] Replacing Ordinal Suffixes

On Feb 28, 2014, at 2:04 PM, George Weaver <gweaver@shaw.ca> wrote:

Hi list,

I'm stumped.

I am trying to use Regexp_Replace to replace ordinal suffixes in addresses
(eg have '126th' want '126') for comparison purposes. So far no luck.

I have found that

SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'),
'(?!/D)(st|nd|rd|th)', '', 'g');
regexp_replace
------------------
300 nor 126 reet

but

SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'),
'(?=/D)(st|nd|rd|th)', '', 'g');
regexp_replace
------------------------
300 north 126th street

I'm a novice with regular expressions and google hasn't helped much.

Any suggestions?

Maybe this?

select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)',
'\1', 'gi');

Hi Steve,

Thanks, but no luck:

select regexp_replace('300 North 126th Street', E'(\d+)(?:st|nd|rd|th)',
E'\1', 'gi');
regexp_replace
------------------------
300 North 126th Street

George

Cheers,
Steve

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6George Weaver
gweaver@shaw.ca
In reply to: George Weaver (#1)
Re: Replacing Ordinal Suffixes

From: Paul Jungwirth

Try this:

SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'),

'(\d)(st|nd|rd|th)', '\1', 'g');

Hi Paul,

No luck...

SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), E'(\d)(st|nd|rd|th)',
E'\1', 'g');
regexp_replace
------------------------
300 north 126th street
(1 row)

Note that matching a number is \d not /D: backslash, not forward
slash, and lowercase d not uppercase. \d means a digit, \D means
anything except a digit.

Also, I don't think Postgres supports positive lookbehind expressions
(which are actually (?<=foo), not (?!foo)), but you can get the same
effect by capturing the number with (\d) and then outputting it again
with the \1.

Paul

On Fri, Feb 28, 2014 at 2:04 PM, George Weaver <gweaver@shaw.ca> wrote:

Hi list,

I'm stumped.

I am trying to use Regexp_Replace to replace ordinal suffixes in addresses
(eg have '126th' want '126') for comparison purposes. So far no luck.

I have found that

SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'),
'(?!/D)(st|nd|rd|th)', '', 'g');
regexp_replace
------------------
300 nor 126 reet

but

SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'),
'(?=/D)(st|nd|rd|th)', '', 'g');
regexp_replace
------------------------
300 north 126th street

I'm a novice with regular expressions and google hasn't helped much.

Any suggestions?

Thanks,
George

--
_________________________________
Pulchritudo splendor veritatis.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#7George Weaver
gweaver@shaw.ca
In reply to: George Weaver (#6)
Re: Replacing Ordinal Suffixes

From: Steve Atkins

On Feb 28, 2014, at 2:04 PM, George Weaver <gweaver@shaw.ca> wrote:

Hi list,

I'm stumped.

I am trying to use Regexp_Replace to replace ordinal suffixes in addresses
(eg have '126th' want '126') for comparison purposes. So far no luck.

I have found that

SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'),
'(?!/D)(st|nd|rd|th)', '', 'g');
regexp_replace
------------------
300 nor 126 reet

but

SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'),
'(?=/D)(st|nd|rd|th)', '', 'g');
regexp_replace
------------------------
300 north 126th street

I'm a novice with regular expressions and google hasn't helped much.

Any suggestions?

Maybe this?

select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)',
'\1', 'gi');

Hi Steve,

Thanks, but no luck:

select regexp_replace('300 North 126th Street', E'(\d+)(?:st|nd|rd|th)',
E'\1', 'gi');
regexp_replace
------------------------
300 North 126th Street

George

Cheers,
Steve

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8Steve Atkins
steve@blighty.com
In reply to: George Weaver (#7)
Re: Replacing Ordinal Suffixes

On Feb 28, 2014, at 2:43 PM, George Weaver <gweaver@shaw.ca> wrote:

From: Steve Atkins

Maybe this?

select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)',
'\1', 'gi');

Hi Steve,

Thanks, but no luck:

select regexp_replace('300 North 126th Street', E'(\d+)(?:st|nd|rd|th)',
E'\1', 'gi');
regexp_replace
------------------------
300 North 126th Street

George

Those E’s you added completely change the meaning. If you want to
use E-style literals (and you probably don’t) you’ll need to double the
backslashes in all the strings.

Cheers,
Steve

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9George Weaver
gweaver@shaw.ca
In reply to: George Weaver (#7)
Re: Replacing Ordinal Suffixes

----- Original Message -----
From: Steve Atkins

On Feb 28, 2014, at 2:43 PM, George Weaver <gweaver@shaw.ca> wrote:

Maybe this?

select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)',
'\1', 'gi');

Hi Steve,

Thanks, but no luck:

select regexp_replace('300 North 126th Street', E'(\d+)(?:st|nd|rd|th)',
E'\1', 'gi');
regexp_replace
------------------------
300 North 126th Street

George

Those E�s you added completely change the meaning. If you want to
use E-style literals (and you probably don�t) you�ll need to double the
backslashes in all the strings.

Hi Steve,

Without the E's:

development=# select regexp_replace('300 North 126th Street',
'(\d+)(?:st|nd|rd|th)', '\1', 'gi');
WARNING: nonstandard use of escape in a string literal
LINE 1: select regexp_replace('300 North 126th Street', '(\d+)(?:st|...
^
HINT: Use the escape string syntax for escapes, e.g., E'\r\n'.
WARNING: nonstandard use of escape in a string literal
LINE 1: ...'300 North 126th Street', '(\d+)(?:st|nd|rd|th)', '\1', 'gi'...
^
HINT: Use the escape string syntax for escapes, e.g., E'\r\n'.
regexp_replace
------------------------
300 North 126th Street
(1 row)

Frustrating...

Cheers,
Steve

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#10Ian Lawrence Barwick
barwick@gmail.com
In reply to: George Weaver (#9)
Re: Replacing Ordinal Suffixes

2014-03-01 8:16 GMT+09:00 George Weaver <gweaver@shaw.ca>:

----- Original Message ----- From: Steve Atkins

On Feb 28, 2014, at 2:43 PM, George Weaver <gweaver@shaw.ca> wrote:

Maybe this?

select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)',
'\1', 'gi');

Hi Steve,

Thanks, but no luck:

select regexp_replace('300 North 126th Street', E'(\d+)(?:st|nd|rd|th)',
E'\1', 'gi');
regexp_replace
------------------------
300 North 126th Street

George

Those E’s you added completely change the meaning. If you want to
use E-style literals (and you probably don’t) you’ll need to double the
backslashes in all the strings.

Hi Steve,

Without the E's:

development=# select regexp_replace('300 North 126th Street',
'(\d+)(?:st|nd|rd|th)', '\1', 'gi');
WARNING: nonstandard use of escape in a string literal
LINE 1: select regexp_replace('300 North 126th Street', '(\d+)(?:st|...
^
HINT: Use the escape string syntax for escapes, e.g., E'\r\n'.
WARNING: nonstandard use of escape in a string literal
LINE 1: ...'300 North 126th Street', '(\d+)(?:st|nd|rd|th)', '\1', 'gi'...
^
HINT: Use the escape string syntax for escapes, e.g., E'\r\n'.

regexp_replace
------------------------
300 North 126th Street
(1 row)

Frustrating...

Per Steve Atkin's note about double backslashes:

postgres=> select regexp_replace('300 North 126th Street',
E'(\\d+)(?:st|nd|rd|th)', E'\\1', 'gi');
regexp_replace
----------------------
300 North 126 Street
(1 row)

Regards

Ian Barwick

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#11James Cloos
cloos@jhcloos.com
In reply to: George Weaver (#6)
Re: Replacing Ordinal Suffixes

"GW" == George Weaver <gweaver@shaw.ca> writes:

GW> SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'),
GW> E'(\d)(st|nd|rd|th)', E'\1', 'g');
GW> regexp_replace
GW> ------------------------
GW> 300 north 126th street
GW> (1 row)

The E'' syntax eats your backslashes. For that version, try just:

SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'),'(\d)(st|nd|rd|th)', '\1', 'g');

-JimC
--
James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#12George Weaver
gweaver@shaw.ca
In reply to: George Weaver (#7)
Re: Replacing Ordinal Suffixes

From: Ian Lawrence Barwick

----- Original Message ----- From: Steve Atkins

On Feb 28, 2014, at 2:43 PM, George Weaver <gweaver@shaw.ca> wrote:

Maybe this?

select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)',
'\1', 'gi');

Hi Steve,

Thanks, but no luck:

select regexp_replace('300 North 126th Street', E'(\d+)(?:st|nd|rd|th)',
E'\1', 'gi');
regexp_replace
------------------------
300 North 126th Street

George

Those E’s you added completely change the meaning. If you want to
use E-style literals (and you probably don’t) you’ll need to double the
backslashes in all the strings.

Hi Steve,

Without the E's:

development=# select regexp_replace('300 North 126th Street',
'(\d+)(?:st|nd|rd|th)', '\1', 'gi');
WARNING: nonstandard use of escape in a string literal
LINE 1: select regexp_replace('300 North 126th Street', '(\d+)(?:st|...
^
HINT: Use the escape string syntax for escapes, e.g., E'\r\n'.
WARNING: nonstandard use of escape in a string literal
LINE 1: ...'300 North 126th Street', '(\d+)(?:st|nd|rd|th)', '\1', 'gi'...
^
HINT: Use the escape string syntax for escapes, e.g., E'\r\n'.

regexp_replace
------------------------
300 North 126th Street
(1 row)

Frustrating...

Per Steve Atkin's note about double backslashes:

postgres=> select regexp_replace('300 North 126th Street',
E'(\\d+)(?:st|nd|rd|th)', E'\\1', 'gi');
regexp_replace
----------------------
300 North 126 Street
(1 row)

Hi Ian,

I just got that as well - awesome!

Regards

Ian Barwick

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#13Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: George Weaver (#12)
Re: Replacing Ordinal Suffixes

George Weaver wrote:

Hi Ian,

I just got that as well - awesome!

http://xkcd.com/1313/

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#14George Weaver
gweaver@shaw.ca
In reply to: George Weaver (#1)
Re: Replacing Ordinal Suffixes

Hi All,

From: James Cloos

The E'' syntax eats your backslashes. For that version, try just:

SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'),'(\d)(st|nd|rd|th)',
'\1', 'g');

Actually, I found that the double backslashes are required whether the E is
used or not:

development=# select regexp_replace('300 North 126th Street',
'(\d+)(?:st|nd|rd|th)', '\1', 'gi');
WARNING: nonstandard use of escape in a string literal
LINE 1: select regexp_replace('300 North 126th Street', '(\d+)(?:st|...
^
HINT: Use the escape string syntax for escapes, e.g., E'\r\n'.
WARNING: nonstandard use of escape in a string literal
LINE 1: ...'300 North 126th Street', '(\d+)(?:st|nd|rd|th)', '\1', 'gi'...
^
HINT: Use the escape string syntax for escapes, e.g., E'\r\n'.
regexp_replace
------------------------
300 North 126th Street
(1 row)

development=# select regexp_replace('300 North 126th Street',
'(\\d+)(?:st|nd|rd|th)', '\\1', 'gi');
WARNING: nonstandard use of \\ in a string literal
LINE 1: select regexp_replace('300 North 126th Street', '(\\d+)(?:st...
^
HINT: Use the escape string syntax for backslashes, e.g., E'\\'.
WARNING: nonstandard use of \\ in a string literal
LINE 1: ...300 North 126th Street', '(\\d+)(?:st|nd|rd|th)', '\\1', 'gi...
^
HINT: Use the escape string syntax for backslashes, e.g., E'\\'.
regexp_replace
----------------------
300 North 126 Street
(1 row)

development=# select regexp_replace('300 North 126th Street',
E'(\\d+)(?:st|nd|rd|th)', E'\\1', 'gi');
regexp_replace
----------------------
300 North 126 Street

I appreciate the help of everyone who responded!

Cheers,
George

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: George Weaver (#14)
Re: Replacing Ordinal Suffixes

"George Weaver" <gweaver@shaw.ca> writes:

Actually, I found that the double backslashes are required whether the E is
used or not:

You must be using a relatively old PG version then. Default behavior
since around 9.1 has been that backslashes aren't special except
in E'' strings.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#16George Weaver
gweaver@shaw.ca
In reply to: George Weaver (#1)
Re: Replacing Ordinal Suffixes

Actually, I found that the double backslashes are required whether the E
is
used or not:

You must be using a relatively old PG version then. Default behavior
since around 9.1 has been that backslashes aren't special except
in E'' strings.

Hmm.

development=# select version();
version
------------------------------------------------------------
PostgreSQL 9.1.9, compiled by Visual C++ build 1500, 32-bit
(1 row)

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#17Steve Atkins
steve@blighty.com
In reply to: George Weaver (#16)
Re: Replacing Ordinal Suffixes

On Mar 1, 2014, at 11:45 AM, George Weaver <gweaver@shaw.ca> wrote:

Actually, I found that the double backslashes are required whether the E is
used or not:

You must be using a relatively old PG version then. Default behavior
since around 9.1 has been that backslashes aren't special except
in E'' strings.

Hmm.

development=# select version();
version
------------------------------------------------------------
PostgreSQL 9.1.9, compiled by Visual C++ build 1500, 32-bit
(1 row)

steve=# select version();
version
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 9.1.4 on x86_64-apple-darwin12.0.0, compiled by i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00), 64-bit
(1 row)

steve=# select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)', '\1', 'gi');
regexp_replace
----------------------
300 North 126 Street
(1 row)

I suspect you have standard_conforming_strings set to off (it defaults to on in 9.1), possibly for backwards compatibility to support an app you’re using that’s not been updated, possibly accidentally.

Cheers,
Steve

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#18George Weaver
gweaver@shaw.ca
In reply to: George Weaver (#1)
Re: Replacing Ordinal Suffixes

----- Original Message -----
From: Steve Atkins

<snip>

I suspect you have standard_conforming_strings set to off (it defaults to
on in 9.1), possibly for backwards compatibility to support an app you�re
using that�s not been updated, possibly accidentally.

You're right - it was off (now On). Though for the life of me I can't say
why...

On the production installations I support its On.

Thanks!

Cheers,
Steve

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general