BUG #11523: Regular expressions work differently on different platforms

Started by Daniel Migowskiover 11 years ago2 messagesbugs
Jump to latest
#1Daniel Migowski
dmigowski@ikoffice.de

The following bug has been logged on the website:

Bug reference: 11523
Logged by: Daniel Migowski
Email address: dmigowski@ikoffice.de
PostgreSQL version: 9.1.2
Operating system: Debian Linux 6.0.6 + Windows 7
Description:

I recently found that regular expressions, or specifically the [:space:]
shorthand escape work differntly on Windows and Linux. On Linux the
non-brakeable space is not included in the shorthand escape, on windows it
is. The following statement is therefore true on Windows and false on
Linux:

select convert_from(E'\\xA0'::bytea,'ISO8859-1') ~ '\s'

This brakes email validation here, and the insert of a linux created backup
into my windows machine. Is it possible to fix that? Is there a reason that
UTF-8 on Linux differs from UTF-8 on Windows?

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Migowski (#1)
Re: BUG #11523: Regular expressions work differently on different platforms

dmigowski@ikoffice.de writes:

I recently found that regular expressions, or specifically the [:space:]
shorthand escape work differntly on Windows and Linux. On Linux the
non-brakeable space is not included in the shorthand escape, on windows it
is.

That would depend on what locale you're using for LC_CTYPE. We can't do
much about the fact that locale definitions vary across platforms. In
principle you could use C locale, which *is* standardized, but that cure
may be worse than the disease for your purposes.

You could always spell it out with whatever set of characters you consider
whitespace: [ \t\r\n] or something like that. For purposes like email
address validation, the set of whitespace characters allowed by the
relevant RFCs is probably smaller than most locales' [:space:] anyway.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs