order of (escaped) characters in regex range
Dear List,
I found this interesting:
SELECT regexp_matches('123-A' , E'(3[A-Z\- ])');
ERROR: invalid regular expression: invalid character range
whereas:
SELECT regexp_matches('123-A' , E'(3[\- A-Z])');
regexp_matches
----------------
{3-}
(1 row)
Notice the order of (escaped) characters and ranges in the last bit of the
expression.
Am I missing some key concept of the regular expression?
Regards,
Rob
On 13 December 2011 14:04, InterRob <rob.marjot@gmail.com> wrote:
Dear List,
I found this interesting:
SELECT regexp_matches('123-A' , E'(3[A-Z\- ])');
ERROR: invalid regular expression: invalid character rangewhereas:
SELECT regexp_matches('123-A' , E'(3[\- A-Z])');
regexp_matches
----------------
{3-}
(1 row)Notice the order of (escaped) characters and ranges in the last bit of the
expression.Am I missing some key concept of the regular expression?
Regards,
Rob
Hi Rob,
try '\\-' instead of '\-'
and it works :)
regards
Szymon
True, but still weird...
And are you sure it does the same thing?
2011/12/13 Szymon Guz <mabewlun@gmail.com>
Show quoted text
On 13 December 2011 14:04, InterRob <rob.marjot@gmail.com> wrote:
Dear List,
I found this interesting:
SELECT regexp_matches('123-A' , E'(3[A-Z\- ])');
ERROR: invalid regular expression: invalid character rangewhereas:
SELECT regexp_matches('123-A' , E'(3[\- A-Z])');
regexp_matches
----------------
{3-}
(1 row)Notice the order of (escaped) characters and ranges in the last bit of
the expression.Am I missing some key concept of the regular expression?
Regards,
RobHi Rob,
try '\\-' instead of '\-'
and it works :)regards
Szymon
True, but still weird...
And are you sure it does the same thing?
2011/12/13 Szymon Guz <mabewlun@gmail.com>
Show quoted text
On 13 December 2011 14:04, InterRob <rob.marjot@gmail.com> wrote:
Dear List,
I found this interesting:
SELECT regexp_matches('123-A' , E'(3[A-Z\- ])');
ERROR: invalid regular expression: invalid character rangewhereas:
SELECT regexp_matches('123-A' , E'(3[\- A-Z])');
regexp_matches
----------------
{3-}
(1 row)Notice the order of (escaped) characters and ranges in the last bit of
the expression.Am I missing some key concept of the regular expression?
Regards,
RobHi Rob,
try '\\-' instead of '\-'
and it works :)regards
Szymon
On Dec 13, 2011, at 8:09, Szymon Guz <mabewlun@gmail.com> wrote:
On 13 December 2011 14:04, InterRob <rob.marjot@gmail.com> wrote:
Dear List,I found this interesting:
SELECT regexp_matches('123-A' , E'(3[A-Z\- ])');
ERROR: invalid regular expression: invalid character rangewhereas:
SELECT regexp_matches('123-A' , E'(3[\- A-Z])');
regexp_matches
----------------
{3-}
(1 row)Notice the order of (escaped) characters and ranges in the last bit of the expression.
Am I missing some key concept of the regular expression?
Regards,
RobHi Rob,
try '\\-' instead of '\-'
and it works :)regards
If you don't intend to use PostgreSQL escapes in your string then omit the leading 'E'.
In a character class the - symbol has special meaning if it appears anywhere but the first character of the group. To avoid that special meaning you have to escape it. If it appears first it always means a literal -. The PostgreSQL documentation does not fully describe RegularExpressions but a reference book on them would note this particular behavior.
David J.
Thanks guys, i see what you mean.
I do intend to use the PG escaping, in order to avoid that annoying
warning... Hence, my expression should indeed be:
SELECT regexp_matches('123-A' , E'(3[A-Z\\-\\(\\) ])');
In the above expression i added the parentheses as I whish to match these
as well :))
Thanks!
2011/12/13 David Johnston <polobo@yahoo.com>
Show quoted text
On Dec 13, 2011, at 8:09, Szymon Guz <mabewlun@gmail.com> wrote:
On 13 December 2011 14:04, InterRob < <rob.marjot@gmail.com>
rob.marjot@gmail.com> wrote:Dear List,
I found this interesting:
SELECT regexp_matches('123-A' , E'(3[A-Z\- ])');
ERROR: invalid regular expression: invalid character rangewhereas:
SELECT regexp_matches('123-A' , E'(3[\- A-Z])');
regexp_matches
----------------
{3-}
(1 row)Notice the order of (escaped) characters and ranges in the last bit of
the expression.Am I missing some key concept of the regular expression?
Regards,
RobHi Rob,
try '\\-' instead of '\-'
and it works :)regards
If you don't intend to use PostgreSQL escapes in your string then omit the
leading 'E'.In a character class the - symbol has special meaning if it appears
anywhere but the first character of the group. To avoid that special
meaning you have to escape it. If it appears first it always means a
literal -. The PostgreSQL documentation does not fully describe
RegularExpressions but a reference book on them would note this particular
behavior.David J.
On Tue, Dec 13, 2011 at 02:51:15PM +0100, InterRob wrote:
Thanks guys, i see what you mean.
I do intend to use the PG escaping, in order to avoid that annoying
warning... Hence, my expression should indeed be:
SELECT regexp_matches('123-A' , E'(3[A-Z\\-\\(\\) ])');In the above expression i added the parentheses as I whish to match these
as well :))
instead of putting that much quoting just do:
SELECT regexp_matches('123-A' , '(3[A-Z() -])');
( and ) don't need to be quoted, and if you'll move - at the beginning
or end (i prefer end) of range, it doesn't need to be quoted either.
Best regards,
depesz
--
The best thing about modern society is how easy it is to avoid contact with it.
http://depesz.com/
On Tue, Dec 13, 2011 at 7:51 AM, InterRob <rob.marjot@gmail.com> wrote:
Thanks guys, i see what you mean.
I do intend to use the PG escaping, in order to avoid that annoying
warning... Hence, my expression should indeed be:
SELECT regexp_matches('123-A' , E'(3[A-Z\\-\\(\\) ])');In the above expression i added the parentheses as I whish to match these as
well :))
I advise dollar quoting when writing complicated regular expressions:
E'(3[A-Z\\-\\(\\) ])'
becomes
$$(3[A-Z\-\(\) ])$$
merlin
-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Merlin Moncure
Sent: Tuesday, December 13, 2011 11:39 AM
To: rob@marjot-multisoft.com
Cc: David Johnston; Szymon Guz; pgsql-general
Subject: Re: [GENERAL] order of (escaped) characters in regex range
On Tue, Dec 13, 2011 at 7:51 AM, InterRob <rob.marjot@gmail.com> wrote:
Thanks guys, i see what you mean.
I do intend to use the PG escaping, in order to avoid that annoying
warning... Hence, my expression should indeed be:
SELECT regexp_matches('123-A' , E'(3[A-Z\\-\\(\\) ])');In the above expression i added the parentheses as I whish to match
these as well :))
I advise dollar quoting when writing complicated regular expressions:
E'(3[A-Z\\-\\(\\) ])'
becomes
$$(3[A-Z\-\(\) ])$$
merlin
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make
changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
---------------------------------------------------------------
Aside from backward compatibility, and the various warnings, is there any
reason to prefer dollar-quoting over a non-SQL-escaped string literal (i.e.,
'3[A-Z\-\(\) ]' ) ?
David J.
On Tue, Dec 13, 2011 at 10:53 AM, David Johnston <polobo@yahoo.com> wrote:
Aside from backward compatibility, and the various warnings, is there any
reason to prefer dollar-quoting over a non-SQL-escaped string literal (i.e.,
'3[A-Z\-\(\) ]' ) ?
yeah -- because sooner or later you have to stick a single quote in
there (of course, you can double the ', but I personally think that's
awful).
merlin