Regexp confusion
Trying to match some numbers, and I'm having some regexp problems. I've
boiled it down to the following:
/* (1) */ select '3.14' similar to E'^\\d+\\.\\d+$'; -- true
/* (2) */ select '3.14' similar to E'^\\d+(\\.\\d+)$'; -- true
/* (3) */ select '3.14' similar to E'^\\d+(\\.\\d+)*$'; -- true
/* (4) */ select '3.14' similar to E'^\\d+(\\.\\d+)?$'; -- false
/* (5) */ select '3.14' similar to E'^\\d+(\\.\\d+)+$'; -- true
So, based on (1) and (2), the pattern '\.\d+' occurs once. So why does
(4) return false? between (3), (4), and (5), it appears as though the
group is matching multiple times.
Thanks,
--
------------------------------------------------------------------------
*Doug Gorley* | doug.gorley@gmail.com <mailto:doug.gorley@gmail.com>
Doug Gorley escribi�:
Trying to match some numbers, and I'm having some regexp problems.
I've boiled it down to the following:/* (1) */ select '3.14' similar to E'^\\d+\\.\\d+$'; -- true
/* (2) */ select '3.14' similar to E'^\\d+(\\.\\d+)$'; -- true
/* (3) */ select '3.14' similar to E'^\\d+(\\.\\d+)*$'; -- true
/* (4) */ select '3.14' similar to E'^\\d+(\\.\\d+)?$'; -- false
/* (5) */ select '3.14' similar to E'^\\d+(\\.\\d+)+$'; -- trueSo, based on (1) and (2), the pattern '\.\d+' occurs once. So why
does (4) return false? between (3), (4), and (5), it appears as
though the group is matching multiple times.
I think the confusion is about what SIMILAR TO supports. ? it doesn't.
See here:
http://www.postgresql.org/docs/8.4/static/functions-matching.html#FUNCTIONS-SIMILARTO-REGEXP
You probably want to use ~ instead of SIMILAR TO.
(SIMILAR TO is a weird beast that the SQL committee came up with,
vaguely based on regular expressions.)
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes:
Doug Gorley escribi�:
Trying to match some numbers, and I'm having some regexp problems.
I've boiled it down to the following:/* (1) */ select '3.14' similar to E'^\\d+\\.\\d+$'; -- true
/* (2) */ select '3.14' similar to E'^\\d+(\\.\\d+)$'; -- true
/* (3) */ select '3.14' similar to E'^\\d+(\\.\\d+)*$'; -- true
/* (4) */ select '3.14' similar to E'^\\d+(\\.\\d+)?$'; -- false
/* (5) */ select '3.14' similar to E'^\\d+(\\.\\d+)+$'; -- trueSo, based on (1) and (2), the pattern '\.\d+' occurs once. So why
does (4) return false? between (3), (4), and (5), it appears as
though the group is matching multiple times.
I think the confusion is about what SIMILAR TO supports. ? it doesn't.
See here:
http://www.postgresql.org/docs/8.4/static/functions-matching.html#FUNCTIONS-SIMILARTO-REGEXP
You probably want to use ~ instead of SIMILAR TO.
(SIMILAR TO is a weird beast that the SQL committee came up with,
vaguely based on regular expressions.)
Hmm ... actually I think *none* of those should have succeeded, because
^ and $ are not supposed to be metacharacters in SIMILAR TO. We are
failing to quote them, but apparently we need to --- it looks like the
regexp engine processes ^^ at the start of the pattern the same as ^,
and likewise for $$ at the end.
regards, tom lane
Alvaro Herrera <alvherre@commandprompt.com> writes:
I think the confusion is about what SIMILAR TO supports. ? it doesn't.
Actually, upon looking into SQL:2008, it seems it's supposed to support
? now, and also {m,n} style bounds. Those weren't there in SQL99 ...
I've changed the similar_escape code to not escape ? and {, so that
those things will work now, and to escape ^ and $ instead.
regards, tom lane