psql/pg_dump vs. dollar signs in identifiers
An example being discussed on the jdbc list led me to try this:
regression=# create table a$b$c (f1 int);
CREATE TABLE
regression=# \d a$b$c
Did not find any relation named "a$b$c".
It works if you use quotes:
regression=# \d "a$b$c"
Table "public.a$b$c"
Column | Type | Modifiers
--------+---------+-----------
f1 | integer |
The reason it doesn't work without quotes is that processSQLNamePattern()
thinks this:
* Inside double quotes, or at all times if force_escape is true,
* quote regexp special characters with a backslash to avoid
* regexp errors. Outside quotes, however, let them pass through
* as-is; this lets knowledgeable users build regexp expressions
* that are more powerful than shell-style patterns.
and of course $ is a regexp special character, so it bollixes up the
match.
Now, because we surround the pattern with ^...$ anyway, I can't offhand
see a use-case for putting $ with its regexp meaning into the pattern.
And since we do allow $ as a non-first character of identifiers, there
is a use-case for expecting it to be treated like an ordinary character.
So I'm thinking that $ ought to be quoted whether it's inside double
quotes or not. This change would affect psql's describe commands as
well as pg_dump -t and -n patterns.
Comments?
regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> writes:
Now, because we surround the pattern with ^...$ anyway, I can't offhand
see a use-case for putting $ with its regexp meaning into the pattern.
It's possible to still usefully use $ in the regexp, but it's existence at the
end means there should always be a way to write the regexp without needing
another one inside.
Incidentally, are these really regexps? I always thought they were globs.
And experiments seem to back up my memory:
postgres=# \d foo*
Table "public.foo^bar"
Column | Type | Modifiers
--------+---------+-----------
i | integer |
postgres=# \d foo.*
Did not find any relation named "foo.*".
Comments?
The first half of the logic applies to ^ as well. There's no use case for
regexps using ^ inside. You would have to use quotes to create the table but
we could have \d foo^* work:
postgres=# \d foo^*
Did not find any relation named "foo^*".
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Gregory Stark <stark@enterprisedb.com> writes:
Incidentally, are these really regexps? I always thought they were globs.
They're regexps under the hood, but we treat . as a schema separator
and translate * to .*, which makes it look like mostly a glob scheme.
But you can make use of brackets, |, +, ...
regards, tom lane
On Mon, Jul 09, 2007 at 07:04:27PM +0100, Gregory Stark wrote:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:
Now, because we surround the pattern with ^...$ anyway, I can't offhand
see a use-case for putting $ with its regexp meaning into the pattern.It's possible to still usefully use $ in the regexp, but it's existence at the
end means there should always be a way to write the regexp without needing
another one inside.
Unless you're doing muti-line regex, what's the point of a $ anywhere
but the end of the expression? Am I missing something? Likewise with ^.
I'm inclined to escape $ as Tom suggested.
--
Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
"Jim C. Nasby" <decibel@decibel.org> writes:
Unless you're doing muti-line regex, what's the point of a $ anywhere
but the end of the expression? Am I missing something? Likewise with ^.
Leaving out the backslashes, you can do things like (foo$|baz|qux)(baz|qux|)
to say that all 9 combinations of those two tokens are valid except that foo
must be followed by the empty second half.
But it can always be refactored into something more normal like
(foo|((baz|qux)(baz|qux)?))
I'm inclined to escape $ as Tom suggested.
Yeah, I have a tendency to look for the most obscure counter-example if only
to be sure I really understand precisely how obscure it is. I do agree that
it's not a realistic concern. Especially since I never even realized we
handled regexps here at all :)
IIRC some regexp engines don't actually treat $ specially except at the end of
the regexp at all. Tom's just suggesting doing the same thing here where
complicated regexps are even *less* likely and dollars as literals more.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com