unc paths, like and backslashes on 8.4
I just don't get how we are supposed to use LIKE with backslashes in
strings in 8.4. This is particularly vexing, because I have a field
containing UNC paths that I need to search on (and eventually update). I
have been looking at this page for guidance:
http://www.postgresql.org/docs/8.4/static/functions-matching.html
So I will ask my questions first, then show you what I tried:
1) Why do I get a warning when doubling a backslash?
2) What is the meaning of "E" syntax (E'\\\\fs1\\bar')?
3) If I have backslashes in my table, how can I get them back out?
4) I'd like to run an update to change the value '\\fs1\bar' to
\\fs1\foo\bar'. What incantation would do that.
So, trying to figure it out on my own...
CREATE TABLE FOOBAR
( UNC_PATH VARCHAR(100)
);
/* first insert attempt */
INSERT INTO FOOBAR VALUES ('\\FS1\BAR');
returns a warning:
WARNING: nonstandard use of \\ in a string literal
LINE 1: INSERT INTO FOOBAR VALUES ('\\FS1\BAR');
^
HINT: Use the escape string syntax for backslashes, e.g., E'\\'.
Query returned successfully: 1 row affected, 21 ms execution time.
but the row is inserted. There is one leading backslash, and the "b" is
some unprintable character. Let's try the "E" syntax, whatever that is:
INSERT INTO FOOBAR VALUES (E'\\FS1\BAR');
No warning, but exactly the same results again (one leading backslash, "b"
replaced by unprintable char). Let's try E with doubled backslashes:
INSERT INTO FOOBAR VALUES (E'\\\\FS1\\BAR');
okay, that worked. Yay. Now let's see if I can get the record back out
with "LIKE":
SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'\\\\FS1%';
That gets back a record, but the value returned is "\FS1BAR". I'm missing
two backslashes. I'm too confused to even attempt the update.
-Chris
On Sep 26, 2012, at 20:42, Chris Curvey <chris@chriscurvey.com> wrote:
I just don't get how we are supposed to use LIKE with backslashes in strings in 8.4. This is particularly vexing, because I have a field containing UNC paths that I need to search on (and eventually update). I have been looking at this page for guidance: http://www.postgresql.org/docs/8.4/static/functions-matching.html
So I will ask my questions first, then show you what I tried:
1) Why do I get a warning when doubling a backslash?
2) What is the meaning of "E" syntax (E'\\\\fs1\\bar')?
3) If I have backslashes in my table, how can I get them back out?
4) I'd like to run an update to change the value '\\fs1\bar' to \\fs1\foo\bar'. What incantation would do that.So, trying to figure it out on my own...
CREATE TABLE FOOBAR
( UNC_PATH VARCHAR(100)
);/* first insert attempt */
INSERT INTO FOOBAR VALUES ('\\FS1\BAR');returns a warning:
WARNING: nonstandard use of \\ in a string literal
LINE 1: INSERT INTO FOOBAR VALUES ('\\FS1\BAR');
^
HINT: Use the escape string syntax for backslashes, e.g., E'\\'.
Query returned successfully: 1 row affected, 21 ms execution time.but the row is inserted. There is one leading backslash, and the "b" is some unprintable character. Let's try the "E" syntax, whatever that is:
INSERT INTO FOOBAR VALUES (E'\\FS1\BAR');
No warning, but exactly the same results again (one leading backslash, "b" replaced by unprintable char). Let's try E with doubled backslashes:
INSERT INTO FOOBAR VALUES (E'\\\\FS1\\BAR');
okay, that worked. Yay. Now let's see if I can get the record back out with "LIKE":
SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'\\\\FS1%';
That gets back a record, but the value returned is "\FS1BAR". I'm missing two backslashes. I'm too confused to even attempt the update.
-Chris
First, please read the follow section of the docs, though especially 4.1.2
http://www.postgresql.org/docs/8.4/interactive/sql-syntax-lexical.html
Note the callout regarding standard conforming strings.
Since LIKE is an escapable pattern and you are using it in an escapable string literal the backslashes behave as such:
"\\\\" perform string literal escape -> "\\" perform like escape -> "\"
So on the first pass the four become two since each pair represents a single backslash post-literal-escape. Then the pair supplied to the LIKE becomes one post-like-escape.
Post back here if the reason and behavior of E'' is still unclear after reading the documentation.
David J.
Chris Curvey wrote:
1) Why do I get a warning when doubling a backslash?
2) What is the meaning of "E" syntax (E'\\\\fs1\\bar')?
3) If I have backslashes in my table, how can I get them back out?
4) I'd like to run an update to change the value '\\fs1\bar' to
\\fs1\foo\bar'. What incantation
would do that.
So, trying to figure it out on my own...
CREATE TABLE FOOBAR
( UNC_PATH VARCHAR(100)
);/* first insert attempt */
INSERT INTO FOOBAR VALUES ('\\FS1\BAR');returns a warning:
WARNING: nonstandard use of \\ in a string literal
LINE 1: INSERT INTO FOOBAR VALUES ('\\FS1\BAR');
^
HINT: Use the escape string syntax for backslashes, e.g., E'\\'.
Query returned successfully: 1 row affected, 21 ms execution time.but the row is inserted. There is one leading backslash, and the "b"
is some unprintable character.
You have standard_conforming_strings set to "off" and
escape_string_warning set to "on".
So backslash sequences (backslash + something) are not taken literally,
but interpreted as escape sequences.
Let's try the "E" syntax, whatever that is:
INSERT INTO FOOBAR VALUES (E'\\FS1\BAR');
No warning, but exactly the same results again (one leading backslash,
"b" replaced by unprintable
char). Let's try E with doubled backslashes:
You are insinuating that the "E" syntax is not well-documented.
Have you tried to read up on it?
What happens here is that you got rid of the warning because
you explicitly said "I'm going to use escape sequences".
INSERT INTO FOOBAR VALUES (E'\\\\FS1\\BAR');
okay, that worked. Yay. Now let's see if I can get the record back
out with "LIKE":
SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'\\\\FS1%';
That gets back a record, but the value returned is "\FS1BAR". I'm
missing two backslashes. I'm too
confused to even attempt the update.
In LIKE expressions, backslash acts as an escape character,
so you have double escaping: once from the LIKE pattern, and
again from standard_conforming_strings.
You'd have to write:
SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'\\\\\\\\FS1%';
or say that backslash is not an escape character:
SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'\\\\FS1%' ESCAPE '';
Yours,
Laurenz Albe
Many thanks to David and Albe for their kind assistance. I've looked at
the docs and run some experiments, and this is what I found. I'm going to
answer my own questions in a slightly different order, because it seems to
make the explanation more logical. Note that the rules appear to be
different for LIKE clauses. (more on that later)
2) What is the meaning of "E" syntax (E'\\\\fs1\\bar')?
the "E" syntax allows you to enter "special" characters into a field (\n =
newline, \t = tab, \b = backspace, etc). A double-backslash is interpreted
as a backslash. Any character besides backslash and
b,f,n,r,t,x,0,1,2,3,4,5,6,7,8,9 is taken literally and the backslash is
ignored. (See table 4-1 in the docs for details on what each character
means).
Example: E'eat\tat\njoes' is interpreted as "eat<tab>at<newline>joes"
1) Why do I get a warning when doubling a backslash?
because STANDARD_CONFORMING_STRINGS is off and ESCAPE_STRING_WARNING is on
(thanks Albe). Turning on STANDARD_CONFORMING_STRINGS makes a literal
backslash "just another character" (unless using E-syntax). Turning off
STANDARD_CONFORMING_STRINGS is the equivalent of using E-syntax for each
literal string. (Are those statements true?)
ESCAPE_STRING_WARNING is there to notify you if you are writing code that
may behave differently in the future. The warning is just a warning, the
statement will go through.
Example:
'eat\tat\njoes' is interpreted as "eat<tab>at<newline>joes"
STANDARD_CONFORMING_STRINGS is off
'eat\tat\njoes' is interpreted as
"eat[backslash][tee]at[backslash][en]joes" STANDARD_CONFORMING_STRINGS is on
By changing to the 'E' syntax (E'eat\tat\njoes'), you are specifically
saying that you want the <tab> and <newline> characters, regardless of how
STANDARD_CONFORMING_STRINGS is set. (I guess there is a logical
implication here that if STANDARD_CONFORMING_STRINGS is off, then the only
way to get a tab into a field would be to insert a ctrl-I somehow.)
added bonus information that will be obvious to PG masters, but I had to
find it: you can find the server setting with "SHOW
STANDARD_CONFORMING_STRINGS" and you can set it for your session with "SET
STANDARD_CONFORMING_STRINGS=ON".
3) If I have backslashes in my table, how can I get them back out?
I'm not sure how I got myself into that situation.
4) I'd like to run an update to change the value '\\fs1\bar' to
\\fs1\foo\bar'. What incantation would do that.
SET STANDARD_CONFORMING_STRINGS=ON;
UPDATE FOOBAR
SET UNC_PATH = REPLACE('\bar','\foo\bar')
WHERE UNC_PATH LIKE E'\\\\fs1\bar%' ESCAPE '';
====================================
I've tested the above rules for SELECT, INSERT, and UPDATE, and the rules
seem to hold. They also hold for WHERE clauses when searching for
equality. But they don't seem to hold for LIKE.
SELECT * FROM FOOBAR WHERE UNC_PATH = '\\fs1\bar' -- works
SELECT * FROM FOOBAR WHERE UNC_PATH LIKE '\\fs1\bar' -- no workie, although
I would have expected it to.
SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'\\\\fs1\\bar' -- no workie,
although I would have expected it to
SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'\\\\fs1\\bar' ESCAPE '' --
works.
So I guess the rules for string interpretation of backslashes in LIKE are
just different.
A) The backslash always escapes another backslash, regardless of what the
ESCAPE clause is. In fact, it appears that you HAVE to specify some other
escape clause to get it to work
B) You can't use another character to escape a backslash.
SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'@\@\fs1%' ESCAPE '@' -- does not
work.
I think I got it now. LIKE-with-backslash is just different. This has
been a learning experience!
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Chris Curvey
Sent: Thursday, September 27, 2012 1:44 PM
To: pgsql
Subject: Re: [GENERAL] unc paths, like and backslashes on 8.4
Many thanks to David and Albe for their kind assistance. I've looked at the
docs and run some experiments, and this is what I found. I'm going to
answer my own questions in a slightly different order, because it seems to
make the explanation more logical. Note that the rules appear to be
different for LIKE clauses. (more on that later)
2) What is the meaning of "E" syntax (E'\\\\fs1\\bar')?
the "E" syntax allows you to enter "special" characters into a field (\n =
newline, \t = tab, \b = backspace, etc). A double-backslash is interpreted
as a backslash. Any character besides backslash and
b,f,n,r,t,x,0,1,2,3,4,5,6,7,8,9 is taken literally and the backslash is
ignored. (See table 4-1 in the docs for details on what each character
means).
Example: E'eat\tat\njoes' is interpreted as "eat<tab>at<newline>joes"
1) Why do I get a warning when doubling a backslash?
because STANDARD_CONFORMING_STRINGS is off and ESCAPE_STRING_WARNING is on
(thanks Albe). Turning on STANDARD_CONFORMING_STRINGS makes a literal
backslash "just another character" (unless using E-syntax). Turning off
STANDARD_CONFORMING_STRINGS is the equivalent of using E-syntax for each
literal string. (Are those statements true?)
Yes, they are.
ESCAPE_STRING_WARNING is there to notify you if you are writing code that
may behave differently in the future. The warning is just a warning, the
statement will go through.
Example:
'eat\tat\njoes' is interpreted as "eat<tab>at<newline>joes"
STANDARD_CONFORMING_STRINGS is off
'eat\tat\njoes' is interpreted as "eat[backslash][tee]at[backslash][en]joes"
STANDARD_CONFORMING_STRINGS is on
By changing to the 'E' syntax (E'eat\tat\njoes'), you are specifically
saying that you want the <tab> and <newline> characters, regardless of how
STANDARD_CONFORMING_STRINGS is set. (I guess there is a logical
implication here that if STANDARD_CONFORMING_STRINGS is off, then the only
way to get a tab into a field would be to insert a ctrl-I somehow.)
added bonus information that will be obvious to PG masters, but I had to
find it: you can find the server setting with "SHOW
STANDARD_CONFORMING_STRINGS" and you can set it for your session with "SET
STANDARD_CONFORMING_STRINGS=ON".
3) If I have backslashes in my table, how can I get them back out?
I'm not sure how I got myself into that situation.
4) I'd like to run an update to change the value '\\fs1\bar' to
\\fs1\foo\bar <file:///\\fs1\foo\bar> '. What incantation would do that.
SET STANDARD_CONFORMING_STRINGS=ON;
UPDATE FOOBAR
SET UNC_PATH = REPLACE('\bar','\foo\bar')
WHERE UNC_PATH LIKE E'\\\\fs1\bar%' ESCAPE '';
====================================
I've tested the above rules for SELECT, INSERT, and UPDATE, and the rules
seem to hold. They also hold for WHERE clauses when searching for equality.
But they don't seem to hold for LIKE.
SELECT * FROM FOOBAR WHERE UNC_PATH = '\\fs1\bar' -- works
SELECT * FROM FOOBAR WHERE UNC_PATH LIKE '\\fs1\bar' -- no workie, although
I would have expected it to. [Like escapes the first pair]
SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'\\\\fs1\\bar' -- no workie,
although I would have expected it to [literal escape, then like escape]
SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'\\\\fs1\\bar' ESCAPE '' --
works. [literal escape, not Like escape]
So I guess the rules for string interpretation of backslashes in LIKE are
just different.
A) The backslash always escapes another backslash, regardless of what the
ESCAPE clause is. In fact, it appears that you HAVE to specify some other
escape clause to get it to work
B) You can't use another character to escape a backslash.
SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'@\@\fs1%' ESCAPE '@' -- does not
work.
If you change the LIKE escape character the "\" no longer has any
special meaning. But to get your new escape character you double-it up as
well. LIKE '@@' matches a single "@" in the example above
In your example you get E'@\@\fs1%' -> E'@{\@}{\f}s1% ->
'@[formfeed]s1%' -> no-idea what '{@[formfeed]}s1%' resolves to. The
literal escaping always occurs first, then the LIKE escaping. Note that the
"{}" denotes a single expression seen by the literal parser.
I think I got it now. LIKE-with-backslash is just different. This has been
a learning experience!