Thread: unc paths, like and backslashes on 8.4

unc paths, like and backslashes on 8.4

From
Chris Curvey
Date:
I just don't get how we are supposed to use LIKE with backslashes in strings in 8.4.  This is particularly vexing, because I have a field containing UNC paths that I need to search on (and eventually update).  I have been looking at this page for guidance:  http://www.postgresql.org/docs/8.4/static/functions-matching.html

So I will ask my questions first, then show you what I tried:

1) Why do I get a warning when doubling a backslash?  
2) What is the meaning of "E" syntax (E'\\\\fs1\\bar')?
3) If I have backslashes in my table, how can I get them back out?
4) I'd like to run an update to change the value '\\fs1\bar' to \\fs1\foo\bar'.  What incantation would do that.

So, trying to figure it out on my own...

CREATE TABLE FOOBAR
(  UNC_PATH VARCHAR(100)
);

/* first insert attempt */
INSERT INTO FOOBAR VALUES ('\\FS1\BAR');

returns a warning:

WARNING:  nonstandard use of \\ in a string literal
LINE 1: INSERT INTO FOOBAR VALUES ('\\FS1\BAR');
                                   ^
HINT:  Use the escape string syntax for backslashes, e.g., E'\\'.
Query returned successfully: 1 row affected, 21 ms execution time.

but the row is inserted.  There is one leading backslash, and the "b" is some unprintable character.  Let's try the "E" syntax, whatever that is:

INSERT INTO FOOBAR VALUES (E'\\FS1\BAR');

No warning, but exactly the same results again (one leading backslash, "b" replaced by unprintable char).  Let's try E with doubled backslashes:

INSERT INTO FOOBAR VALUES (E'\\\\FS1\\BAR');

okay, that worked.  Yay.   Now let's see if I can get the record back out with "LIKE":

SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'\\\\FS1%';

That gets back a record, but the value returned is "\FS1BAR". I'm missing two backslashes.  I'm too confused to even attempt the update.

-Chris

Re: unc paths, like and backslashes on 8.4

From
David Johnston
Date:
On Sep 26, 2012, at 20:42, Chris Curvey <chris@chriscurvey.com> wrote:

I just don't get how we are supposed to use LIKE with backslashes in strings in 8.4.  This is particularly vexing, because I have a field containing UNC paths that I need to search on (and eventually update).  I have been looking at this page for guidance:  http://www.postgresql.org/docs/8.4/static/functions-matching.html

So I will ask my questions first, then show you what I tried:

1) Why do I get a warning when doubling a backslash?  
2) What is the meaning of "E" syntax (E'\\\\fs1\\bar')?
3) If I have backslashes in my table, how can I get them back out?
4) I'd like to run an update to change the value '\\fs1\bar' to \\fs1\foo\bar'.  What incantation would do that.

So, trying to figure it out on my own...

CREATE TABLE FOOBAR
(  UNC_PATH VARCHAR(100)
);

/* first insert attempt */
INSERT INTO FOOBAR VALUES ('\\FS1\BAR');

returns a warning:

WARNING:  nonstandard use of \\ in a string literal
LINE 1: INSERT INTO FOOBAR VALUES ('\\FS1\BAR');
                                   ^
HINT:  Use the escape string syntax for backslashes, e.g., E'\\'.
Query returned successfully: 1 row affected, 21 ms execution time.

but the row is inserted.  There is one leading backslash, and the "b" is some unprintable character.  Let's try the "E" syntax, whatever that is:

INSERT INTO FOOBAR VALUES (E'\\FS1\BAR');

No warning, but exactly the same results again (one leading backslash, "b" replaced by unprintable char).  Let's try E with doubled backslashes:

INSERT INTO FOOBAR VALUES (E'\\\\FS1\\BAR');

okay, that worked.  Yay.   Now let's see if I can get the record back out with "LIKE":

SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'\\\\FS1%';

That gets back a record, but the value returned is "\FS1BAR". I'm missing two backslashes.  I'm too confused to even attempt the update.

-Chris


First, please read the follow section of the docs, though especially 4.1.2

http://www.postgresql.org/docs/8.4/interactive/sql-syntax-lexical.html

Note the callout regarding standard conforming strings.

Since LIKE is an escapable pattern and you are using it in an escapable string literal the backslashes behave as such:

"\\\\" perform string literal escape -> "\\" perform like escape -> "\"

So on the first pass the four become two since each pair represents a single backslash post-literal-escape.  Then the pair supplied to the LIKE becomes one post-like-escape.

Post back here if the reason and behavior of E'' is still unclear after reading the documentation.

David J.




Re: unc paths, like and backslashes on 8.4

From
"Albe Laurenz"
Date:
Chris Curvey wrote:
> 1) Why do I get a warning when doubling a backslash?
> 2) What is the meaning of "E" syntax (E'\\\\fs1\\bar')?
> 3) If I have backslashes in my table, how can I get them back out?
> 4) I'd like to run an update to change the value '\\fs1\bar' to
\\fs1\foo\bar'.  What incantation
> would do that.
>
> So, trying to figure it out on my own...
>
> CREATE TABLE FOOBAR
> (  UNC_PATH VARCHAR(100)
> );
>
> /* first insert attempt */
> INSERT INTO FOOBAR VALUES ('\\FS1\BAR');
>
> returns a warning:
>
> WARNING:  nonstandard use of \\ in a string literal
> LINE 1: INSERT INTO FOOBAR VALUES ('\\FS1\BAR');
>                                    ^
> HINT:  Use the escape string syntax for backslashes, e.g., E'\\'.
> Query returned successfully: 1 row affected, 21 ms execution time.
>
> but the row is inserted.  There is one leading backslash, and the "b"
is some unprintable character.

You have standard_conforming_strings set to "off" and
escape_string_warning set to "on".

So backslash sequences (backslash + something) are not taken literally,
but interpreted as escape sequences.

> Let's try the "E" syntax, whatever that is:
>
> INSERT INTO FOOBAR VALUES (E'\\FS1\BAR');
>
> No warning, but exactly the same results again (one leading backslash,
"b" replaced by unprintable
> char).  Let's try E with doubled backslashes:

You are insinuating that the "E" syntax is not well-documented.
Have you tried to read up on it?
What happens here is that you got rid of the warning because
you explicitly said "I'm going to use escape sequences".

> INSERT INTO FOOBAR VALUES (E'\\\\FS1\\BAR');
>
> okay, that worked.  Yay.   Now let's see if I can get the record back
out with "LIKE":
>
> SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'\\\\FS1%';
>
> That gets back a record, but the value returned is "\FS1BAR". I'm
missing two backslashes.  I'm too
> confused to even attempt the update.

In LIKE expressions, backslash acts as an escape character,
so you have double escaping: once from the LIKE pattern, and
again from standard_conforming_strings.

You'd have to write:

SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'\\\\\\\\FS1%';

or say that backslash is not an escape character:

SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'\\\\FS1%' ESCAPE '';

Yours,
Laurenz Albe


Re: unc paths, like and backslashes on 8.4

From
Chris Curvey
Date:
Many thanks to David and Albe for their kind assistance.  I've looked at the docs and run some experiments, and this is what I found.  I'm going to answer my own questions in a slightly different order, because it seems to make the explanation more logical.  Note that the rules appear to be different for LIKE clauses. (more on that later)

2) What is the meaning of "E" syntax (E'\\\\fs1\\bar')?

the "E" syntax allows you to enter "special" characters into a field (\n = newline, \t = tab, \b = backspace, etc).  A double-backslash is interpreted as a backslash. Any character besides backslash and b,f,n,r,t,x,0,1,2,3,4,5,6,7,8,9 is taken literally and the backslash is ignored.  (See table 4-1 in the docs for details on what each character means).

Example:  E'eat\tat\njoes' is interpreted as "eat<tab>at<newline>joes"
 
1) Why do I get a warning when doubling a backslash?  

because STANDARD_CONFORMING_STRINGS is off and ESCAPE_STRING_WARNING is on (thanks Albe).  Turning on STANDARD_CONFORMING_STRINGS makes a literal backslash "just another character" (unless using E-syntax).  Turning off STANDARD_CONFORMING_STRINGS is the equivalent of using E-syntax for each literal string.  (Are those statements true?)

ESCAPE_STRING_WARNING is there to notify you if you are writing code that may behave differently in the future.  The warning is just a warning, the statement will go through.

Example:  
'eat\tat\njoes' is interpreted as "eat<tab>at<newline>joes" STANDARD_CONFORMING_STRINGS is off
'eat\tat\njoes' is interpreted as "eat[backslash][tee]at[backslash][en]joes" STANDARD_CONFORMING_STRINGS is on

By changing to the 'E' syntax (E'eat\tat\njoes'), you are specifically saying that you want the <tab> and <newline> characters, regardless of how STANDARD_CONFORMING_STRINGS is set.    (I guess there is a logical implication here that if STANDARD_CONFORMING_STRINGS is off, then the only way to get a tab into a field would be to insert a ctrl-I somehow.)

added bonus information that will be obvious to PG masters, but I had to find it: you can find the server setting with "SHOW STANDARD_CONFORMING_STRINGS" and you can set it for your session with "SET STANDARD_CONFORMING_STRINGS=ON".

 
3) If I have backslashes in my table, how can I get them back out?

I'm not sure how I got myself into that situation.
 
4) I'd like to run an update to change the value '\\fs1\bar' to \\fs1\foo\bar'.  What incantation would do that.

SET STANDARD_CONFORMING_STRINGS=ON;

UPDATE FOOBAR
SET UNC_PATH = REPLACE('\bar','\foo\bar')
WHERE UNC_PATH LIKE E'\\\\fs1\bar%' ESCAPE '';

====================================

I've tested the above rules for SELECT, INSERT, and UPDATE, and the rules seem to hold.  They also hold for WHERE clauses when searching for equality.  But they don't seem to hold for LIKE.

SELECT * FROM FOOBAR WHERE UNC_PATH = '\\fs1\bar' -- works
SELECT * FROM FOOBAR WHERE UNC_PATH LIKE '\\fs1\bar' -- no workie, although I would have expected it to.
SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'\\\\fs1\\bar' -- no workie, although I would have expected it to
SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'\\\\fs1\\bar' ESCAPE ''  -- works.

So I guess the rules for string interpretation of backslashes in LIKE are just different.  

A) The backslash always escapes another backslash, regardless of what the ESCAPE clause is. In fact, it appears that you HAVE to specify some other escape clause to get it to work 

B) You can't use another character to escape a backslash.

SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'@\@\fs1%' ESCAPE '@' -- does not work.

I think I got it now.  LIKE-with-backslash is just different.  This has been a learning experience!

Re: unc paths, like and backslashes on 8.4

From
"David Johnston"
Date:

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Chris Curvey
Sent: Thursday, September 27, 2012 1:44 PM
To: pgsql
Subject: Re: [GENERAL] unc paths, like and backslashes on 8.4

 

Many thanks to David and Albe for their kind assistance.  I've looked at the docs and run some experiments, and this is what I found.  I'm going to answer my own questions in a slightly different order, because it seems to make the explanation more logical.  Note that the rules appear to be different for LIKE clauses. (more on that later)

 

2) What is the meaning of "E" syntax (E'\\\\fs1\\bar')?

 

the "E" syntax allows you to enter "special" characters into a field (\n = newline, \t = tab, \b = backspace, etc).  A double-backslash is interpreted as a backslash. Any character besides backslash and b,f,n,r,t,x,0,1,2,3,4,5,6,7,8,9 is taken literally and the backslash is ignored.  (See table 4-1 in the docs for details on what each character means).

 

Example:  E'eat\tat\njoes' is interpreted as "eat<tab>at<newline>joes"

 

1) Why do I get a warning when doubling a backslash?  

 

because STANDARD_CONFORMING_STRINGS is off and ESCAPE_STRING_WARNING is on (thanks Albe).  Turning on STANDARD_CONFORMING_STRINGS makes a literal backslash "just another character" (unless using E-syntax).  Turning off STANDARD_CONFORMING_STRINGS is the equivalent of using E-syntax for each literal string.  (Are those statements true?)

 

>>>>> Yes, they are.

 

ESCAPE_STRING_WARNING is there to notify you if you are writing code that may behave differently in the future.  The warning is just a warning, the statement will go through.

 

Example:  

'eat\tat\njoes' is interpreted as "eat<tab>at<newline>joes" STANDARD_CONFORMING_STRINGS is off

'eat\tat\njoes' is interpreted as "eat[backslash][tee]at[backslash][en]joes" STANDARD_CONFORMING_STRINGS is on

 

By changing to the 'E' syntax (E'eat\tat\njoes'), you are specifically saying that you want the <tab> and <newline> characters, regardless of how STANDARD_CONFORMING_STRINGS is set.    (I guess there is a logical implication here that if STANDARD_CONFORMING_STRINGS is off, then the only way to get a tab into a field would be to insert a ctrl-I somehow.)

 

added bonus information that will be obvious to PG masters, but I had to find it: you can find the server setting with "SHOW STANDARD_CONFORMING_STRINGS" and you can set it for your session with "SET STANDARD_CONFORMING_STRINGS=ON".

 

 

3) If I have backslashes in my table, how can I get them back out?

 

I'm not sure how I got myself into that situation.

 

4) I'd like to run an update to change the value '\\fs1\bar' to \\fs1\foo\bar'.  What incantation would do that.


SET STANDARD_CONFORMING_STRINGS=ON;

 

UPDATE FOOBAR

SET UNC_PATH = REPLACE('\bar','\foo\bar')

WHERE UNC_PATH LIKE E'\\\\fs1\bar%' ESCAPE '';

 

====================================

I've tested the above rules for SELECT, INSERT, and UPDATE, and the rules seem to hold.  They also hold for WHERE clauses when searching for equality.  But they don't seem to hold for LIKE.

 

SELECT * FROM FOOBAR WHERE UNC_PATH = '\\fs1\bar' -- works

SELECT * FROM FOOBAR WHERE UNC_PATH LIKE '\\fs1\bar' -- no workie, although I would have expected it to. [Like escapes the first pair]

SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'\\\\fs1\\bar' -- no workie, although I would have expected it to [literal escape, then like escape]

SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'\\\\fs1\\bar' ESCAPE ''  -- works. [literal escape, not Like escape]

 

So I guess the rules for string interpretation of backslashes in LIKE are just different.  

 

A) The backslash always escapes another backslash, regardless of what the ESCAPE clause is. In fact, it appears that you HAVE to specify some other escape clause to get it to work 

 

B) You can't use another character to escape a backslash.

 

SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'@\@\fs1%' ESCAPE '@' -- does not work.

 

>>>>>> If you change the LIKE escape character the “\” no longer has any special meaning.  But to get your new escape character you double-it up as well.  LIKE ‘@@’ matches a single “@” in the example above

>>>>>> In your example you get E‘@\@\fs1%’ -> E’@{\@}{\f}s1% -> ‘@[formfeed]s1%’ -> no-idea what ‘{@[formfeed]}s1%’ resolves to.  The literal escaping always occurs first, then the LIKE escaping. Note that the “{}” denotes a single expression seen by the literal parser.

 

I think I got it now.  LIKE-with-backslash is just different.  This has been a learning experience!