Re: storing strings with embedded '\' - Mailing list pgsql-sql

From Tom Lane
Subject Re: storing strings with embedded '\'
Date
Msg-id 16213.916154247@sss.pgh.pa.us
Whole thread Raw
In response to storing strings with embedded '\'  ("Joel Fischer" <joelf@min.ascend.com>)
List pgsql-sql
"Joel Fischer" <joelf@min.ascend.com> writes:
> Hello, interested in storing strings with embedded '\' characters?
> Using \\ appears to store \\.  Suggestions?

You didn't say what frontend you are using, nor what version.  But it
could be that everything's fine and you're just misinterpreting what
you see on output.

The backend itself expects backslashes to be doubled in data it
receives.  There are actually two different rules:

* In string constants in SQL statements, backslash works somewhat
like it does in C: it's an escape that causes the next character(s)
to be interpreted specially.  For example \n gets converted to a
linefeed.  If you want a literal backslash you have to write \\.
Also, if you want a quote character you write \' to keep it from getting
interpreted as the end of the string.  This is all required by the
SQL standard.

* In COPY data, backslash is still special, but it's used for fewer
cases.  \N means a null field, \\ means a backslash data character,
and you also have to backslash return and tab data characters to keep
them from being interpreted as field or row separators.

Now, *output* from the backend works a little differently: in the
results of an ordinary SELECT statement, the data is just given
verbatim.  (The frontend/backend protocol doesn't need any quoting
since everything is treated as counted strings.)  COPY OUT quotes the
outgoing data as above, so that it will be interpreted properly by
COPY IN.

So the question is, what frontend code are you using, and what might it
be doing to the data?  My guess is that you are using an old version of
psql that doubles backslashes during output --- ie, it takes the single
backslash that comes back from the backend after a SELECT and prints it
as \\.  There's only one backslash in the stored data, though, and
that's what you'd see if you were programming in C or Tcl or Perl rather
than going through psql.

We got rid of that psql behavior in 6.4 because the consensus was that
it was confusing.  Now psql just prints what the backend sends.

"Frank Morton" <fmorton@base2inc.com> writes:
> This is a real pain. What I do is have a "cleanString" method that
> I pass everything through when constructing SQL statements
> that changes all '\' to '\\' and all seems to work just fine.

Yes, you have to do that because the SQL spec says so.  I hope your
subroutine is set up to quote ' as well.

> Also note that pg_dump has a bug in this area that I am unable
> to get attention from anyone. In my setting, if I pass "\net" in
> the actual SQL it turns into "\\net" but when read as part of a select,
> it is "\net" as expected. However, pg_dump does not double-up
> the backslash, so it is unusable to psql.

Um, I just checked this and it appears to work fine in 6.4.  What
release are you using?

            regards, tom lane

pgsql-sql by date:

Previous
From: "Neil Burrows"
Date:
Subject: A path through a tree
Next
From: David Martinez Cuevas
Date:
Subject: Re: [SQL] A path through a tree