Escape handling in COPY, strings, psql - Mailing list pgsql-hackers

From Bruce Momjian
Subject Escape handling in COPY, strings, psql
Date
Msg-id 200505290358.j4T3w1n25524@candle.pha.pa.us
Whole thread Raw
Responses Re: Escape handling in COPY, strings, psql
Re: Escape handling in COPY, strings, psql
List pgsql-hackers
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Here is an updated version of the COPY \x patch.  It is the first patch
> > attached.
> > Also, I realized that if we support \x in COPY, we should also support
> > \x in strings to the backend.  This is the second patch.
> 
> Do we really want to do any of these things?  We've been getting beaten
> up recently about the fact that we have non-SQL-spec string escapes
> (ie, all the backslash stuff) so I'm a bit dubious about adding more,
> especially when there's little if any demand for it.

I thought about that, but adding additional escape letters isn't our
problem --- it is the escape mechanism itself that is the issue.

I have wanted to post on this issue so now is a good time.  I think we
have been validly beaten up in that we pride ourselves on standards
compliance but have escape requirement on all strings.  Our string
escapes are a major problem --- not the number of them but the
requirement to double backslashes on input, like 'C:\\tmp'.  I am
thinking the only clean solution is to add a special keyword like ESCAPE
before strings that contain escape information.  I think a GUC is too
general.  You know if the string is a constant if it contains escapes
just by looking at it, and if it is a variable, hopefully you know if it
has escapes.  

Basically, I think we have to deal with this somehow. I think it could
be implemented by looking for the ESCAPE keyword in parser/scan.l and
handling it all in there by ignoring backslash escapes if ESCAPE
preceeds the string.  By the time you are in gram.y, it is too late.

> I don't object too much to the COPY addition, since that's outside any
> spec anyway, but I do think we ought to think twice about adding this
> to SQL literal handling.
> 
> > Third, I found out that psql has some unusual handling of escaped
> > numbers.  Instead of using \ddd as octal, it has \ddd is decimal, \0ddd
> > is octal, and \0xddd is decimal.  It is basically following the strtol()
> > rules for an escaped value.  This seems confusing and contradicts how
> > the rest of our system works.
> 
> I agree, that's just going to confuse people.
> 
> > ! xqescape        [\\][^0-7x]
> 
> If you are going to insist on this, at least make it case-insensitive.

The submitted COPY patch also was case-insensitive, \x and \X, but I
changed that because we are case-sensitive for all backslashes in COPY,
and C is the same (\n and \N are different too, so we actually use the
case-sensitivity).  Should we allow \X just so it is case-insensitive
like the SQL specification X'4f'?  That is the only logic I can think of
for it to be case-insensitive, but we have to then do that at all
levels, and I am not sure it makes sense.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Inefficiency in recent pgtz patch
Next
From: Tom Lane
Date:
Subject: Re: unsafe use of hash_search(... HASH_ENTER ...)