Re: Escape handling in COPY, strings, psql - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Escape handling in COPY, strings, psql
Date
Msg-id 200505300350.j4U3orF01875@candle.pha.pa.us
Whole thread Raw
In response to Re: Escape handling in COPY, strings, psql  (Peter Eisentraut <peter_e@gmx.net>)
Responses Re: Escape handling in COPY, strings, psql  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-hackers
Peter Eisentraut wrote:
> Bruce Momjian wrote:
> > >  I am thinking the only clean solution is to add a special keyword
> > > like ESCAPE before strings that contain escape information.  I
> > > think a GUC is too general.  You know if the string is a constant
> > > if it contains escapes just by looking at it, and if it is a
> > > variable, hopefully you know if it has escapes.
> 
> I do support gradually phasing out backslash escapes in standard string 
> literals in the interest of portability.  Most of the current escape 
> sequences are of limited value anyway.  Let's think about ways to get 
> there:
> 
> Enabling escape sequences in string literals controls the formatting of 
> input (and output?) data, so it is akin to, say, the client encoding 
> and the date style, so a GUC variable isn't out of the question in my 
> mind.  It makes most sense, though, if we want to eventually make users 
> switch it off all the time, that is, as a transition aid.  But before 
> that can happen, we need to come up with an alternative mechanism to 
> enter weird characters.
> 
> One such way may be to provide functions (say, chr(), tab(), etc.) to 
> give access to unprintable characters, but that will result in terrible 
> performance for long strings and it also won't help with COPY or places 
> where only literals are allowed.
> 
> Another way would be to allow escape sequences only in specially marked 
> strings.  The proposal above doing 'foo' ESCAPE 'x' seems fairly 
> elegant for SQL linguists but would be pretty weird to implement in the 
> lexer.  It won't help with COPY either, but that is really the case for 
> all solutions.

I was suggesting ESCAPE 'string' or ESC 'string'.  The marker has to be
before the string so scan.l can alter its processing of the string ---
after the string is too late --- there is no way to undo any escaping
that has happened, and it might already be used by gram.y.

I could probably hack up a sample implementation if people are
interested.

> A more compact representation may be using a prefix letter, like E'foo'.  
> This fits the SQL syntax, is familiar with Python programmers (although 
> in the other direction), and can be implemented efficiently in the 
> lexer.  I like that the best, personally.
> 
> For COPY, we would probably have to use a flag in the COPY command 
> itself either way (like already done for NULL AS).

I agree with Tom that COPY has to be left unchanged.  The fundamental
problem is the representation of NULL values, that I don't think we can
do without some escape mechanism.  Single-quote escapes works by
doubling them, but once you need to represent something more like
null's, I can't think of a solution without escapes.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: locks in CREATE TRIGGER, ADD FK
Next
From: Bruce Momjian
Date:
Subject: Re: Escape handling in COPY, strings, psql