Re: COPY formatting - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: COPY formatting
Date
Msg-id 405B432A.9030407@dunslane.net
Whole thread Raw
In response to Re: COPY formatting  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:

>Andrew Dunstan <andrew@dunslane.net> writes:
>  
>
>>There are some wrinkles, though, concerning the interaction of CSV's 
>>notion of escaping and  COPY's notion of escaping. If someone want to 
>>undertake this I can flesh those out in a further email.
>>    
>>
>
>Please do that, so that the info is in the archives in case someone else
>wants to tackle the project.
>
>  
>

briefly:

According to my understanding, in a CSV file backslash has no magical 
meaning unless it is the escape character, in which case we only expect 
to find it prefacing either itself or the quote character inside a 
quoted field. Otherwise, it is just another character.

One way of handling this might be to have 2 modes of COPY processing:
. if a quote delimiter is specified turn off all of COPY's usual 
backslash processing, and make the default NULL marker the empty string
. if no quote delimiter is specified, act as now.

OTOH it might be a good idea to be able to turn off backslash processing 
even without a quote delimiter, e.g. in a CSV-like file using tab as the 
delimiter and no quote escaping, so maybe another switch on COPY would 
be a better way to go.

Another issue I wondered about is how to specify nicely that TAB is the 
field delimiter - I hate putting a literal semantic tab in files, and 
consider its magical use in places like Makefiles and syslog.conf files 
some of the worst decisions in computing ever made ;-). I'd like a nicer 
*visible* way of specifying it, either with \t or ^I maybe.

I'm sure other issues will arise - that's all that's in my head for the 
moment :-)

cheers

andrew


pgsql-hackers by date:

Previous
From: "Thomas Swan"
Date:
Subject: Re: COPY formatting
Next
From: Bruce Momjian
Date:
Subject: Re: COPY formatting