Re: multiline CSV fields - Mailing list pgsql-hackers
From | Andrew Dunstan |
---|---|
Subject | Re: multiline CSV fields |
Date | |
Msg-id | 41947F3C.2050906@dunslane.net Whole thread Raw |
In response to | Re: multiline CSV fields (Bruce Momjian <pgman@candle.pha.pa.us>) |
Responses |
Re: multiline CSV fields
|
List | pgsql-hackers |
This example should fail on data line 2 or 3 on any platform, regardless of the platform's line-end convention, although I haven't tested on Windows. cheers andrew [andrew@aloysius inst]$ bin/psql -e -f csverr.sql ; od -c /tmp/csverrtest.csv create table csverrtest (a int, b text, c int); CREATE TABLE insert into csverrtest values(1,'a',1); INSERT 122471 1 insert into csverrtest values(2,'foo\r\nbar',2); INSERT 122472 1 insert into csverrtest values(3,'baz\nblurfl',3); INSERT 122473 1 insert into csverrtest values(4,'d',4); INSERT 122474 1 insert into csverrtest values(5,'e',5); INSERT 122475 1 copy csverrtest to '/tmp/csverrtest.csv' csv; COPY truncate csverrtest; TRUNCATE TABLE copy csverrtest from '/tmp/csverrtest.csv' csv; psql:cvserr.sql:9: ERROR: literal carriage return found in data HINT: Use "\r" to represent carriage return. CONTEXT: COPY csverrtest, line 2: "2,"foo" drop table csverrtest; DROP TABLE 0000000 1 , a , 1 \n 2 , " f o o \r \n b a 0000020 r " , 2 \n 3 , " b a z \n b l u r 0000040 f l " , 3 \n 4 , d , 4 \n 5 , e , 0000060 5 \n 0000062 [andrew@aloysius inst]$ Bruce Momjian wrote: >Can I see an example of such a failure line? > >--------------------------------------------------------------------------- > >Andrew Dunstan wrote: > > >>Darcy Buskermolen has drawn my attention to unfortunate behaviour of >>COPY CSV with fields containing embedded line end chars if the embedded >>sequence isn't the same as those of the file containing the CSV data. In >>that case we error out when reading the data in. This means there are >>cases where we can produce a CSV data file which we can't read in, which >>is not at all pleasant. >> >>Possible approaches to the problem: >>. make it a documented limitation >>. have a "csv read" mode for backend/commands/copy.c:CopyReadLine() that >>relaxes some of the restrictions on inconsistent line endings >>. escape embedded line end chars >> >>The last really isn't an option, because the whole point of CSVs is to >>play with other programs, and my understanding is that those that >>understand multiline fields (e.g. Excel) expect them not to be escaped, >>and do not produce them escaped. >> >>So right now I'm tossing up in my head between the first two options. Or >>maybe there's another solution I haven't thought of. >> >>Thoughts? >> >>cheers >> >>andrew >> >>---------------------------(end of broadcast)--------------------------- >>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >> >> >> > > >
pgsql-hackers by date: