Re: multiline CSV fields - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: multiline CSV fields
Date
Msg-id 200411301934.iAUJY6u04191@candle.pha.pa.us
Whole thread Raw
In response to Re: multiline CSV fields  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-hackers
Andrew Dunstan wrote:
> 
> 
> Bruce Momjian wrote:
> 
> >I am wondering if one good solution would be to pre-process the input
> >stream in copy.c to convert newline to \n and carriage return to \r and
> >double data backslashes and tell copy.c to interpret those like it does
> >for normal text COPY files.  That way, the changes to copy.c might be
> >minimal; basically, place a filter in front of the CSV file that cleans
> >up the input so it can be more easily processed.
> >  
> >
> 
> This would have to parse the input stream, because you would need to 
> know which CRs and LFs were part of the data stream and so should be 
> escaped, and which really ended data lines and so should be left alone. 
> However, while the idea is basically sound, parsing the stream twice 
> seems crazy. My argument has been that at this stage in the dev cycle we 
> should document the limitation, maybe issue a warning as you want, and 
> make the more invasive code changes to fix it properly in 8.1. If you 

OK, right.

> don't want to wait, then following your train of thought a bit, ISTM 
> that the correct solution is a routine for CSV mode that combines the 
> functions of CopyReadAttributeCSV() and CopyReadLine(). Then we'd have a 
> genuine and fast fix for Greg's and Darcy's problem.

We are fine for 8.0, except for the warning, and you think we can fix it
perfectly in 8.1, good.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Increasing the length of
Next
From: alex yap
Date:
Subject: createdb failed