On Nov 11, 2004, at 2:56 PM, Andrew Dunstan wrote:
>
>
> Tom Lane wrote:
>
>> Andrew Dunstan <andrew@dunslane.net> writes:
>>
>>> Patrick B Kelly wrote:
>>>
>>>> Actually, when I try to export a sheet with multi-line cells from
>>>> excel, it tells me that this feature is incompatible with the CSV
>>>> format and will not include them in the CSV file.
>>>>
>>
>>
>>> It probably depends on the version. I have just tested with Excel
>>> 2000 on a WinXP machine and it both read and wrote these files.
>>>
>>
>> I'd be inclined to define Excel 2000 as broken, honestly, if it's
>> writing unescaped newlines as data. To support this would mean
>> throwing
>> away most of our ability to detect incorrectly formatted CSV files.
>> A simple error like a missing close quote would look to the machine
>> like
>> the rest of the file is a single long data line where all the newlines
>> are embedded in data fields. How likely is it that you'll get a
>> useful
>> error message out of that? Most likely the error message would point
>> to
>> the end of the file, or at least someplace well removed from the
>> actual
>> mistake.
>>
>> I would vote in favor of removing the current code that attempts to
>> support unquoted newlines, and waiting to see if there are complaints.
>>
>>
>>
>
> This feature was specifically requested when we discussed what sort of
> CSVs we would handle.
>
> And it does in fact work as long as the newline style is the same.
>
> I just had an idea. How about if we add a new CSV option MULTILINE. If
> absent, then on output we would not output unescaped LF/CR characters
> and on input we would not allow fields with embedded unescaped LF/CR
> characters. In both cases we could error out for now, with perhaps an
> 8.1 TODO to provide some other behaviour.
>
> Or we could drop the whole multiline "feature" for now and make the
> whole thing an 8.1 item, although it would be a bit of a pity when it
> does work in what will surely be the most common case.
>
What about just coding a FSM into
backend/commands/copy.c:CopyReadLine() that does not process any flavor
of NL characters when it is inside of a data field?
Patrick B. Kelly
------------------------------------------------------ http://patrickbkelly.org