On Nov 11, 2004, at 10:07 PM, Andrew Dunstan wrote:
>
>
> Patrick B Kelly wrote:
>
>>
>>
>>
>> My suggestion is to simply have CopyReadLine recognize these two
>> states (in-field and out-of-field) and execute the current logic only
>> while in the second state. It would not be too hard but as you
>> mentioned it is non-trivial.
>>
>>
>>
>
> We don't know what state we expect the end of line to be in until
> after we have actually read the line. To know how to treat the end of
> line on your scheme we would have to parse as we go rather than after
> reading the line as now. Changing this would be not only be
> non-trivial but significantly invasive to the code.
>
>
Perhaps I am misunderstanding the code. As I read it the code currently
goes through the input character by character looking for NL and EOF
characters. It appears to be very well structured for what I am
proposing. The section in question is a small and clearly defined loop
which reads the input one character at a time and decides when it has
reached the end of the line or file. Each call of CopyReadLine attempts
to get one more line. I would propose that each time it starts out in
the out-of-field state and the state is toggled by each un-escaped
quote that it encounters in the stream. When in the in-field state, it
would only look for the next un-escaped quote and while in the
out-of-field state, it would execute the existing logic as well as
looking for the next un-escaped quote.
I may not be explaining myself well or I may fundamentally
misunderstand how copy works. I would be happy to code the change and
send it to you for review, if you would be interested in looking it
over and it is felt to be a worthwhile capability.
Patrick B. Kelly
------------------------------------------------------ http://patrickbkelly.org