Re: COPY TO STDOUT WITH (FORMAT CSV, HEADER), and embedded newlines - Mailing list pgsql-general

From Francisco Olarte
Subject Re: COPY TO STDOUT WITH (FORMAT CSV, HEADER), and embedded newlines
Date
Msg-id CA+bJJbywV2J0sSFxoVhnAb59oask-=RbOctf9BvepROw0PNsjg@mail.gmail.com
Whole thread Raw
In response to Re: COPY TO STDOUT WITH (FORMAT CSV, HEADER), and embedded newlines  ("Daniel Verite" <daniel@manitou-mail.org>)
List pgsql-general
Hi Daniel:

On Fri, 11 Mar 2022 at 19:38, Daniel Verite <daniel@manitou-mail.org> wrote:
> > These values are 'normal'. I'm not use to CSV, but I suppose
> > such newlines
> > must be encoded, perhaps as \n, since AFAIK CSV needs to be 1 line per row,
> > no?
> No, but such fields must be enclosed by double quotes, as documented
> in RFC 4180 https://datatracker.ietf.org/doc/html/rfc4180

CSV is really poiosonous. And in the multiplan days, which was nearly
RFC4180, it was tolerable, but this days where everybody uses excel to
spit "localized csv" it is hell ( in spain it uses ; as delimiter
because it localizes numbers with , as decimal separator, you may have
similar problems ).

Anyway, I was going to point RFC4180 is a bit misleading. In 2.1 it states:
>>>
   1.  Each record is located on a separate line, delimited by a line
       break (CRLF).  For example:

       aaa,bbb,ccc CRLF
       zzz,yyy,xxx CRLF
<<<

Which may lead you to believe you can read by lines, but several lines
after that in 2.6 it says

>>>
   6.  Fields containing line breaks (CRLF), double quotes, and commas
       should be enclosed in double-quotes.  For example:

       "aaa","b CRLF
       bb","ccc" CRLF
       zzz,yyy,xxx
<<<

Which somehow contradicts 2.1.

In C/C++ it's easily parsed with a simple state machine reading char
by char, wich is one of the strong points of those languages, but
reading lines as strings usually leads to complex logic.

Francisco Olarte.



pgsql-general by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Am I in the same transaction block in complex PLPGSQL?
Next
From: Francisco Olarte
Date:
Subject: Re: COPY TO STDOUT WITH (FORMAT CSV, HEADER), and embedded newlines