Re: csv format for psql - Mailing list pgsql-hackers

From Daniel Verite
Subject Re: csv format for psql
Date
Msg-id ea4145e4-8c7c-4541-af24-500f1e47de88@manitou-mail.org
Whole thread Raw
In response to Re: csv format for psql  (Michael Paquier <michael@paquier.xyz>)
Responses Re: csv format for psql
List pgsql-hackers
    Michael Paquier wrote:

> Still what's the point except complicating the code?  We don't care
> about anything fancy in the backend-side ProcessCopyOptions() when
> checking cstate->delim, and having some consistency looks like a good
> thing to me.

The backend has its reasons that don't apply to the psql output
format, mostly import performance according to [1]
It's not that nobody wants delimiter outside of US-ASCII,
as people do ask for that sometimes:

https://www.postgresql.org/message-id/f02ulk%242r3u%241%40news.hub.org
https://github.com/greenplum-db/gpdb/issues/1246

> However there is no option to specify
> an escape character, no option to specify a quote character, and it is
> not possible to force quotes for all values.  Those are huge advantages
> as any output can be made compatible with other CSV variants.  Isn't
> what is presented too limited?

The guidelines that the patch has been following are those of RFC 4180 [2]
with two exceptions on the field separator that we can define
and the end of lines that are OS-dependant instead of the fixed CRLF
that IETF seems to see as the norm.

The only reference to escaping in the RFC is:
       "If double-quotes are used to enclose fields, then a double-quote
       appearing inside a field must be escaped by preceding it with
       another double quote"

The problem with using non-standard QUOTE or ESCAPE is that it's a
violation of the format that goes further than choosing a separator
different than comma, which is already a pain point.
We can always add these options later if there is demand. I suspect it
will never happen.

I looked at the 2004 archives when CSV was added to COPY, that's
around commit 862b20b38 in case anyone cares to look, but
I couldn't find a discussion on these options, all I could find is they were
present from the start.

But again COPY is concerned with importing the data that preexists,
even if it's weird, whereas a psql output formats are not.


[1] https://www.postgresql.org/message-id/4C9D2BC5.1080006%40optonline.net
[2] https://tools.ietf.org/html/rfc4180


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: notice processors for isolationtester
Next
From: "Daniel Verite"
Date:
Subject: Re: Alternative to \copy in psql modelled after \g