Re: csv format for psql - Mailing list pgsql-hackers

From Daniel Verite
Subject Re: csv format for psql
Date
Msg-id 1be5aee2-0548-4cf8-94b7-5565d347cb08@manitou-mail.org
Whole thread Raw
In response to Re: csv format for psql  (Fabien COELHO <coelho@cri.ensmp.fr>)
Responses Re: csv format for psql
Re: csv format for psql
List pgsql-hackers
    Fabien COELHO wrote:

> I also think that a short option brings little value, and "--csv" is good
> enough for the purpose, so I would agree to remove the "-C" binding.

It's not that accepting -C brings much value by itself, it's that
loosing the consistency across all options comes with a negative cost.
The point is that up to now all options have a short form and a long
form, so --csv would be a deliberate exception. I'm rather unconvinced
it's justified, but I seem to be alone in that case, so I'll comply.


> About "fieldsep_csv", I do not like much the principle of having different
> output variables to represent the same concept depending on the format. I
> would rather have reused fieldsep as in your previous submission and set
> it to "," when under --csv

The trouble with fieldsep is that it defaults to '|', which
both you and Pavel say you dislike. Fair enough, it's better
to have ',' by default, but the cleanest solution to that
is fieldsep_csv with its own default.
The solution to set fieldsep automatically to ',' with
\pset format csv is problematic.

For instance
 \pset format csv
 \pset fieldsep ';'

changes fieldsep to ';' as expected, but in the other order

 \pset fieldsep ';'
 \pset format csv

you get ',' while reasonably you'd expect ';'

Same problem on the command line. Options are evaluated left-to-right:

 $ psql --csv -F';'
would work as expected, but
 $ psql -F';' --csv
would not.

I don't feel good about solving these issues with ad-hoc rules,
or inventing the notion that a pset variable has been defined
but not user-redefined. This stuff has "inconsistent" written all over it
and I don't see a maintainer going along with that.

> The "\n" eol style is hardcoded. Should it use "recordsep"? For instance,
> https://tools.ietf.org/html/rfc4180 seems to specify CRLF end of lines.
> The definition is evolving, eg https://www.w3.org/TR/tabular-data-model/
> accepts both "\r" and "\r\n". I do not like using windows eol, but I think
> that it should be possible to do it, which is not the case with this
> version.

Interesting point. The output stream is opened in text mode so printing
'\n' should generate LF on Unix, CR LF on Windows, and I think CR on MacOS.
I think that's for the best.

recordsep in the unaligned mode doesn't play the role of a line ending
because the last line is not finished by recordsep. According to the source
code, this is intended, see print_unaligned_text() in print.c:
  /*
   * The last record is terminated by a newline, independent of the set
   * record separator.    But when the record separator is a zero byte, we
   * use that (compatible with find -print0 and xargs).
   */


> The "\pset format" error message in "do_pset" shows values in seemingly
> random order. The situation is pre-existing but not really satisfactory.
> I'd suggest to put all values in alphabetical order.

ok

> In csv_print_field & csv_print_text, you are not consistent when putting
> braces on blocks with only one instruction. I'd suggest not to put braces
> in that case.

ok

> I'd suggest that tests should include more types, not just strings. I
> would suggest int, float, timestamp, bytea, an array (which uses , as a
> separator), json (which uses both " and ,)...

I'll do but the printout code is type-agnostic so it's not supposed
to make a difference compared to mere literals.
Cases with NULLs are missing though, I'll go add some too.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


pgsql-hackers by date:

Previous
From: Arthur Zakirov
Date:
Subject: Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)
Next
From: "Daniel Verite"
Date:
Subject: Re: csv format for psql