Re: Re: csv format for psql - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: Re: csv format for psql
Date
Msg-id CAFj8pRBU4OVKTrrJwkZGQ28nAzLG+k_huj+m+vTkyhFoYce7Bw@mail.gmail.com
Whole thread Raw
In response to Re: Re: csv format for psql  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: Re: csv format for psql  (Pavel Stehule <pavel.stehule@gmail.com>)
Re: Re: csv format for psql  (Pavel Stehule <pavel.stehule@gmail.com>)
Re: Re: csv format for psql  (Fabien COELHO <coelho@cri.ensmp.fr>)
List pgsql-hackers


2018-03-24 8:24 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:


2018-03-24 8:15 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:

Hello Pavel,

The patch adds a simple way to generate csv output from "psql" queries,
much simpler than playing around with COPY or \copy. It allows to generate
a clean CSV dump from something as short as:

  sh> psql --csv -c 'TABLE foo' > foo.csv

Documentation is clear.

Test cover a significant number of cases (fieldsep, expanded, tuples-only).
Although recordsep changes are not actually tested, it worked interactively
and I think that tests are sufficient as is.

There are somehow remaining point about which a committer/other people input would be nice:

(1) There are some mild disagreement whether the fieldsep should be format specific shared with other format. I do not think that a specific fieldsep is worth it, but this is a marginal preference, and other people opinion differ. What is best is not obvious.

Pavel also suggested to have a special handling based on whether the fieldsep is explicitely set or not. I'm not too keen on that because it departs significantly from the way psql formatting is currently handled, and what is happening becomes unclear to the user.

(2) For interactive use, two commands are required: \pset format csv + \pset fieldsep ',' (or ';' or '\t' or whatever...). Maybe some \csv command similar to \H would be appropriate, or not, to set both values more efficiently. Could be something for another patch.

Not sure what is the status of the patch if we do not have a clear
consensus.

I am sorry, but I don't think so this interface is good enough. Using | as
default CSV separator is just wrong. It and only it is a problem. Any other
is perfect.

I do not think that there is a perfect solution, so some compromise will be needed or we won't get it.

(1) patch v4:

    "\pset format csv" retains the current fieldsep value, so fields are
    separated by whatever is in the variable, which means that for getting
    a standard csv two commands are needed, which is clearly documented,
    but may be considered as surprising. ISTM that the underlying point is
    that "format" is really about string escaping, not about the full output
    format, but this is a pre-existing situation.

    I'm suggesting to add \csv which would behave like \H to toggle CSV
    mode so as to improve this situation, with a caveat which is that
    toggling back \csv would have forgotted the previous settings (just
    like \H does, though, so would for instance reset to aligned with |),
    so it would not be perfect.

this doesn't solve usual format settings by \pset format csv
 

(2) your proposal as I understand it:

    "\pset format csv" may or may not use the fieldsep, depending on
    whether it was explicitely set, an information which is not shown, i.e.:

      \pset fieldsep # fieldsep separator is "|"
      \pset format csv # would output a,b,c or a|b|c...

    Because it depends on whether fieldsep was set explicitely to '|' or
    whether it has this value but it was due to the default.

    This kind of unclear behavioral determinism does not seem desirable.

please, check and test attached patch. It is very simply for usage - and there is not any unclear behave. Just you should to accept so formats can have own defaults for separators.
 

(3) other option, always use a comma:

    this was rejected because some people like their comma separated
    values to be separated by semi-colons or tabs (aka tsv).

(4) other option, Daniel v3 or v2:

    use a distinct "fieldsep_csv" variable initially set to ','. This adds
    yet another specific variable that has to be remembered, some styles
    would use fieldsep but csv would not so it is some kind of exception
    that I would wish to avoid.

My current preference order in the suggested solutions is 1, 4, 2, 3, with a significant preference for 1.

I am thinking so @1 solves nothing - people are using \pset format ...

@3 is clearly bad - there are not any discussion

@4 can be compromise solution, but then there should be renamed fieldsep. Now, fieldsep is used just for unaligned format - for nothing else. If we introduce fieldsep_csv, then fieldsep should be renamed to fieldsep_unaligned. I can live with it.

But I think so default fieldsep is better option. Please, try my patch and comment it.

minor fix

Regards

Pavel
 

Regards

Pavel

--
Fabien.


Attachment

pgsql-hackers by date:

Previous
From: Kapil Sharma
Date:
Subject: Running Installcheck remotely
Next
From: Pavel Stehule
Date:
Subject: Re: Re: csv format for psql