Thread: csv format for psql

Re: csv format for psql

From

Fabien COELHO

Date:

30 January 2018, 12:14:11

Hello Daniel,

> This patch implements csv as an output format in psql
> (\pset format csv).

Would you consider registering it in the next CF?

-- 
Fabien.

Re: csv format for psql

From

Pavel Stehule

Date:

30 January 2018, 17:24:57

2018-01-30 9:31 GMT+01:00 Daniel Verite <daniel@manitou-mail.org>:

Hi,

This patch implements csv as an output format in psql
(\pset format csv). It's quite similar to the unaligned format,
except that it applies CSV quoting rules (obviously!) and that
it prints no footer and no title.
As with unaligned, a header with column names is output unless
tuples_only is on. It also supports the fieldsep/fielsep_zero
and recordsep/recordsep_zero settings.

Most of times, the need for CSV is covered by \copy or COPY with
the CSV option, but there are some cases where it would be more
practical to have it as an output format in psql.

I absolutely agree

* \copy does not interpolate psql variables and is a single-line
command, so making a query fit these contraints can be cumbersome.
It can be got around by defining a temporary view and
\copy from that view, but that doesn't work in a read-only context
such as when connected to a standby.

* the server-side COPY TO STDOUT can also be used from psql,
typically with psql -c "COPY (query) TO STDOUT CSV" > file.csv,
but that's too simple to extract multiple result sets per script.
COPY is also more rigid than psql in the options to delimit
fields and records.

* copy with csv can't help for the output of meta-commands
such as \gx, \crosstabview, \l, \d ... whereas a CSV format within psql
does work with these.

It is great - long time I miss this feature - It is interesting for scripting, ETL, ..

This format is too important, so some special short or long option can be practical (it will be printed in help)

some like --csv

I found one issue - PostgreSQL default field separator is "|". Maybe good time to use more common "," ?

Or when field separator was not explicitly defined, then use "," for CSV, and "|" for other. Although it can be little bit messy

Thank you

Pavel

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

"Daniel Verite"

Date:

31 January 2018, 14:39:53

    Fabien COELHO wrote:

> > This patch implements csv as an output format in psql
> > (\pset format csv).
>
> Would you consider registering it in the next CF?

Done here:

https://commitfest.postgresql.org/17/1500/

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

"Daniel Verite"

Date:

31 January 2018, 15:58:45

    Pavel Stehule wrote:

> This format is too important, so some special short or long option can be
> practical (it will be printed in help)
>
> some like --csv

I guess -C/--csv could be used, like there already is -H/--html.

> I found one issue - PostgreSQL default field separator is "|". Maybe good
> time to use more common "," ?
>
> Or when field separator was not explicitly defined, then use "," for CSV,
> and "|" for other. Although it can be little bit messy

Currently there's a strong analogy between the unaligned
mode and csv mode. In particular they use the existing pset
variables fieldsep, fieldsep_zero, recordsep, recordsep_zero
in the same way. If we want to make csv special with regard
to the delimiters,  that complicates the user interface
For instance if fieldsep was changed automatically by
\pset format csv, it's not obvious if/when we should switch it
back to its previous state, and how the fieldsep switch done
automatically would mix or conflict with other \pset
commands issued by the user.
Or we need to duplicate these variables. Or duplicate
only fieldsep, having a fieldsep_csv, leaving out the
other variables and not being as close to the unaligned
mode.
These options don't appeal to me much compared
to the simplicity of the current patch.

Also, although the comma is the separator defined by the
RFC4180 and the default for COPY CSV, people also use the
semicolon extensively (because it's what Excel does I guess),
which somehow mitigates the importance of comma as the
default value.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

Pavel Stehule

Date:

31 January 2018, 16:09:40

2018-01-31 13:58 GMT+01:00 Daniel Verite <daniel@manitou-mail.org>:

Pavel Stehule wrote:

> This format is too important, so some special short or long option can be
> practical (it will be printed in help)
>
> some like --csv

I guess -C/--csv could be used, like there already is -H/--html.

> I found one issue - PostgreSQL default field separator is "|". Maybe good
> time to use more common "," ?
>
> Or when field separator was not explicitly defined, then use "," for CSV,
> and "|" for other. Although it can be little bit messy

Currently there's a strong analogy between the unaligned
mode and csv mode. In particular they use the existing pset
variables fieldsep, fieldsep_zero, recordsep, recordsep_zero
in the same way. If we want to make csv special with regard
to the delimiters, that complicates the user interface
For instance if fieldsep was changed automatically by
\pset format csv, it's not obvious if/when we should switch it
back to its previous state, and how the fieldsep switch done
automatically would mix or conflict with other \pset
commands issued by the user.
Or we need to duplicate these variables. Or duplicate
only fieldsep, having a fieldsep_csv, leaving out the
other variables and not being as close to the unaligned
mode.
These options don't appeal to me much compared
to the simplicity of the current patch.

Also, although the comma is the separator defined by the
RFC4180 and the default for COPY CSV, people also use the
semicolon extensively (because it's what Excel does I guess),
which somehow mitigates the importance of comma as the
default value.

The functionality is clean and great - default setting "|" is less. I have not strong opinion about it and I understand to your arguments. Has anybody some idea?

Regards

Pavel

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

Pavel Stehule

Date:

05 February 2018, 10:58:28

2018-01-31 14:09 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2018-01-31 13:58 GMT+01:00 Daniel Verite <daniel@manitou-mail.org>:
Pavel Stehule wrote:

> This format is too important, so some special short or long option can be
> practical (it will be printed in help)
>
> some like --csv

I guess -C/--csv could be used, like there already is -H/--html.

> I found one issue - PostgreSQL default field separator is "|". Maybe good
> time to use more common "," ?
>
> Or when field separator was not explicitly defined, then use "," for CSV,
> and "|" for other. Although it can be little bit messy

Currently there's a strong analogy between the unaligned
mode and csv mode. In particular they use the existing pset
variables fieldsep, fieldsep_zero, recordsep, recordsep_zero
in the same way. If we want to make csv special with regard
to the delimiters, that complicates the user interface
For instance if fieldsep was changed automatically by
\pset format csv, it's not obvious if/when we should switch it
back to its previous state, and how the fieldsep switch done
automatically would mix or conflict with other \pset
commands issued by the user.
Or we need to duplicate these variables. Or duplicate
only fieldsep, having a fieldsep_csv, leaving out the
other variables and not being as close to the unaligned
mode.
These options don't appeal to me much compared
to the simplicity of the current patch.

Also, although the comma is the separator defined by the
RFC4180 and the default for COPY CSV, people also use the
semicolon extensively (because it's what Excel does I guess),
which somehow mitigates the importance of comma as the
default value.

The functionality is clean and great - default setting "|" is less. I have not strong opinion about it and I understand to your arguments. Has anybody some idea?

two other ideas from me

1. currently \pset format xxx command has not any other parameters. Other optional parameters can be set of possible attributes like {, ; | \t header_off escape_all}. If this set will be known by tabcomplete, then it can be very friendly and I don't think so code will be more complex than in your proposal (long options can be "--csv --csv-options=", header_off" )

some like "\pset format csv , header_off

or just limited variant with only field separator specification

\pset format csv ;

2. if we support csv format, then we can introduce \gcsv xxx | gnuplot .... It is just idea.

@1 is much more interesting than @2

Regards

Pavel

Regards

Pavel

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

Pavel Stehule

Date:

05 February 2018, 11:00:29

2018-01-31 13:58 GMT+01:00 Daniel Verite <daniel@manitou-mail.org>:

Pavel Stehule wrote:

> This format is too important, so some special short or long option can be
> practical (it will be printed in help)
>
> some like --csv

I guess -C/--csv could be used, like there already is -H/--html.

I prefer just long option only. Maybe in future we can use short "C" for something else better. There is only few free short commands.

Regards

Pavel

> I found one issue - PostgreSQL default field separator is "|". Maybe good
> time to use more common "," ?
>
> Or when field separator was not explicitly defined, then use "," for CSV,
> and "|" for other. Although it can be little bit messy

Currently there's a strong analogy between the unaligned
mode and csv mode. In particular they use the existing pset
variables fieldsep, fieldsep_zero, recordsep, recordsep_zero
in the same way. If we want to make csv special with regard
to the delimiters, that complicates the user interface
For instance if fieldsep was changed automatically by
\pset format csv, it's not obvious if/when we should switch it
back to its previous state, and how the fieldsep switch done
automatically would mix or conflict with other \pset
commands issued by the user.
Or we need to duplicate these variables. Or duplicate
only fieldsep, having a fieldsep_csv, leaving out the
other variables and not being as close to the unaligned
mode.
These options don't appeal to me much compared
to the simplicity of the current patch.

Also, although the comma is the separator defined by the
RFC4180 and the default for COPY CSV, people also use the
semicolon extensively (because it's what Excel does I guess),
which somehow mitigates the importance of comma as the
default value.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

"Daniel Verite"

Date:

12 February 2018, 16:46:41

    Pavel Stehule wrote:

> > I guess -C/--csv could be used, like there already is -H/--html.
> >
>
> I prefer just long option only. Maybe in future we can use short "C" for
> something else better. There is only few free short commands.

Looking at parse_psql_options(), currently psql has 35 long options
and all of them have a corresponding single-character invocation.
Unless there is a consensus for it, I don't feel like it's time to create
the precedent that some options deserve short forms and others
don't.

Sure at some point we'll run out of letters, but looking back ten
years ago, on Jan 2008 psql had 31 options, so the rate of adding new
ones does not look worrying.

Besides I like the fact that -C can be seen as a drop-in replacement for
-A, because in most cases, it's just a safer version of -A, as it
deals with the situation that the separator might be in the contents,
which is the main weakness of -A.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

"Daniel Verite"

Date:

12 February 2018, 17:33:41

    Pavel Stehule wrote:

> some like "\pset format csv , header_off

If we did that, we'd need to reconsider the interactions
of this with \t on|off and \pset footer on|off
and how to keep things consistent with the unaligned
format.
I feel like it's not worth the trouble.
We can still add this later if users are confused with
the interface, but that interface is already well
established with the unaligned format.

> 2. if we support csv format, then we can introduce \gcsv xxx | gnuplot ....

I thought committers rejected this because they didn't want a
new variant of \g
The csv format makes it possible to not add any new \g-style
metacommand and inject csv into the existing ones.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

Pavel Stehule

Date:

12 February 2018, 17:39:31

2018-02-12 15:33 GMT+01:00 Daniel Verite <daniel@manitou-mail.org>:

Pavel Stehule wrote:

> some like "\pset format csv , header_off

If we did that, we'd need to reconsider the interactions
of this with \t on|off and \pset footer on|off
and how to keep things consistent with the unaligned
format.

I feel like it's not worth the trouble.
We can still add this later if users are confused with
the interface, but that interface is already well
established with the unaligned format.

> 2. if we support csv format, then we can introduce \gcsv xxx | gnuplot ....

I thought committers rejected this because they didn't want a
new variant of \g
The csv format makes it possible to not add any new \g-style
metacommand and inject csv into the existing ones.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

Fabien COELHO

Date:

26 February 2018, 13:14:48

Hello Daniel,

> This patch implements csv as an output format in psql
> (\pset format csv). It's quite similar to the unaligned format,
> except that it applies CSV quoting rules (obviously!) and that
> it prints no footer and no title.
> As with unaligned, a header with column names is output unless
> tuples_only is on. It also supports the fieldsep/fielsep_zero
> and recordsep/recordsep_zero settings.

Patch applies cleanly and compiles. "make check" ok, although there is
no specific test for this feature...

The documentation should mention the other CSV options (COPY, \copy, ...) 
and explain how they compare to this one. Maybe a specific paragraph about 
how to do CSV? I understand "\pset format csv" as triggering that all 
outputs compared to per command options.

Given that this is somehow already available, I'm wondering why there is 
no code sharing.

I find it annoying that setting csv keeps the default '|' separator, where
ISTM that it should be by default "," (as in COMMA separated value:-).
However it would not be a good idea to change another variables when setting
one, obviously.

Maybe some \csv command could set the format to csv, fieldsep to ",", 
tuples_only to on, recordsep to '\n'? Not sure whether it would be 
acceptable, though, and how to turn it off once turned on... Probably an 
average (aka not good) idea:-)

The format adds column headers, however they are not escaped properly:

   psql> \pset format csv
   psql> \pset fieldsep ,
   psql> SELECT 1 AS "hello, world", 2 AS """";
     hello, world,"
     1,2

Also it does not seem to work properly in expanded mode, both for the
column and values:

   psql> \x
   psql> SELECT 1 AS "bla""", E'\n,"' AS foo;
     bla",1
     foo,
     ,"

There MUST be some tests, especially with ugly stuff (escapes, newlines,
double quotes, various types, expanded or not, field seps, strange 
column names...).


> Most of times, the need for CSV is covered by \copy or COPY with
> the CSV option, but there are some cases where it would be more
> practical to have it as an output format in psql.
>
> * \copy does not interpolate psql variables and is a single-line
> command, so making a query fit these contraints can be cumbersome.
> It can be got around by defining a temporary view and
> \copy from that view, but that doesn't work in a read-only context
> such as when connected to a standby.
>
> * the server-side COPY TO STDOUT can also be used from psql,
> typically with psql -c "COPY (query) TO STDOUT CSV" > file.csv,
> but that's too simple to extract multiple result sets per script.
> COPY is also more rigid than psql in the options to delimit
> fields and records.
>
> * copy with csv can't help for the output of meta-commands
> such as \gx, \crosstabview, \l, \d ... whereas a CSV format within psql
> does work with these.

-- 
Fabien.

Re: csv format for psql

From

"Daniel Verite"

Date:

01 March 2018, 19:10:08

    Fabien COELHO wrote:

> Maybe some \csv command could set the format to csv, fieldsep to ",",
> tuples_only to on, recordsep to '\n'? Not sure whether it would be
> acceptable, though, and how to turn it off once turned on... Probably an
> average (aka not good) idea:-)

Thanks for reviewing this patch!

Attached is a v2 fixing the bugs you mentioned, and adding ---csv/-C
as discussed upthread. I'll add some regression tests shortly.

About the default separator, the approach taken by this patch is that
the csv output is equivalent to unaligned with csv quoting, so it has
the same behavior and follows the options of the unaligned mode
as much as possible.

That seemed like a good idea to me when I wrote it, but maybe
a better idea would to have a new \pset fieldsep_csv
parameter with its own command-line option, and ignore {fieldsep,
fieldsep_zero, recordsep, recordsep_zero} for this format, just like
they are being ignored for HTML, or latex, or asciidoc...
Does anyone who think that csv should be added care
about this "alignment" with unaligned? :)

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

psql-csv-format-v2.patch

Re: csv format for psql

From

Pavel Stehule

Date:

01 March 2018, 19:15:18

2018-03-01 17:10 GMT+01:00 Daniel Verite <daniel@manitou-mail.org>:

Fabien COELHO wrote:

> Maybe some \csv command could set the format to csv, fieldsep to ",",
> tuples_only to on, recordsep to '\n'? Not sure whether it would be
> acceptable, though, and how to turn it off once turned on... Probably an
> average (aka not good) idea:-)

Thanks for reviewing this patch!

Attached is a v2 fixing the bugs you mentioned, and adding ---csv/-C
as discussed upthread. I'll add some regression tests shortly.

About the default separator, the approach taken by this patch is that
the csv output is equivalent to unaligned with csv quoting, so it has
the same behavior and follows the options of the unaligned mode
as much as possible.

That seemed like a good idea to me when I wrote it, but maybe
a better idea would to have a new \pset fieldsep_csv
parameter with its own command-line option, and ignore {fieldsep,
fieldsep_zero, recordsep, recordsep_zero} for this format, just like
they are being ignored for HTML, or latex, or asciidoc...
Does anyone who think that csv should be added care
about this "alignment" with unaligned? :)

good idea

Pavel

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

Fabien COELHO

Date:

07 March 2018, 12:45:03

Hello Daniel,

> Attached is a v2 fixing the bugs you mentioned, and adding ---csv/-C
> as discussed upthread. I'll add some regression tests shortly.

Basically I'm waiting for the version with regression tests before 
reviewing.

It is unclear whether committer will like it.

From my point of view being able to simply set postgres to generate csv is 
fine with me, with example uses such as:

     psql --csv 'TABLE Stuff;' > stuff.csv

So that having the --csv option to turn to "full csv", i.e. set the format 
and various seperators as expected, would be a nice have.

-- 
Fabien.

Re: csv format for psql

From

Pavel Stehule

Date:

07 March 2018, 13:00:59

2018-03-07 10:45 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:

Hello Daniel,

Attached is a v2 fixing the bugs you mentioned, and adding ---csv/-C
as discussed upthread. I'll add some regression tests shortly.

Basically I'm waiting for the version with regression tests before reviewing.

It is unclear whether committer will like it.

From my point of view being able to simply set postgres to generate csv is fine with me, with example uses such as:

psql --csv 'TABLE Stuff;' > stuff.csv

So that having the --csv option to turn to "full csv", i.e. set the format and various seperators as expected, would be a nice have.

There is commad -c and it should be used. The --csv options should not to have a parameter. I don't like a idea to have more options for query execution.

Regards

Pavel

--
Fabien.

Re: csv format for psql

From

Fabien COELHO

Date:

07 March 2018, 21:40:49

Hello Pavel,

>>         psql --csv 'TABLE Stuff;' > stuff.csv
>
> There is commad -c and it should be used. The --csv options should not to
> have a parameter. I don't like a idea to have more options for query
> execution.

Yes, I agree and that is indeed what I meant, sorry for the typo. The 
cleaner example would be something like:

   psql --csv -c 'TABLE foo' > foo.csv

With a -c to introduce the command.

-- 
Fabien.

Re: csv format for psql

From

Pavel Stehule

Date:

07 March 2018, 21:41:52

2018-03-07 19:40 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:

Hello Pavel,

psql --csv 'TABLE Stuff;' > stuff.csv

There is commad -c and it should be used. The --csv options should not to
have a parameter. I don't like a idea to have more options for query
execution.

Yes, I agree and that is indeed what I meant, sorry for the typo. The cleaner example would be something like:

psql --csv -c 'TABLE foo' > foo.csv

With a -c to introduce the command.

ok :) it has sense now

Regards

Pavel

--
Fabien.

Re: csv format for psql

From

David Fetter

Date:

07 March 2018, 21:48:22

On Wed, Mar 07, 2018 at 07:40:49PM +0100, Fabien COELHO wrote:
> 
> Hello Pavel,
> 
> >>        psql --csv 'TABLE Stuff;' > stuff.csv
> >
> >There is commad -c and it should be used. The --csv options should not to
> >have a parameter. I don't like a idea to have more options for query
> >execution.
> 
> Yes, I agree and that is indeed what I meant, sorry for the typo. The
> cleaner example would be something like:
> 
>   psql --csv -c 'TABLE foo' > foo.csv
> 
> With a -c to introduce the command.

This seems pretty specialized.  If we're adding something new, how about 

    psql --format=csv -o foo.csv -c 'TABLE foo'

Or we could stick with:

    psql -P format=csv -o foo.csv -c 'TABLE foo'

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: csv format for psql

From

Fabien COELHO

Date:

07 March 2018, 22:04:05

>>   psql --csv -c 'TABLE foo' > foo.csv
>>
>> With a -c to introduce the command.
>
> This seems pretty specialized.  If we're adding something new, how about
>
>    psql --format=csv -o foo.csv -c 'TABLE foo'
>
> Or we could stick with:
>
>    psql -P format=csv -o foo.csv -c 'TABLE foo'

Currently "-P format=csv" uses the unaligned formating separators, i.e. 
'|' is used. I was suggesting that a special long option could switch 
several variables to some specific values, i.e.

   --csv

Would be equivalent to something like:

   -P format=csv -P fieldsep=, -P recordsep=\n (?) -P tuples_only=on ...

I.e. really generate some csv from the data in just one option, not many.

But this is obviously debatable.

-- 
Fabien.

Re: csv format for psql

From

David Fetter

Date:

07 March 2018, 22:25:54

On Wed, Mar 07, 2018 at 08:04:05PM +0100, Fabien COELHO wrote:
> 
> >>  psql --csv -c 'TABLE foo' > foo.csv
> >>
> >>With a -c to introduce the command.
> >
> >This seems pretty specialized.  If we're adding something new, how about
> >
> >   psql --format=csv -o foo.csv -c 'TABLE foo'
> >
> >Or we could stick with:
> >
> >   psql -P format=csv -o foo.csv -c 'TABLE foo'
> 
> Currently "-P format=csv" uses the unaligned formating separators, i.e. '|'
> is used. I was suggesting that a special long option could switch several
> variables to some specific values, i.e.
> 
>   --csv
> 
> Would be equivalent to something like:
> 
>   -P format=csv -P fieldsep=, -P recordsep=\n (?) -P tuples_only=on ...

We have some inconsistency here in that fewer table formats are
supported, but I think asciidoc, etc., do this correctly via
invocations like:

    psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo'

> I.e. really generate some csv from the data in just one option, not many.
> 
> But this is obviously debatable.

I suspect we'll get requests for an all-JSON option, HTML tables,
etc., assuming we don't have them already.

I'm hoping we can have that all in one framework.  I get that setting
each of tuples_only, fieldsep, recordsep, etc.  might be a bit of a
lift for some users, but it's not clear how we'd make a sane default
that made choices among those correct for enough users.

For example, do we know that we want tuples_only behavior by default?
A lot of people's CSV tools assume a header row.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: csv format for psql

From

Pavel Stehule

Date:

07 March 2018, 22:30:33

2018-03-07 20:25 GMT+01:00 David Fetter <david@fetter.org>:

On Wed, Mar 07, 2018 at 08:04:05PM +0100, Fabien COELHO wrote:
>
> >> psql --csv -c 'TABLE foo' > foo.csv
> >>
> >>With a -c to introduce the command.
> >
> >This seems pretty specialized. If we're adding something new, how about
> >
> > psql --format=csv -o foo.csv -c 'TABLE foo'
> >
> >Or we could stick with:
> >
> > psql -P format=csv -o foo.csv -c 'TABLE foo'
>
> Currently "-P format=csv" uses the unaligned formating separators, i.e. '|'
> is used. I was suggesting that a special long option could switch several
> variables to some specific values, i.e.
>
> --csv
>
> Would be equivalent to something like:
>
> -P format=csv -P fieldsep=, -P recordsep=\n (?) -P tuples_only=on ...

We have some inconsistency here in that fewer table formats are
supported, but I think asciidoc, etc., do this correctly via
invocations like:

psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo'

> I.e. really generate some csv from the data in just one option, not many.
>
> But this is obviously debatable.

I suspect we'll get requests for an all-JSON option, HTML tables,
etc., assuming we don't have them already.

I'm hoping we can have that all in one framework. I get that setting
each of tuples_only, fieldsep, recordsep, etc. might be a bit of a
lift for some users, but it's not clear how we'd make a sane default
that made choices among those correct for enough users.

For example, do we know that we want tuples_only behavior by default?
A lot of people's CSV tools assume a header row.

I am for default header - it can be disabled by -t option

Pavel

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: csv format for psql

From

"Daniel Verite"

Date:

07 March 2018, 23:00:20

 I wrote:

> a better idea would to have a new \pset fieldsep_csv

PFA a v3 patch that implements that, along with
regression tests this time.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

psql-csv-format-v3.patch

Re: csv format for psql

From

"Daniel Verite"

Date:

07 March 2018, 23:08:43

    David Fetter wrote:

> This seems pretty specialized.  If we're adding something new, how about
>
>    psql --format=csv -o foo.csv -c 'TABLE foo'

It's a bit easier to memorize than -P format=csv,
but psql doesn't have any long option that does not a have a short
form with a single letter, and both -F and -f are already taken.
Contrary to -C that isn't used until now.

> Or we could stick with:
>
>    psql -P format=csv -o foo.csv -c 'TABLE foo'

That already works as of the current patch, just
like with the other formats.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

"Daniel Verite"

Date:

07 March 2018, 23:31:11

    David Fetter wrote:

> We have some inconsistency here in that fewer table formats are
> supported, but I think asciidoc, etc., do this correctly via
> invocations like:
>
>    psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo'

-A is equivalent to -P format=unaligned, so in the above
invocation, it cancels the effect of -P format=asciidoc.
Anyway -P format=name on the command line
is the same as "\pset format name" as a
metacommand, so it works for all formats.

Some formats (unaligned, html)    have corresponding
command-line options (-A, -H), and others don't.
In this patch, -C is used so that csv would be in the
category of formats that can be switched on with the simpler
invocation on the command line.
If we don't like that, we can leave out -C for future use
and let users write -P format=csv.
That's not the best choice from my POV though, as csv
is a primary choice to export tabular data.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

Pavel Stehule

Date:

07 March 2018, 23:37:26

2018-03-07 21:31 GMT+01:00 Daniel Verite <daniel@manitou-mail.org>:

David Fetter wrote:

> We have some inconsistency here in that fewer table formats are
> supported, but I think asciidoc, etc., do this correctly via
> invocations like:
>
> psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo'

-A is equivalent to -P format=unaligned, so in the above
invocation, it cancels the effect of -P format=asciidoc.
Anyway -P format=name on the command line
is the same as "\pset format name" as a
metacommand, so it works for all formats.

Some formats (unaligned, html) have corresponding
command-line options (-A, -H), and others don't.
In this patch, -C is used so that csv would be in the
category of formats that can be switched on with the simpler
invocation on the command line.
If we don't like that, we can leave out -C for future use
and let users write -P format=csv.
That's not the best choice from my POV though, as csv
is a primary choice to export tabular data.

-C can be used for certificates or some similar. I like csv, but I am not sure, so it is too important to get short option (the list of free chars will be only shorter)

Regards

Pavel

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

David Fetter

Date:

08 March 2018, 00:16:21

On Wed, Mar 07, 2018 at 09:37:26PM +0100, Pavel Stehule wrote:
> 2018-03-07 21:31 GMT+01:00 Daniel Verite <daniel@manitou-mail.org>:
> 
> >         David Fetter wrote:
> >
> > > We have some inconsistency here in that fewer table formats are
> > > supported, but I think asciidoc, etc., do this correctly via
> > > invocations like:
> > >
> > >    psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo'
> >
> > -A is equivalent to -P format=unaligned, so in the above
> > invocation, it cancels the effect of -P format=asciidoc.
> > Anyway -P format=name on the command line
> > is the same as "\pset format name" as a
> > metacommand, so it works for all formats.
> >
> > Some formats (unaligned, html)  have corresponding
> > command-line options (-A, -H), and others don't.
> > In this patch, -C is used so that csv would be in the
> > category of formats that can be switched on with the simpler
> > invocation on the command line.
> > If we don't like that, we can leave out -C for future use
> > and let users write -P format=csv.
> > That's not the best choice from my POV though, as csv
> > is a primary choice to export tabular data.
> >
> 
> -C can be used for certificates or some similar. I like csv, but I am not
> sure, so it is too important to get short option (the list of free chars
> will be only shorter)

+1 for not using up a single-letter option for this.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: csv format for psql

From

Pavel Stehule

Date:

08 March 2018, 10:29:39

2018-03-07 22:16 GMT+01:00 David Fetter <david@fetter.org>:

On Wed, Mar 07, 2018 at 09:37:26PM +0100, Pavel Stehule wrote:
> 2018-03-07 21:31 GMT+01:00 Daniel Verite <daniel@manitou-mail.org>:
>
> > David Fetter wrote:
> >
> > > We have some inconsistency here in that fewer table formats are
> > > supported, but I think asciidoc, etc., do this correctly via
> > > invocations like:
> > >
> > > psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo'
> >
> > -A is equivalent to -P format=unaligned, so in the above
> > invocation, it cancels the effect of -P format=asciidoc.
> > Anyway -P format=name on the command line
> > is the same as "\pset format name" as a
> > metacommand, so it works for all formats.
> >
> > Some formats (unaligned, html) have corresponding
> > command-line options (-A, -H), and others don't.
> > In this patch, -C is used so that csv would be in the
> > category of formats that can be switched on with the simpler
> > invocation on the command line.
> > If we don't like that, we can leave out -C for future use
> > and let users write -P format=csv.
> > That's not the best choice from my POV though, as csv
> > is a primary choice to export tabular data.
> >
>
> -C can be used for certificates or some similar. I like csv, but I am not
> sure, so it is too important to get short option (the list of free chars
> will be only shorter)

+1 for not using up a single-letter option for this.

Is there some rule, so alone long options are disallowed? When this software will be more mature, then we cannot to find "inteligent" short option for lot of tasks.

Regards

Pavel

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: csv format for psql

From

Fabien COELHO

Date:

08 March 2018, 11:29:06

>> I.e. really generate some csv from the data in just one option, not many.
>>
>> But this is obviously debatable.
>
> I suspect we'll get requests for an all-JSON option, HTML tables,
> etc., assuming we don't have them already.

I would definitely be fine with --html (which indeed already exists) & 
--json (which does not, but could some day) as long options.

> I'm hoping we can have that all in one framework.

ISTM that it is more or less the case if an option simply presets a bunch 
of existing table output options which is an existing framework.

> I get that setting each of tuples_only, fieldsep, recordsep, etc. 
> might be a bit of a lift for some users, but it's not clear how we'd 
> make a sane default that made choices among those correct for enough 
> users. For example, do we know that we want tuples_only behavior by 
> default? A lot of people's CSV tools assume a header row.

If there is a possible disagreement on one option, then let it out and use 
the corresponding short option if needed?

Tuple only:

   psql --csv -t -c 'TABLE foo' -o foo.csv

With title headers:

   psql --csv    -c 'TABLE foo' -o foo.csv

Would be okay.

-- 
Fabien.

Re: csv format for psql

From

Pavel Stehule

Date:

08 March 2018, 11:40:59

2018-03-08 9:29 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:

I.e. really generate some csv from the data in just one option, not many.

But this is obviously debatable.

I suspect we'll get requests for an all-JSON option, HTML tables,
etc., assuming we don't have them already.

I would definitely be fine with --html (which indeed already exists) & --json (which does not, but could some day) as long options.

I'm hoping we can have that all in one framework.

ISTM that it is more or less the case if an option simply presets a bunch of existing table output options which is an existing framework.

I get that setting each of tuples_only, fieldsep, recordsep, etc. might be a bit of a lift for some users, but it's not clear how we'd make a sane default that made choices among those correct for enough users. For example, do we know that we want tuples_only behavior by default? A lot of people's CSV tools assume a header row.

If there is a possible disagreement on one option, then let it out and use the corresponding short option if needed?

Tuple only:

psql --csv -t -c 'TABLE foo' -o foo.csv

With title headers:

psql --csv -c 'TABLE foo' -o foo.csv

Would be okay.

Pavel

--
Fabien.

Re: csv format for psql

From

Fabien COELHO

Date:

08 March 2018, 13:05:50

Hello Daniel,

> PFA a v3 patch that implements that, along with regression tests this 
> time.

My 0.02 €:

Patch applies cleanly, compiles, make check ok, doc generation ok.

I'm in favor of having a simple psql way to generate a convenient and 
compliant CSV output for export/import.

I also think that a short option brings little value, and "--csv" is good 
enough for the purpose, so I would agree to remove the "-C" binding.

About "fieldsep_csv", I do not like much the principle of having different 
output variables to represent the same concept depending on the format. I 
would rather have reused fieldsep as in your previous submission and set 
it to "," when under --csv. This is not a strong opinion and other people 
may differ: the committer opinion is the one to follow:-)

The "\n" eol style is hardcoded. Should it use "recordsep"? For instance, 
https://tools.ietf.org/html/rfc4180 seems to specify CRLF end of lines. 
The definition is evolving, eg https://www.w3.org/TR/tabular-data-model/ 
accepts both "\r" and "\r\n". I do not like using windows eol, but I think 
that it should be possible to do it, which is not the case with this 
version.

The "\pset format" error message in "do_pset" shows values in seemingly 
random order. The situation is pre-existing but not really satisfactory. 
I'd suggest to put all values in alphabetical order.

In csv_print_field & csv_print_text, you are not consistent when putting 
braces on blocks with only one instruction. I'd suggest not to put braces 
in that case.

I'd suggest that tests should include more types, not just strings. I 
would suggest int, float, timestamp, bytea, an array (which uses , as a 
separator), json (which uses both " and ,)...

-- 
Fabien.

Re: csv format for psql

From

Pavel Stehule

Date:

08 March 2018, 13:17:14

2018-03-08 11:05 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:

Hello Daniel,

PFA a v3 patch that implements that, along with regression tests this time.

My 0.02 €:

Patch applies cleanly, compiles, make check ok, doc generation ok.

I'm in favor of having a simple psql way to generate a convenient and compliant CSV output for export/import.

I also think that a short option brings little value, and "--csv" is good enough for the purpose, so I would agree to remove the "-C" binding.

About "fieldsep_csv", I do not like much the principle of having different output variables to represent the same concept depending on the format. I would rather have reused fieldsep as in your previous submission and set it to "," when under --csv. This is not a strong opinion and other people may differ: the committer opinion is the one to follow:-)

The "\n" eol style is hardcoded. Should it use "recordsep"? For instance, https://tools.ietf.org/html/rfc4180 seems to specify CRLF end of lines. The definition is evolving, eg https://www.w3.org/TR/tabular-data-model/ accepts both "\r" and "\r\n". I do not like using windows eol, but I think that it should be possible to do it, which is not the case with this version.

In this case recordsep is not problem - default is ok.

The "\pset format" error message in "do_pset" shows values in seemingly random order. The situation is pre-existing but not really satisfactory. I'd suggest to put all values in alphabetical order.

In csv_print_field & csv_print_text, you are not consistent when putting braces on blocks with only one instruction. I'd suggest not to put braces in that case.

I'd suggest that tests should include more types, not just strings. I would suggest int, float, timestamp, bytea, an array (which uses , as a separator), json (which uses both " and ,)...

--
Fabien.

Re: csv format for psql

From

Pavel Stehule

Date:

08 March 2018, 13:21:10

2018-03-08 11:17 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2018-03-08 11:05 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:

Hello Daniel,

PFA a v3 patch that implements that, along with regression tests this time.

My 0.02 €:

Patch applies cleanly, compiles, make check ok, doc generation ok.

I'm in favor of having a simple psql way to generate a convenient and compliant CSV output for export/import.

I also think that a short option brings little value, and "--csv" is good enough for the purpose, so I would agree to remove the "-C" binding.

About "fieldsep_csv", I do not like much the principle of having different output variables to represent the same concept depending on the format. I would rather have reused fieldsep as in your previous submission and set it to "," when under --csv. This is not a strong opinion and other people may differ: the committer opinion is the one to follow:-)

Looks like complex rule. But maybe it is acceptable. If the format is csv, and fieldsep is not defined, then use ','. If it defined, then use defined field sep. I am not sure, if is possible distinct between default and explicitly defined field separator.

If not, then I prefer fieldsep_csv.

The "\n" eol style is hardcoded. Should it use "recordsep"? For instance, https://tools.ietf.org/html/rfc4180 seems to specify CRLF end of lines. The definition is evolving, eg https://www.w3.org/TR/tabular-data-model/ accepts both "\r" and "\r\n". I do not like using windows eol, but I think that it should be possible to do it, which is not the case with this version.

In this case recordsep is not problem - default is ok.

The "\pset format" error message in "do_pset" shows values in seemingly random order. The situation is pre-existing but not really satisfactory. I'd suggest to put all values in alphabetical order.

In csv_print_field & csv_print_text, you are not consistent when putting braces on blocks with only one instruction. I'd suggest not to put braces in that case.

I'd suggest that tests should include more types, not just strings. I would suggest int, float, timestamp, bytea, an array (which uses , as a separator), json (which uses both " and ,)...

--
Fabien.

Re: csv format for psql

From

Peter Eisentraut

Date:

09 March 2018, 05:13:14

On 3/8/18 05:05, Fabien COELHO wrote:
> I'm in favor of having a simple psql way to generate a convenient and 
> compliant CSV output for export/import.

yes

> I also think that a short option brings little value, and "--csv" is good 
> enough for the purpose, so I would agree to remove the "-C" binding.

yes

> About "fieldsep_csv", I do not like much the principle of having different 
> output variables to represent the same concept depending on the format. I 
> would rather have reused fieldsep as in your previous submission and set 
> it to "," when under --csv.

yes

> The "\n" eol style is hardcoded. Should it use "recordsep"?

yes

> The "\pset format" error message in "do_pset" shows values in seemingly 
> random order. The situation is pre-existing but not really satisfactory. 
> I'd suggest to put all values in alphabetical order.

yes

> I'd suggest that tests should include more types, not just strings. I 
> would suggest int, float, timestamp, bytea, an array (which uses , as a 
> separator), json (which uses both " and ,)...

sounds good

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: csv format for psql

From

Pavel Stehule

Date:

09 March 2018, 07:23:15

2018-03-09 3:13 GMT+01:00 Peter Eisentraut <peter.eisentraut@2ndquadrant.com>:

On 3/8/18 05:05, Fabien COELHO wrote:
> I'm in favor of having a simple psql way to generate a convenient and
> compliant CSV output for export/import.

yes

> I also think that a short option brings little value, and "--csv" is good
> enough for the purpose, so I would agree to remove the "-C" binding.

yes

> About "fieldsep_csv", I do not like much the principle of having different
> output variables to represent the same concept depending on the format. I
> would rather have reused fieldsep as in your previous submission and set
> it to "," when under --csv.

yes

how will be possible to set different separator ';'? I don't see it with described design

Regards

Pavel

> The "\n" eol style is hardcoded. Should it use "recordsep"?

yes

> The "\pset format" error message in "do_pset" shows values in seemingly
> random order. The situation is pre-existing but not really satisfactory.
> I'd suggest to put all values in alphabetical order.

yes

> I'd suggest that tests should include more types, not just strings. I
> would suggest int, float, timestamp, bytea, an array (which uses , as a
> separator), json (which uses both " and ,)...

sounds good

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: csv format for psql

From

Fabien COELHO

Date:

09 March 2018, 10:40:40

>>> About "fieldsep_csv", I do not like much the principle of having 
>>> different output variables to represent the same concept depending on 
>>> the format. I would rather have reused fieldsep as in your previous 
>>> submission and set it to "," when under --csv.
>>
>> yes
>>
>
> how will be possible to set different separator ';'? I don't see it with
> described design

Indeed, it should be possible. I think that the following should be made 
to work:

   psql --csv -P fieldsep=; -c 'TABLE foo' > foo.csv

So that it can be changed the semi-colon (or tab or whatever) style if 
required.

-- 
Fabien.

Re: csv format for psql

From

Pavel Stehule

Date:

09 March 2018, 19:48:00

2018-03-09 8:40 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:

About "fieldsep_csv", I do not like much the principle of having different output variables to represent the same concept depending on the format. I would rather have reused fieldsep as in your previous submission and set it to "," when under --csv.

yes

how will be possible to set different separator ';'? I don't see it with
described design

Indeed, it should be possible. I think that the following should be made to work:

psql --csv -P fieldsep=; -c 'TABLE foo' > foo.csv

So that it can be changed the semi-colon (or tab or whatever) style if required.

ok, then all is ok

Regards

Pavel

--
Fabien.

Re: csv format for psql

From

"Daniel Verite"

Date:

09 March 2018, 22:20:18

    Fabien COELHO wrote:

> I also think that a short option brings little value, and "--csv" is good
> enough for the purpose, so I would agree to remove the "-C" binding.

It's not that accepting -C brings much value by itself, it's that
loosing the consistency across all options comes with a negative cost.
The point is that up to now all options have a short form and a long
form, so --csv would be a deliberate exception. I'm rather unconvinced
it's justified, but I seem to be alone in that case, so I'll comply.

> About "fieldsep_csv", I do not like much the principle of having different
> output variables to represent the same concept depending on the format. I
> would rather have reused fieldsep as in your previous submission and set
> it to "," when under --csv

The trouble with fieldsep is that it defaults to '|', which
both you and Pavel say you dislike. Fair enough, it's better
to have ',' by default, but the cleanest solution to that
is fieldsep_csv with its own default.
The solution to set fieldsep automatically to ',' with
\pset format csv is problematic.

For instance
 \pset format csv
 \pset fieldsep ';'

changes fieldsep to ';' as expected, but in the other order

 \pset fieldsep ';'
 \pset format csv

you get ',' while reasonably you'd expect ';'

Same problem on the command line. Options are evaluated left-to-right:

 $ psql --csv -F';'
would work as expected, but
 $ psql -F';' --csv
would not.

I don't feel good about solving these issues with ad-hoc rules,
or inventing the notion that a pset variable has been defined
but not user-redefined. This stuff has "inconsistent" written all over it
and I don't see a maintainer going along with that.

> The "\n" eol style is hardcoded. Should it use "recordsep"? For instance,
> https://tools.ietf.org/html/rfc4180 seems to specify CRLF end of lines.
> The definition is evolving, eg https://www.w3.org/TR/tabular-data-model/
> accepts both "\r" and "\r\n". I do not like using windows eol, but I think
> that it should be possible to do it, which is not the case with this
> version.

Interesting point. The output stream is opened in text mode so printing
'\n' should generate LF on Unix, CR LF on Windows, and I think CR on MacOS.
I think that's for the best.

recordsep in the unaligned mode doesn't play the role of a line ending
because the last line is not finished by recordsep. According to the source
code, this is intended, see print_unaligned_text() in print.c:
  /*
   * The last record is terminated by a newline, independent of the set
   * record separator.    But when the record separator is a zero byte, we
   * use that (compatible with find -print0 and xargs).
   */

> The "\pset format" error message in "do_pset" shows values in seemingly
> random order. The situation is pre-existing but not really satisfactory.
> I'd suggest to put all values in alphabetical order.

ok

> In csv_print_field & csv_print_text, you are not consistent when putting
> braces on blocks with only one instruction. I'd suggest not to put braces
> in that case.

ok

> I'd suggest that tests should include more types, not just strings. I
> would suggest int, float, timestamp, bytea, an array (which uses , as a
> separator), json (which uses both " and ,)...

I'll do but the printout code is type-agnostic so it's not supposed
to make a difference compared to mere literals.
Cases with NULLs are missing though, I'll go add some too.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

"Daniel Verite"

Date:

09 March 2018, 22:42:36

    I wrote:

> recordsep in the unaligned mode doesn't play the role of a line ending
> because the last line is not finished by recordsep. According to the source
>code, this is intended, see print_unaligned_text() in print.c:

Something else comes to mind: CSV allows linefeeds inside fields, and
we don't want to replace these with record separators.
So the notion that recordsep can be used to choose line endings
is even less okay than if there was just the last line issue.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

"David G. Johnston"

Date:

09 March 2018, 23:00:43

On Fri, Mar 9, 2018 at 12:42 PM, Daniel Verite <daniel@manitou-mail.org> wrote:

I wrote:

> recordsep in the unaligned mode doesn't play the role of a line ending
> because the last line is not finished by recordsep. According to the source
>code, this is intended, see print_unaligned_text() in print.c:

Something else comes to mind: CSV allows linefeeds inside fields, and
we don't want to replace these with record separators.
So the notion that recordsep can be used to choose line endings
is even less okay than if there was just the last line issue.

I'm not following - if recordsep is not something that would interpreted as a newline then the file we output would have not structural newlines. It might have data newlines but those would be quoted.

David J.

Re: csv format for psql

From

"Daniel Verite"

Date:

10 March 2018, 01:18:22

    David G. Johnston wrote:

> I'm not following - if recordsep is not something that would
> interpreted as a newline then the file we output would have not
> structural newlines.
> It might have data newlines but those would be quoted.

They would be, but I don't quite understand the point
in your first sentence.
All I'm saying is that setting recordsep to '\r\n' does
not achieve the goal of obtaining CRLF line endings.

Here's an example with the unaligned mode, linux host:

(psql -At -P recordsep=$'\r\n' << EOF
  select E'A\nB' union E'C\nD';
EOF
) | hexdump -C

The result is:
00000000  41 0a 42 0d 0a 43 0a 44  0a      |A.B..C.D.|

The expectation of CRLF line endings is that every LF would be
preceded by CR, but here we get that only for 1 LF out of 4. That's
useless.

It's not a bug. We asked for a CRLF to separate our two
records and that's exactly what we got, no more no less.

In csv output, the difference would be that there would a double quote
character before A and after B, and before C and after D but otherwise
it would be the same mix of LF and CRLF.

I think that the point of recordsep in unaligned mode is you can set it
to something that never appears in the data, especially when embedded
newlines might be in the data. In CSV this is solved differently so
we don't need it.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

"David G. Johnston"

Date:

10 March 2018, 05:58:10

On Fri, Mar 9, 2018 at 3:18 PM, Daniel Verite <daniel@manitou-mail.org> wrote:

I think that the point of recordsep in unaligned mode is you can set it

to something that never appears in the data, especially when embedded
newlines might be in the data. In CSV this is solved differently so
we don't need it.

I'd rather argue it from the standpoint that \copy doesn't use recordsep nor fieldsep and thus neither should --csv; which is arguably a convenience invocation of \copy that pipes to psql's stdout (and overcomes \copy's single-line limitation - which I think still exists... - and inability to use variables - does it?...). COPY doesn't allow for changing the record separator and the newline output is system-dependent. I can accept the same limitation with this feature.

I suppose the question is how many "COPY" options do we want to expose on the command line, and how does it look?

I'll put a -1 on having a short option (-C or otherwise); "that is the way its always been done" doesn't work for me here - by way of example "-a and -A" is ill-advised; --echo-all does not seem important enough to warrant a short option (especially not a lower-case one) and so the more useful unaligned mode is forced into the secondary capital A position.

David J.

Re: csv format for psql

From

Fabien COELHO

Date:

10 March 2018, 09:45:03

Hello Daniel,

About "-C", I'm fine if it is used and if it is not used, really. psql 
begins to be quite full of one letter options, currently 34 of them, with 
upper & lower cases and numbers included.


> The solution to set fieldsep automatically to ',' with
> \pset format csv is problematic.

I agree. I'm really advocating that --csv would set fieldsep, but manual 
pset on format would still do what is expected, and only that, i.e. --csv 
is NOT a simple shortcut for  -P format=csv".

> Same problem on the command line. Options are evaluated left-to-right:
>
> $ psql --csv -F';'
> would work as expected, but
> $ psql -F';' --csv
> would not.

ISTM that having an option overriding another one after it is standard 
practice.

I would be fine with that if --csv is documented as "setting format, 
fieldsep and recordsep to default suited for outputting CSV".

Now this is just a personal opinion.

>> The "\n" eol style is hardcoded. Should it use "recordsep"? For instance,
>> https://tools.ietf.org/html/rfc4180 seems to specify CRLF end of lines.
>> The definition is evolving, eg https://www.w3.org/TR/tabular-data-model/
>> accepts both "\r" and "\r\n". I do not like using windows eol, but I think
>> that it should be possible to do it, which is not the case with this
>> version.
>
> Interesting point. The output stream is opened in text mode so printing
> '\n' should generate LF on Unix, CR LF on Windows, and I think CR on MacOS.
> I think that's for the best.

I did not know that C's putc/printf/... would change output on sight on 
different systems. I'm not sure I like it. It would still mean that one 
cannot change the format to suits \r\n or \n at will, which is kind of 
disappointing.

> recordsep in the unaligned mode doesn't play the role of a line ending
> because the last line is not finished by recordsep.

It could just be with CSV format? As you point out, there is already an 
exception with the separator is '\0'. Note that the last line of a CSV 
file may or may not end with \n or \r\n, according to specs.

>> I'd suggest that tests should include more types, not just strings. I
>> would suggest int, float, timestamp, bytea, an array (which uses , as a
>> separator), json (which uses both " and ,)...
>
> I'll do but the printout code is type-agnostic so it's not supposed
> to make a difference compared to mere literals.

Sure, but it seems better to actually see that it works properly for non 
trivial cases.

> Cases with NULLs are missing though, I'll go add some too.

Indeed.

-- 
Fabien.

Re: csv format for psql

From

Fabien COELHO

Date:

10 March 2018, 09:49:45

>> recordsep in the unaligned mode doesn't play the role of a line ending
>> because the last line is not finished by recordsep. According to the source
>> code, this is intended, see print_unaligned_text() in print.c:
>
> Something else comes to mind: CSV allows linefeeds inside fields, and
> we don't want to replace these with record separators.

Sure.

> So the notion that recordsep can be used to choose line endings
> is even less okay than if there was just the last line issue.

I'm not following. ISTM that the escaping macanism would work in pretty 
all reasonable cases, although it might be possible to shot oneself in the 
foot by setting manually strange values for recordsep, eg '"'. I do not 
see that as a significant issue. If the user asks for something stupid, 
they get something stupid, fine.

-- 
Fabien.

Re: Re: csv format for psql

From

David Steele

Date:

21 March 2018, 20:27:33

Hi Daniel,

On 3/10/18 1:49 AM, Fabien COELHO wrote:
> 
>>> recordsep in the unaligned mode doesn't play the role of a line ending
>>> because the last line is not finished by recordsep. According to the
>>> source
>>> code, this is intended, see print_unaligned_text() in print.c:
>>
>> Something else comes to mind: CSV allows linefeeds inside fields, and
>> we don't want to replace these with record separators.
> 
> Sure.
> 
>> So the notion that recordsep can be used to choose line endings
>> is even less okay than if there was just the last line issue.
> 
> I'm not following. ISTM that the escaping macanism would work in pretty
> all reasonable cases, although it might be possible to shot oneself in
> the foot by setting manually strange values for recordsep, eg '"'. I do
> not see that as a significant issue. If the user asks for something
> stupid, they get something stupid, fine.

This patch looks like it is getting close but there has been no update
in a while.

Do you know when you'll have an updated patch that addresses the minor
issues brought up in review and the concern above?

Regards,
-- 
-David
david@pgmasters.net

Re: Re: csv format for psql

From

"Daniel Verite"

Date:

22 March 2018, 18:30:41

    David Steele wrote:

> Do you know when you'll have an updated patch that addresses the minor
> issues brought up in review and the concern above?

Here's an update addressing the issues discussed:

- fieldsep and recordsep are used, no more fieldsep_csv
- the command line option is --csv without short option and is equivalent
 to -P format=csv -P fieldsep=','
- pset output formats are reordered alphabetically on display
- a couple more cases in the regression tests


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

psql-csv-format-v4.patch

Re: Re: csv format for psql

From

Pavel Stehule

Date:

22 March 2018, 18:57:30

2018-03-22 16:30 GMT+01:00 Daniel Verite <daniel@manitou-mail.org>:

David Steele wrote:

> Do you know when you'll have an updated patch that addresses the minor
> issues brought up in review and the concern above?

Here's an update addressing the issues discussed:

- fieldsep and recordsep are used, no more fieldsep_csv
- the command line option is --csv without short option and is equivalent
to -P format=csv -P fieldsep=','
- pset output formats are reordered alphabetically on display
- a couple more cases in the regression tests

I am sorry, I don't think so this design is correct. It introduce dependency on arguments order

# correct output

[pavel@nemesis postgresql]$ psql --csv -c "table foo" -F ';'
a;b;c
1;2;Hello
3;4;Nazdar

# error -F is ignored
[pavel@nemesis postgresql]$ psql -F ';' --csv -c "table foo"
a,b,c
1,2,Hello
3,4,Nazdar

Using \pset format csv means overwriting field sep every time - nobody uses |

I think so dependency on order of psql arguments is significant problem

Regards

Pavel

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: Re: csv format for psql

From

Fabien COELHO

Date:

22 March 2018, 20:38:39

Hello Pavel,

> Using \pset format csv means overwriting field sep every time - nobody uses
> |

Yep. The alternative is to have a csv-specific separator variable, which 
does not seem very useful, must be remembered, but this is indeed 
debatable.

> I think so dependency on order of psql arguments is significant problem

This is intentional, and this issue/feature already exists, the last 
argument overwrite previous settings thus will win, eg:

   psql --pset=format=troff --html -c 'SELECT 1'

Will output in html, not in troff.

-- 
Fabien.

Re: Re: csv format for psql

From

Pavel Stehule

Date:

22 March 2018, 21:28:14

2018-03-22 18:38 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:

Hello Pavel,

Using \pset format csv means overwriting field sep every time - nobody uses
|

Yep. The alternative is to have a csv-specific separator variable, which does not seem very useful, must be remembered, but this is indeed debatable.

I think so dependency on order of psql arguments is significant problem

This is intentional, and this issue/feature already exists, the last argument overwrite previous settings thus will win, eg:

psql --pset=format=troff --html -c 'SELECT 1'

Will output in html, not in troff.

Can we introduce some format specific default separators - if we would not to introduce csv_field_sep options?

It should not be hard. All formats can has '|' like now, and csv can have a ',' - then if field separator is not explicit, then default field separator is used, else specified field separator is used.

You can see my idea in attached patch

Regards

Pavel

postgres=# \pset format csv
Output format is csv.
postgres=# select * from foo;
a,b,c
1,2,Hello
3,4,Nazdar
postgres=# \pset fieldsep ;
Field separator is ";".
postgres=# select * from foo;
a;b;c
1;2;Hello
3;4;Nazdar

--
Fabien.

Attachment

default_format_fieldsep.patch

Re: Re: csv format for psql

From

Pavel Stehule

Date:

22 March 2018, 22:10:16

2018-03-22 19:28 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2018-03-22 18:38 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:

Hello Pavel,

Using \pset format csv means overwriting field sep every time - nobody uses
|

Yep. The alternative is to have a csv-specific separator variable, which does not seem very useful, must be remembered, but this is indeed debatable.

I think so dependency on order of psql arguments is significant problem

This is intentional, and this issue/feature already exists, the last argument overwrite previous settings thus will win, eg:

psql --pset=format=troff --html -c 'SELECT 1'

Will output in html, not in troff.

Can we introduce some format specific default separators - if we would not to introduce csv_field_sep options?

It should not be hard. All formats can has '|' like now, and csv can have a ',' - then if field separator is not explicit, then default field separator is used, else specified field separator is used.

You can see my idea in attached patch

Regards

Pavel

postgres=# \pset format csv
Output format is csv.
postgres=# select * from foo;
a,b,c
1,2,Hello
3,4,Nazdar
postgres=# \pset fieldsep ;
Field separator is ";".
postgres=# select * from foo;
a;b;c
1;2;Hello
3;4;Nazdar

The default fieldsep should be "off" that means so format defaults are used. ',' is used for CSV, | for any else.

So all will work like now, but there will be bigger freedom with new format design. Now, all formats with possibility to setting fieldsep, should to share default '|', what I think, is not practical.

--
Fabien.

Re: Re: csv format for psql

From

Pavel Stehule

Date:

23 March 2018, 07:55:22

2018-03-22 20:10 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2018-03-22 19:28 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2018-03-22 18:38 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:

Hello Pavel,

Using \pset format csv means overwriting field sep every time - nobody uses
|

Yep. The alternative is to have a csv-specific separator variable, which does not seem very useful, must be remembered, but this is indeed debatable.

I think so dependency on order of psql arguments is significant problem

This is intentional, and this issue/feature already exists, the last argument overwrite previous settings thus will win, eg:

psql --pset=format=troff --html -c 'SELECT 1'

Will output in html, not in troff.

Can we introduce some format specific default separators - if we would not to introduce csv_field_sep options?

It should not be hard. All formats can has '|' like now, and csv can have a ',' - then if field separator is not explicit, then default field separator is used, else specified field separator is used.

You can see my idea in attached patch

Regards

Pavel

postgres=# \pset format csv
Output format is csv.
postgres=# select * from foo;
a,b,c
1,2,Hello
3,4,Nazdar
postgres=# \pset fieldsep ;
Field separator is ";".
postgres=# select * from foo;
a;b;c
1;2;Hello
3;4;Nazdar

The default fieldsep should be "off" that means so format defaults are used. ',' is used for CSV, | for any else.

So all will work like now, but there will be bigger freedom with new format design. Now, all formats with possibility to setting fieldsep, should to share default '|', what I think, is not practical.

I can make patch, if there will be a agreement.

comments?

Regards

Pavel

--
Fabien.

Re: Re: csv format for psql

From

"Daniel Verite"

Date:

23 March 2018, 14:59:25

    Pavel Stehule wrote:

> It should not be hard. All formats can has '|' like now, and csv can have a
> ',' - then if field separator is not explicit, then default field separator
> is used, else specified field separator is used.
>
> You can see my idea in attached patch

With that patch, consider this sequence:

postgres=# \pset format csv
Output format is csv.
postgres=# \pset fieldsep
Field separator is "|".
postgres=# select 1 as a,2 as b;
a,b
1,2

Here psql reports that fieldsep is "|" and right away is using something
else in the output. That doesn't look good.

You may object that it's fixable by tweaking the output of \pset,
\pset fieldsep, and \? variables so that it knows that the current
output format is going to use a "hidden" default separator, and
then these commands should display that value instead.
But that'd be somewhat playing whack-a-mole, as the following
sequence would now be possible, with '|' being used  as
the separator instead of the ',' reported just above:

postgres=# \pset format csv
Output format is csv.
postgres=# \pset fieldsep
Field separator is ",".
postgres=# \a
Output format is aligned.
postgres=# select 1 as a,2 as b;
 a | b
---+---
 1 | 2

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: Re: csv format for psql

From

Fabien COELHO

Date:

23 March 2018, 20:55:17

Hello Daniel,

>> Do you know when you'll have an updated patch that addresses the minor
>> issues brought up in review and the concern above?
>
> Here's an update addressing the issues discussed:
>
> - fieldsep and recordsep are used, no more fieldsep_csv
> - the command line option is --csv without short option and is equivalent
> to -P format=csv -P fieldsep=','
> - pset output formats are reordered alphabetically on display
> - a couple more cases in the regression tests

Patch applies cleanly, compiles, doc gen ok, "make check" ok.

The patch adds a simple way to generate csv output from "psql" queries, 
much simpler than playing around with COPY or \copy. It allows to generate 
a clean CSV dump from something as short as:

   sh> psql --csv -c 'TABLE foo' > foo.csv

Documentation is clear.

Test cover a significant number of cases (fieldsep, expanded, tuples-only).
Although recordsep changes are not actually tested, it worked interactively
and I think that tests are sufficient as is.

There are somehow remaining point about which a committer/other people input
would be nice:

(1) There are some mild disagreement whether the fieldsep should be format
     specific shared with other format. I do not think that a specific fieldsep
     is worth it, but this is a marginal preference, and other people opinion
     differ. What is best is not obvious.

     Pavel also suggested to have a special handling based on whether
     the fieldsep is explicitely set or not. I'm not too keen on that because
     it departs significantly from the way psql formatting is currently
     handled, and what is happening becomes unclear to the user.

(2) For interactive use, two commands are required: \pset format csv +
     \pset fieldsep ',' (or ';' or '\t' or whatever...). Maybe some \csv
     command similar  to \H would be appropriate, or not, to set both values
     more efficiently. Could be something for another patch.

Not sure what is the status of the patch if we do not have a clear 
consensus.

-- 
Fabien.

Re: Re: csv format for psql

From

Pavel Stehule

Date:

23 March 2018, 21:44:19

2018-03-23 12:59 GMT+01:00 Daniel Verite <daniel@manitou-mail.org>:

Pavel Stehule wrote:

> It should not be hard. All formats can has '|' like now, and csv can have a
> ',' - then if field separator is not explicit, then default field separator
> is used, else specified field separator is used.
>
> You can see my idea in attached patch

With that patch, consider this sequence:

postgres=# \pset format csv
Output format is csv.
postgres=# \pset fieldsep
Field separator is "|".
postgres=# select 1 as a,2 as b;
a,b
1,2

Here psql reports that fieldsep is "|" and right away is using something
else in the output. That doesn't look good.

yes - my patch was proof concept - nothing more.

But this can be simply solved - if we have a table of default field separator, then if separator is not explicit, then default for used format is printed.

You may object that it's fixable by tweaking the output of \pset,
\pset fieldsep, and \? variables so that it knows that the current
output format is going to use a "hidden" default separator, and
then these commands should display that value instead.
But that'd be somewhat playing whack-a-mole, as the following
sequence would now be possible, with '|' being used as
the separator instead of the ',' reported just above:

postgres=# \pset format csv
Output format is csv.
postgres=# \pset fieldsep
Field separator is ",".
postgres=# \a
Output format is aligned.
postgres=# select 1 as a,2 as b;
a | b
---+---
1 | 2

I am sorry, but path that I sent was just proof concept - I didn't implement defaults for any other related formats.

I'll try to send cleaner patch tomorrow.

Regards

Pavel

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: Re: csv format for psql

From

Pavel Stehule

Date:

23 March 2018, 21:51:08

2018-03-23 18:55 GMT+01:00 Fabien COELHO <fabien.coelho@mines-paristech.fr>:

Hello Daniel,

Do you know when you'll have an updated patch that addresses the minor
issues brought up in review and the concern above?

Here's an update addressing the issues discussed:

- fieldsep and recordsep are used, no more fieldsep_csv
- the command line option is --csv without short option and is equivalent
to -P format=csv -P fieldsep=','
- pset output formats are reordered alphabetically on display
- a couple more cases in the regression tests

Patch applies cleanly, compiles, doc gen ok, "make check" ok.

The patch adds a simple way to generate csv output from "psql" queries, much simpler than playing around with COPY or \copy. It allows to generate a clean CSV dump from something as short as:

sh> psql --csv -c 'TABLE foo' > foo.csv

Documentation is clear.

Test cover a significant number of cases (fieldsep, expanded, tuples-only).
Although recordsep changes are not actually tested, it worked interactively
and I think that tests are sufficient as is.

There are somehow remaining point about which a committer/other people input
would be nice:

(1) There are some mild disagreement whether the fieldsep should be format
specific shared with other format. I do not think that a specific fieldsep
is worth it, but this is a marginal preference, and other people opinion
differ. What is best is not obvious.

Pavel also suggested to have a special handling based on whether
the fieldsep is explicitely set or not. I'm not too keen on that because
it departs significantly from the way psql formatting is currently
handled, and what is happening becomes unclear to the user.

(2) For interactive use, two commands are required: \pset format csv +
\pset fieldsep ',' (or ';' or '\t' or whatever...). Maybe some \csv
command similar to \H would be appropriate, or not, to set both values
more efficiently. Could be something for another patch.

Not sure what is the status of the patch if we do not have a clear consensus.

I am sorry, but I don't think so this interface is good enough. Using | as default CSV separator is just wrong. It and only it is a problem. Any other is perfect.

Regards

Pavel

--
Fabien.

Re: Re: csv format for psql

From

Fabien COELHO

Date:

24 March 2018, 10:15:08

Hello Pavel,

>> The patch adds a simple way to generate csv output from "psql" queries,
>> much simpler than playing around with COPY or \copy. It allows to generate
>> a clean CSV dump from something as short as:
>>
>>   sh> psql --csv -c 'TABLE foo' > foo.csv
>>
>> Documentation is clear.
>>
>> Test cover a significant number of cases (fieldsep, expanded, tuples-only).
>> Although recordsep changes are not actually tested, it worked interactively
>> and I think that tests are sufficient as is.
>>
>> There are somehow remaining point about which a committer/other people 
>> input would be nice:
>>
>> (1) There are some mild disagreement whether the fieldsep should be 
>> format specific shared with other format. I do not think that a 
>> specific fieldsep is worth it, but this is a marginal preference, and 
>> other people opinion differ. What is best is not obvious.
>>
>> Pavel also suggested to have a special handling based on whether the 
>> fieldsep is explicitely set or not. I'm not too keen on that because it 
>> departs significantly from the way psql formatting is currently 
>> handled, and what is happening becomes unclear to the user.
>>
>> (2) For interactive use, two commands are required: \pset format csv + 
>> \pset fieldsep ',' (or ';' or '\t' or whatever...). Maybe some \csv 
>> command similar to \H would be appropriate, or not, to set both values 
>> more efficiently. Could be something for another patch.
>>
>> Not sure what is the status of the patch if we do not have a clear
>> consensus.
>
> I am sorry, but I don't think so this interface is good enough. Using | as
> default CSV separator is just wrong. It and only it is a problem. Any other
> is perfect.

I do not think that there is a perfect solution, so some compromise will 
be needed or we won't get it.

(1) patch v4:

     "\pset format csv" retains the current fieldsep value, so fields are
     separated by whatever is in the variable, which means that for getting
     a standard csv two commands are needed, which is clearly documented,
     but may be considered as surprising. ISTM that the underlying point is
     that "format" is really about string escaping, not about the full output
     format, but this is a pre-existing situation.

     I'm suggesting to add \csv which would behave like \H to toggle CSV
     mode so as to improve this situation, with a caveat which is that
     toggling back \csv would have forgotted the previous settings (just
     like \H does, though, so would for instance reset to aligned with |),
     so it would not be perfect.

(2) your proposal as I understand it:

     "\pset format csv" may or may not use the fieldsep, depending on
     whether it was explicitely set, an information which is not shown, i.e.:

       \pset fieldsep # fieldsep separator is "|"
       \pset format csv # would output a,b,c or a|b|c...

     Because it depends on whether fieldsep was set explicitely to '|' or
     whether it has this value but it was due to the default.

     This kind of unclear behavioral determinism does not seem desirable.

(3) other option, always use a comma:

     this was rejected because some people like their comma separated
     values to be separated by semi-colons or tabs (aka tsv).

(4) other option, Daniel v3 or v2:

     use a distinct "fieldsep_csv" variable initially set to ','. This adds
     yet another specific variable that has to be remembered, some styles
     would use fieldsep but csv would not so it is some kind of exception
     that I would wish to avoid.

My current preference order in the suggested solutions is 1, 4, 2, 3, with 
a significant preference for 1.

-- 
Fabien.

Re: Re: csv format for psql

From

Pavel Stehule

Date:

24 March 2018, 10:24:47

2018-03-24 8:15 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:

Hello Pavel,

The patch adds a simple way to generate csv output from "psql" queries,
much simpler than playing around with COPY or \copy. It allows to generate
a clean CSV dump from something as short as:

sh> psql --csv -c 'TABLE foo' > foo.csv

Documentation is clear.

Test cover a significant number of cases (fieldsep, expanded, tuples-only).
Although recordsep changes are not actually tested, it worked interactively
and I think that tests are sufficient as is.

There are somehow remaining point about which a committer/other people input would be nice:

(1) There are some mild disagreement whether the fieldsep should be format specific shared with other format. I do not think that a specific fieldsep is worth it, but this is a marginal preference, and other people opinion differ. What is best is not obvious.

Pavel also suggested to have a special handling based on whether the fieldsep is explicitely set or not. I'm not too keen on that because it departs significantly from the way psql formatting is currently handled, and what is happening becomes unclear to the user.

(2) For interactive use, two commands are required: \pset format csv + \pset fieldsep ',' (or ';' or '\t' or whatever...). Maybe some \csv command similar to \H would be appropriate, or not, to set both values more efficiently. Could be something for another patch.

Not sure what is the status of the patch if we do not have a clear
consensus.

I am sorry, but I don't think so this interface is good enough. Using | as
default CSV separator is just wrong. It and only it is a problem. Any other
is perfect.

I do not think that there is a perfect solution, so some compromise will be needed or we won't get it.

(1) patch v4:

"\pset format csv" retains the current fieldsep value, so fields are
separated by whatever is in the variable, which means that for getting
a standard csv two commands are needed, which is clearly documented,
but may be considered as surprising. ISTM that the underlying point is
that "format" is really about string escaping, not about the full output
format, but this is a pre-existing situation.

I'm suggesting to add \csv which would behave like \H to toggle CSV
mode so as to improve this situation, with a caveat which is that
toggling back \csv would have forgotted the previous settings (just
like \H does, though, so would for instance reset to aligned with |),
so it would not be perfect.

this doesn't solve usual format settings by \pset format csv

(2) your proposal as I understand it:

"\pset format csv" may or may not use the fieldsep, depending on
whether it was explicitely set, an information which is not shown, i.e.:

\pset fieldsep # fieldsep separator is "|"
\pset format csv # would output a,b,c or a|b|c...

Because it depends on whether fieldsep was set explicitely to '|' or
whether it has this value but it was due to the default.

This kind of unclear behavioral determinism does not seem desirable.

please, check and test attached patch. It is very simply for usage - and there is not any unclear behave. Just you should to accept so formats can have own defaults for separators.

(3) other option, always use a comma:

this was rejected because some people like their comma separated
values to be separated by semi-colons or tabs (aka tsv).

(4) other option, Daniel v3 or v2:

use a distinct "fieldsep_csv" variable initially set to ','. This adds
yet another specific variable that has to be remembered, some styles
would use fieldsep but csv would not so it is some kind of exception
that I would wish to avoid.

My current preference order in the suggested solutions is 1, 4, 2, 3, with a significant preference for 1.

I am thinking so @1 solves nothing - people are using \pset format ...

@3 is clearly bad - there are not any discussion

@4 can be compromise solution, but then there should be renamed fieldsep. Now, fieldsep is used just for unaligned format - for nothing else. If we introduce fieldsep_csv, then fieldsep should be renamed to fieldsep_unaligned. I can live with it.

But I think so default fieldsep is better option. Please, try my patch and comment it.

Regards

Pavel

--
Fabien.

Attachment

default_format_fieldsep-2.patch

Re: Re: csv format for psql

From

Pavel Stehule

Date:

24 March 2018, 11:05:29

2018-03-24 8:24 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2018-03-24 8:15 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:

Hello Pavel,

The patch adds a simple way to generate csv output from "psql" queries,
much simpler than playing around with COPY or \copy. It allows to generate
a clean CSV dump from something as short as:

sh> psql --csv -c 'TABLE foo' > foo.csv

Documentation is clear.

Test cover a significant number of cases (fieldsep, expanded, tuples-only).
Although recordsep changes are not actually tested, it worked interactively
and I think that tests are sufficient as is.

There are somehow remaining point about which a committer/other people input would be nice:

(1) There are some mild disagreement whether the fieldsep should be format specific shared with other format. I do not think that a specific fieldsep is worth it, but this is a marginal preference, and other people opinion differ. What is best is not obvious.

Pavel also suggested to have a special handling based on whether the fieldsep is explicitely set or not. I'm not too keen on that because it departs significantly from the way psql formatting is currently handled, and what is happening becomes unclear to the user.

(2) For interactive use, two commands are required: \pset format csv + \pset fieldsep ',' (or ';' or '\t' or whatever...). Maybe some \csv command similar to \H would be appropriate, or not, to set both values more efficiently. Could be something for another patch.

Not sure what is the status of the patch if we do not have a clear
consensus.

I am sorry, but I don't think so this interface is good enough. Using | as
default CSV separator is just wrong. It and only it is a problem. Any other
is perfect.

I do not think that there is a perfect solution, so some compromise will be needed or we won't get it.

(1) patch v4:

"\pset format csv" retains the current fieldsep value, so fields are
separated by whatever is in the variable, which means that for getting
a standard csv two commands are needed, which is clearly documented,
but may be considered as surprising. ISTM that the underlying point is
that "format" is really about string escaping, not about the full output
format, but this is a pre-existing situation.

I'm suggesting to add \csv which would behave like \H to toggle CSV
mode so as to improve this situation, with a caveat which is that
toggling back \csv would have forgotted the previous settings (just
like \H does, though, so would for instance reset to aligned with |),
so it would not be perfect.

this doesn't solve usual format settings by \pset format csv

(2) your proposal as I understand it:

"\pset format csv" may or may not use the fieldsep, depending on
whether it was explicitely set, an information which is not shown, i.e.:

\pset fieldsep # fieldsep separator is "|"
\pset format csv # would output a,b,c or a|b|c...

Because it depends on whether fieldsep was set explicitely to '|' or
whether it has this value but it was due to the default.

This kind of unclear behavioral determinism does not seem desirable.

please, check and test attached patch. It is very simply for usage - and there is not any unclear behave. Just you should to accept so formats can have own defaults for separators.

(3) other option, always use a comma:

this was rejected because some people like their comma separated
values to be separated by semi-colons or tabs (aka tsv).

(4) other option, Daniel v3 or v2:

use a distinct "fieldsep_csv" variable initially set to ','. This adds
yet another specific variable that has to be remembered, some styles
would use fieldsep but csv would not so it is some kind of exception
that I would wish to avoid.

My current preference order in the suggested solutions is 1, 4, 2, 3, with a significant preference for 1.

I am thinking so @1 solves nothing - people are using \pset format ...

@3 is clearly bad - there are not any discussion

@4 can be compromise solution, but then there should be renamed fieldsep. Now, fieldsep is used just for unaligned format - for nothing else. If we introduce fieldsep_csv, then fieldsep should be renamed to fieldsep_unaligned. I can live with it.

But I think so default fieldsep is better option. Please, try my patch and comment it.

minor fix

Regards

Pavel

Regards

Pavel

--
Fabien.

Attachment

default_format_fieldsep-3.patch

Re: Re: csv format for psql

From

Pavel Stehule

Date:

24 March 2018, 11:07:47

minor fix

all is working now

[pavel@nemesis postgresql]$ psql --csv -F ';' -c "table foo" postgres
a;b;c
3;4;Nazdar
3;4;Nazdar
[pavel@nemesis postgresql]$ psql -F ';' --csv -c "table foo" postgres
a;b;c
3;4;Nazdar
3;4;Nazdar
[pavel@nemesis postgresql]$ psql --csv -c "table foo" postgres
a,b,c
3,4,Nazdar
3,4,Nazdar
[pavel@nemesis postgresql]$ psql -A -c "table foo" postgres
a|b|c
3|4|Nazdar
3|4|Nazdar
(2 rows)
[pavel@nemesis postgresql]$ psql -A -F , -c "table foo" postgres
a,b,c
3,4,Nazdar
3,4,Nazdar
(2 rows)

Regards

Pavel

Regards

Pavel

Regards

Pavel

--
Fabien.

Re: Re: csv format for psql

From

Pavel Stehule

Date:

24 March 2018, 11:28:39

minor fix

another fix in help

Pavel

postgres=# \f
Field separator is "".
postgres=# \pset format csv
Output format is csv.
postgres=# \f
Field separator is ",".
postgres=# \pset format unaligned
Output format is unaligned.
postgres=# \f
Field separator is "|".
postgres=# \f ;
Field separator is ";".
postgres=# \pset format unaligned
Output format is unaligned.
postgres=# \f
Field separator is ";".
postgres=# \pset format csv
Output format is csv.
postgres=# \f
Field separator is ";".

Regards

Regards

Pavel

Regards

Pavel

--
Fabien.

Attachment

default_format_fieldsep-4.patch

Re: Re: csv format for psql

From

Fabien COELHO

Date:

24 March 2018, 12:06:12

Hello Pavel,

>>>     I'm suggesting to add \csv which would behave like \H to toggle CSV
>>>     mode so as to improve this situation, with a caveat which is that
>>>     toggling back \csv would have forgotted the previous settings (just
>>>     like \H does, though, so would for instance reset to aligned with |),
>>>     so it would not be perfect.
>>
>> this doesn't solve usual format settings by \pset format csv

Indeed, it does not. Alas, "format" is not strictly about format but more 
about string escaping.

>>> (2) your proposal as I understand it:
>>>
>>>     "\pset format csv" may or may not use the fieldsep, depending on
>>>     whether it was explicitely set, an information which is not shown,
>>> i.e.:
>>>
>>>       \pset fieldsep # fieldsep separator is "|"
>>>       \pset format csv # would output a,b,c or a|b|c...
>>>
>>>     Because it depends on whether fieldsep was set explicitely to '|' or
>>>     whether it has this value but it was due to the default.
>>>
>>>     This kind of unclear behavioral determinism does not seem desirable.
>>>
>>
>> please, check and test attached patch. It is very simply for usage - and
>> there is not any unclear behave. Just you should to accept so formats can
>> have own defaults for separators.

I checked, and the behavior is intrinsically strange.

Your patch shows an empty '' fieldsep at startup, which is debatable 
because it is not its actual value.

  psql> \pset # fieldsep '' -- doubtful

ISTM that at the minimum it should show a special "<format-default>" or 
whatever value, which creates some kind of exception because this special 
value cannot be set and special values are a bad thing.

When a format is set, a current default fielsep is shown.

  psql> \pset format unaligned
  psql> \pset # fieldsep '|' -- new default shown
  psql> SELECT 1 AS one, 2 AS two;
    one|two
    1|2

  psql> \pset format csv
  psql> \pset # fieldsep ',' -- new default shown

  psql> SELECT 1 AS one, 2 AS two;
    one,two
    1,2

However, if fieldsep is set once, the behaviors changes indefinitely:

  psql> \pset fieldsep '|'
  psql> \pset format unaligned
    # unaligned with '|'

  psql> \pset format csv
    # csv with '|'

That is once the fieldsep has been used, you are back to the v4 behavior 
that you are disagreeing with in the first place.

The patch creates an incomplete state automaton which some unreachable 
states, because once a value is set there is no way to switch back to the 
previous "use default" behavior.

Also, if I do "\pset fielsep ''" then the \pset output does not allow to 
distinguish between the unset state and set to empty state.

I would not expect a committer to accept such a strange no-possible-return 
to previous state because of a hidden state (whether fieldsep has been set 
or not in the past) behavior.

So I do not think that this proposed version is really satisfactory.

-- 
Fabien.

Re: Re: csv format for psql

From

Pavel Stehule

Date:

24 March 2018, 15:41:21

2018-03-24 10:06 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:

Hello Pavel,

I'm suggesting to add \csv which would behave like \H to toggle CSV
mode so as to improve this situation, with a caveat which is that
toggling back \csv would have forgotted the previous settings (just
like \H does, though, so would for instance reset to aligned with |),
so it would not be perfect.

this doesn't solve usual format settings by \pset format csv

Indeed, it does not. Alas, "format" is not strictly about format but more about string escaping.

(2) your proposal as I understand it:

"\pset format csv" may or may not use the fieldsep, depending on
whether it was explicitely set, an information which is not shown,
i.e.:

\pset fieldsep # fieldsep separator is "|"
\pset format csv # would output a,b,c or a|b|c...

Because it depends on whether fieldsep was set explicitely to '|' or
whether it has this value but it was due to the default.

This kind of unclear behavioral determinism does not seem desirable.

please, check and test attached patch. It is very simply for usage - and
there is not any unclear behave. Just you should to accept so formats can
have own defaults for separators.

I checked, and the behavior is intrinsically strange.

Your patch shows an empty '' fieldsep at startup, which is debatable because it is not its actual value.

it is correct. Default format is aligned, that doesn't use fieldsep.

psql> \pset # fieldsep '' -- doubtful

ISTM that at the minimum it should show a special "<format-default>" or whatever value, which creates some kind of exception because this special value cannot be set and special values are a bad thing.

When a format is set, a current default fielsep is shown.

psql> \pset format unaligned
psql> \pset # fieldsep '|' -- new default shown
psql> SELECT 1 AS one, 2 AS two;
one|two
1|2

psql> \pset format csv
psql> \pset # fieldsep ',' -- new default shown

psql> SELECT 1 AS one, 2 AS two;
one,two
1,2

However, if fieldsep is set once, the behaviors changes indefinitely:

psql> \pset fieldsep '|'
psql> \pset format unaligned
# unaligned with '|'

psql> \pset format csv
# csv with '|'

There should be similar mechanism like fieldsep_zero, that reset settings. Some like fieldsep_default.

That is once the fieldsep has been used, you are back to the v4 behavior that you are disagreeing with in the first place.

The patch creates an incomplete state automaton which some unreachable states, because once a value is set there is no way to switch back to the previous "use default" behavior.

It is not implemented, but it is small problem

Also, if I do "\pset fielsep ''" then the \pset output does not allow to distinguish between the unset state and set to empty state.

This is question - how to fieldsep be related to current format. Aligned format doesn't use fieldsep.

The alternative can be words "not used" -- and then you can see difference between "" and other. But if you see the code, the empty string is used like info about empty fieldsep now.

I would not expect a committer to accept such a strange no-possible-return to previous state because of a hidden state (whether fieldsep has been set or not in the past) behavior.

This design is very similar to already implemented fieldsep_zero - look to code. It is nothing new.

So I do not think that this proposed version is really satisfactory.

So I can do better?

1. use special default string for formats that doesn't field sep - like "not used" or some

2. I can implemet the option fieldsep_default - very similary to fieldsep_zero to reset fieldsep to default state.

please, check updated patch

Regards

Pavel

--
Fabien.

Attachment

default_format_fieldsep-5.patch

Re: Re: csv format for psql

From

Fabien COELHO

Date:

25 March 2018, 16:21:03


Hello Pavel,

>> [...]
> it is correct. Default format is aligned, that doesn't use fieldsep.

My comment is that currently fieldsep is kind of a variable, the value of 
which is displayed and reliable wrt commands executed afterwards, and the 
proposed approach changes that by adding a new "default" state.

>> [...]
> There should be similar mechanism like fieldsep_zero, that reset settings.
> Some like fieldsep_default.

The user-visible fieldsep_zero is a horrible hack which exposes the fact 
that C strings cannot hold character zero because it is the end of string 
marker. The user should not have to care! I wish it would be simply 
removed, and that "\pset fieldsep '\0'" or something like that would do 
the trick.

Having another one of these, which would create 4 boolean states which 
would have to be interpreted (eg if default is true and zero is true and 
sep is ';', and format is csv, what will I get?) by the user to guess what 
is going to happen looks unhelpful.

For me adding another user-visible fieldsep-related boolean a no go.

>> Also, if I do "\pset fielsep ''" then the \pset output does not allow to
>> distinguish between the unset state and set to empty state.
>
> This is question - how to fieldsep be related to current format. Aligned
> format doesn't use fieldsep.

For me \pset reflects the current value of output options. With the 
proposed approach it does not mean that anymore, as already said above.

>> I would not expect a committer to accept such a strange no-possible-return
>> to previous state because of a hidden state (whether fieldsep has been set
>> or not in the past) behavior.
>
> This design is very similar to already implemented fieldsep_zero - look to
> code. It is nothing new.

Alas you are right. I suggest not to go on the same path again:-)

> So I can do better?

Dunno. Possibly.

> 1. use special default string for formats that doesn't field sep - like
> "not used" or some
> 2. I can implemet the option fieldsep_default - very similary to
> fieldsep_zero to reset fieldsep to default state.

I strongly dislike option 2, as expressed above. I would enthousiastically 
review any patch that would aim are removing these "*_zero" options. I 
might submit it someday.

I'm more unclear about option 1. Maybe it could be managed cleanly.

I'm still at odds that it would mean that \pset would not show the actual 
setting anymore, but something harder to explain, "actual value or some 
format-specific value, depending".

-- 
Fabien.

Re: Re: csv format for psql

From

Pavel Stehule

Date:

25 March 2018, 19:02:41

2018-03-25 15:21 GMT+02:00 Fabien COELHO <coelho@cri.ensmp.fr>:

Hello Pavel,

[...]
it is correct. Default format is aligned, that doesn't use fieldsep.

My comment is that currently fieldsep is kind of a variable, the value of which is displayed and reliable wrt commands executed afterwards, and the proposed approach changes that by adding a new "default" state.

[...]
There should be similar mechanism like fieldsep_zero, that reset settings.
Some like fieldsep_default.

The user-visible fieldsep_zero is a horrible hack which exposes the fact that C strings cannot hold character zero because it is the end of string marker. The user should not have to care! I wish it would be simply removed, and that "\pset fieldsep '\0'" or something like that would do the trick.

Having another one of these, which would create 4 boolean states which would have to be interpreted (eg if default is true and zero is true and sep is ';', and format is csv, what will I get?) by the user to guess what is going to happen looks unhelpful.

For me adding another user-visible fieldsep-related boolean a no go.

Also, if I do "\pset fielsep ''" then the \pset output does not allow to
distinguish between the unset state and set to empty state.

This is question - how to fieldsep be related to current format. Aligned
format doesn't use fieldsep.

For me \pset reflects the current value of output options. With the proposed approach it does not mean that anymore, as already said above.

I would not expect a committer to accept such a strange no-possible-return
to previous state because of a hidden state (whether fieldsep has been set
or not in the past) behavior.

This design is very similar to already implemented fieldsep_zero - look to
code. It is nothing new.

Alas you are right. I suggest not to go on the same path again:-)

So I can do better?

Dunno. Possibly.

1. use special default string for formats that doesn't field sep - like
"not used" or some
2. I can implemet the option fieldsep_default - very similary to
fieldsep_zero to reset fieldsep to default state.

I strongly dislike option 2, as expressed above. I would enthousiastically review any patch that would aim are removing these "*_zero" options. I might submit it someday.

I can remove it simply - but a alternative is implementation of some \pset_reset command maybe:

\pset reset fieldsep

what do you think?

I'm more unclear about option 1. Maybe it could be managed cleanly.

I'm still at odds that it would mean that \pset would not show the actual setting anymore, but something harder to explain, "actual value or some format-specific value, depending".

It can be formulated little bit different - "when a value of field separator is not entered, then format specific default is used (if can be specified - some formats doesn't allow to specify field separator)."

Regards

Pavel

--
Fabien.

Re: Re: csv format for psql

From

Pavel Stehule

Date:

26 March 2018, 08:12:41

2018-03-25 18:02 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:

2018-03-25 15:21 GMT+02:00 Fabien COELHO <coelho@cri.ensmp.fr>:

Hello Pavel,

[...]
it is correct. Default format is aligned, that doesn't use fieldsep.

My comment is that currently fieldsep is kind of a variable, the value of which is displayed and reliable wrt commands executed afterwards, and the proposed approach changes that by adding a new "default" state.

[...]
There should be similar mechanism like fieldsep_zero, that reset settings.
Some like fieldsep_default.

The user-visible fieldsep_zero is a horrible hack which exposes the fact that C strings cannot hold character zero because it is the end of string marker. The user should not have to care! I wish it would be simply removed, and that "\pset fieldsep '\0'" or something like that would do the trick.

Having another one of these, which would create 4 boolean states which would have to be interpreted (eg if default is true and zero is true and sep is ';', and format is csv, what will I get?) by the user to guess what is going to happen looks unhelpful.

For me adding another user-visible fieldsep-related boolean a no go.

Also, if I do "\pset fielsep ''" then the \pset output does not allow to
distinguish between the unset state and set to empty state.

This is question - how to fieldsep be related to current format. Aligned
format doesn't use fieldsep.

For me \pset reflects the current value of output options. With the proposed approach it does not mean that anymore, as already said above.

I would not expect a committer to accept such a strange no-possible-return
to previous state because of a hidden state (whether fieldsep has been set
or not in the past) behavior.

This design is very similar to already implemented fieldsep_zero - look to
code. It is nothing new.

Alas you are right. I suggest not to go on the same path again:-)

So I can do better?

Dunno. Possibly.

1. use special default string for formats that doesn't field sep - like
"not used" or some
2. I can implemet the option fieldsep_default - very similary to
fieldsep_zero to reset fieldsep to default state.

I strongly dislike option 2, as expressed above. I would enthousiastically review any patch that would aim are removing these "*_zero" options. I might submit it someday.

I can remove it simply - but a alternative is implementation of some \pset_reset command maybe:

\pset reset fieldsep

what do you think?

implemented in attached patch

I'm more unclear about option 1. Maybe it could be managed cleanly.

I'm still at odds that it would mean that \pset would not show the actual setting anymore, but something harder to explain, "actual value or some format-specific value, depending".

It can be formulated little bit different - "when a value of field separator is not entered, then format specific default is used (if can be specified - some formats doesn't allow to specify field separator)."

it can be clean from following result

postgres=# \pset fieldsep
User didn't specified field separator.
Current format doesn't specify default field separator.
postgres=# \pset fieldsep ;
Field separator is ";".
postgres=# \pset fieldsep
Field separator is ";".

fieldsep_default was removed

Regards

Pavel

Regards

Pavel

--
Fabien.

Attachment

default_format_fieldsep-6.patch

Re: Re: csv format for psql

From

"Daniel Verite"

Date:

26 March 2018, 15:44:20

    Pavel Stehule wrote:

> implemented in attached patch

Consider your initial output of \pset, when no option is specified
=================
$ ./psql
psql (11devel)
Type "help" for help.

postgres=# \pset
border             1
columns          0
expanded         off
fieldsep         not used
fieldsep_zero         ERROR
footer             on
format             aligned
linestyle         ascii
null             ''
numericlocale         off
pager             1
pager_min_lines      0
recordsep         '\n'
recordsep_zero         off
reset             ERROR
tableattr
title
tuples_only         off
unicode_border_linestyle single
unicode_column_linestyle single
unicode_header_linestyle single
================

These entries with ERROR correspond in fact to no error at all,
or we have to pretend that the default state of psql is erroneous,
which doesn't make sense.

Also "reset" is not a variable, it seems to be a command,
so it probably shouldn't be there in the first place.

More generally, I'd think the point of reusing "fieldsep" was to
reuse the concept, not reimplement it, let alone changing
bits of behavior of the unaligned mode along the way.

With this patch, again without specifying any option, just looking
at what fieldsep is leads to this:

postgres=# \pset fieldsep
User didn't specified field separator.
Current format doesn't specify default field separator.

If this is the way to "solve" the fact that a user has to do
 \pset fieldsep ','
to get commas in csv mode, then IMV the proposed solution
is clearly worse than the stated problem, and worse than
simply adding fieldsep_csv to be independant from the
unaligned mode.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: Re: csv format for psql

From

Pavel Stehule

Date:

26 March 2018, 16:45:01

2018-03-26 14:44 GMT+02:00 Daniel Verite <daniel@manitou-mail.org>:

Pavel Stehule wrote:

> implemented in attached patch

Consider your initial output of \pset, when no option is specified
=================
$ ./psql
psql (11devel)
Type "help" for help.

postgres=# \pset
border 1
columns 0
expanded off
fieldsep not used
fieldsep_zero ERROR
footer on
format aligned
linestyle ascii
null ''
numericlocale off
pager 1
pager_min_lines 0
recordsep '\n'
recordsep_zero off
reset ERROR
tableattr
title
tuples_only off
unicode_border_linestyle single
unicode_column_linestyle single
unicode_header_linestyle single
================

These entries with ERROR correspond in fact to no error at all,
or we have to pretend that the default state of psql is erroneous,
which doesn't make sense.

Also "reset" is not a variable, it seems to be a command,
so it probably shouldn't be there in the first place.

fixed

More generally, I'd think the point of reusing "fieldsep" was to
reuse the concept, not reimplement it, let alone changing
bits of behavior of the unaligned mode along the way.

With this patch, again without specifying any option, just looking
at what fieldsep is leads to this:

postgres=# \pset fieldsep
User didn't specified field separator.
Current format doesn't specify default field separator.

If this is the way to "solve" the fact that a user has to do
\pset fieldsep ','
to get commas in csv mode, then IMV the proposed solution
is clearly worse than the stated problem, and worse than
simply adding fieldsep_csv to be independant from the
unaligned mode.

I don't understand what is wrong there? There can be any message (short - just "unset",...). I understand so default visual can look strange, because aligned mode has not default field separator, but I don't see any other possibility.

Can I do some recapitulation:

1. using CSV with default | as field separator is wrong - probably there is a agreement

2. partial solution is fieldsep_X where X will be format name. It is better than using | for csv, but it introduces new problems:

a) there are not new long, short options for this value

b) the list of pset options is bloating - every possible new format can introduce fieldsep_X option

c) can be messy for people, because the setting fieldsep can has zero effect on csv or some other formats that don't share default with unaligned format.

So my position - I am very strong against to introduce CSV format with | as field separator, and I am not happy if we introduce fieldsep_X like options, because it is not too good too.

Regards

Pavel

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

default_format_fieldsep-7.patch

Re: Re: csv format for psql

From

"Daniel Verite"

Date:

26 March 2018, 21:26:26

    Pavel Stehule wrote:

> b) the list of pset options is bloating - every possible new format can
> introduce fieldsep_X option

What new format?
The usefulness of fieldsep does not extend outside of xSV, and it's
no suprise that there have been no other use for a fieldsep-like
variable until now, with the other formats supported in psql.

In fact it's even absurd for any format to use a non-fixed separator
at a place that is key for being parsed unambiguously.

We could even support only the comma and make it non-configurable
based on the fact it's Comma-Separated-Values, not
Whatever-Separated-Values, except that won't do much
to serve the users interests, as the reality is that
people use various separators.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: Re: csv format for psql

From

Pavel Stehule

Date:

26 March 2018, 22:51:25

2018-03-26 20:26 GMT+02:00 Daniel Verite <daniel@manitou-mail.org>:

Pavel Stehule wrote:

> b) the list of pset options is bloating - every possible new format can
> introduce fieldsep_X option

What new format?
The usefulness of fieldsep does not extend outside of xSV, and it's
no suprise that there have been no other use for a fieldsep-like
variable until now, with the other formats supported in psql.

I can imagine format based on \t as field separator, or some other invisible chars.

In fact it's even absurd for any format to use a non-fixed separator
at a place that is key for being parsed unambiguously.

We could even support only the comma and make it non-configurable
based on the fact it's Comma-Separated-Values, not
Whatever-Separated-Values, except that won't do much
to serve the users interests, as the reality is that
people use various separators.

I don't think so allow only comma as separator is the solution.

Regards

Pavel

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: Re: csv format for psql

From

Fabien COELHO

Date:

27 March 2018, 12:35:44

Hello Pavel,

I'd like to convince you to compromise, because otherwise I'm afraid we 
will not get this feature.

>> 1. use special default string for formats that doesn't field sep - like
>>> "not used" or some
>>> 2. I can implemet the option fieldsep_default - very similary to
>>> fieldsep_zero to reset fieldsep to default state.
>>>
>>
>> I strongly dislike option 2, as expressed above. I would enthousiastically
>> review any patch that would aim are removing these "*_zero" options. I
>> might submit it someday.
>>
>
> I can remove it simply - but a alternative is implementation of some
> \pset_reset command maybe:
>
> \pset reset fieldsep
>
> what do you think?

I think that changing the semantics of \pset is a nonstarter, it should 
be only about "output [p]arameter [set]ting", because it has been like 
that for the last XX years and people are used to it.

>> I'm more unclear about option 1. Maybe it could be managed cleanly.
>>
>> I'm still at odds that it would mean that \pset would not show the actual
>> setting anymore, but something harder to explain, "actual value or some
>> format-specific value, depending".
>
> It can be formulated little bit different - "when a value of field
> separator is not entered, then format specific default is used (if can be
> specified - some formats doesn't allow to specify field separator)."

Well, that is 3 lines of explanations where people thought they were just 
setting a simple variable to a simple value, or showing the actual current 
value which would be used if needed.

My opinion is that some of what you are suggesting could have participated 
to an alternate and interesting output-parameters-for-psql design, but we 
are a much too late to change that.

The purpose of the patch is just to enable having clean CSV quite easily 
from psql, possibly for pg11... Changing the design and the underlying 
user visible behavior would require a much wider and difficult to obtain 
consensus, and is very unlikely to get in for pg11, if ever.

The current version allows "--csv" or need two "\pset" to achieve CSV, 
without changing the preceding behavior, however unperfect it is...

Could you live with that with the benefit of getting it in? I do not claim 
it is a perfect solution, just that it is a reasonable one.

The dynamic default changing approach departs too much for the current 
user expectations, the user-benefit is not worth it, and committers are 
very likely to think like that.

The only no-surprise, no-behavioral-change, alternative I see is to have a 
csv-specific fieldsep. I'm not keen on that one because it is yet another 
variable, one has to remember it, and then it asks the question about 
recordsep... and I think there are already too many parameters, and more 
is not desirable, although I probably could live with that if I can live 
with "fieldsep_zero":-)

-- 
Fabien.

Re: Re: csv format for psql

From

Pavel Stehule

Date:

27 March 2018, 21:34:50

2018-03-27 11:35 GMT+02:00 Fabien COELHO <coelho@cri.ensmp.fr>:

Hello Pavel,

I'd like to convince you to compromise, because otherwise I'm afraid we will not get this feature.

1. use special default string for formats that doesn't field sep - like
"not used" or some
2. I can implemet the option fieldsep_default - very similary to
fieldsep_zero to reset fieldsep to default state.

I strongly dislike option 2, as expressed above. I would enthousiastically
review any patch that would aim are removing these "*_zero" options. I
might submit it someday.

I can remove it simply - but a alternative is implementation of some
\pset_reset command maybe:

\pset reset fieldsep

what do you think?

I think that changing the semantics of \pset is a nonstarter, it should be only about "output [p]arameter [set]ting", because it has been like that for the last XX years and people are used to it.

I'm more unclear about option 1. Maybe it could be managed cleanly.

I'm still at odds that it would mean that \pset would not show the actual
setting anymore, but something harder to explain, "actual value or some
format-specific value, depending".

It can be formulated little bit different - "when a value of field
separator is not entered, then format specific default is used (if can be
specified - some formats doesn't allow to specify field separator)."

Well, that is 3 lines of explanations where people thought they were just setting a simple variable to a simple value, or showing the actual current value which would be used if needed.

My opinion is that some of what you are suggesting could have participated to an alternate and interesting output-parameters-for-psql design, but we are a much too late to change that.

The purpose of the patch is just to enable having clean CSV quite easily from psql, possibly for pg11... Changing the design and the underlying user visible behavior would require a much wider and difficult to obtain consensus, and is very unlikely to get in for pg11, if ever.

The current version allows "--csv" or need two "\pset" to achieve CSV, without changing the preceding behavior, however unperfect it is...

Could you live with that with the benefit of getting it in? I do not claim it is a perfect solution, just that it is a reasonable one.

The dynamic default changing approach departs too much for the current user expectations, the user-benefit is not worth it, and committers are very likely to think like that.

The only no-surprise, no-behavioral-change, alternative I see is to have a csv-specific fieldsep. I'm not keen on that one because it is yet another variable, one has to remember it, and then it asks the question about recordsep... and I think there are already too many parameters, and more is not desirable, although I probably could live with that if I can live with "fieldsep_zero":-)

I don't share your opinion so introduction csv-specific fieldsep is surprise less. Current design is wrong - this thread is a evidence. And if we introduce csv-specific fieldsep, then we multiply this wrong design. The fix in this direction is renaming fieldsep to fieldsep-unaliagn - but it is probably too big change too. So this design is nothing what I can mark as good solution. I can live with because it is much better than using pipe as separator for csv, and because real impact is small - for almost people it will be invisible - but have not good feeling from it. Are there some possible alternatives?

Regards

Pavel

--
Fabien.

Re: Re: csv format for psql

From

Fabien COELHO

Date:

28 March 2018, 11:24:07

Hello Pavel,

>> I'd like to convince you to compromise, because otherwise I'm afraid we
>> will not get this feature.
>
> [...]
>>
>> The only no-surprise, no-behavioral-change, alternative I see is to have a
>> csv-specific fieldsep. I'm not keen on that one because it is yet another
>> variable, one has to remember it, and then it asks the question about
>> recordsep... and I think there are already too many parameters, and more is
>> not desirable, although I probably could live with that if I can live with
>> "fieldsep_zero":-)
>>
>>
> I don't share your opinion so introduction csv-specific fieldsep is
> surprise less. Current design is wrong - this thread is a evidence.

Wrong is too strong a word. Current design is not perfect, sure. Proposed 
alternatives are somehow worse, at least to some people mind, which 
explains why we arrived on this version after a few iterations.

> And if we introduce csv-specific fieldsep, then we multiply this wrong 
> design. The fix in this direction is renaming fieldsep to 
> fieldsep-unaliagn - but it is probably too big change too. So this 
> design is nothing what I can mark as good solution.

Good, we somehow agree on something!

> I can live with because it is much better than using pipe as separator 
> for csv, and because real impact is small - for almost people it will be 
> invisible - but have not good feeling from it.

> Are there some possible alternatives?

Given the date and the fact that the cf end is 3 days away, the proposed 
short term alternative is Daniel's version, that I feel is reasonable. Ok, 
people have to do two pset to get comma-separated csv, otherwise they get 
pipe-separated csv in one pset.

You could compromise on that for now, and submit an improvement patch for 
a later version if you wish.

Otherwise, ISTM that the current result of this thread is that there will 
be no simple CSV in pg11:-(

Or maybe I can mark Daniel's latest version as "ready" and point out that 
there is some disagreement on the thread, so it is not a full consensus. 
Then a committer can decide whether it is better in like that or that it 
should be put back in some design stage, possibly with their preference 
wrt to the kind of solution they think best.

-- 
Fabien.

Re: Re: csv format for psql

From

Pavel Stehule

Date:

28 March 2018, 19:42:34

2018-03-28 10:24 GMT+02:00 Fabien COELHO <coelho@cri.ensmp.fr>:

Hello Pavel,

I'd like to convince you to compromise, because otherwise I'm afraid we
will not get this feature.

[...]

The only no-surprise, no-behavioral-change, alternative I see is to have a
csv-specific fieldsep. I'm not keen on that one because it is yet another
variable, one has to remember it, and then it asks the question about
recordsep... and I think there are already too many parameters, and more is
not desirable, although I probably could live with that if I can live with
"fieldsep_zero":-)

I don't share your opinion so introduction csv-specific fieldsep is
surprise less. Current design is wrong - this thread is a evidence.

Wrong is too strong a word. Current design is not perfect, sure. Proposed alternatives are somehow worse, at least to some people mind, which explains why we arrived on this version after a few iterations.

And if we introduce csv-specific fieldsep, then we multiply this wrong design. The fix in this direction is renaming fieldsep to fieldsep-unaliagn - but it is probably too big change too. So this design is nothing what I can mark as good solution.

Good, we somehow agree on something!

I can live with because it is much better than using pipe as separator for csv, and because real impact is small - for almost people it will be invisible - but have not good feeling from it.

Are there some possible alternatives?

Given the date and the fact that the cf end is 3 days away, the proposed short term alternative is Daniel's version, that I feel is reasonable. Ok, people have to do two pset to get comma-separated csv, otherwise they get pipe-separated csv in one pset.

I dislike the last Daniel's design. I am sorry.

You could compromise on that for now, and submit an improvement patch for a later version if you wish.

I am able to accept csv specific field sep independent on unaligned field sep.

I have not any other idea. And there is not any good (acceptable) ideas. It is hard to expect so there will be change next year. This space is small, and there are not too much possible variants. We checked all possibility without any agreement.

Otherwise, ISTM that the current result of this thread is that there will be no simple CSV in pg11:-(

Can be. If there is not good enough solution now. If we merge it now, then will be hard to change it due possible compatibility issues.

Or maybe I can mark Daniel's latest version as "ready" and point out that there is some disagreement on the thread, so it is not a full consensus. Then a committer can decide whether it is better in like that or that it should be put back in some design stage, possibly with their preference wrt to the kind of solution they think best.

You can do it. But from my view the Daniel's latest version (with respect to his work) is the worst variant :(. So I am against to commit this version.

Regards

Pavel

--
Fabien.

Re: csv format for psql

From

"David G. Johnston"

Date:

28 March 2018, 22:19:28

On Wednesday, March 28, 2018, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Are there some possible alternatives?

Given the date and the fact that the cf end is 3 days away, the proposed short term alternative is Daniel's version, that I feel is reasonable. Ok, people have to do two pset to get comma-separated csv, otherwise they get pipe-separated csv in one pset.

Could someone post how captions, rows-only, and footer pset settings factor into this? Specifically are they fixed to on/off or will they hide/show if users request them explicitly?

My take on this is that --csv mode is/should be an alternate output mode from the existing pset controlled one, and functions basically like "\copy to stdout" and all echoing and metadata outputs are disabled and only query results, with header and the user specified delimiter, are output. No control other than the delimiter seems to be provided in the current design but that could be expanded upon. In that specification the existing fieldsep argument that is tied to pset should not be used and something like --csv-fieldsep should be provided (I like the prefixing to tie the option lexically to the master --csv option).

David J.

Re: csv format for psql

From

"David G. Johnston"

Date:

28 March 2018, 22:33:29

On Wednesday, March 28, 2018, David G. Johnston <david.g.johnston@gmail.com> wrote:

On Wednesday, March 28, 2018, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Are there some possible alternatives?

Given the date and the fact that the cf end is 3 days away, the proposed short term alternative is Daniel's version, that I feel is reasonable. Ok, people have to do two pset to get comma-separated csv, otherwise they get pipe-separated csv in one pset.

Could someone post how captions, rows-only, and footer pset settings factor into this? Specifically are they fixed to on/off or will they hide/show if users request them explicitly?

I found the original post that covers that indeed we simply fix these values, which is why I was thinking.

and something like --csv-fieldsep should be provided (I like the prefixing to tie the option lexically to the master --csv option).

Or, really, just make --csv take an optional argument which, if present, is the delimiter. No separate argument needed, and we ignore the pset fieldsep argument like we ignore everything else.

Need to decide whether to "not ignore" --tuples-only, which doesn't seem controversial aside from being a pset argument that isn't being ignored in this design...

David J.

Re: csv format for psql

From

"David G. Johnston"

Date:

28 March 2018, 22:35:28

On Monday, March 26, 2018, Daniel Verite <daniel@manitou-mail.org> wrote:

We could even support only the comma and make it non-configurable
based on the fact it's Comma-Separated-Values, not
Whatever-Separated-Values, except that won't do much
to serve the users interests, as the reality is that
people use various separators.

I like to call it "Character Separated Values" now for just that reason.

David J.

Re: csv format for psql

From

"Joshua D. Drake"

Date:

28 March 2018, 22:43:07

On 03/28/2018 12:35 PM, David G. Johnston wrote:

On Monday, March 26, 2018, Daniel Verite <daniel@manitou-mail.org> wrote:

We could even support only the comma and make it non-configurable
based on the fact it's Comma-Separated-Values, not
Whatever-Separated-Values, except that won't do much
to serve the users interests, as the reality is that
people use various separators.

I like to call it "Character Separated Values" now for just that reason.

Isn't the actual wording Character Delimited Values? I may be picking at hairs here but every single time I use anything to import a CSV or other delimited file (TAB or | usually) that is what the import screen says.

JD

David J.

-- 
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc
***  A fault and talent of mine is to tell it exactly how it is.  ***
PostgreSQL centered full stack support, consulting and development. 
Advocate: @amplifypostgres || Learn: https://postgresconf.org
*****     Unless otherwise stated, opinions are my own.   *****

Re: csv format for psql

From

"David G. Johnston"

Date:

28 March 2018, 23:32:46

On Wednesday, March 28, 2018, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

And if we introduce csv-specific fieldsep, then we multiply this wrong design. The fix in this direction is renaming fieldsep to fieldsep-unaliagn - but it is probably too big change too. So this design is nothing what I can mark as good solution.

Good, we somehow agree on something!

I can live with because it is much better than using pipe as separator for csv, and because real impact is small - for almost people it will be invisible - but have not good feeling from it.

Concretely...I'm in favor of the "\pset fieldsep_csv ," solution and csv format should always use its existing value. Teach \pset fieldsep to fail if the current format is csv. Being able to specify the csv fieldsep like "\pset format csv ;" would be a plus.

Unaligned format could grow its own fieldsep if it wanted to but it can also just use the default provided fieldsep var whose default value is pipe. If it did grow one it would need to understand "not set" in order to preserve existing behavior. We don't have that problem with csv.

I don't believe we can modify fieldsep without causing unwanted grief.

David J.

Re: csv format for psql

From

Isaac Morland

Date:

29 March 2018, 02:19:43

On 28 March 2018 at 15:43, Joshua D. Drake <jd@commandprompt.com> wrote:

On 03/28/2018 12:35 PM, David G. Johnston wrote:
I like to call it "Character Separated Values" now for just that reason.

Isn't the actual wording Character Delimited Values? I may be picking at hairs here but every single time I use anything to import a CSV or other delimited file (TAB or | usually) that is what the import screen says.

CSV stands for Comma Separated Values, and not anything else common as far as I can tell. A Google search for "csv" turns up the Wikipedia page describing the file format as the first hit, followed by the Wikipedia disambiguation page for CSV, which links to the aforementioned Wikipedia page as the only file-format-related link.

The actual reason I'm posting this is because some of the discussion has made me a bit confused: there is already a CSV format defined for the COPY command and used by the psql \copy. I just want to check that what is being discussed here would use the exact same format as the existing CSV COPY format; and the configurability of them should be the exact same options (which already includes being able to change the delimiter). I think it's important that Postgres not have two CSVs with slightly different behaviours. Indeed, psql could use COPY behind the scenes to generate the CSV output, which would guarantee no nasty surprises.

If this is already widely accepted or if I'm horribly misunderstanding the discussion then I'm sorry for the noise.

Re: csv format for psql

From

"David G. Johnston"

Date:

29 March 2018, 02:38:51

On Wed, Mar 28, 2018 at 4:19 PM, Isaac Morland <isaac.morland@gmail.com> wrote:

On 28 March 2018 at 15:43, Joshua D. Drake <jd@commandprompt.com> wrote:
On 03/28/2018 12:35 PM, David G. Johnston wrote:
I like to call it "Character Separated Values" now for just that reason.

Isn't the actual wording Character Delimited Values? I may be picking at hairs here but every single time I use anything to import a CSV or other delimited file (TAB or | usually) that is what the import screen says.

CSV stands for Comma Separated Values, and not anything else common as far as I can tell.

Language evolves - Wikipedia just hasn't recognized this particular evolution yet :)

The actual reason I'm posting this is because some of the discussion has made me a bit confused: there is already a CSV format defined for the COPY command and used by the psql \copy. I just want to check that what is being discussed here would use the exact same format as the existing CSV COPY format; and the configurability of them should be the exact same options (which already includes being able to change the delimiter).

Nope, the \copy meta-command, and corresponding COPY SQL command, have additional options that I don't believe will be accessible here. Specifically the "OIDS", "QUOTE", and various "FORCE_" options are not controllable, nor are ESCAPE and ENCODING (this last I think, or at least not as flexible).

I think it's important that Postgres not have two CSVs with slightly different behaviours. Indeed, psql could use COPY behind the scenes to generate the CSV output, which would guarantee no nasty surprises.

If this is already widely accepted or if I'm horribly misunderstanding the discussion then I'm sorry for the noise.

I agree this is a possible approach but the way the thinking is presently is that this is just another "\pset format" option with forced defaults excepting the delimiter (DELIMITER),presence of the column names (HEADER), NULL (I think...)

Issuing \copy with the equivalent settings should result in the same output...but the two mechanisms are distinct for better and worse.

David J.

Re: csv format for psql

From

"Daniel Verite"

Date:

29 March 2018, 14:53:55

    David G. Johnston wrote:

> Could someone post how captions, rows-only, and footer pset settings factor
> into this?  Specifically are they fixed to on/off or will they hide/show if
> users request them explicitly?

This is described in the doc with:

+       <para><literal>csv</literal> format writes columns separated
+       by <literal>fieldsep</literal>, applying the CSV quoting rules
+       described in RFC-4180 and compatible with the CSV format
+       of the <command>COPY</command> command.
+       The header with column names is output unless the
+       <literal>tuples_only</literal> parameter is <literal>on</literal>.
+       Title and footers are not printed.
+       </para>


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

"Daniel Verite"

Date:

29 March 2018, 15:17:38

    David G. Johnston wrote:

> Or, really, just make --csv take an optional argument which, if present, is
> the delimiter.

I don't think psql can support optional arguments because
psql --option foo
would be ambiguous, since foo could be the option's value or
the name of a database to connect to.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

Pavel Stehule

Date:

29 March 2018, 15:19:46

2018-03-29 14:17 GMT+02:00 Daniel Verite <daniel@manitou-mail.org>:

David G. Johnston wrote:

> Or, really, just make --csv take an optional argument which, if present, is
> the delimiter.

I don't think psql can support optional arguments because
psql --option foo
would be ambiguous, since foo could be the option's value or
the name of a database to connect to.

sure

Pavel

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

"Daniel Verite"

Date:

29 March 2018, 15:42:01

    Isaac Morland wrote:

> The actual reason I'm posting this is because some of the discussion has
> made me a bit confused: there is already a CSV format defined for the COPY
> command and used by the psql \copy. I just want to check that what is being
> discussed here would use the exact same format as the existing CSV COPY

Please see the start of the thread at
https://www.postgresql.org/message-id/a8de371e-006f-4f92-ab72-2bbe3ee78f03%40manitou-mail.org
where what is proposed is compared to what \copy and COPY
already provide.

About the CSV acronym itself, the COPY documentation
https://www.postgresql.org/docs/current/static/sql-copy.html
already refers to it as "Comma Separated Values", even though
as we know, the delimiter is user-configurable.
There's no difference in psql that would justify a different wording.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

"Daniel Verite"

Date:

29 March 2018, 17:30:00

    David G. Johnston wrote:

> Unaligned format could grow its own fieldsep if it wanted to but it can
> also just use the default provided fieldsep var whose default value is
> pipe.  If it did grow one it would need to understand "not set" in order to
> preserve existing behavior.  We don't have that problem with csv.

fielsep is already owned by the unaligned format. No other format
uses fieldsep currently. Why would it needs to grow its own?
I don't quite see what you mean here.

For csv, Fabien and Peter expressed the opinion that we shouldn't
create another fieldsep-like variable specifically for it, but instead
reuse fieldsep. That's what my latest patch does.

Now it turns out that sharing fieldsep comes with some problems.

1. since a single variable can't have two different default values,
we have either to accept '|' as a default csv separator, or introduce
some tricks to automagically commute the value.

2. when a user does \pset fieldsep ';' there are doubts
on whether this should affect the current mode only, or both,
or even the other one when the user goes back and forth between
formats. Any proposal where a \pset var value affects by
side-effect another pset variable is inconsistent with what has been
done until now with these variables, and I expect that
side-effects are not exactly sought after in general.

3. the command-line cannot express: use character 'X' for unaligned
and 'Y' for csv for the rest of the session. The current patch provides
'--csv' to select both the csv format and a comma separator, but
when combined with -F, the end result depend on the relative position
of the options, which may feel unintuitive or buggy.
Again we could tweak that, but only by being inconsistent
with how we process other options.

Personally I think the benefit of sharing fieldsep is not worth these
problems, but I'm waiting for the discussion to continue with
more opinions.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

"David G. Johnston"

Date:

29 March 2018, 18:08:49

On Thu, Mar 29, 2018 at 7:30 AM, Daniel Verite <daniel@manitou-mail.org> wrote:

David G. Johnston wrote:

> Unaligned format could grow its own fieldsep if it wanted to but it can
> also just use the default provided fieldsep var whose default value is
> pipe. If it did grow one it would need to understand "not set" in order to
> preserve existing behavior. We don't have that problem with csv.

fielsep is already owned by the unaligned format. No other format
uses fieldsep currently. Why would it needs to grow its own?
I don't quite see what you mean here.

Its mostly a terminology concern...

For csv, Fabien and Peter expressed the opinion that we shouldn't
create another fieldsep-like variable specifically for it, but instead
reuse fieldsep. That's what my latest patch does.

Now it turns out that sharing fieldsep comes with some problems.
[...]

Personally I think the benefit of sharing fieldsep is not worth these
problems, but I'm waiting for the discussion to continue with
more opinions.

Basically that because "fieldsep" lacks any kind of suffix it can be considered "unowned" versus "fieldsep_csv" which would be "owned" by the "csv" format. Given the problems I do want an "owned by csv" variable. An owned version for unaligned would then be named "fieldsep_unaligned". But we wouldn't want to rename fieldsep, we'd simply leave it in place as a generic fieldsep variable that formats can choose to use if they wish - with the limitations that come with sharing it with unaligned. Decent odds that no other format would want to but we cannot remove it so we might as well better describe its place now that two formats care about field separation.

fieldsep is "shared" even though at present no one else is sharing it. fieldsep_csv is not shared. fieldsep_unaligned doesn't exist because unaligned uses the shared variable. No one else uses fieldsep, shared or owned.

Saying the "fieldsep" is owned by unaligned mode would be technically correct, though there is no "mark" that makes is readily obvious.

Implementation wise "ownership" could mean that changing the delimiter can only occur while the format type matches. But we could not do that for fieldsep since no such ordering currently is enforced.

David J.

Re: csv format for psql

From

"David G. Johnston"

Date:

30 March 2018, 01:15:07

On Thu, Mar 29, 2018 at 7:30 AM, Daniel Verite <daniel@manitou-mail.org> wrote:

Personally I think the benefit of sharing fieldsep is not worth these
problems, but I'm waiting for the discussion to continue with
more opinions.

Apologies in advance if I mis-represent someone's position.

It seems like having a dedicated option is the consensus opinion. Daniel (the original author) and I both prefer it, Pavel will accept it. Fabien is opposed.

Peter E. was opposed, wanting to leverage both fieldsep and recordsep, but hasn't chimed in recently. His opinion at this point might push this over the edge since he is also a committer.

I would probably suggest maybe just calling it "\pset separator" to match the "S" in "cSv" and not have any name overlap with the fieldsep variable used with unaligned mode. The user will have to learn anything and being more distinct should help the process. We would not consult recordsep though the end-of-line choice should be platform dependent.

David J.

Re: csv format for psql

From

Pavel Stehule

Date:

30 March 2018, 12:45:16

2018-03-30 0:15 GMT+02:00 David G. Johnston <david.g.johnston@gmail.com>:

On Thu, Mar 29, 2018 at 7:30 AM, Daniel Verite <daniel@manitou-mail.org> wrote:
Personally I think the benefit of sharing fieldsep is not worth these
problems, but I'm waiting for the discussion to continue with
more opinions.

Apologies in advance if I mis-represent someone's position.

It seems like having a dedicated option is the consensus opinion. Daniel (the original author) and I both prefer it, Pavel will accept it. Fabien is opposed.

Peter E. was opposed, wanting to leverage both fieldsep and recordsep, but hasn't chimed in recently. His opinion at this point might push this over the edge since he is also a committer.

I would probably suggest maybe just calling it "\pset separator" to match the "S" in "cSv" and not have any name overlap with the fieldsep variable used with unaligned mode. The user will have to learn anything and being more distinct should help the process. We would not consult recordsep though the end-of-line choice should be platform dependent.

-1. The difference between fieldsep and separator is not clear, and the relation between separator and csv is not clean too.

fieldsep_csv or csv_fieldsep is not nice, but it is clear.

Regards

Pavel

David J.

Re: csv format for psql

From

Fabien COELHO

Date:

31 March 2018, 11:33:38

Bonjour Daniel,

> For csv, Fabien and Peter expressed the opinion that we shouldn't
> create another fieldsep-like variable specifically for it, but instead
> reuse fieldsep. That's what my latest patch does.
>
> Now it turns out that sharing fieldsep comes with some problems.

Personnaly I do not have any problem with CSV defaulting to '|' separator, 
given that anyway people often use anything but a comma for the purpose, 
including '|'.

However Pavel wants to block the patch on this point. Too bad.

Maybe you can do (again) a patch with a 'fieldsep_csv' or whatever 
additional variable? Not sure about 'recordsep'...

And then we forward to committers both version and they can chose 
whatever they think best, including not committing anything?

-- 
Fabien.

Re: csv format for psql

From

Gavin Flower

Date:

31 March 2018, 11:44:29

On 31/03/18 21:33, Fabien COELHO wrote:
>
> Bonjour Daniel,
>
>> For csv, Fabien and Peter expressed the opinion that we shouldn't
>> create another fieldsep-like variable specifically for it, but instead
>> reuse fieldsep. That's what my latest patch does.
>>
>> Now it turns out that sharing fieldsep comes with some problems.
>
> Personnaly I do not have any problem with CSV defaulting to '|' 
> separator, given that anyway people often use anything but a comma for 
> the purpose, including '|'.
>
> However Pavel wants to block the patch on this point. Too bad.
>
> Maybe you can do (again) a patch with a 'fieldsep_csv' or whatever 
> additional variable? Not sure about 'recordsep'...
>
> And then we forward to committers both version and they can chose 
> whatever they think best, including not committing anything?
>
Would suggest the choice of 'fieldsep_csv' out of those two (a record 
would correspond to the whole row), and the default being a comma.


Cheers,
Gavin

Re: csv format for psql

From

Isaac Morland

Date:

31 March 2018, 18:22:43

On 31 March 2018 at 04:33, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Bonjour Daniel,

For csv, Fabien and Peter expressed the opinion that we shouldn't
create another fieldsep-like variable specifically for it, but instead
reuse fieldsep. That's what my latest patch does.

Now it turns out that sharing fieldsep comes with some problems.

Personnaly I do not have any problem with CSV defaulting to '|' separator, given that anyway people often use anything but a comma for the purpose, including '|'.

However Pavel wants to block the patch on this point. Too bad.

OK, mostly trying to avoid commenting because I doubt I have much to add. But. If I ask for CSV and don't specify any overrides, I expect to get "C"omma separated values, not some other character. More specifically, if I say --csv I expect to get files that are identical with what I would get if I used COPY ... CSV. Actually, COPY ... CSV HEADER, given that psql shows column headings. This also implies that I expect the quoting and related details that are associated with CSV.

And I don't think I'm a weird user. If --csv does anything even a little different from a simple COPY invocation on the same query, at some point it's going to bite somebody and they will rightfully curse the design decisions taken in this thread.

Re: csv format for psql

From

Fabien COELHO

Date:

01 April 2018, 09:30:17

Hello Isaac,

>> Personnaly I do not have any problem with CSV defaulting to '|' separator,
>> given that anyway people often use anything but a comma for the purpose,
>> including '|'.
>>
>> However Pavel wants to block the patch on this point. Too bad.
>
> OK, mostly trying to avoid commenting because I doubt I have much to 
> add. But. If I ask for CSV and don't specify any overrides, I expect to 
> get "C"omma separated values, not some other character. More 
> specifically, if I say --csv I expect to get files that are identical 
> with what I would get if I used COPY ... CSV.

My summary was incomplete. The --csv option implementation by Daniel 
already does that.

The issue Pavel is complaining about is that in interactive mode "\pset 
format csv" does not do the same: it triggers the csv-rule string-escaping 
mechanism, but does not reset the "fieldsep" variable (eh, it sets the 
"format" variable) so the default separator under this interactive use is 
"|" if the "fieldsep" variable is shared.

I have suggested a "\csv" interactive command to set both as a convenient 
shorthand for "\pset format csv & \pset fieldsep ','", similarly to --csv, 
but Pavel still wants "\pset format csv" to trigger the full csv output.

A consequence I forgot about adding "fieldsep_csv", is that it probably 
has to duplicate the "_zero" ugly hack to be consistent with existing 
"*sep" variables, or else be inconsistent. Sigh.

-- 
Fabien.

Re: csv format for psql

From

Pavel Stehule

Date:

01 April 2018, 10:27:24

2018-04-01 8:30 GMT+02:00 Fabien COELHO <coelho@cri.ensmp.fr>:

Hello Isaac,

Personnaly I do not have any problem with CSV defaulting to '|' separator,
given that anyway people often use anything but a comma for the purpose,
including '|'.

However Pavel wants to block the patch on this point. Too bad.

OK, mostly trying to avoid commenting because I doubt I have much to add. But. If I ask for CSV and don't specify any overrides, I expect to get "C"omma separated values, not some other character. More specifically, if I say --csv I expect to get files that are identical with what I would get if I used COPY ... CSV.

My summary was incomplete. The --csv option implementation by Daniel already does that.

The issue Pavel is complaining about is that in interactive mode "\pset format csv" does not do the same: it triggers the csv-rule string-escaping mechanism, but does not reset the "fieldsep" variable (eh, it sets the "format" variable) so the default separator under this interactive use is "|" if the "fieldsep" variable is shared.

I have suggested a "\csv" interactive command to set both as a convenient shorthand for "\pset format csv & \pset fieldsep ','", similarly to --csv, but Pavel still wants "\pset format csv" to trigger the full csv output.

A consequence I forgot about adding "fieldsep_csv", is that it probably has to duplicate the "_zero" ugly hack to be consistent with existing "*sep" variables, or else be inconsistent. Sigh.

there are not any relation between fieldsep_csv and fieldsep_zero.

The base of this issue is in wrong concept of fieldsep. It is based on idea so one value can be good enough for all formats. It is worked when all formats that used this variable is ONE format - unaligned.

similar issue is with tuplesonly option - see Isaac's mail.

For different formats the different defaults are more consistent - the other formats does it - doesn't use fieldsep - but it lost internal consistency, because its does quietly.

We can have different ideas about users's expectations - but we should to share opinion, so correct solution should be consistent. If I choose some format, then behave should be same without dependency on activation mechanism.

so \csv should be same like \pset format csv or --csv.

I don't share option so CSV format should be exactly same like CSV COPY. COPY is designed for backups - and header is not too important there. When I seen some csv, then there usually header was used.

If I can search some precedent, then it is \pset linestyle and \pset unicode_*** options.

so we can to have

\pset format xxx

and set of local possibly changed defaults

\pset csv_fieldsep ,

\pset csv_tuplesonly on

\pset unaligned_fieldsep |

\pset unaligned_tuplesonly off

the fieldsep, tuplesonly can be just links to currently used ****_fieldsep and ****_tuplesonly

This is consistent - defaults can be correct for Isaac, and I can paste to my .psqlrc

\pset csv_tuplesonly off

and the semantic is clean, and this option will be active just for csv, and doesn't depend on current format - so it can be used in .psqlrc

When it is works well for linestyle, then it can works for some other too

Regards

Pavel

--
Fabien.

Re: csv format for psql

From

Peter Eisentraut

Date:

06 April 2018, 06:08:34

On 4/1/18 03:27, Pavel Stehule wrote:
> I don't share option so CSV format should be exactly same like CSV COPY.
> COPY is designed for backups - and header is not too important there.
> When I seen some csv, then there usually header was used.

I think in practice a lot of people use COPY also because it's a nice
way to get CSV output, even if it's not for backups.  The options that
COPY has for CSV are clearly designed around making the output
compatible with various CSV-variants.  If we create another way to do
CSV in psql, it would be totally confusing if some of the options are
the same and some are different or missing.

Another thought: Isn't CSV just the same as unaligned output plus some
quoting?  Could we add a quote character setting and then define --csv
to be quote-character = " and fieldsep = , ?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: csv format for psql

From

"Jonathan S. Katz"

Date:

06 April 2018, 06:46:34

> On Apr 5, 2018, at 11:08 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
>
> On 4/1/18 03:27, Pavel Stehule wrote:
>> I don't share option so CSV format should be exactly same like CSV COPY.
>> COPY is designed for backups - and header is not too important there.
>> When I seen some csv, then there usually header was used.
>
> I think in practice a lot of people use COPY also because it's a nice
> way to get CSV output, even if it's not for backups.  The options that
> COPY has for CSV are clearly designed around making the output
> compatible with various CSV-variants.

+1

From a user standpoint this was mostly how I use COPY.  Someone
requests a report that they can manipulate in $SPREADSHEET.  I write
a query, place it inside a “COPY” statement with FORMAT CSV,
HEADER TRUE, save to file, and deliver.

> Another thought: Isn't CSV just the same as unaligned output plus some
> quoting?  Could we add a quote character setting and then define --csv
> to be quote-character = " and fieldsep = , ?

This makes a lot of sense. I’ve also generated CSV files using a
combination of:

    \f ,
    \a
    \o file.csv

and then running the query, but if any of the fields contained a “,” if would
inevitably break in $SPREADSHEET.

Jonathan

Re: csv format for psql

From

Pavel Stehule

Date:

06 April 2018, 08:38:15

2018-04-06 5:46 GMT+02:00 Jonathan S. Katz <jkatz@postgresql.org>:

> On Apr 5, 2018, at 11:08 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
>
> On 4/1/18 03:27, Pavel Stehule wrote:
>> I don't share option so CSV format should be exactly same like CSV COPY.
>> COPY is designed for backups - and header is not too important there.
>> When I seen some csv, then there usually header was used.
>
> I think in practice a lot of people use COPY also because it's a nice
> way to get CSV output, even if it's not for backups. The options that
> COPY has for CSV are clearly designed around making the output
> compatible with various CSV-variants.

+1

From a user standpoint this was mostly how I use COPY. Someone
requests a report that they can manipulate in $SPREADSHEET. I write
a query, place it inside a “COPY” statement with FORMAT CSV,
HEADER TRUE, save to file, and deliver.

> Another thought: Isn't CSV just the same as unaligned output plus some
> quoting? Could we add a quote character setting and then define --csv
> to be quote-character = " and fieldsep = , ?

This makes a lot of sense. I’ve also generated CSV files using a
combination of:

\f ,
\a
\o file.csv

and then running the query, but if any of the fields contained a “,” if would
inevitably break in $SPREADSHEET.

unfortunately, most used CSV separator in Czech Repuplic is ; due MS Excel settings for CR

Regards

Pavel

Jonathan

Re: csv format for psql

From

"Jonathan S. Katz"

Date:

06 April 2018, 15:43:59

Hi Pavel,

On Apr 6, 2018, at 1:38 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

2018-04-06 5:46 GMT+02:00 Jonathan S. Katz <jkatz@postgresql.org>:

> On Apr 5, 2018, at 11:08 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
>
> On 4/1/18 03:27, Pavel Stehule wrote:
>> I don't share option so CSV format should be exactly same like CSV COPY.
>> COPY is designed for backups - and header is not too important there.
>> When I seen some csv, then there usually header was used.
>
> I think in practice a lot of people use COPY also because it's a nice
> way to get CSV output, even if it's not for backups. The options that
> COPY has for CSV are clearly designed around making the output
> compatible with various CSV-variants.

+1

From a user standpoint this was mostly how I use COPY. Someone
requests a report that they can manipulate in $SPREADSHEET. I write
a query, place it inside a “COPY” statement with FORMAT CSV,
HEADER TRUE, save to file, and deliver.

> Another thought: Isn't CSV just the same as unaligned output plus some
> quoting? Could we add a quote character setting and then define --csv
> to be quote-character = " and fieldsep = , ?

This makes a lot of sense. I’ve also generated CSV files using a
combination of:

  \f ,
  \a
  \o file.csv

and then running the query, but if any of the fields contained a “,” if would
inevitably break in $SPREADSHEET.

unfortunately, most used CSV separator in Czech Repuplic is ; due MS Excel settings for CR

Sure. So in that case with the above, I would have used "\f ;” or

DELIMITER ‘;’ so it sounds like Peter’s suggestion is still valid.

Jonathan

Re: csv format for psql

From

"Daniel Verite"

Date:

07 April 2018, 14:55:28

    Pavel Stehule wrote:

> so we can to have
>
> \pset format xxx
>
> and set of local possibly changed defaults
>
> \pset csv_fieldsep ,
> \pset csv_tuplesonly on
> \pset unaligned_fieldsep |
> \pset unaligned_tuplesonly off

tuples_only (\t) is a single setting that is currently used by all
formats. It makes sense as it is and I don't quite see what we
would gain by "exploding" it.
There's also "footer" that goes in tandem with "tuples_only",
to switch off the footer while keeping the header and title.
Whatever change is considered to "tuples_only", "footer" must
be considered with it.
For the csv format, tuples_only=off is interpreted as "output the header"
and tuples_only=on as "don't output the header". This is consistent
with what other formats do.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

Pavel Stehule

Date:

07 April 2018, 15:00:52

2018-04-07 13:55 GMT+02:00 Daniel Verite <daniel@manitou-mail.org>:

Pavel Stehule wrote:

> so we can to have
>
> \pset format xxx
>
> and set of local possibly changed defaults
>
> \pset csv_fieldsep ,
> \pset csv_tuplesonly on
> \pset unaligned_fieldsep |
> \pset unaligned_tuplesonly off

tuples_only (\t) is a single setting that is currently used by all
formats. It makes sense as it is and I don't quite see what we
would gain by "exploding" it.

It was a example, how the one default can be not good enough.

Usually we expect in align, unalign headers by default. But somebody can expect tuples only by default for CSV format.

There's also "footer" that goes in tandem with "tuples_only",
to switch off the footer while keeping the header and title.
Whatever change is considered to "tuples_only", "footer" must
be considered with it.
For the csv format, tuples_only=off is interpreted as "output the header"
and tuples_only=on as "don't output the header". This is consistent
with what other formats do.

My note was not about the implementations, it was about different expectations of some users - looks on Isaac's mail, pls.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

"Daniel Verite"

Date:

07 April 2018, 15:23:49

    Peter Eisentraut wrote:

> Another thought: Isn't CSV just the same as unaligned output plus some
> quoting?  Could we add a quote character setting and then define --csv
> to be quote-character = " and fieldsep = , ?

Plus footer set to off. So --csv would be like
\pset format unaligned
\pset fieldsep ','
\pset footer off
\pset unaligned_quote_character '"'

I guess we'd add a \csv command that does these together.

There would still be some discomfort with some of the issues
discussed upthread. For instance
  psql -F ';' --csv
is likely to reset fieldsep to ',' having then a different outcome than
  psql --csv -F';'

And there's the point on how to come back from \csv to another
output, given that it has overwritten "footer" that is used across
all formats, and fieldsep used by unaligned.
So a user might need to figure out anyway the
intricaties behind \csv, if it's not an independant format.

On the whole I'm inclined to resubmit the patch with
fieldsep_csv and some minor changes based on the rest
of the discussion.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

"Daniel Verite"

Date:

07 April 2018, 16:24:21

    Isaac Morland wrote:

> OK, mostly trying to avoid commenting because I doubt I have much to add.
> But. If I ask for CSV and don't specify any overrides, I expect to get
> "C"omma separated values, not some other character. More specifically, if I
> say --csv I expect to get files that are identical with what I would get if
> I used COPY ... CSV. Actually, COPY ... CSV HEADER, given that psql shows
> column headings. This also implies that I expect the quoting and related
> details that are associated with CSV.

Yes, this is what the current patch does. Only the first version
used the same default delimiter as the unaligned mode, but
nobody liked that part.
As for the header, it's driven by how it's already done for other
formats in psql, with the tuples_only setting, which implies
that column names are output by default as you mention.
Both settings must be changeable, the exact details of how/when
is what's being discussed.

The output might still differ compared to COPY in that line endings
depend on the client-side OS. There's also the minor issue
of a single \ by itself on a line, which gets quoted by COPY
and not by psql.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

"Daniel Verite"

Date:

07 April 2018, 16:48:38

    Daniel Verite wrote:

> The output might still differ compared to COPY in that line endings
> depend on the client-side OS. There's also the minor issue
> of a single \ by itself on a line, which gets quoted by COPY
> and not by psql.

I meant \. or backslash followed by period.
This sequence means nothing in particular in csv outside
of COPY, but it means end-of-data for COPY.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

Pavel Stehule

Date:

08 April 2018, 10:38:43

2018-04-07 14:23 GMT+02:00 Daniel Verite <daniel@manitou-mail.org>:

Peter Eisentraut wrote:

> Another thought: Isn't CSV just the same as unaligned output plus some
> quoting? Could we add a quote character setting and then define --csv
> to be quote-character = " and fieldsep = , ?

Plus footer set to off. So --csv would be like
\pset format unaligned
\pset fieldsep ','
\pset footer off
\pset unaligned_quote_character '"'

I guess we'd add a \csv command that does these together.

There would still be some discomfort with some of the issues
discussed upthread. For instance
psql -F ';' --csv
is likely to reset fieldsep to ',' having then a different outcome than
psql --csv -F';'

And there's the point on how to come back from \csv to another
output, given that it has overwritten "footer" that is used across
all formats, and fieldsep used by unaligned.
So a user might need to figure out anyway the
intricaties behind \csv, if it's not an independant format.

On the whole I'm inclined to resubmit the patch with
fieldsep_csv and some minor changes based on the rest
of the discussion.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

Pavel Stehule

Date:

03 June 2018, 19:45:39

2018-04-07 14:23 GMT+02:00 Daniel Verite <daniel@manitou-mail.org>:

Peter Eisentraut wrote:

> Another thought: Isn't CSV just the same as unaligned output plus some
> quoting? Could we add a quote character setting and then define --csv
> to be quote-character = " and fieldsep = , ?

Plus footer set to off. So --csv would be like
\pset format unaligned
\pset fieldsep ','
\pset footer off
\pset unaligned_quote_character '"'

I guess we'd add a \csv command that does these together.

There would still be some discomfort with some of the issues
discussed upthread. For instance
psql -F ';' --csv
is likely to reset fieldsep to ',' having then a different outcome than
psql --csv -F';'

And there's the point on how to come back from \csv to another
output, given that it has overwritten "footer" that is used across
all formats, and fieldsep used by unaligned.
So a user might need to figure out anyway the
intricaties behind \csv, if it's not an independant format.

On the whole I'm inclined to resubmit the patch with
fieldsep_csv and some minor changes based on the rest
of the discussion.

Can we restart discussion about this feature? I am sure so there is strong common agreement about usability of this feature. Now, we have good enough time to design and implement it well.

I am starting this discussion with some initial requirement and we can talk about validity of these requirements first.

1. command line long option --csv set output format CSV, default separator should be ','

2. option -F should to work too (there is (was) first issue - this value is now semicolon). I don't like introduction a new like "-F" option just for CSV, because we can share this option for all output formats where separator can be customized.

3. The behave should not depend on order of usage

4. I miss some short/long option for enabling/disabling header (default should be on), there is option -t, that is inverse value of "use header". I am not sure if it is enough.

5. In interactive mode I prefer activating CSV format via \pset format csv .. like almost all any other supported formats

It should to set output configuration like mentioned

\pset format unaligned
\pset fieldsep ',' -- if not assigned before
\pset footer off
\pset unaligned_quote_character '"'

similarly like command line I am expecting so I can set \pset fieldsep and \pset format in any order.

expected behave:

psql -c "select ..." --csv -F';'

psql -c "select ..." -F';' --csv

interactive

\pset format csv

\pset fieldsep ;

\pset format csv

I don't like when one command overwrite settings of some other command. We can introduce some similar like GUC where default values from configure file can be overwritten by custom setting for session. So I am able to thinking about some settings

\pset csv_default_separator

\pset csv_default_header

Then there is question to simplify all and use \pset csv_separator, and csv format independent of fieldseparator value? It is possible, but I don't think so have more option for similar value is good idea (for interactive mode).

Why I propose these rich and strong requirements?

1. CSV is fundamental format - I don't know any other with similar value for data import/export

2. unfortunately, important feature (separator) is not same in all areas - in Czech rep most used separator is semicolon - so typical case needs setting format and setting separator.

Regards

Pavel

Re: csv format for psql

From

"David G. Johnston"

Date:

03 June 2018, 20:16:24

On Sunday, June 3, 2018, Pavel Stehule <pavel.stehule@gmail.com> wrote:

\pset fieldsep ;
\pset format csv

I don't like when one command overwrite settings of some other command. We can introduce some similar like GUC where default values from configure file can be overwritten by custom setting for session. So I am able to thinking about some settings

like

\pset csv_default_separator
\pset csv_default_header

Then there is question to simplify all and use \pset csv_separator, and csv format independent of fieldseparator value? It is possible, but I don't think so have more option for similar value is good idea (for interactive mode).

Having a dedicated option seems infinitely better than adding new settings for defaults and having to keep track of whether the shared field separator is a default versus a user specified value.

Recently we reworked server GUCs to avoid this kind of unset/default behavior. I don't see how introducing or relying upon it in psql would be an advantage.

At this point -1, keep the status quo, for any implementation that tries to make the unaligned mode field separator perform double duty. I'm open, but unlikely, to be convinced that it can be done without unforeseen bad side effects and degraded usability.

David J.

Re: csv format for psql

From

Pavel Stehule

Date:

03 June 2018, 20:43:05

2018-06-03 19:16 GMT+02:00 David G. Johnston <david.g.johnston@gmail.com>:

On Sunday, June 3, 2018, Pavel Stehule <pavel.stehule@gmail.com> wrote:

\pset fieldsep ;
\pset format csv

I don't like when one command overwrite settings of some other command. We can introduce some similar like GUC where default values from configure file can be overwritten by custom setting for session. So I am able to thinking about some settings

like

\pset csv_default_separator
\pset csv_default_header

Then there is question to simplify all and use \pset csv_separator, and csv format independent of fieldseparator value? It is possible, but I don't think so have more option for similar value is good idea (for interactive mode).

Having a dedicated option seems infinitely better than adding new settings for defaults and having to keep track of whether the shared field separator is a default versus a user specified value.

Recently we reworked server GUCs to avoid this kind of unset/default behavior. I don't see how introducing or relying upon it in psql would be an advantage.

I am thinking so psql design is little bit special, because we should to think about more modes - interactive and not interactive, and our goal should be some consistency from user perspective in interactive mode. But for me, the special CSV options are acceptable - although little bit unfriendly from user perspective.

At this point -1, keep the status quo, for any implementation that tries to make the unaligned mode field separator perform double duty. I'm open, but unlikely, to be convinced that it can be done without unforeseen bad side effects and degraded usability.

With respect to your opinion, I don't agree so current status is good - mainly in this case.

Regards

Pavel

David J.

Re: csv format for psql

From

"Daniel Verite"

Date:

10 August 2018, 13:25:49

    Pavel Stehule wrote:

> > On the whole I'm inclined to resubmit the patch with
> > fieldsep_csv and some minor changes based on the rest
> > of the discussion.
> >
>
> +1

PFA an updated version.
Usage from the command line:
$ psql --csv          # or -P format=csv
$ psql --csv -P fieldsep_csv=";"  # for non-comma csv separator

From inside psql:

\pset format csv
\pset fieldsep_csv '\t'

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

psql-csv-format-v5.patch

Re: csv format for psql

From

Fabien COELHO

Date:

10 August 2018, 16:32:08

Hello Daniel,

> PFA an updated version.
> Usage from the command line:
> $ psql --csv          # or -P format=csv
> $ psql --csv -P fieldsep_csv=";"  # for non-comma csv separator
>
> From inside psql:
>
> \pset format csv
> \pset fieldsep_csv '\t'

Patch applies cleanly, compiles, global make check ok.

Doc: "according to the csv rules" -> "according to csv rules."?

Doc: "RFC-4180" -> "RFC 4180"?

The previous RFC specifies CRLF as eol, but '\n' is hardcoded in the 
source. I'm fine with that, but I'd suggest that the documentation should 
said which EOL is used.

ISTM that "--csv" & "-C" are not documented, neither in sgml nor under
--help.

"fieldsep_csv" does not show on the list of output options under "\?".

There seems to be a test in the code to set an empty string "" by default,
but it is unclear to me when this is triggered.

I'd tend to use "CSV" instead of "csv" everywhere it makes sense, eg in 
the doc (CSV rules) and in variable names in the code (FooCsv -> FooCSV?), 
but that is pretty debatable.

-- 
Fabien.

Re: csv format for psql

From

Pavel Stehule

Date:

11 August 2018, 09:03:43

2018-08-10 12:25 GMT+02:00 Daniel Verite <daniel@manitou-mail.org>:

Pavel Stehule wrote:

> > On the whole I'm inclined to resubmit the patch with
> > fieldsep_csv and some minor changes based on the rest
> > of the discussion.
> >
>
> +1

PFA an updated version.
Usage from the command line:
$ psql --csv # or -P format=csv
$ psql --csv -P fieldsep_csv=";" # for non-comma csv separator

From inside psql:

\pset format csv
\pset fieldsep_csv '\t'

quick check +1 - I have not a objections

Regards

Pavel

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

"Daniel Verite"

Date:

22 August 2018, 23:05:41

    Fabien COELHO wrote:

> Doc: "according to the csv rules" -> "according to csv rules."?

Fixed.

> Doc: "RFC-4180" -> "RFC 4180"?

Fixed. The other references to RFCs use this syntax indeed.

> The previous RFC specifies CRLF as eol, but '\n' is hardcoded in the
> source. I'm fine with that, but I'd suggest that the documentation should
> said which EOL is used.

'\n' gets translated by libc when the output is in text mode.
We discussed this upthread, but maybe it should be a code comment:
added now.

> ISTM that "--csv" & "-C" are not documented, neither in sgml nor under
> --help.
>
> "fieldsep_csv" does not show on the list of output options under "\?".

Oops, fixed.

> There seems to be a test in the code to set an empty string "" by default,
> but it is unclear to me when this is triggered.

Where is that code?

> I'd tend to use "CSV" instead of "csv" everywhere it makes sense, eg in
> the doc (CSV rules) and in variable names in the code (FooCsv -> FooCSV?),
> but that is pretty debatable.

I've changed to upper case in a couple places and added <acronym> tags,
but depending on the context sometimes lower case feels more consistent.
This is the same as, for instance, ASCII. We display \pset linestyle as
ascii, not ASCII, presumably because everything else in the \pset area
is lower case. But both cases are accepted in input.

Also added a CSV entry in the doc index per Alvaro's suggestion in
https://www.postgresql.org/message-id/20180809202125.r4mdx2jzm7hytetz@alvherre.pgsql
with pointers to psql and COPY.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

psql-csv-format-v6.patch

Re: csv format for psql

From

Fabien COELHO

Date:

23 August 2018, 11:30:05

Bonjour Daniel,

> '\n' gets translated by libc when the output is in text mode.
> We discussed this upthread, but maybe it should be a code comment:
> added now.

My point was more about the documentation which should be clear about what 
is the EOL. I understand from your point above that the EOL is the 
platform-specific one, which is yet again fine with me, but should be said 
clearly in the documentation?

>> There seems to be a test in the code to set an empty string "" by 
>> default, but it is unclear to me when this is triggered.
>
> Where is that code?

  + else if (strcmp(param, "fieldsep_csv") == 0)
  +     return pset_quoted_string(popt->topt.fieldSepCsv
  +                               ? popt->topt.fieldSepCsv
  +                               : "");

It is unclear to me when this "" is triggered. Never? If so, maybe a 
comment should say so?

A few comments about v6: Patch applies cleanly, make check ok, doc build 
ok.

Why removing "-C"? As we already have "-A" an "-H", I was fine with it.

It seems that you changed the indentation in "psql-ref.sgml":

       </varlistentry>
  -    <varlistentry>
  +     <varlistentry>

but maybe this is just the previous entry which was misaligned...

  -  {"unaligned", "aligned", "wrapped", "html", "asciidoc",
  -   "latex", "latex-longtable", "troff-ms", NULL};
  +  {"aligned", "asciidoc", "csv", "html", "latex", "latex-longtable",
  +   "unaligned", "troff-ms", "wrapped", NULL};

If you want alphabetical, 'u' > 't'.

While testing I found a small issue if "fieldsep_csv" is set to a strange 
value:

   \pset format_csv ',,'
   SELECT ',', ',';
   -- gives the ambiguous:
   ,,,,

The rule to decide whether to quote should be made safer/smarter. I'd 
suggest that if the string contains any of the caracters used in format 
csv it should be quoted.

-- 
Fabien.

Re: csv format for psql

From

"Daniel Verite"

Date:

28 August 2018, 19:45:25

    Fabien COELHO wrote:

> My point was more about the documentation which should be clear about what
> is the EOL. I understand from your point above that the EOL is the
> platform-specific one, which is yet again fine with me, but should be said
> clearly in the documentation?

Okay, I've added a bit in the doc.

>   + else if (strcmp(param, "fieldsep_csv") == 0)
>   +     return pset_quoted_string(popt->topt.fieldSepCsv
>   +                               ? popt->topt.fieldSepCsv
>   +                               : "");
>
> It is unclear to me when this "" is triggered. Never? If so, maybe a
> comment should say so?

Currently popt->topt.fieldSepCsv can't be NULL so I've simplified this
to just return pset_quoted_string(popt->topt.fieldSepCsv).

> Why removing "-C"? As we already have "-A" an "-H", I was fine with it.

It was a leftover from v3. Participants in the thread don't seem to
want the short option, to my surprise. Pavel argued first against -C
upthread, I argued quite a bit in favor of it, the "for" had 0 upvote, and
"against" had at least 4 I think, including yours in [1].

> It seems that you changed the indentation in "psql-ref.sgml":
>
>        </varlistentry>
>   -    <varlistentry>
>   +     <varlistentry>

Fixed.

>   -  {"unaligned", "aligned", "wrapped", "html", "asciidoc",
>   -   "latex", "latex-longtable", "troff-ms", NULL};
>   +  {"aligned", "asciidoc", "csv", "html", "latex", "latex-longtable",
>   +   "unaligned", "troff-ms", "wrapped", NULL};
>
> If you want alphabetical, 'u' > 't'.

Fixed.

>
> While testing I found a small issue if "fieldsep_csv" is set to a strange
> value:
>
>    \pset format_csv ',,'
>    SELECT ',', ',';
>    -- gives the ambiguous:
>    ,,,,
>
> The rule to decide whether to quote should be made safer/smarter. I'd
> suggest that if the string contains any of the caracters used in format
> csv it should be quoted.

You meant \pset fieldsep_csv ',,'
If you do that even SELECT 'foo', 'bar' comes out wrong because it looks
like a 3-field row: foo,,bar
If we want to prevent people to shoot themselves in the foot with that
sort of thing, I've added a couple tests: No double quote, no LF or
CR, single character (but multibyte allowed) for the separator.

[1]
https://www.postgresql.org/message-id/alpine.DEB.2.20.1803081004241.2916%40lancre

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

psql-csv-format-v7.patch

Re: csv format for psql

From

Fabien COELHO

Date:

28 August 2018, 21:54:02

Bonjour Daniel,

> Currently popt->topt.fieldSepCsv can't be NULL so I've simplified this
> to just return pset_quoted_string(popt->topt.fieldSepCsv).

Ok.

>> While testing I found a small issue if "fieldsep_csv" is set to a strange
>> value:
>>
>>    \pset format_csv ',,'
>
> You meant \pset fieldsep_csv ',,'

Indeed.

> If you do that even SELECT 'foo', 'bar' comes out wrong because it looks
> like a 3-field row: foo,,bar

Yes and no. I asked for ",," as a separator, so probably I really want 
that and I'd be okay with the result.

> If we want to prevent people to shoot themselves in the foot with that
> sort of thing, I've added a couple tests: No double quote, no LF or
> CR, single character (but multibyte allowed) for the separator.

Ok, why not.

Patch applies cleanly, compiles, "make check" ok.

I tried "\pset fieldsep_csv '\0'" which could be seen as one character, 
but it does not want it. I'm okay with this behavior.

I'd suggest to add a test about rejected fieldsep_csv values, which raises 
both errors.

-- 
Fabien.

Re: csv format for psql

From

Michael Paquier

Date:

01 October 2018, 08:42:29

On Tue, Aug 28, 2018 at 08:54:02PM +0200, Fabien COELHO wrote:
> Patch applies cleanly, compiles, "make check" ok.
>
> I tried "\pset fieldsep_csv '\0'" which could be seen as one character, but
> it does not want it. I'm okay with this behavior.
>
> I'd suggest to add a test about rejected fieldsep_csv values, which raises
> both errors.

The patch has been waiting for input from the author since August, so I
am marking it as returned with feedback.  If you can produce a new
version, please feel free to post it.
--
Michael

Attachment

signature.asc

Re: csv format for psql

From

"Daniel Verite"

Date:

02 November 2018, 12:35:26

    Michael Paquier wrote:

> If you can produce a new  version, please feel free to post it.

Here's a rebased version with a couple regression tests
added per the discussions during the previous CF.

Now at https://commitfest.postgresql.org/20/1861/

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

psql-csv-format-v8.patch

Re: csv format for psql

From

Fabien COELHO

Date:

03 November 2018, 08:45:14

Bonjour Daniel,

> Here's a rebased version with a couple regression tests
> added per the discussions during the previous CF.
>
> Now at https://commitfest.postgresql.org/20/1861/

Patch applies cleanly, compiles, make check ok, doc gen ok.

Fine with me. I switched the patch to "Ready".

-- 
Fabien.

Re: csv format for psql

From

Michael Paquier

Date:

06 November 2018, 06:06:29

On Sat, Nov 03, 2018 at 09:45:14AM +0100, Fabien COELHO wrote:
> Patch applies cleanly, compiles, make check ok, doc gen ok.
>
> Fine with me. I switched the patch to "Ready".

I have begun looking at this patch, and there are some noise diffs
because of the reordering of the items you are doing in psql code.
Ordering them in alphabetical order is a good idea due to the high
number of options available, and more would pile up even if this
separates a bit "aligned" and "unaligned", so I have have separated
those diffs from the core patch and committed it, leaving the core
portion of the patch aside for later.
--
Michael

Attachment

signature.asc

Re: csv format for psql

From

"Daniel Verite"

Date:

06 November 2018, 13:57:15

    Michael Paquier wrote:

> Ordering them in alphabetical order is a good idea due to the high
> number of options available, and more would pile up even if this
> separates a bit "aligned" and "unaligned", so I have have separated
> those diffs from the core patch and committed it, leaving the core
> portion of the patch aside for later.

Here's a rebased version following these changes.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

psql-csv-format-v9.patch

Re: csv format for psql

From

Michael Paquier

Date:

07 November 2018, 08:23:55

On Tue, Nov 06, 2018 at 02:57:15PM +0100, Daniel Verite wrote:
> Here's a rebased version following these changes.

Thanks for doing a rebase.

First, it seems to me that it is necessary to overcomplicate the
interface of psql for that.  Most of the other formats don't have their
own switch, and it is perfectly possible to use CSV format with just
--pset=format=csv.  -C could also be useful for other things, say
compression.

Not reusing fieldsep and creating a new variable looks like the correct
way to go, as there is no need to bother about cross-option checks if
the format is moved from unaligned to csv and the other way around,
especially as the delimiter of CSV needs cannot be larger than
one-byte.

+   pset.popt.topt.fieldSepCsv = pg_strdup(",");
Let's store that in a variable instead of hardcoding it.

In the regression tests, "col 9" is wanted with a newline?

I am still digging into the details of this patch, and I am
unfortunately running out of fuel for the day.
--
Michael

Attachment

signature.asc

Re: csv format for psql

From

Fabien COELHO

Date:

07 November 2018, 09:01:20

Bonjour Michaël,

> First, it seems to me that it is necessary to overcomplicate the
> interface of psql for that.  Most of the other formats don't have their
> own switch, and it is perfectly possible to use CSV format with just
> --pset=format=csv.

Although this work, it is not very user friendly nor easy to memorize.

Quite a few direct format options already exists, eg -A --no-align (yuk, 
-A means not to align... :-(), -H --html...

In this context "--csv" looks pretty unambiguous and more useful, so I 
would be in favor of keeping this option.

> -C could also be useful for other things, say compression.

Sure. -C is *NOT* used for --csv in the patch, although it may have been 
with an earlier version?

> In the regression tests, "col 9" is wanted with a newline?

Yes, the point is to check whether quoting & escaping are triggered 
appropriately in column names and values.

-- 
Fabien.

Re: csv format for psql

From

"Daniel Verite"

Date:

07 November 2018, 16:36:54

    Michael Paquier wrote:

> -C could also be useful for other things, say compression.

The patch does not grab any short letter, as the consensus
went that way.

> +   pset.popt.topt.fieldSepCsv = pg_strdup(",");
> Let's store that in a variable instead of hardcoding it.

I guess it could go into a #define in psql/settings.h, along with
these:
#define DEFAULT_FIELD_SEP "|"
#define DEFAULT_RECORD_SEP "\n"

> In the regression tests, "col 9" is wanted with a newline?

Yes, since we support column names with embedded newlines (even
though it's hard to think of a legit use case for that) and CSV fields
support embedded newlines too.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

Michael Paquier

Date:

08 November 2018, 04:58:34

On Wed, Nov 07, 2018 at 05:36:54PM +0100, Daniel Verite wrote:
> I guess it could go into a #define in psql/settings.h, along with
> these:
> #define DEFAULT_FIELD_SEP "|"
> #define DEFAULT_RECORD_SEP "\n"

Yes.

> Yes, since we support column names with embedded newlines (even
> though it's hard to think of a legit use case for that) and CSV fields
> support embedded newlines too.

Hm.  We would likely want at least a comment about that..

Anyway, I am still going through the patch, so no need to send a new
version for now.
--
Michael

Attachment

signature.asc

Re: csv format for psql

From

Michael Paquier

Date:

09 November 2018, 05:56:34

On Thu, Nov 08, 2018 at 01:58:34PM +0900, Michael Paquier wrote:
> Anyway, I am still going through the patch, so no need to send a new
> version for now.

Okay, I have done a round of more in-depth review, and the patch looks
to be in pretty good shape.

Relying on tuples_only to decide if the header should be printed or not
looks good to me.

+    /* check for value being non-empty and with an MB length of 1 */
+    if (*value == '\0' || value[PQmblen(value, pset.encoding)] != '\0')

It seems to me that this can just be replaced with that:
if (strlen(value) != 1)

Attached is what I am finishing up with for the moment.  Comments are
welcome.  I am still planning look at that stuff a bit more once again,
particularly the printing part, but I am lacking of time now..
--
Michael

Attachment

Re: csv format for psql

From

Pavel Stehule

Date:

09 November 2018, 06:41:52

pá 9. 11. 2018 v 6:57 odesílatel Michael Paquier <michael@paquier.xyz> napsal:

On Thu, Nov 08, 2018 at 01:58:34PM +0900, Michael Paquier wrote:
> Anyway, I am still going through the patch, so no need to send a new
> version for now.

Okay, I have done a round of more in-depth review, and the patch looks
to be in pretty good shape.

Relying on tuples_only to decide if the header should be printed or not
looks good to me.

+ /* check for value being non-empty and with an MB length of 1 */
+ if (*value == '\0' || value[PQmblen(value, pset.encoding)] != '\0')

It seems to me that this can just be replaced with that:
if (strlen(value) != 1)

Attached is what I am finishing up with for the moment. Comments are
welcome. I am still planning look at that stuff a bit more once again,
particularly the printing part, but I am lacking of time now..

looks ok

Pavel

--
Michael

Re: csv format for psql

From

"Daniel Verite"

Date:

09 November 2018, 10:40:44

    Michael Paquier wrote:

> +    /* check for value being non-empty and with an MB length of 1 */
> +    if (*value == '\0' || value[PQmblen(value, pset.encoding)] != '\0')
>
> It seems to me that this can just be replaced with that:
> if (strlen(value) != 1)

The above is meant to accept a multibyte character as the separator,
in which case strlen(value) would be greater than 1.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

Michael Paquier

Date:

09 November 2018, 12:11:10

On Fri, Nov 09, 2018 at 11:40:44AM +0100, Daniel Verite wrote:
> The above is meant to accept a multibyte character as the separator,
> in which case strlen(value) would be greater than 1.

Still what's the point except complicating the code?  We don't care
about anything fancy in the backend-side ProcessCopyOptions() when
checking cstate->delim, and having some consistency looks like a good
thing to me.

I am also questioning a couple of things: it is possible to choose
freely how to represent a null value thanks to "\pset null", which gives
an equivalent to the COPY option.  Choosing if a header is present or
not can be done with tuples_only.  However there is no option to specify
an escape character, no option to specify a quote character, and it is
not possible to force quotes for all values.  Those are huge advantages
as any output can be made compatible with other CSV variants.  Isn't
what is presented too limited?  \copy has at least the advantage to push
down those responsibilities to the backend..
--
Michael

Attachment

signature.asc

Re: csv format for psql

From

"Daniel Verite"

Date:

09 November 2018, 16:28:07

    Michael Paquier wrote:

> Still what's the point except complicating the code?  We don't care
> about anything fancy in the backend-side ProcessCopyOptions() when
> checking cstate->delim, and having some consistency looks like a good
> thing to me.

The backend has its reasons that don't apply to the psql output
format, mostly import performance according to [1]
It's not that nobody wants delimiter outside of US-ASCII,
as people do ask for that sometimes:

https://www.postgresql.org/message-id/f02ulk%242r3u%241%40news.hub.org
https://github.com/greenplum-db/gpdb/issues/1246

> However there is no option to specify
> an escape character, no option to specify a quote character, and it is
> not possible to force quotes for all values.  Those are huge advantages
> as any output can be made compatible with other CSV variants.  Isn't
> what is presented too limited?

The guidelines that the patch has been following are those of RFC 4180 [2]
with two exceptions on the field separator that we can define
and the end of lines that are OS-dependant instead of the fixed CRLF
that IETF seems to see as the norm.

The only reference to escaping in the RFC is:
       "If double-quotes are used to enclose fields, then a double-quote
       appearing inside a field must be escaped by preceding it with
       another double quote"

The problem with using non-standard QUOTE or ESCAPE is that it's a
violation of the format that goes further than choosing a separator
different than comma, which is already a pain point.
We can always add these options later if there is demand. I suspect it
will never happen.

I looked at the 2004 archives when CSV was added to COPY, that's
around commit 862b20b38 in case anyone cares to look, but
I couldn't find a discussion on these options, all I could find is they were
present from the start.

But again COPY is concerned with importing the data that preexists,
even if it's weird, whereas a psql output formats are not.

[1] https://www.postgresql.org/message-id/4C9D2BC5.1080006%40optonline.net
[2] https://tools.ietf.org/html/rfc4180

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

Michael Paquier

Date:

09 November 2018, 23:44:14

On Fri, Nov 09, 2018 at 05:28:07PM +0100, Daniel Verite wrote:
> But again COPY is concerned with importing the data that preexists,
> even if it's weird, whereas a psql output formats are not.

Hm.  I checked the contents of the patch in details which provide output
consistent with COPY, but after looking at the global picture I am
getting cold feet on this patch for a couple of reasons:
- This stuff adds new code paths in the frontend mimicking what the
backend already does for years, both doing the same thing.
- There are already three ways to fetch data in this format with COPY,
\copy and file_fdw, with all three using the same code paths for option
validations (I can see the arguments at the top of the thread for which
COPY SELECT can actually do everything you want with?).
- The experience is confusing, as the psql format uses different options
than the backend to do the same things:
-- tuples_only instead of HEADER.
-- fieldsep_csv instead of DELIMITER
-- null is an equivalent of the one with the same name, which is
actually consistent.
-- encoding is also an equivalent of ENCODING.
-- and all the others missing.
That looks like a lot.
--
Michael

Attachment

signature.asc

Re: csv format for psql

From

"Daniel Verite"

Date:

11 November 2018, 14:51:27

    Michael Paquier wrote:

> - The experience is confusing, as the psql format uses different options
> than the backend to do the same things:
> -- tuples_only instead of HEADER.
> -- fieldsep_csv instead of DELIMITER
> -- null is an equivalent of the one with the same name, which is
> actually consistent.
> -- encoding is also an equivalent of ENCODING.

That conveniently ignores the fact that "\pset format somename"
as a command doesn't take any additional option, contrary to COPY.
We can't do \pset format csv (delimiter=';')
If we choosed "delimiter" as an option name, it would have to exist
within 20 other names in the \pset namespace and then it would be
too vague, whereas "fieldsep_csv" makes it clear what it applies to.
"tuples_only" is preexisting, and I don't see how the comment that it's
not called "header" could be actionable.

Overall, you seem to posit that we should mimic the entire
COPY TO interface to implement 'psql --csv'.
But the starting point is that 'psql --csv' is just a slightly
different (safer) variant of 'psql -At', which is not concerned
at all with being consistent with COPY.
The model of the csv output format is the unaligned output format,
it's just not COPY.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

"David G. Johnston"

Date:

12 November 2018, 04:18:39

On Friday, November 9, 2018, Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Nov 09, 2018 at 05:28:07PM +0100, Daniel Verite wrote:
> But again COPY is concerned with importing the data that preexists,
> even if it's weird, whereas a psql output formats are not.

Hm. I checked the contents of the patch in details which provide output
consistent with COPY, but after looking at the global picture I am
getting cold feet on this patch for a couple of reasons:
- This stuff adds new code paths in the frontend mimicking what the
backend already does for years, both doing the same thing.

From the original post:

"copy with csv can't help for the output of meta-commands

such as \gx, \crosstabview, \l, \d ... whereas a CSV format within psql
does work with these."

Formatting is usually a client-side concern so this feature fits well there. The fact that we’ve provided a server interface for the same doesn’t preclude its potential desirability in the client.

- There are already three ways to fetch data in this format with COPY,
\copy and file_fdw, with all three using the same code paths for option
validations (I can see the arguments at the top of the thread for which
COPY SELECT can actually do everything you want with?).

Not always conveniently.

- The experience is confusing, as the psql format uses different options
than the backend to do the same things:

Yes, those who use psql need to learn its features. I’d posit that since this syntax is being learned anyway that transferring said knowledge to a newly added csv format will not be confusing. No more so that having to do something that is usually client-side (formatting) on the server in the first place. That we don’t fully replicate the server functionality does’t bother be. This is meant to be a simple and quick ability that handles 95% of use cases and defers to the more powerful server version for the outliers.

Feature-wise I’m on board. Given it’s already written I’d say it should go in unless there are code complexity and support concerns - which given the prevalence of other formats I have to believe there are not.

David J.

Re: csv format for psql

From

Pavel Stehule

Date:

12 November 2018, 05:36:18

po 12. 11. 2018 v 5:19 odesílatel David G. Johnston <david.g.johnston@gmail.com> napsal:

On Friday, November 9, 2018, Michael Paquier <michael@paquier.xyz> wrote:
On Fri, Nov 09, 2018 at 05:28:07PM +0100, Daniel Verite wrote:
> But again COPY is concerned with importing the data that preexists,
> even if it's weird, whereas a psql output formats are not.

Hm. I checked the contents of the patch in details which provide output
consistent with COPY, but after looking at the global picture I am
getting cold feet on this patch for a couple of reasons:
- This stuff adds new code paths in the frontend mimicking what the
backend already does for years, both doing the same thing.

From the original post:

"copy with csv can't help for the output of meta-commands
such as \gx, \crosstabview, \l, \d ... whereas a CSV format within psql
does work with these."

Formatting is usually a client-side concern so this feature fits well there. The fact that we’ve provided a server interface for the same doesn’t preclude its potential desirability in the client.

- There are already three ways to fetch data in this format with COPY,
\copy and file_fdw, with all three using the same code paths for option
validations (I can see the arguments at the top of the thread for which
COPY SELECT can actually do everything you want with?).

Not always conveniently.

- The experience is confusing, as the psql format uses different options
than the backend to do the same things:

Yes, those who use psql need to learn its features. I’d posit that since this syntax is being learned anyway that transferring said knowledge to a newly added csv format will not be confusing. No more so that having to do something that is usually client-side (formatting) on the server in the first place. That we don’t fully replicate the server functionality does’t bother be. This is meant to be a simple and quick ability that handles 95% of use cases and defers to the more powerful server version for the outliers.

This patch is not simple - not due own complexity, but due current state of psql and output format support. The psql is aged software and implement new format well known is not simple.

The COPY statement has different purpose and because it has server side implementation, it cannot to cover client side space.

I afraid so there hardy be designed some better - and it is unhappy so psql has not native csv support yet.

Regards

Pavel

Feature-wise I’m on board. Given it’s already written I’d say it should go in unless there are code complexity and support concerns - which given the prevalence of other formats I have to believe there are not.

David J.

Re: csv format for psql

From

Fabien COELHO

Date:

14 November 2018, 12:17:19

Bonjour Michaël,

>> But again COPY is concerned with importing the data that preexists,
>> even if it's weird, whereas a psql output formats are not.
>
> Hm.  I checked the contents of the patch in details which provide output
> consistent with COPY, but after looking at the global picture I am
> getting cold feet on this patch for a couple of reasons:
> - This stuff adds new code paths in the frontend mimicking what the
> backend already does for years, both doing the same thing.

  - Backend's COPY cannot write to client space, so the comparison
    is not relevant.

  - "\copy (SQL-query) TO STDOUT CSV;" is kind of a pain, because one has
    to edit around the query, which is not convenient, esp from the
    command line:

    sh> psql --csv -c 'SELECT 1 as one, 2 as two' > out.csv

    vs

    sh> psql -c "\copy (SELECT 1 AS one, 2 as two) TO STDOUT CSV" > out.csv

    or mixing the output file name inside the argument, which is very
    unshell like:

    sh> psql -c "\copy (SELECT 1 AS one, 2 as two) TO 'out.csv' CSV"

If you have a "query.sql" file that you want to output in csv, there is no 
simple way to do that with \copy/COPY, whereas "psql --csv -f query.sql" 
looks pretty straightforward to me. Also, in a makefile, I could write:

    %.csv: %.sql
        psql --csv -f $< > $@

My point is that \copy, COPY and the proposed CSV format do not address 
the same use cases.

> - There are already three ways to fetch data in this format with COPY,
> \copy and file_fdw, with all three using the same code paths for option
> validations (I can see the arguments at the top of the thread for which
> COPY SELECT can actually do everything you want with?).
> - The experience is confusing, as the psql format uses different options
> than the backend to do the same things:
> -- tuples_only instead of HEADER.
> -- fieldsep_csv instead of DELIMITER
> -- null is an equivalent of the one with the same name, which is
> actually consistent.
> -- encoding is also an equivalent of ENCODING.
> -- and all the others missing.
> That looks like a lot.

I disagree on this one: the proposed csv format just follows the existing 
psql format pattern used for 8 formats and reuses it for csv.

Note that there are existing command line options for tuples_only (-t), 
encoding is inherited from the shell and does not need to be changed that 
often nowadays, fieldsep_csv is kind-of a pain, but then if someone wants 
"comma-separated-values" NOT separated by commas, probably they can handle 
it.

Basically the proposed patch addresses a simple and convenient use case 
which are neither addressed by \copy nor COPY. The fact that more options 
are available with these commands does it precludes its usefulness as is.

-- 
Fabien.

Re: csv format for psql

From

Tom Lane

Date:

26 November 2018, 01:26:47

Fabien COELHO <coelho@cri.ensmp.fr> writes:
> Basically the proposed patch addresses a simple and convenient use case 
> which are neither addressed by \copy nor COPY. The fact that more options 
> are available with these commands does it precludes its usefulness as is.

Yeah, I agree that this option is useful independently of whether COPY
provides something similar.  I think the killer argument is that right
now, psql-ref.sgml repeatedly suggests that unaligned mode with fieldsep
',' is a reasonable way to produce comma-separated output for consumption
by other programs.  That's like handing our users a loaded foot-gun.
And, in fact, right now *none* of psql's table output formats is both
unambiguous and reasonably simple/popular to use.  So the astonishing
thing about this patch, IMO, is that we didn't do it a decade ago.

I went through the documentation to improve that point, and did some other
cosmetic cleanup including rebasing up to HEAD, and got the attached.

I think there are two remaining points to settle:

1. Are we limiting the separator to be a single-byte character or not?
If we are, the code could be made simpler and faster by working with a
"char" value instead of a string.  If we're not, then Michael's change
needs to be undone (but I didn't do that here).

I feel that if we allow multi-byte characters here, we might as well
take the training wheels off and just say you can use any separator
string you want, as long as it doesn't contain double quote, \r, or \n.
Most programs reading a file are not going to perceive a difference
between separating fields with a single multibyte character and multiple
single-byte characters; either they can cope or they can't.  A fair
number of them are going to be in the latter category.  So we can either
say "do what you wish, it's your problem whether anything can read the
result" or "we're going to restrict you to something that (perhaps) is
more widely readable".  I'm a bit inclined to the former viewpoint.
If we were in the business of being restrictive, why would we allow
the field separator to be changed at all?  The name of the format is
*comma* separated values, not something else.

2. Speaking of the field separator, I'm pretty desperately unhappy
with the choice of "fieldsep_csv" as the parameter name.  The trouble
with that is that it encourages sticking "fieldsep_csv" in between
"fieldsep" and "fieldsep_zero", because alphabet.  But "fieldsep" and
"fieldsep_zero" are *intimately* tied together, in fact it's only a
dubious implementation choice that made them separate parameters at all.
It does not make any semantic sense to stick other vaguely-related
parameters in between, neither in the documentation nor in \pset output.

We could avoid this self-inflicted confusion by choosing a different
parameter name.  I'd be good with "csv_fieldsep" or "csvfieldsep".

Or we could kill both issues by hard-wiring the separator as ','.

Thoughts?

            regards, tom lane

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 6e6d0f4..d53451b 100644
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
*************** PostgreSQL documentation
*** 68,75 ****
        <listitem>
        <para>
        Switches to unaligned output mode. (The default output mode is
!       otherwise aligned.)  This is equivalent to <command>\pset format
!       unaligned</command>.
        </para>
        </listitem>
      </varlistentry>
--- 68,75 ----
        <listitem>
        <para>
        Switches to unaligned output mode. (The default output mode is
!       <literal>aligned</literal>.)  This is equivalent to
!       <command>\pset format unaligned</command>.
        </para>
        </listitem>
      </varlistentry>
*************** EOF
*** 152,157 ****
--- 152,167 ----
      </varlistentry>

      <varlistentry>
+       <term><option>--csv</option></term>
+       <listitem>
+       <para>
+       Switches to <acronym>CSV</acronym> (Comma-Separated Values) output
+       mode.  This is equivalent to <command>\pset format csv</command>.
+       </para>
+       </listitem>
+     </varlistentry>
+
+     <varlistentry>
        <term><option>-d <replaceable class="parameter">dbname</replaceable></option></term>
        <term><option>--dbname=<replaceable class="parameter">dbname</replaceable></option></term>
        <listitem>
*************** EOF
*** 270,277 ****
        <term><option>--html</option></term>
        <listitem>
        <para>
!       Turn on <acronym>HTML</acronym> tabular output. This is
!       equivalent to <literal>\pset format html</literal> or the
        <command>\H</command> command.
        </para>
        </listitem>
--- 280,287 ----
        <term><option>--html</option></term>
        <listitem>
        <para>
!       Switches to <acronym>HTML</acronym> output mode.  This is
!       equivalent to <command>\pset format html</command> or the
        <command>\H</command> command.
        </para>
        </listitem>
*************** lo_import 152801
*** 2547,2554 ****
            <listitem>
            <para>
            Specifies the field separator to be used in unaligned output
!           format. That way one can create, for example, tab- or
!           comma-separated output, which other programs might prefer. To
            set a tab as field separator, type <literal>\pset fieldsep
            '\t'</literal>. The default field separator is
            <literal>'|'</literal> (a vertical bar).
--- 2557,2564 ----
            <listitem>
            <para>
            Specifies the field separator to be used in unaligned output
!           format. That way one can create, for example, tab-separated
!           output, which other programs might prefer. To
            set a tab as field separator, type <literal>\pset fieldsep
            '\t'</literal>. The default field separator is
            <literal>'|'</literal> (a vertical bar).
*************** lo_import 152801
*** 2557,2562 ****
--- 2567,2585 ----
            </varlistentry>

            <varlistentry>
+           <term><literal>fieldsep_csv</literal></term>
+           <listitem>
+           <para>
+           Specifies the field separator to be used in
+           <acronym>CSV</acronym> output format.  If the separator character
+           appears in a field's value, that field is output inside double
+           quotes, following standard <acronym>CSV</acronym> rules.
+           The default is a comma.
+           </para>
+           </listitem>
+           </varlistentry>
+
+           <varlistentry>
            <term><literal>fieldsep_zero</literal></term>
            <listitem>
            <para>
*************** lo_import 152801
*** 2585,2606 ****
            <listitem>
            <para>
            Sets the output format to one of <literal>aligned</literal>,
!           <literal>asciidoc</literal>, <literal>html</literal>,
!           <literal>latex</literal> (uses <literal>tabular</literal>),
            <literal>latex-longtable</literal>, <literal>troff-ms</literal>,
            <literal>unaligned</literal>, or <literal>wrapped</literal>.
            Unique abbreviations are allowed.
            </para>

            <para><literal>unaligned</literal> format writes all columns of a row on one
            line, separated by the currently active field separator. This
            is useful for creating output that might be intended to be read
!           in by other programs (for example, tab-separated or comma-separated
!           format).
            </para>

!           <para><literal>aligned</literal> format is the standard, human-readable,
!           nicely formatted text output;  this is the default.
            </para>

            <para><literal>wrapped</literal> format is like <literal>aligned</literal> but wraps
--- 2608,2655 ----
            <listitem>
            <para>
            Sets the output format to one of <literal>aligned</literal>,
!           <literal>asciidoc</literal>,
!           <literal>csv</literal>,
!           <literal>html</literal>,
!           <literal>latex</literal>,
            <literal>latex-longtable</literal>, <literal>troff-ms</literal>,
            <literal>unaligned</literal>, or <literal>wrapped</literal>.
            Unique abbreviations are allowed.
            </para>

+           <para><literal>aligned</literal> format is the standard,
+           human-readable, nicely formatted text output; this is the default.
+           </para>
+
            <para><literal>unaligned</literal> format writes all columns of a row on one
            line, separated by the currently active field separator. This
            is useful for creating output that might be intended to be read
!           in by other programs, for example, tab-separated or comma-separated
!           format.  However, the field separator character is not treated
!           specially if it appears in a column's value;
!           so <acronym>CSV</acronym> format may be better suited for such
!           purposes.
            </para>

!           <para><literal>csv</literal> format
!           <indexterm>
!            <primary>CSV (Comma-Separated Values) format</primary>
!            <secondary>in psql</secondary>
!           </indexterm>
!           writes column values separated by commas, applying the quoting
!           rules described in
!           <ulink url="https://tools.ietf.org/html/rfc4180">RFC 4180</ulink>.
!           This output is compatible with the CSV format of the server's
!           <command>COPY</command> command.
!           A header line with column names is generated unless
!           the <literal>tuples_only</literal> parameter is
!           <literal>on</literal>. Titles and footers are not printed.
!           Each row is terminated by the system-dependent end-of-line character,
!           which is typically a single newline (<literal>\n</literal>) for
!           Unix-like systems or a carriage return and newline sequence
!           (<literal>\r\n</literal>) for Microsoft Windows.
!           Field separator characters other than comma can be selected with
!           <command>\pset fieldsep_csv</command>.
            </para>

            <para><literal>wrapped</literal> format is like <literal>aligned</literal> but wraps
*************** lo_import 152801
*** 2620,2627 ****
            language. They are not complete documents! This might not be
            necessary in <acronym>HTML</acronym>, but in
            <application>LaTeX</application> you must have a complete
!           document wrapper.  <literal>latex-longtable</literal>
!           also requires the <application>LaTeX</application>
            <literal>longtable</literal> and <literal>booktabs</literal> packages.
            </para>
            </listitem>
--- 2669,2680 ----
            language. They are not complete documents! This might not be
            necessary in <acronym>HTML</acronym>, but in
            <application>LaTeX</application> you must have a complete
!           document wrapper.
!           The <literal>latex</literal> format
!           uses <application>LaTeX</application>'s <literal>tabular</literal>
!           environment.
!           The <literal>latex-longtable</literal> format
!           requires the <application>LaTeX</application>
            <literal>longtable</literal> and <literal>booktabs</literal> packages.
            </para>
            </listitem>
*************** first second
*** 4588,4604 ****

  peter@localhost testdb=> <userinput>\pset border 1</userinput>
  Border style is 1.
! peter@localhost testdb=> <userinput>\pset format unaligned</userinput>
! Output format is unaligned.
! peter@localhost testdb=> <userinput>\pset fieldsep ","</userinput>
! Field separator is ",".
  peter@localhost testdb=> <userinput>\pset tuples_only</userinput>
! Showing only tuples.
  peter@localhost testdb=> <userinput>SELECT second, first FROM my_table;</userinput>
  one,1
  two,2
  three,3
  four,4
  </programlisting>
    Alternatively, use the short commands:
  <programlisting>
--- 4641,4664 ----

  peter@localhost testdb=> <userinput>\pset border 1</userinput>
  Border style is 1.
! peter@localhost testdb=> <userinput>\pset format csv</userinput>
! Output format is csv.
  peter@localhost testdb=> <userinput>\pset tuples_only</userinput>
! Tuples only is on.
  peter@localhost testdb=> <userinput>SELECT second, first FROM my_table;</userinput>
  one,1
  two,2
  three,3
  four,4
+ peter@localhost testdb=> <userinput>\pset format unaligned</userinput>
+ Output format is unaligned.
+ peter@localhost testdb=> <userinput>\pset fieldsep '\t'</userinput>
+ Field separator is "    ".
+ peter@localhost testdb=> <userinput>SELECT second, first FROM my_table;</userinput>
+ one     1
+ two     2
+ three   3
+ four    4
  </programlisting>
    Alternatively, use the short commands:
  <programlisting>
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index ee88e1c..a18c8a7 100644
*** a/src/bin/psql/command.c
--- b/src/bin/psql/command.c
*************** exec_command_pset(PsqlScanState scan_sta
*** 1957,1964 ****

              int            i;
              static const char *const my_list[] = {
!                 "border", "columns", "expanded", "fieldsep", "fieldsep_zero",
!                 "footer", "format", "linestyle", "null",
                  "numericlocale", "pager", "pager_min_lines",
                  "recordsep", "recordsep_zero",
                  "tableattr", "title", "tuples_only",
--- 1957,1964 ----

              int            i;
              static const char *const my_list[] = {
!                 "border", "columns", "expanded", "fieldsep", "fieldsep_csv",
!                 "fieldsep_zero", "footer", "format", "linestyle", "null",
                  "numericlocale", "pager", "pager_min_lines",
                  "recordsep", "recordsep_zero",
                  "tableattr", "title", "tuples_only",
*************** _align2string(enum printFormat in)
*** 3616,3621 ****
--- 3616,3624 ----
          case PRINT_ASCIIDOC:
              return "asciidoc";
              break;
+         case PRINT_CSV:
+             return "csv";
+             break;
          case PRINT_HTML:
              return "html";
              break;
*************** do_pset(const char *param, const char *v
*** 3696,3701 ****
--- 3699,3705 ----
              /* remember to update error message below when adding more */
              {"aligned", PRINT_ALIGNED},
              {"asciidoc", PRINT_ASCIIDOC},
+             {"csv", PRINT_CSV},
              {"html", PRINT_HTML},
              {"latex", PRINT_LATEX},
              {"latex-longtable", PRINT_LATEX_LONGTABLE},
*************** do_pset(const char *param, const char *v
*** 3727,3733 ****
              }
              if (match_pos < 0)
              {
!                 psql_error("\\pset: allowed formats are aligned, asciidoc, html, latex, latex-longtable, troff-ms,
unaligned,wrapped\n"); 
                  return false;
              }
              else
--- 3731,3737 ----
              }
              if (match_pos < 0)
              {
!                 psql_error("\\pset: allowed formats are aligned, asciidoc, csv, html, latex, latex-longtable,
troff-ms,unaligned, wrapped\n"); 
                  return false;
              }
              else
*************** do_pset(const char *param, const char *v
*** 3865,3870 ****
--- 3869,3895 ----
          popt->topt.fieldSep.separator_zero = true;
      }

+     /* field separator for CSV format */
+     else if (strcmp(param, "fieldsep_csv") == 0)
+     {
+         if (value)
+         {
+             /* CSV separator has to be a one-byte character */
+             if (strlen(value) != 1)
+             {
+                 psql_error("\\pset: the CSV field separator must be a single character\n");
+                 return false;
+             }
+             if (value[0] == '"' || value[0] == '\n' || value[0] == '\r')
+             {
+                 psql_error("\\pset: the CSV field separator cannot be a double quote, a newline, or a carriage
return\n");
+                 return false;
+             }
+             free(popt->topt.fieldSepCsv);
+             popt->topt.fieldSepCsv = pg_strdup(value);
+         }
+     }
+
      /* record separator for unaligned text */
      else if (strcmp(param, "recordsep") == 0)
      {
*************** printPsetInfo(const char *param, struct
*** 4013,4018 ****
--- 4038,4050 ----
          printf(_("Field separator is zero byte.\n"));
      }

+     /* show field separator for CSV format */
+     else if (strcmp(param, "fieldsep_csv") == 0)
+     {
+         printf(_("Field separator for CSV is \"%s\".\n"),
+                popt->topt.fieldSepCsv);
+     }
+
      /* show disable "(x rows)" footer */
      else if (strcmp(param, "footer") == 0)
      {
*************** pset_value_string(const char *param, str
*** 4207,4212 ****
--- 4239,4246 ----
          return pset_quoted_string(popt->topt.fieldSep.separator
                                    ? popt->topt.fieldSep.separator
                                    : "");
+     else if (strcmp(param, "fieldsep_csv") == 0)
+         return pset_quoted_string(popt->topt.fieldSepCsv);
      else if (strcmp(param, "fieldsep_zero") == 0)
          return pstrdup(pset_bool_string(popt->topt.fieldSep.separator_zero));
      else if (strcmp(param, "footer") == 0)
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index 586aebd..961acd9 100644
*** a/src/bin/psql/help.c
--- b/src/bin/psql/help.c
*************** usage(unsigned short int pager)
*** 108,113 ****
--- 108,114 ----

      fprintf(output, _("\nOutput format options:\n"));
      fprintf(output, _("  -A, --no-align           unaligned table output mode\n"));
+     fprintf(output, _("      --csv                CSV (Comma-Separated Values) table output mode\n"));
      fprintf(output, _("  -F, --field-separator=STRING\n"
                        "                           field separator for unaligned output (default: \"%s\")\n"),
              DEFAULT_FIELD_SEP);
*************** slashUsage(unsigned short int pager)
*** 272,281 ****
      fprintf(output, _("  \\H                     toggle HTML output mode (currently %s)\n"),
              ON(pset.popt.topt.format == PRINT_HTML));
      fprintf(output, _("  \\pset [NAME [VALUE]]   set table output option\n"
!                       "                         (NAME := {border|columns|expanded|fieldsep|fieldsep_zero|\n"
!                       "                         footer|format|linestyle|null|numericlocale|pager|\n"
!                       "                         pager_min_lines|recordsep|recordsep_zero|tableattr|title|\n"
!                       "                         tuples_only|unicode_border_linestyle|\n"
                        "                         unicode_column_linestyle|unicode_header_linestyle})\n"));
      fprintf(output, _("  \\t [on|off]            show only rows (currently %s)\n"),
              ON(pset.popt.topt.tuples_only));
--- 273,282 ----
      fprintf(output, _("  \\H                     toggle HTML output mode (currently %s)\n"),
              ON(pset.popt.topt.format == PRINT_HTML));
      fprintf(output, _("  \\pset [NAME [VALUE]]   set table output option\n"
!                       "                         (NAME := {border|columns|expanded|fieldsep|fieldsep_csv|\n"
!                       "                         fieldsep_zero|footer|format|linestyle|null|numericlocale|\n"
!                       "                         pager|pager_min_lines|recordsep|recordsep_zero|tableattr|\n"
!                       "                         title|tuples_only|unicode_border_linestyle|\n"
                        "                         unicode_column_linestyle|unicode_header_linestyle})\n"));
      fprintf(output, _("  \\t [on|off]            show only rows (currently %s)\n"),
              ON(pset.popt.topt.tuples_only));
diff --git a/src/bin/psql/settings.h b/src/bin/psql/settings.h
index 69e617e..93d0b95 100644
*** a/src/bin/psql/settings.h
--- b/src/bin/psql/settings.h
***************
*** 13,18 ****
--- 13,19 ----
  #include "fe_utils/print.h"

  #define DEFAULT_FIELD_SEP "|"
+ #define DEFAULT_FIELD_SEP_CSV ","
  #define DEFAULT_RECORD_SEP "\n"

  #if defined(WIN32) || defined(__CYGWIN__)
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index be57574..84d02f4 100644
*** a/src/bin/psql/startup.c
--- b/src/bin/psql/startup.c
*************** main(int argc, char *argv[])
*** 144,149 ****
--- 144,151 ----
      pset.popt.topt.stop_table = true;
      pset.popt.topt.default_footer = true;

+     pset.popt.topt.fieldSepCsv = pg_strdup(DEFAULT_FIELD_SEP_CSV);
+
      pset.popt.topt.unicode_border_linestyle = UNICODE_LINESTYLE_SINGLE;
      pset.popt.topt.unicode_column_linestyle = UNICODE_LINESTYLE_SINGLE;
      pset.popt.topt.unicode_header_linestyle = UNICODE_LINESTYLE_SINGLE;
*************** parse_psql_options(int argc, char *argv[
*** 468,473 ****
--- 470,476 ----
          {"expanded", no_argument, NULL, 'x'},
          {"no-psqlrc", no_argument, NULL, 'X'},
          {"help", optional_argument, NULL, 1},
+         {"csv", no_argument, NULL, 2},
          {NULL, 0, NULL, 0}
      };

*************** parse_psql_options(int argc, char *argv[
*** 658,663 ****
--- 661,669 ----
                      exit(EXIT_SUCCESS);
                  }
                  break;
+             case 2:
+                 pset.popt.topt.format = PRINT_CSV;
+                 break;
              default:
          unknown_option:
                  fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 9dbd555..b51fd57 100644
*** a/src/bin/psql/tab-complete.c
--- b/src/bin/psql/tab-complete.c
*************** psql_completion(const char *text, int st
*** 2605,2610 ****
--- 2605,2611 ----
      /* Complete CREATE EVENT TRIGGER <name> ON with event_type */
      else if (Matches("CREATE", "EVENT", "TRIGGER", MatchAny, "ON"))
          COMPLETE_WITH("ddl_command_start", "ddl_command_end", "sql_drop");
+
      /*
       * Complete CREATE EVENT TRIGGER <name> ON <event_type>.  EXECUTE FUNCTION
       * is the recommended grammar instead of EXECUTE PROCEDURE in version 11
*************** psql_completion(const char *text, int st
*** 3525,3531 ****
          COMPLETE_WITH_QUERY(Query_for_list_of_roles);
      else if (TailMatchesCS("\\pset"))
          COMPLETE_WITH_CS("border", "columns", "expanded",
!                          "fieldsep", "fieldsep_zero", "footer", "format",
                           "linestyle", "null", "numericlocale",
                           "pager", "pager_min_lines",
                           "recordsep", "recordsep_zero",
--- 3526,3533 ----
          COMPLETE_WITH_QUERY(Query_for_list_of_roles);
      else if (TailMatchesCS("\\pset"))
          COMPLETE_WITH_CS("border", "columns", "expanded",
!                          "fieldsep", "fieldsep_csv", "fieldsep_zero",
!                          "footer", "format",
                           "linestyle", "null", "numericlocale",
                           "pager", "pager_min_lines",
                           "recordsep", "recordsep_zero",
*************** psql_completion(const char *text, int st
*** 3536,3542 ****
      else if (TailMatchesCS("\\pset", MatchAny))
      {
          if (TailMatchesCS("format"))
!             COMPLETE_WITH_CS("aligned", "asciidoc", "html", "latex",
                               "latex-longtable", "troff-ms", "unaligned",
                               "wrapped");
          else if (TailMatchesCS("linestyle"))
--- 3538,3544 ----
      else if (TailMatchesCS("\\pset", MatchAny))
      {
          if (TailMatchesCS("format"))
!             COMPLETE_WITH_CS("aligned", "asciidoc", "csv", "html", "latex",
                               "latex-longtable", "troff-ms", "unaligned",
                               "wrapped");
          else if (TailMatchesCS("linestyle"))
diff --git a/src/fe_utils/print.c b/src/fe_utils/print.c
index cb9a9a0..3b07280 100644
*** a/src/fe_utils/print.c
--- b/src/fe_utils/print.c
*************** print_aligned_vertical(const printTableC
*** 1737,1742 ****
--- 1737,1846 ----


  /**********************/
+ /* CSV format          */
+ /**********************/
+
+
+ static void
+ csv_escaped_print(const char *str, FILE *fout)
+ {
+     const char *p;
+
+     fputc('"', fout);
+     for (p = str; *p; p++)
+     {
+         if (*p == '"')
+             fputc('"', fout);    /* double quotes are doubled */
+         fputc(*p, fout);
+     }
+     fputc('"', fout);
+ }
+
+ static void
+ csv_print_field(const char *str, FILE *fout, const char *sep)
+ {
+     /*----------------
+      * Enclose and escape field contents when one of these conditions is met:
+      * - the field separator is found in the contents.
+      * - the field contains a CR or LF.
+      * - the field contains a double quote.
+      *----------------
+      */
+     if ((sep != NULL && *sep != '\0' && strstr(str, sep) != NULL) ||
+         strcspn(str, "\r\n\"") != strlen(str))
+         csv_escaped_print(str, fout);
+     else
+         fputs(str, fout);
+ }
+
+ static void
+ print_csv_text(const printTableContent *cont, FILE *fout)
+ {
+     const char *const *ptr;
+     int            i;
+
+     if (cancel_pressed)
+         return;
+
+     /*
+      * The title and footer are never printed in csv format. The header is
+      * printed if opt_tuples_only is false.
+      *
+      * Despite RFC 4180 saying that end of lines are CRLF, terminate lines
+      * with '\n', which prints out as the system-dependent EOL string in text
+      * mode (typically LF on Unix and CRLF on Windows).
+      */
+     if (cont->opt->start_table && !cont->opt->tuples_only)
+     {
+         /* print headers */
+         for (ptr = cont->headers; *ptr; ptr++)
+         {
+             if (ptr != cont->headers)
+                 fputs(cont->opt->fieldSepCsv, fout);
+             csv_print_field(*ptr, fout, cont->opt->fieldSepCsv);
+         }
+         fputc('\n', fout);
+     }
+
+     /* print cells */
+     for (i = 0, ptr = cont->cells; *ptr; i++, ptr++)
+     {
+         csv_print_field(*ptr, fout, cont->opt->fieldSepCsv);
+         if ((i + 1) % cont->ncolumns)
+             fputs(cont->opt->fieldSepCsv, fout);
+         else
+             fputc('\n', fout);
+     }
+ }
+
+ static void
+ print_csv_vertical(const printTableContent *cont, FILE *fout)
+ {
+     const char *const *ptr;
+     int            i;
+
+     /* print records */
+     for (i = 0, ptr = cont->cells; *ptr; i++, ptr++)
+     {
+         if (cancel_pressed)
+             return;
+
+         /* print name of column */
+         csv_print_field(cont->headers[i % cont->ncolumns], fout,
+                         cont->opt->fieldSepCsv);
+
+         /* print field separator */
+         fputs(cont->opt->fieldSepCsv, fout);
+
+         /* print field value */
+         csv_print_field(*ptr, fout, cont->opt->fieldSepCsv);
+
+         fputc('\n', fout);
+     }
+ }
+
+
+ /**********************/
  /* HTML printing ******/
  /**********************/

*************** printTable(const printTableContent *cont
*** 3234,3239 ****
--- 3338,3349 ----
              else
                  print_aligned_text(cont, fout, is_pager);
              break;
+         case PRINT_CSV:
+             if (cont->opt->expanded == 1)
+                 print_csv_vertical(cont, fout);
+             else
+                 print_csv_text(cont, fout);
+             break;
          case PRINT_HTML:
              if (cont->opt->expanded == 1)
                  print_html_vertical(cont, fout);
diff --git a/src/include/fe_utils/print.h b/src/include/fe_utils/print.h
index b761349..665a782 100644
*** a/src/include/fe_utils/print.h
--- b/src/include/fe_utils/print.h
*************** enum printFormat
*** 28,33 ****
--- 28,34 ----
      PRINT_NOTHING = 0,            /* to make sure someone initializes this */
      PRINT_ALIGNED,
      PRINT_ASCIIDOC,
+     PRINT_CSV,
      PRINT_HTML,
      PRINT_LATEX,
      PRINT_LATEX_LONGTABLE,
*************** typedef struct printTableOpt
*** 112,117 ****
--- 113,119 ----
      const printTextFormat *line_style;    /* line style (NULL for default) */
      struct separator fieldSep;    /* field separator for unaligned text mode */
      struct separator recordSep; /* record separator for unaligned text mode */
+     char       *fieldSepCsv;    /* field separator for csv format */
      bool        numericLocale;    /* locale-aware numeric units separator and
                                   * decimal marker */
      char       *tableAttr;        /* attributes for HTML <table ...> */
diff --git a/src/test/regress/expected/psql.out b/src/test/regress/expected/psql.out
index 3818cfe..ceea28f 100644
*** a/src/test/regress/expected/psql.out
--- b/src/test/regress/expected/psql.out
*************** border                   1
*** 262,267 ****
--- 262,268 ----
  columns                  0
  expanded                 off
  fieldsep                 '|'
+ fieldsep_csv             ','
  fieldsep_zero            off
  footer                   on
  format                   aligned
*************** last error message: division by zero
*** 3243,3245 ****
--- 3244,3389 ----
  \echo 'last error code:' :LAST_ERROR_SQLSTATE
  last error code: 22012
  \unset FETCH_COUNT
+ -- test csv format
+ -- test multi-line headers, wrapping, quoting rules and newline indicators
+ prepare q as select 'ab,cd' as col1,
+   'ab' as "col,2",
+   E'a\tb' as col3,
+   '"' as col4,
+   '""' as col5,
+   'a"b' as "col""6",
+   E'a\nb' as col7,
+   NULL as col8,
+   'ab' as "col
+ 9",
+   'cd' as "col
+
+ 10",
+   array['ab', E'cd\nef'] as col11,
+   '{"a":"a,b", "a,b":null, "c":"a,\"b"}'::json as col12
+   from generate_series(1,2);
+ \pset format csv
+ \pset fieldsep_csv ','
+ \pset expanded off
+ \t off
+ execute q;
+ col1,"col,2",col3,col4,col5,"col""6",col7,col8,"col
+ 9","col
+
+ 10",col11,col12
+ "ab,cd",ab,a    b,"""","""""","a""b","a
+ b",,ab,cd,"{ab,""cd
+ ef""}","{""a"":""a,b"", ""a,b"":null, ""c"":""a,\""b""}"
+ "ab,cd",ab,a    b,"""","""""","a""b","a
+ b",,ab,cd,"{ab,""cd
+ ef""}","{""a"":""a,b"", ""a,b"":null, ""c"":""a,\""b""}"
+ \pset fieldsep_csv '\t'
+ execute q;
+ col1    col,2    col3    col4    col5    "col""6"    col7    col8    "col
+ 9"    "col
+
+ 10"    col11    col12
+ ab,cd    ab    "a    b"    """"    """"""    "a""b"    "a
+ b"        ab    cd    "{ab,""cd
+ ef""}"    "{""a"":""a,b"", ""a,b"":null, ""c"":""a,\""b""}"
+ ab,cd    ab    "a    b"    """"    """"""    "a""b"    "a
+ b"        ab    cd    "{ab,""cd
+ ef""}"    "{""a"":""a,b"", ""a,b"":null, ""c"":""a,\""b""}"
+ \t on
+ execute q;
+ ab,cd    ab    "a    b"    """"    """"""    "a""b"    "a
+ b"        ab    cd    "{ab,""cd
+ ef""}"    "{""a"":""a,b"", ""a,b"":null, ""c"":""a,\""b""}"
+ ab,cd    ab    "a    b"    """"    """"""    "a""b"    "a
+ b"        ab    cd    "{ab,""cd
+ ef""}"    "{""a"":""a,b"", ""a,b"":null, ""c"":""a,\""b""}"
+ \t off
+ \pset expanded on
+ execute q;
+ col1    ab,cd
+ col,2    ab
+ col3    "a    b"
+ col4    """"
+ col5    """"""
+ "col""6"    "a""b"
+ col7    "a
+ b"
+ col8
+ "col
+ 9"    ab
+ "col
+
+ 10"    cd
+ col11    "{ab,""cd
+ ef""}"
+ col12    "{""a"":""a,b"", ""a,b"":null, ""c"":""a,\""b""}"
+ col1    ab,cd
+ col,2    ab
+ col3    "a    b"
+ col4    """"
+ col5    """"""
+ "col""6"    "a""b"
+ col7    "a
+ b"
+ col8
+ "col
+ 9"    ab
+ "col
+
+ 10"    cd
+ col11    "{ab,""cd
+ ef""}"
+ col12    "{""a"":""a,b"", ""a,b"":null, ""c"":""a,\""b""}"
+ \pset fieldsep_csv ','
+ execute q;
+ col1,"ab,cd"
+ "col,2",ab
+ col3,a    b
+ col4,""""
+ col5,""""""
+ "col""6","a""b"
+ col7,"a
+ b"
+ col8,
+ "col
+ 9",ab
+ "col
+
+ 10",cd
+ col11,"{ab,""cd
+ ef""}"
+ col12,"{""a"":""a,b"", ""a,b"":null, ""c"":""a,\""b""}"
+ col1,"ab,cd"
+ "col,2",ab
+ col3,a    b
+ col4,""""
+ col5,""""""
+ "col""6","a""b"
+ col7,"a
+ b"
+ col8,
+ "col
+ 9",ab
+ "col
+
+ 10",cd
+ col11,"{ab,""cd
+ ef""}"
+ col12,"{""a"":""a,b"", ""a,b"":null, ""c"":""a,\""b""}"
+ -- illegal csv separators
+ \pset fieldsep_csv ''
+ \pset: the CSV field separator must be a single character
+ \pset fieldsep_csv ',,'
+ \pset: the CSV field separator must be a single character
+ \pset fieldsep_csv '\0'
+ \pset: the CSV field separator must be a single character
+ \pset fieldsep_csv '\n'
+ \pset: the CSV field separator cannot be a double quote, a newline, or a carriage return
+ \pset fieldsep_csv '\r'
+ \pset: the CSV field separator cannot be a double quote, a newline, or a carriage return
+ \pset fieldsep_csv '"'
+ \pset: the CSV field separator cannot be a double quote, a newline, or a carriage return
+ deallocate q;
+ \pset format aligned
+ \pset expanded off
+ \t off
diff --git a/src/test/regress/sql/psql.sql b/src/test/regress/sql/psql.sql
index b45da9b..fd2d0df 100644
*** a/src/test/regress/sql/psql.sql
--- b/src/test/regress/sql/psql.sql
*************** select 1/(15-unique2) from tenk1 order b
*** 688,690 ****
--- 688,737 ----
  \echo 'last error code:' :LAST_ERROR_SQLSTATE

  \unset FETCH_COUNT
+
+ -- test csv format
+ -- test multi-line headers, wrapping, quoting rules and newline indicators
+ prepare q as select 'ab,cd' as col1,
+   'ab' as "col,2",
+   E'a\tb' as col3,
+   '"' as col4,
+   '""' as col5,
+   'a"b' as "col""6",
+   E'a\nb' as col7,
+   NULL as col8,
+   'ab' as "col
+ 9",
+   'cd' as "col
+
+ 10",
+   array['ab', E'cd\nef'] as col11,
+   '{"a":"a,b", "a,b":null, "c":"a,\"b"}'::json as col12
+   from generate_series(1,2);
+
+ \pset format csv
+ \pset fieldsep_csv ','
+ \pset expanded off
+ \t off
+ execute q;
+ \pset fieldsep_csv '\t'
+ execute q;
+ \t on
+ execute q;
+ \t off
+ \pset expanded on
+ execute q;
+ \pset fieldsep_csv ','
+ execute q;
+
+ -- illegal csv separators
+ \pset fieldsep_csv ''
+ \pset fieldsep_csv ',,'
+ \pset fieldsep_csv '\0'
+ \pset fieldsep_csv '\n'
+ \pset fieldsep_csv '\r'
+ \pset fieldsep_csv '"'
+
+ deallocate q;
+ \pset format aligned
+ \pset expanded off
+ \t off

Re: csv format for psql

From

Corey Huinker

Date:

26 November 2018, 04:19:11

Or we could kill both issues by hard-wiring the separator as ','.

I've never encountered a situation where a customer wanted a custom delimiter AND quoted strings. So either they wanted pure CSV or a customed TSV.

Could we have another output type called "separated" that uses the existing --fieldsep / --recordsep? Word will get out that csv is faster, but we'd still have the flexibility if somebody really wanted it.

Re: csv format for psql

From

Tom Lane

Date:

26 November 2018, 04:23:16

Corey Huinker <corey.huinker@gmail.com> writes:
> Could we have another output type called "separated" that uses the existing
> --fieldsep / --recordsep?

Uh, what's the difference from the existing unaligned format?

            regards, tom lane

Re: csv format for psql

From

Andrew Gierth

Date:

26 November 2018, 04:36:19

>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 Tom> Or we could kill both issues by hard-wiring the separator as ','.

Using ';' for the delimiter isn't all that rare.

But I don't see any reason to allow multibyte or non-ascii characters or
arbitrary strings.

-- 
Andrew (irc:RhodiumToad)

Re: csv format for psql

From

Pavel Stehule

Date:

26 November 2018, 04:53:05

po 26. 11. 2018 v 5:36 odesílatel Andrew Gierth <andrew@tao11.riddles.org.uk> napsal:

>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

Tom> Or we could kill both issues by hard-wiring the separator as ','.

Using ';' for the delimiter isn't all that rare.

; is default for CSV produced by MS Excel in Czech mutation - so for some countries, like CR is common.

Regards

Pavel

But I don't see any reason to allow multibyte or non-ascii characters or
arbitrary strings.

--
Andrew (irc:RhodiumToad)

Re: csv format for psql

From

Corey Huinker

Date:

26 November 2018, 04:56:56

On Sun, Nov 25, 2018 at 11:23 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Corey Huinker <corey.huinker@gmail.com> writes:
> Could we have another output type called "separated" that uses the existing
> --fieldsep / --recordsep?

Uh, what's the difference from the existing unaligned format?

No footer and I guess we'd want to escape instances of fieldsep and recordsep in the data, so I guess if we had an option to escape instances of fieldsep/recordsep found in the data, unaligned would work fine.

Re: csv format for psql

From

Magnus Hagander

Date:

26 November 2018, 09:16:09

n Mon, Nov 26, 2018 at 5:53 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:

po 26. 11. 2018 v 5:36 odesílatel Andrew Gierth <andrew@tao11.riddles.org.uk> napsal:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

Tom> Or we could kill both issues by hard-wiring the separator as ','.

Using ';' for the delimiter isn't all that rare.

; is default for CSV produced by MS Excel in Czech mutation - so for some countries, like CR is common.

I believe this is a common occurance in countries that use decimal comma instead of decimal point, which is a fairly large portion of the world.

/Magnus

Re: csv format for psql

From

"Daniel Verite"

Date:

26 November 2018, 11:40:31

    Tom Lane wrote:

> And, in fact, right now *none* of psql's table output formats is both
> unambiguous and reasonably simple/popular to use.  So the astonishing
> thing about this patch, IMO, is that we didn't do it a decade ago.

Yeah, that's what motivated this submission in the first place.

> I feel that if we allow multi-byte characters here, we might as well
> take the training wheels off and just say you can use any separator
> string you want, as long as it doesn't contain double quote, \r, or \n.

The reason behind disallowing multiple characters was the
likeliness of them being mistakes. For instance, this example came
up at some point in the discussion:
  \pset fieldsep_csv ,,
To me the probability that a user has fat-fingered this is pretty high,
and this would silently produce a really bogus file.

Another kind of mistake comes from the difficulty of properly
quoting on the command line:
psql -- csv -P fieldsep_csv='\t'
would be interpreted as a two-character separator despite being
obviously not the user's intention.

About disallowing characters beyond US-ASCII, I can't find a similar
justification. COPY does not allow them, but it's justified (in the
archives) by the fear of being slower when importing, which is not a
concern here.

> We could avoid this self-inflicted confusion by choosing a different
> parameter name.  I'd be good with "csv_fieldsep" or "csvfieldsep".

+1

> Or we could kill both issues by hard-wiring the separator as ','.

Ideally people would understand that they can use -A for any delimiter
but no quoting, or --csv with strict quoting and in that case a fixed
delimiter is fine, since it's going to be safely quoted, its presence in
the data is not a problem. But I'm not too confident that everyone
would understand it that way, even if it's well explained in the doc.

When one is told "please produce CSV files with semi-colons as
separators", it's simpler to just produce that rather than arguing
that these requirements are probably ill-advised.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: csv format for psql

From

"David G. Johnston"

Date:

26 November 2018, 15:28:46

On Sun, Nov 25, 2018 at 6:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I think there are two remaining points to settle:
>
> 1. Are we limiting the separator to be a single-byte character or not?

I agree with what others have said that expanding functionality in
this direction is more likely to mask errors than be useful.

> I'm a bit inclined to the former viewpoint.
> If we were in the business of being restrictive, why would we allow
> the field separator to be changed at all?  The name of the format is
> *comma* separated values, not something else.

I still stand by the more inclusive, and arguably modern, name
"character separated values" for the abbreviation...which can be taken
to mean any single character quite easily and is how it appears to be
used these days in any case.

> 2. Speaking of the field separator, I'm pretty desperately unhappy
> with the choice of "fieldsep_csv" as the parameter name.[...]
> We could avoid this self-inflicted confusion by choosing a different
> parameter name.  I'd be good with "csv_fieldsep" or "csvfieldsep".

Make sense to me - with the underscore personally.

David J.

Re: csv format for psql

From

Tom Lane

Date:

26 November 2018, 16:14:05

"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Sun, Nov 25, 2018 at 6:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> 1. Are we limiting the separator to be a single-byte character or not?

> I agree with what others have said that expanding functionality in
> this direction is more likely to mask errors than be useful.

OK, reasonable arguments were made why not to allow multi-character
separators.  Should we then match the server and insist on a single-byte
separator?  It's a bit inconsistent if psql can be made to emit "csv"
files that COPY can't read, especially when it's otherwise a subset
of what COPY allows.

>> 2. Speaking of the field separator, I'm pretty desperately unhappy
>> with the choice of "fieldsep_csv" as the parameter name.[...]
>> We could avoid this self-inflicted confusion by choosing a different
>> parameter name.  I'd be good with "csv_fieldsep" or "csvfieldsep".

> Make sense to me - with the underscore personally.

Barring complaints, I'll switch it to "csv_fieldsep".

            regards, tom lane

Re: csv format for psql

From

"Daniel Verite"

Date:

26 November 2018, 17:47:32

    Tom Lane wrote:

> OK, reasonable arguments were made why not to allow multi-character
> separators.  Should we then match the server and insist on a single-byte
> separator?  It's a bit inconsistent if psql can be made to emit "csv"
> files that COPY can't read, especially when it's otherwise a subset
> of what COPY allows.

I seem to be the only one advocating to accept an Unicode
character here, and I won't insist on it.

There's still a minor annoyance to solve if we want to claim full
compatibility with COPY CSV: backslash followed by dot must be
quoted to avoid being interpreted as an end of data indicator.

A proposed fix is attached. print_csv_vertical() is left unchanged
because it's not possible currently to end up with \. alone
on a line with the expanded display: we'd need to allow
first for an empty field separator, I believe.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

psql-csv-backslash-dot-fix.txt

Re: csv format for psql

From

"Daniel Verite"

Date:

26 November 2018, 19:01:20

    I wrote:

> A proposed fix is attached. print_csv_vertical() is left unchanged
> because it's not possible currently to end up with \. alone
> on a line with the expanded display

On second thought, it is possible

postgres=# \pset format csv
Output format is csv.

postgres=# \pset fieldsep_csv '.'
Field separator for CSV is ".".

postgres=# \x
Expanded display is on.

postgres=# select '' as "\" ;
\.

PFA an upgraded fix.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

psql-csv-backslash-dot-fix-v2.txt

Re: csv format for psql

From

Tom Lane

Date:

26 November 2018, 20:37:11

"Daniel Verite" <daniel@manitou-mail.org> writes:
>> A proposed fix is attached. print_csv_vertical() is left unchanged
>> because it's not possible currently to end up with \. alone
>> on a line with the expanded display

> On second thought, it is possible
> postgres=# \pset format csv
> Output format is csv.
> postgres=# \pset fieldsep_csv '.'
> Field separator for CSV is ".".
> postgres=# \x
> Expanded display is on.
> postgres=# select '' as "\" ;
> \.

Ugh.  There's more to that than meets the eye; consider also

\pset csv_fieldsep '.'
select '\' as d1, '' as d2;

or the reverse case where we set the fieldsep to '\' and then
emit a second field containing '.'.

I think that the approach you're using doesn't really scale to handle
all these cases sanely, so what I did instead was just to make
csv_print_field force field quoting if any of these cases could
possibly apply.  That will result in excess quoting in some
cases, but I think that's fine, since they're all pretty uncommon.

(BTW, it seems only chance that the server's CSV mode isn't also
subject to this case, but since it always quotes empty fields,
I think we're OK there.)

Pushed with that correction and some other last-minute review.

            regards, tom lane

Re: csv format for psql

From

Pavel Stehule

Date:

26 November 2018, 22:05:51

Pushed with that correction and some other last-minute review.

Great! Thank you

Pavel

regards, tom lane

Re: csv format for psql

From

"Daniel Verite"

Date:

27 November 2018, 12:02:01

    Tom Lane wrote:

> what I did instead was just to make
> csv_print_field force field quoting if any of these cases could
> possibly apply.  That will result in excess quoting in some
> cases, but I think that's fine, since they're all pretty uncommon.

Indeed.

> (BTW, it seems only chance that the server's CSV mode isn't also
> subject to this case, but since it always quotes empty fields,
> I think we're OK there.)
>
> Pushed with that correction and some other last-minute review.

Thanks!


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite