Thread: Doc patch on psql output formats

Doc patch on psql output formats

From

"Daniel Verite"

Date:

06 November 2018, 14:46:32

  Hi,

psql's documentation has this mention about output formats:
 "Unique abbreviations are allowed. (That would mean one letter is enough.)"

but "one letter is enough" is not true since 9.3 that added
"latex-longtable" sharing the same start as "latex", and then
9.5 added "asciidoc" with the same first letter as "aligned".

When a non-unique abbreviation is used, psql uses the first
match in an arbitrary order defined in do_pset() by
a cascade of pg_strncasecmp().
(the recent commit add9182e reorders them alphabetically
but it turns out that it doesn't change the relative positions
of  "aligned" / "asciidoc", or "latex" / "latex-longtables"
so \pset format a and \pset format l will continue to
select "aligned" and "latex" as before).

Anyway, "Unique abbreviations are allowed" deserves to
be changed as well.

PFA a doc patch to say simply "Abbreviations are allowed".


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

doc-psql-output-formats.patch

Re: Doc patch on psql output formats

From

Tom Lane

Date:

06 November 2018, 15:03:14

"Daniel Verite" <daniel@manitou-mail.org> writes:
> psql's documentation has this mention about output formats:
>  "Unique abbreviations are allowed. (That would mean one letter is enough.)"

> but "one letter is enough" is not true since 9.3 that added
> "latex-longtable" sharing the same start as "latex", and then 
> 9.5 added "asciidoc" with the same first letter as "aligned".

Yeah, that text has clearly outstayed its welcome.

> When a non-unique abbreviation is used, psql uses the first
> match in an arbitrary order defined in do_pset() by
> a cascade of pg_strncasecmp().

Ugh.  Should we not fix the code so that it complains if there's
not a unique match?  I would bet that the code was also written
on the assumption that any abbrevation must be unique.

            regards, tom lane

Re: Doc patch on psql output formats

From

"Daniel Verite"

Date:

06 November 2018, 17:18:37

    Tom Lane wrote:

> > When a non-unique abbreviation is used, psql uses the first
> > match in an arbitrary order defined in do_pset() by
> > a cascade of pg_strncasecmp().
>
> Ugh.  Should we not fix the code so that it complains if there's
> not a unique match?  I would bet that the code was also written
> on the assumption that any abbrevation must be unique.

There's a backward compatibility issue, since a script issuing
  \pset format a
would now fail instead of setting the format to "aligned".
The doc patch is meant to describe the current behavior.

OTOH if we don't fail with non-unique matches, there is the risk that
in the future, a new format matching /^a/ will come alphabetically
before "aligned", and will be picked up instead of "aligned".

In both cases using abbreviations in scripts is not a great
idea. Anyway I will look into the changes required in do_pset to
implement the error on multiple matches, if that's the preferred
behavior.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: Doc patch on psql output formats

From

Michael Paquier

Date:

07 November 2018, 01:56:00

On Tue, Nov 06, 2018 at 06:18:37PM +0100, Daniel Verite wrote:
> In both cases using abbreviations in scripts is not a great
> idea. Anyway I will look into the changes required in do_pset to
> implement the error on multiple matches, if that's the preferred
> behavior.

You would also break the compatibility of a script using only "a" by
issuing an error.  Anyway, complaining about an unmatch sounds better
than a silent failure.
--
Michael

Attachment

signature.asc

Re: Doc patch on psql output formats

From

"Daniel Verite"

Date:

14 November 2018, 17:22:09

    Tom Lane wrote:

> > but "one letter is enough" is not true since 9.3 that added
> > "latex-longtable" sharing the same start as "latex", and then
> > 9.5 added "asciidoc" with the same first letter as "aligned".
>
> Yeah, that text has clearly outstayed its welcome.
>
> > When a non-unique abbreviation is used, psql uses the first
> > match in an arbitrary order defined in do_pset() by
> > a cascade of pg_strncasecmp().
>
> Ugh.  Should we not fix the code so that it complains if there's
> not a unique match?  I would bet that the code was also written
> on the assumption that any abbrevation must be unique.

Here's a patch making "\pset format" reject ambiguous abbreviations.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

psql-format-abbreviations.patch

Re: Doc patch on psql output formats

From

Tom Lane

Date:

14 November 2018, 21:41:27

"Daniel Verite" <daniel@manitou-mail.org> writes:
>     Tom Lane wrote:
>> Ugh.  Should we not fix the code so that it complains if there's
>> not a unique match?  I would bet that the code was also written
>> on the assumption that any abbrevation must be unique.

> Here's a patch making "\pset format" reject ambiguous abbreviations.

Pushed.  (I simplified the code a bit by using just one state variable,
and also made the error message more verbose.)

            regards, tom lane

Re: Doc patch on psql output formats

From

"Daniel Verite"

Date:

15 November 2018, 11:49:28

    Tom Lane wrote:

> Pushed.  (I simplified the code a bit by using just one state variable,
> and also made the error message more verbose.)

Thanks!


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: Doc patch on psql output formats

From

Tom Lane

Date:

25 November 2018, 23:01:26

"Daniel Verite" <daniel@manitou-mail.org> writes:
> Tom Lane wrote:
>> Pushed.  (I simplified the code a bit by using just one state variable,
>> and also made the error message more verbose.)

> Thanks!

I noticed while poking at the csv patch that we'd outsmarted ourselves
with this one.  As of HEAD, it's impossible to select latex output format
at all:

regression=# \pset format latex
\pset: ambiguous abbreviation "latex" matches both "latex" and "latex-longtable"

We could fix that by adding a special case to accept an exact match
immediately.  However, that would still leave us in a situation where
"latex" can never be abbreviated at all, which does not seem very nice
(not to mention not being backwards-compatible).  Instead I propose
treating "latex-longtable" as a special case, as attached.  With this
code, "l", "la", up through "latex" are all accepted as "latex", while
"latex-" through "latex-longtable" are accepted as "latex-longtable".
This has less ideological purity than one could wish, but it's backwards
compatible and arguably a lot more user-friendly than what we'd have
if we insist on an exact match for "latex".

In future, let's reject any proposal to invent switch or option names
such that one is a prefix of another.

            regards, tom lane

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index ee88e1c..13d4c57 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3698,7 +3698,6 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
             {"asciidoc", PRINT_ASCIIDOC},
             {"html", PRINT_HTML},
             {"latex", PRINT_LATEX},
-            {"latex-longtable", PRINT_LATEX_LONGTABLE},
             {"troff-ms", PRINT_TROFF_MS},
             {"unaligned", PRINT_UNALIGNED},
             {"wrapped", PRINT_WRAPPED}
@@ -3725,13 +3724,22 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
                     }
                 }
             }
-            if (match_pos < 0)
+            if (match_pos >= 0)
+                popt->topt.format = formats[match_pos].number;
+            else if (pg_strncasecmp("latex-longtable", value, vallen) == 0)
+            {
+                /*
+                 * We must treat latex-longtable specially because latex is a
+                 * prefix of it; if both were in the table above, we'd think
+                 * "latex" is ambiguous.
+                 */
+                popt->topt.format = PRINT_LATEX_LONGTABLE;
+            }
+            else
             {
                 psql_error("\\pset: allowed formats are aligned, asciidoc, html, latex, latex-longtable, troff-ms,
unaligned,wrapped\n"); 
                 return false;
             }
-            else
-                popt->topt.format = formats[match_pos].number;
         }
     }

Re: Doc patch on psql output formats

From

"Daniel Verite"

Date:

26 November 2018, 12:00:36

    Tom Lane wrote:

> As of HEAD, it's impossible to select latex output format
> at all:
>
> regression=# \pset format latex
> \pset: ambiguous abbreviation "latex" matches both "latex" and
> "latex-longtable"

Oops!

> We could fix that by adding a special case to accept an exact match
> immediately.

Personally I would favor that one, but to me the problem is that
allowing abbreviations doesn't really work well in the long run,
that is, if new format names are going to appear recurringly in the
future.

In interactive mode, tab-completion seems good enough (maybe \pset was
not tab-completed in the past, when abbreviations were initially
implemented?).

In non-interactive mode, anything abbreviated may clash in the future
with another format, so there's a forward-compatibility hazzard that
should motivate to avoid them as well in scripts.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: Doc patch on psql output formats

From

Tom Lane

Date:

26 November 2018, 14:39:50

"Daniel Verite" <daniel@manitou-mail.org> writes:
>     Tom Lane wrote:
>> As of HEAD, it's impossible to select latex output format
>> at all:
>> regression=# \pset format latex
>> \pset: ambiguous abbreviation "latex" matches both "latex" and
>> "latex-longtable"

> Oops!

>> We could fix that by adding a special case to accept an exact match
>> immediately.

> Personally I would favor that one, but to me the problem is that
> allowing abbreviations doesn't really work well in the long run,
> that is, if new format names are going to appear recurringly in the
> future.

Perhaps, but that ship sailed years ago.  I do not think we can or
should remove the ability to write abbreviations here.  People
writing scripts are obviously taking a risk of future breakage
if they abbreviate, but how is it an improvement in their lives
if we change the risk from "uncertain" to "100%"?

            regards, tom lane