Thread: Doc patch on psql output formats
Hi, psql's documentation has this mention about output formats: "Unique abbreviations are allowed. (That would mean one letter is enough.)" but "one letter is enough" is not true since 9.3 that added "latex-longtable" sharing the same start as "latex", and then 9.5 added "asciidoc" with the same first letter as "aligned". When a non-unique abbreviation is used, psql uses the first match in an arbitrary order defined in do_pset() by a cascade of pg_strncasecmp(). (the recent commit add9182e reorders them alphabetically but it turns out that it doesn't change the relative positions of "aligned" / "asciidoc", or "latex" / "latex-longtables" so \pset format a and \pset format l will continue to select "aligned" and "latex" as before). Anyway, "Unique abbreviations are allowed" deserves to be changed as well. PFA a doc patch to say simply "Abbreviations are allowed". Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
Attachment
"Daniel Verite" <daniel@manitou-mail.org> writes: > psql's documentation has this mention about output formats: > "Unique abbreviations are allowed. (That would mean one letter is enough.)" > but "one letter is enough" is not true since 9.3 that added > "latex-longtable" sharing the same start as "latex", and then > 9.5 added "asciidoc" with the same first letter as "aligned". Yeah, that text has clearly outstayed its welcome. > When a non-unique abbreviation is used, psql uses the first > match in an arbitrary order defined in do_pset() by > a cascade of pg_strncasecmp(). Ugh. Should we not fix the code so that it complains if there's not a unique match? I would bet that the code was also written on the assumption that any abbrevation must be unique. regards, tom lane
Tom Lane wrote: > > When a non-unique abbreviation is used, psql uses the first > > match in an arbitrary order defined in do_pset() by > > a cascade of pg_strncasecmp(). > > Ugh. Should we not fix the code so that it complains if there's > not a unique match? I would bet that the code was also written > on the assumption that any abbrevation must be unique. There's a backward compatibility issue, since a script issuing \pset format a would now fail instead of setting the format to "aligned". The doc patch is meant to describe the current behavior. OTOH if we don't fail with non-unique matches, there is the risk that in the future, a new format matching /^a/ will come alphabetically before "aligned", and will be picked up instead of "aligned". In both cases using abbreviations in scripts is not a great idea. Anyway I will look into the changes required in do_pset to implement the error on multiple matches, if that's the preferred behavior. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
On Tue, Nov 06, 2018 at 06:18:37PM +0100, Daniel Verite wrote: > In both cases using abbreviations in scripts is not a great > idea. Anyway I will look into the changes required in do_pset to > implement the error on multiple matches, if that's the preferred > behavior. You would also break the compatibility of a script using only "a" by issuing an error. Anyway, complaining about an unmatch sounds better than a silent failure. -- Michael
Attachment
Tom Lane wrote: > > but "one letter is enough" is not true since 9.3 that added > > "latex-longtable" sharing the same start as "latex", and then > > 9.5 added "asciidoc" with the same first letter as "aligned". > > Yeah, that text has clearly outstayed its welcome. > > > When a non-unique abbreviation is used, psql uses the first > > match in an arbitrary order defined in do_pset() by > > a cascade of pg_strncasecmp(). > > Ugh. Should we not fix the code so that it complains if there's > not a unique match? I would bet that the code was also written > on the assumption that any abbrevation must be unique. Here's a patch making "\pset format" reject ambiguous abbreviations. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
Attachment
"Daniel Verite" <daniel@manitou-mail.org> writes: > Tom Lane wrote: >> Ugh. Should we not fix the code so that it complains if there's >> not a unique match? I would bet that the code was also written >> on the assumption that any abbrevation must be unique. > Here's a patch making "\pset format" reject ambiguous abbreviations. Pushed. (I simplified the code a bit by using just one state variable, and also made the error message more verbose.) regards, tom lane
Tom Lane wrote: > Pushed. (I simplified the code a bit by using just one state variable, > and also made the error message more verbose.) Thanks! Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
"Daniel Verite" <daniel@manitou-mail.org> writes: > Tom Lane wrote: >> Pushed. (I simplified the code a bit by using just one state variable, >> and also made the error message more verbose.) > Thanks! I noticed while poking at the csv patch that we'd outsmarted ourselves with this one. As of HEAD, it's impossible to select latex output format at all: regression=# \pset format latex \pset: ambiguous abbreviation "latex" matches both "latex" and "latex-longtable" We could fix that by adding a special case to accept an exact match immediately. However, that would still leave us in a situation where "latex" can never be abbreviated at all, which does not seem very nice (not to mention not being backwards-compatible). Instead I propose treating "latex-longtable" as a special case, as attached. With this code, "l", "la", up through "latex" are all accepted as "latex", while "latex-" through "latex-longtable" are accepted as "latex-longtable". This has less ideological purity than one could wish, but it's backwards compatible and arguably a lot more user-friendly than what we'd have if we insist on an exact match for "latex". In future, let's reject any proposal to invent switch or option names such that one is a prefix of another. regards, tom lane diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index ee88e1c..13d4c57 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -3698,7 +3698,6 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet) {"asciidoc", PRINT_ASCIIDOC}, {"html", PRINT_HTML}, {"latex", PRINT_LATEX}, - {"latex-longtable", PRINT_LATEX_LONGTABLE}, {"troff-ms", PRINT_TROFF_MS}, {"unaligned", PRINT_UNALIGNED}, {"wrapped", PRINT_WRAPPED} @@ -3725,13 +3724,22 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet) } } } - if (match_pos < 0) + if (match_pos >= 0) + popt->topt.format = formats[match_pos].number; + else if (pg_strncasecmp("latex-longtable", value, vallen) == 0) + { + /* + * We must treat latex-longtable specially because latex is a + * prefix of it; if both were in the table above, we'd think + * "latex" is ambiguous. + */ + popt->topt.format = PRINT_LATEX_LONGTABLE; + } + else { psql_error("\\pset: allowed formats are aligned, asciidoc, html, latex, latex-longtable, troff-ms, unaligned,wrapped\n"); return false; } - else - popt->topt.format = formats[match_pos].number; } }
Tom Lane wrote: > As of HEAD, it's impossible to select latex output format > at all: > > regression=# \pset format latex > \pset: ambiguous abbreviation "latex" matches both "latex" and > "latex-longtable" Oops! > We could fix that by adding a special case to accept an exact match > immediately. Personally I would favor that one, but to me the problem is that allowing abbreviations doesn't really work well in the long run, that is, if new format names are going to appear recurringly in the future. In interactive mode, tab-completion seems good enough (maybe \pset was not tab-completed in the past, when abbreviations were initially implemented?). In non-interactive mode, anything abbreviated may clash in the future with another format, so there's a forward-compatibility hazzard that should motivate to avoid them as well in scripts. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
"Daniel Verite" <daniel@manitou-mail.org> writes: > Tom Lane wrote: >> As of HEAD, it's impossible to select latex output format >> at all: >> regression=# \pset format latex >> \pset: ambiguous abbreviation "latex" matches both "latex" and >> "latex-longtable" > Oops! >> We could fix that by adding a special case to accept an exact match >> immediately. > Personally I would favor that one, but to me the problem is that > allowing abbreviations doesn't really work well in the long run, > that is, if new format names are going to appear recurringly in the > future. Perhaps, but that ship sailed years ago. I do not think we can or should remove the ability to write abbreviations here. People writing scripts are obviously taking a risk of future breakage if they abbreviate, but how is it an improvement in their lives if we change the risk from "uncertain" to "100%"? regards, tom lane