Thread: Make psql ignore trailing semicolons in \sf, \ef, etc

Make psql ignore trailing semicolons in \sf, \ef, etc

From
Tom Lane
Date:
We had a complaint (see [1], but it's not the first IIRC) about how
psql doesn't behave very nicely if one ends \sf or allied commands
with a semicolon:

regression=# \sf sin(float8);
ERROR:  expected a right parenthesis

This is a bit of a usability gotcha, since many other backslash
commands are forgiving about trailing semicolons.  I looked at
the code and found that it's actually trying to ignore semicolons,
by passing semicolon = true to psql_scan_slash_option.  But that
fails to work because it's also passing type = OT_WHOLE_LINE,
and the whole-line code path ignores the semicolon flag.  Probably
that's just because nobody needed to use that combination back in
the day.  There's another user of OT_WHOLE_LINE, exec_command_help,
which also wants this behavior and has written its own stripping
code to get it.  That seems pretty silly, so here's a quick finger
exercise to move that logic into psql_scan_slash_option.

Is this enough of a bug to deserve back-patching?  I'm not sure.

            regards, tom lane

[1] https://www.postgresql.org/message-id/CAEs%3D6D%3DnwX2wm0hjkaw6C_LnqR%2BNFtnnzbSzeZq-xcfi_ooKSw%40mail.gmail.com

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 9103bc3465..2f74bfdc96 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1640,18 +1640,7 @@ exec_command_help(PsqlScanState scan_state, bool active_branch)
     if (active_branch)
     {
         char       *opt = psql_scan_slash_option(scan_state,
-                                                 OT_WHOLE_LINE, NULL, false);
-        size_t        len;
-
-        /* strip any trailing spaces and semicolons */
-        if (opt)
-        {
-            len = strlen(opt);
-            while (len > 0 &&
-                   (isspace((unsigned char) opt[len - 1])
-                    || opt[len - 1] == ';'))
-                opt[--len] = '\0';
-        }
+                                                 OT_WHOLE_LINE, NULL, true);

         helpSQL(opt, pset.popt.topt.pager);
         free(opt);
@@ -3165,6 +3154,10 @@ ignore_slash_filepipe(PsqlScanState scan_state)
  * This *MUST* be used for inactive-branch processing of any slash command
  * that takes an OT_WHOLE_LINE option.  Otherwise we might consume a different
  * amount of option text in active and inactive cases.
+ *
+ * Note: although callers might pass "semicolon" as either true or false,
+ * we need not duplicate that here, since it doesn't affect the amount of
+ * input text consumed.
  */
 static void
 ignore_slash_whole_line(PsqlScanState scan_state)
diff --git a/src/bin/psql/psqlscanslash.l b/src/bin/psql/psqlscanslash.l
index 514977e59d..10a35dffc4 100644
--- a/src/bin/psql/psqlscanslash.l
+++ b/src/bin/psql/psqlscanslash.l
@@ -18,6 +18,8 @@
  */
 #include "postgres_fe.h"

+#include <ctype.h>
+
 #include "common.h"
 #include "psqlscanslash.h"

@@ -640,7 +642,22 @@ psql_scan_slash_option(PsqlScanState state,
             termPQExpBuffer(&mybuf);
             return NULL;
         case xslashwholeline:
-            /* always okay */
+            /*
+             * In whole-line mode, we interpret semicolon = true as stripping
+             * trailing whitespace as well as semi-colons; this gives the
+             * nearest equivalent to what semicolon = true does in normal
+             * mode.  Note there's no concept of quoting in this mode.
+             */
+            if (semicolon)
+            {
+                while (mybuf.len > 0 &&
+                       (mybuf.data[mybuf.len - 1] == ';' ||
+                        (isascii((unsigned char) mybuf.data[mybuf.len - 1]) &&
+                         isspace((unsigned char) mybuf.data[mybuf.len - 1]))))
+                {
+                    mybuf.data[--mybuf.len] = '\0';
+                }
+            }
             break;
         default:
             /* can't get here */
diff --git a/src/test/regress/expected/psql.out b/src/test/regress/expected/psql.out
index 5d61e4c7bb..4f3fd46420 100644
--- a/src/test/regress/expected/psql.out
+++ b/src/test/regress/expected/psql.out
@@ -5323,7 +5323,7 @@ END
          LANGUAGE sql
          IMMUTABLE PARALLEL SAFE STRICT COST 1
 1       RETURN ($2 + $1)
-\sf ts_debug(text)
+\sf ts_debug(text);
 CREATE OR REPLACE FUNCTION pg_catalog.ts_debug(document text, OUT alias text, OUT description text, OUT token text,
OUTdictionaries regdictionary[], OUT dictionary regdictionary, OUT lexemes text[]) 
  RETURNS SETOF record
  LANGUAGE sql
diff --git a/src/test/regress/sql/psql.sql b/src/test/regress/sql/psql.sql
index f199d624d3..c997106b9f 100644
--- a/src/test/regress/sql/psql.sql
+++ b/src/test/regress/sql/psql.sql
@@ -1315,7 +1315,7 @@ drop role regress_psql_user;
 \sf information_schema._pg_index_position
 \sf+ information_schema._pg_index_position
 \sf+ interval_pl_time
-\sf ts_debug(text)
+\sf ts_debug(text);
 \sf+ ts_debug(text)

 -- AUTOCOMMIT

Re: Make psql ignore trailing semicolons in \sf, \ef, etc

From
"Tristan Partin"
Date:
On Mon Jan 8, 2024 at 2:48 PM CST, Tom Lane wrote:
> We had a complaint (see [1], but it's not the first IIRC) about how
> psql doesn't behave very nicely if one ends \sf or allied commands
> with a semicolon:
>
> regression=# \sf sin(float8);
> ERROR:  expected a right parenthesis
>
> This is a bit of a usability gotcha, since many other backslash
> commands are forgiving about trailing semicolons.  I looked at
> the code and found that it's actually trying to ignore semicolons,
> by passing semicolon = true to psql_scan_slash_option.  But that
> fails to work because it's also passing type = OT_WHOLE_LINE,
> and the whole-line code path ignores the semicolon flag.  Probably
> that's just because nobody needed to use that combination back in
> the day.  There's another user of OT_WHOLE_LINE, exec_command_help,
> which also wants this behavior and has written its own stripping
> code to get it.  That seems pretty silly, so here's a quick finger
> exercise to move that logic into psql_scan_slash_option.
>
> Is this enough of a bug to deserve back-patching?  I'm not sure.
>
>             regards, tom lane
>
> [1] https://www.postgresql.org/message-id/CAEs%3D6D%3DnwX2wm0hjkaw6C_LnqR%2BNFtnnzbSzeZq-xcfi_ooKSw%40mail.gmail.com

>  +                        /*
>  +                         * In whole-line mode, we interpret semicolon = true as stripping
>  +                         * trailing whitespace as well as semi-colons; this gives the
>  +                         * nearest equivalent to what semicolon = true does in normal
>  +                         * mode.  Note there's no concept of quoting in this mode.
>  +                         */
>  +                        if (semicolon)
>  +                        {
>  +                                while (mybuf.len > 0 &&
>  +                                           (mybuf.data[mybuf.len - 1] == ';' ||
>  +                                                (isascii((unsigned char) mybuf.data[mybuf.len - 1]) &&
>  +                                                 isspace((unsigned char) mybuf.data[mybuf.len - 1]))))
>  +                                {
>  +                                        mybuf.data[--mybuf.len] = '\0';
>  +                                }
>  +                        }

Seems like if there was going to be any sort of casting, it would be to
an int, which is what the man page says for these two function, though
isascii(3) explicitly mentions "unsigned char."

Small English nit-pick: I would drop the hyphen between semi and colons.

As for backpatching, seems useful in the sense that people can write the
same script for all supported version of Postgres using the relaxed
syntax.

--
Tristan Partin
Neon (https://neon.tech)



Re: Make psql ignore trailing semicolons in \sf, \ef, etc

From
Tom Lane
Date:
"Tristan Partin" <tristan@neon.tech> writes:
> On Mon Jan 8, 2024 at 2:48 PM CST, Tom Lane wrote:
>> +                                                (isascii((unsigned char) mybuf.data[mybuf.len - 1]) &&
>> +                                                 isspace((unsigned char) mybuf.data[mybuf.len - 1]))))

> Seems like if there was going to be any sort of casting, it would be to
> an int, which is what the man page says for these two function, though
> isascii(3) explicitly mentions "unsigned char."

Casting to unsigned char is our standard pattern for using these
functions.  If "char" is signed (which is the only case in which
this changes anything) then casting to int would imply sign-extension
of the char's high-order bit, which is exactly what must not happen
in order to produce a legal value to be passed to these functions.
POSIX says:

    The c argument is an int, the value of which the application shall
    ensure is a character representable as an unsigned char or equal
    to the value of the macro EOF. If the argument has any other
    value, the behavior is undefined.

If we cast to unsigned char, then the subsequent implicit cast to int
will do zero-extension which is what we need.

> Small English nit-pick: I would drop the hyphen between semi and colons.

Me too, except that it's spelled like that in nearby comments.
Shall I change them all?

            regards, tom lane



Re: Make psql ignore trailing semicolons in \sf, \ef, etc

From
Laurenz Albe
Date:
On Mon, 2024-01-08 at 15:48 -0500, Tom Lane wrote:
> Is this enough of a bug to deserve back-patching?  I'm not sure.

I like the patch, but I wouldn't back-patch it.  I'd call the current
behavior a slight inconsistency rather than an outright bug, and I think
that we should be conservative with back-patching.

Yours,
Laurenz Albe



Re: Make psql ignore trailing semicolons in \sf, \ef, etc

From
"Tristan Partin"
Date:
On Mon Jan 8, 2024 at 6:08 PM CST, Tom Lane wrote:
> "Tristan Partin" <tristan@neon.tech> writes:
> > On Mon Jan 8, 2024 at 2:48 PM CST, Tom Lane wrote:
> >> +                                                (isascii((unsigned char) mybuf.data[mybuf.len - 1]) &&
> >> +                                                 isspace((unsigned char) mybuf.data[mybuf.len - 1]))))
>
> > Seems like if there was going to be any sort of casting, it would be to
> > an int, which is what the man page says for these two function, though
> > isascii(3) explicitly mentions "unsigned char."
>
> Casting to unsigned char is our standard pattern for using these
> functions.  If "char" is signed (which is the only case in which
> this changes anything) then casting to int would imply sign-extension
> of the char's high-order bit, which is exactly what must not happen
> in order to produce a legal value to be passed to these functions.
> POSIX says:
>
>     The c argument is an int, the value of which the application shall
>     ensure is a character representable as an unsigned char or equal
>     to the value of the macro EOF. If the argument has any other
>     value, the behavior is undefined.
>
> If we cast to unsigned char, then the subsequent implicit cast to int
> will do zero-extension which is what we need.

Thanks for the explanation.

> > Small English nit-pick: I would drop the hyphen between semi and colons.
>
> Me too, except that it's spelled like that in nearby comments.
> Shall I change them all?

I'll leave it up to you. Patch looks good as-is.

--
Tristan Partin
Neon (https://neon.tech)



Re: Make psql ignore trailing semicolons in \sf, \ef, etc

From
Tom Lane
Date:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> On Mon, 2024-01-08 at 15:48 -0500, Tom Lane wrote:
>> Is this enough of a bug to deserve back-patching?  I'm not sure.

> I like the patch, but I wouldn't back-patch it.  I'd call the current
> behavior a slight inconsistency rather than an outright bug, and I think
> that we should be conservative with back-patching.

Nobody spoke in favor of back-patching, so committed to HEAD only.

            regards, tom lane