Thread: improve pgbench syntax error messages

improve pgbench syntax error messages

From
Fabien COELHO
Date:
Report the origin of syntax errors from pgbench.

Currently only the column number (for expressions) and command are 
essentially reported:
  sh> ./pgbench -f bad.sql  syntax error at column 14  set: parse error

The patch helps locate the origin of errors with the file name, line 
number and the actual text triggering the issue (either the line or an 
extract for expressions):
  sh> ./pgbench -f bad.sql  syntax error at column 14  error while processing "bad.sql" line 3: (1021 * :id) %  set:
parseerror
 

Whether using a macro is the right tool is debatable. The contents could 
be expanded, but that would mean replicating the same message over and 
over again, so it seems cleaner to me this way. An function seems 
overkill.

-- 
Fabien.

Re: improve pgbench syntax error messages

From
Robert Haas
Date:
On Tue, Mar 3, 2015 at 3:48 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> Report the origin of syntax errors from pgbench.
>
> Currently only the column number (for expressions) and command are
> essentially reported:
>
>   sh> ./pgbench -f bad.sql
>   syntax error at column 14
>   set: parse error
>
> The patch helps locate the origin of errors with the file name, line number
> and the actual text triggering the issue (either the line or an extract for
> expressions):
>
>   sh> ./pgbench -f bad.sql
>   syntax error at column 14
>   error while processing "bad.sql" line 3: (1021 * :id) %
>   set: parse error
>
> Whether using a macro is the right tool is debatable. The contents could be
> expanded, but that would mean replicating the same message over and over
> again, so it seems cleaner to me this way. An function seems overkill.

As I mentioned on the other thread, I'd really like to get this into a
better format, where each error message is on one line.  Looking at
that, you can't tell whether you've got one mistake, two mistakes, or
three mistakes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: improve pgbench syntax error messages

From
Fabien COELHO
Date:
> As I mentioned on the other thread, I'd really like to get this into a
> better format, where each error message is on one line.  Looking at
> that, you can't tell whether you've got one mistake, two mistakes, or
> three mistakes.

Indeed. Here is a v2.
  sh> ./pgbench -f bad.sql  bad.sql:3: syntax error at column 23 in command "set"  \set aid (1021 * :id) %
         ^ error found here
 
  sh> ./pgbench -f bad2.sql  bad2.sql:1: unexpected argument (x) at column 25 in command "setrandom"     \setrandom id
11000 x                          ^ error found here
 
  sh> ./pgbench -f bad3.sql  bad3.sql:1: too many arguments (gaussian) at column 35 in command "setrandom"  \setrandom
foo1 10 gaussian 10.3 1                                    ^ error found here
 
 sh> ./pgbench -f bad4.sql  bad4.sql:1: missing threshold argument (exponential) in command "setrandom"  \setrandom foo
110 exponential
 


I think that transforming unexpected garbage in errors would be a good 
move, even if this breaks backwards compatibility (currently a warning is 
printed about ignored extra stuff).

-- 
Fabien.

Re: improve pgbench syntax error messages

From
Fabien COELHO
Date:
> Indeed. Here is a v2.

Here is a v3, which (1) activates better error messages from bison
and (2) improves the error reporting from the scanner as well.

>  sh> ./pgbench -f bad.sql
>  bad.sql:3: syntax error at column 23 in command "set"
>  \set aid (1021 * :id) %
>                        ^ error found here

the improved message is:  bad.sql:3: syntax error, unexpected $end at column 23 in command "set"  \set aid (1021 * :id)
%                       ^ error found here
 

Also scanner errors are better:
  sh> ./pgbench -f bad5.sql  bad5.sql:1: unexpected character (a) at column 12 in command "set"  \set foo 12abc
   ^ error found here
 

-- 
Fabien.

Re: improve pgbench syntax error messages

From
Fabien COELHO
Date:
> Here is a v3, which (1) activates better error messages from bison
> and (2) improves the error reporting from the scanner as well.

v4.

While adding a basic function call syntax to expressions, a noticed that 
it would be useful to access the "detail" field of syntax errors so as to 
report the name of the unknown function. This version just adds the hook 
(expr_yyerror_detailed) that could be called later for this purpose.

-- 
Fabien.

Re: improve pgbench syntax error messages

From
Robert Haas
Date:
On Sat, Mar 7, 2015 at 5:49 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>> Here is a v3, which (1) activates better error messages from bison
>> and (2) improves the error reporting from the scanner as well.
>
> v4.
>
> While adding a basic function call syntax to expressions, a noticed that it
> would be useful to access the "detail" field of syntax errors so as to
> report the name of the unknown function. This version just adds the hook
> (expr_yyerror_detailed) that could be called later for this purpose.

Let's not add code that isn't doing anything yet.  We can patch the
system more later if we need to.

+/* better error reporting with bison */
+%define parse.lac full
+%define parse.error verbose

What does this do?  The comments should explain, I think.  Does it
work on all versions of bison >= the minimum version we support?

+void syntax_error(const char *source, const int lineno,

Not project style.

+       if (line) {

That isn't either.

How about having syntax_error using appendPQExpBuffer() and friends
instead of printing data one character at a time?

+/* error message generation helper */
+#define SYNTAX_ERROR(msg, more, col)                                   \
+       syntax_error(source, lineno, my_commands->line,         \
+                                my_commands->argv[0], msg, more, col)
+

I don't like this.  Just substitute the expansion.  Otherwise there's
a non-obvious dependency on my_commands.

+               /* stop "set" argument parsing the variable name */

This comment addition isn't related to the purpose of this patch, and
I also can't understand what the new comment is supposed to mean.
It's the value we don't want to strtok()-ify, not the name.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: improve pgbench syntax error messages

From
Fabien COELHO
Date:
Hello,

Here is a v5.

>> While adding a basic function call syntax to expressions, a noticed that it
>> would be useful to access the "detail" field of syntax errors so as to
>> report the name of the unknown function. This version just adds the hook
>> (expr_yyerror_detailed) that could be called later for this purpose.
>
> Let's not add code that isn't doing anything yet.  We can patch the
> system more later if we need to.

Ok.

> +/* better error reporting with bison */
> +%define parse.lac full
> +%define parse.error verbose
>
> What does this do?

It adds an explanation on syntax errors, instead of the laconic "syntax 
error" which is akin to a boolean.

> The comments should explain, I think.

I thought it did with the sentence "better error reporting with bison".

> Does it work on all versions of bison >= the minimum version we support?

Good question. After some digging, it seems to be 1.85 since pg 7.4... It 
does not seem to have a '%define' directive yet. That's too bad, because I 
like the better messages. So I put the define in a comment.

Maybe some conditional activation from configure generated macro could be 
possible, but I did not find anything about bison version.

> +void syntax_error(const char *source, const int lineno,
> +       if (line) {
>
> Not project style.

Indeed.

> How about having syntax_error using appendPQExpBuffer() and friends
> instead of printing data one character at a time?

This is just a basic error message printed before exiting. The simpler and 
more straightforward the better, IMHO. Pgbench relies on basic 
"fprintf(stderr, ...)" for error messages everywhere, I tried to blend in, 
eg I avoided "fputc" for printing alignment spaces for instance. PQ buffer 
stuff is not used anywhere else in pgbench. It seems to me overkill to 
introduce it there just for this function. I'll do it only if required.

> +/* error message generation helper */
> +#define SYNTAX_ERROR(msg, more, col)                                   \
> +       syntax_error(source, lineno, my_commands->line,         \
> +                                my_commands->argv[0], msg, more, col)
> +
>
> I don't like this.  Just substitute the expansion.  Otherwise there's
> a non-obvious dependency on my_commands.

I wanted to have one line calls and to avoid repeating the same list of 
arguments over and over again, especially with the very long "my_commands" 
variable name.

Once expanded, the result is either two long lines or three 
under-80-column lines per call. Not sure which one of these ugly choices 
is best. I took the first option.

> +               /* stop "set" argument parsing the variable name */
>
> This comment addition isn't related to the purpose of this patch, and
> I also can't understand what the new comment is supposed to mean.
> It's the value we don't want to strtok()-ify, not the name.

I wanted to write: "stop set argument parsing after the variable name"
(I forgot "after") because I had to figure out the max_args management
and a minimal comment would have helped. Maybe not with this comment 
though. I removed it.

-- 
Fabien.

Re: improve pgbench syntax error messages

From
Fabien COELHO
Date:
> Here is a v5.

Here is v6, just a rebase.

-- 
Fabien.

Re: improve pgbench syntax error messages

From
Robert Haas
Date:
On Sun, Mar 29, 2015 at 2:20 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>> Here is a v5.
>
> Here is v6, just a rebase.

Committed with minor stylistic fixes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: improve pgbench syntax error messages

From
Fabien COELHO
Date:
>> Here is v6, just a rebase.
>
> Committed with minor stylistic fixes.

Thanks!

-- 
Fabien.