Re: Invalid "trailing junk" error message when non-English letters are used - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Invalid "trailing junk" error message when non-English letters are used
Date
Msg-id 1709057.1725490341@sss.pgh.pa.us
Whole thread Raw
In response to Re: Invalid "trailing junk" error message when non-English letters are used  (Pavel Borisov <pashkin.elfe@gmail.com>)
Responses Re: Invalid "trailing junk" error message when non-English letters are used
List pgsql-hackers
Karina Litskevich <litskevichkarina@gmail.com> writes:
> I see the two solutions here: either move the rule for decinteger_junk
> below the rules for hexinteger_junk, octinteger_junk and bininteger_junk,
> or just use a single rule decinteger_junk for all these cases, since the
> error message is the same anyway. I implemented the latter in the second
> version of the patch, also renamed this common rule to integer_junk.

That seems reasonable, but IMO this code was unacceptably
undercommented before and what you've done has made it worse.
We really need a comment block associated with the flex macros,
perhaps along the lines of

/*
 * An identifier immediately following a numeric literal is disallowed
 * because in some cases it's ambiguous what is meant: for example,
 * 0x1234 could be either a hexinteger or a decinteger "0" and an
 * identifier "x1234".  We can detect such problems by seeing if
 * integer_junk matches a longer substring than any of the XXXinteger
 * patterns.  (One "junk" pattern is sufficient because this will match
 * all the same strings we'd match with {hexinteger}{identifier} etc.)
 * Note that the rule for integer_junk must appear after the ones for
 * XXXinteger to make this work correctly.
 */

(Hmm, actually, is that last sentence true?  My flex is a bit rusty.)

param_junk really needs a similar comment, or maybe we could put
all the XXX_junk macros together and use one comment for all.

> Additionally, I noticed that this patch is going to change error messages
> in some cases, though I don't think it's a big deal. Example:
> Without patch:
> postgres=# select 0xyz;
> ERROR:  invalid hexadecimal integer at or near "0x"
> With patch:
> postgres=# select 0xyz;
> ERROR:  trailing junk after numeric literal at or near "0xyz"

That's sort of annoying, but I don't really see a better way,
or at least not one that's worth the effort.

>> FWIW output of the whole string in the error message doesnt' look nice to
>> me, but other places of code do this anyway e.g:
>> select ('1'||repeat('p',1000000))::integer;
>> This may be worth fixing.

I think this is nonsense: we are already in the habit of repeating the
whole failing query string in the STATEMENT field.  In any case it's
not something for this patch to worry about.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: [PATCH] Avoid mixing custom and OpenSSL BIO functions
Next
From: Thomas Munro
Date:
Subject: Re: Use streaming read API in ANALYZE